5 min read

CLIs vs GUIs

I wanted to do a bit of research on what the standard folk wisdom is for when to use a GUI and when to use a command line tool, because I’m thinking of using it as an example in a blog post (working title: “Derrida with smaller words and more examples”). Same sort of idea as last year’s visual programming post (which I might also raid for examples), but I’m not planning to do a several-week deep dive this time.

First I googled ‘GUI vs CLI’ and found approximately 1 million crappy SEO-optimised listicles copying the same ten points from each other. So I gave up and typed the same thing into Hacker News search and got a real post, GUI vs CLI: Operation vs Expression by Vivek Haldar. This links to another of his posts, The Cognitive Style of Unix which is also good. Also, there are the HN comments, which are mostly arguing over definitions but have a few insightful bits.

Haldar argues that the important feature of CLIs is the unboundedness you get from being able to freely combine elements. He call this expression, and contrasts it with GUIs where you are operating a fixed palette of options:

There is a time for operation, and a time for expression. Most interface debates are misinformed because they are about creators complaining about operation interfaces, or operators complaining about creation interfaces. We need both. But we also need to understand when.

I'm not too keen on the 'expression' terminology, because 'expression' to me also has connotations of being vividly, immediately meaningful, which is not necessarily the case here, and in fact you might find more of this kind of expressive meaning in using a colour picker than piping together commands. Maybe this is just too much Husserl, because he definitely uses 'expression' in that way. But also e.g. 'artistic expression' has shades of this.

(This is also the terminology that the HN commenters get bogged down with, mostly because it sounds like the good one, so various GUI defenders are shouting 'GUIs are too expressive!')

His starting examples are interesting to me. He uses car controls as a sort of GUI analogue:

The goal of these interfaces is to make you operate something, and operate it efficiently and safely. The grooves and clicks and limits constrain the range of motion and the number of choices. The visual look heavily hints at how to actively use it. They are usually not hard to learn. More importantly, the learning curve plateaus. Once you learn how to drive a car, there’s not much progression after that. Boundedness is an important property of these interfaces, by design.

And the frets of a guitar as a sort of CLI analogue:

The goal of these interfaces is to express and create. Note the almost total lack of constraints. There are no hints in the interface as to how to “correctly” use these. To a newbie, a first look offers no hints for how to operate them. You have to go through a steep learning process to even begin to use these. More importantly, the learning curve never flattens. You could be playing a guitar for decades and still learn new things. Unboundedness is an important property of these interfaces, by design.

There seem to be a few different things going on here, so I'm going to pedantically overanalyse them for a bit.

Visual representation. Car controls are supposed to 'look like the thing' in some way. Turn wheel right to go right. Hit left indicator to indicate left. This is only a partial similarity: the left indicator doesn't actually look like a flashing light in any way. Also it only holds for some controls: the brake doesn't 'look like slowing down'. By this standard, I'm not convinced that guitar strings are any less 'looking like the thing'. Play a note higher by going up the fretboard. Play a note lower by going down the fretboard. Play a higher string by moving across the fretboard.

Constraints. These can work by discretisation or limits on the range of motion. Car controls have this. But so do guitars! Frets discretise the fingerboard and also provide a 'highest note'.

Difference in learning curves. Now finally I think he's on to something, though it could do with a bit of unpacking. Both guitars and cars give you a palette of primitive actions: for a car it's turning the steering wheel left or right, moving the brake down or up, etc, and for a guitar it's plucking or strumming different strings with different actions, putting your fingers at different positions on the fingerboard, etc. So in theory in both cases you could combine these primitive actions in any way you liked, producing unboundedly complicated sequences of actions. Still, most ways of doing this with a car will leave you in a ditch somewhere. The main goal is to stay on the road, and this is designed to be manageable without having to do really elaborately patterned sequences of primitive actions. On the other hand, guitar actions can be clustered into e.g. scales and chords, which can themselves be combined in different ways at a higher level.

There's probably more to analyse here, these examples are really rich. Too rich to really do what he wants them to do, they both pull in too many directions at once.

This bit is very relevant to the argument I want to make in the Derrida post, about iterable marks breaching context:

What makes one type of interface bounded and the other one unbounded? It’s the ability to combine. Physical interfaces (and by extension, GUIs) cannot combine at will to create new behavior. You can’t make changes to the way your odometer behaves, without ripping it out and putting in a new one. You can’t “pipe” the reading from it to an alarm that beeps when you go over a certain limit.

Again, he doesn't engage too well with the starting examples (a guitar is also a 'physical interface', so why doesn't the same argument hold there?) but there is something to it. To make things combinable in the sort of repeatable patterns you need for what Haldar would call 'expressiveness' (and what I would follow Derrida and call 'iterability') you need to be able to pull things out of their immediate context for reuse. You can do this with notes on a guitar, by playing essentially any note after any other note (maybe this overstates it, as many combinations will sound awful, but there is a lot of freedom). You can also play 'the same' middle C on a clarinet instead, or a piano... there's some core of the experience that isn't locked to context.

This doesn't work so well with primitive driving actions, because you are so tied to what the road requires you to do. So musical notes are more symbol-y than driving actions.

(I think this matches common sense pretty nicely. There are a number of useful notations for musical notes, but there isn't a 'driving notation' that drivers learn and study. Music has more 'symbolisation potential' than driving.)

Haldar links out to another of his posts, The Cognitive Style of Unix, which elaborates on another difference, internalisation vs externalisation (there's some potentially interesting paper references there). Vi is the classic tool you need to internalise, where beginners can't even work out how to quit it because :q! is a completely opaque command. Externalised tools give you more hints (e.g. autocomplete), allowing you to offload more thinking to the environment.

CLI tools tend to be more internalisation-heavy, and GUIs more externalisation-heavy. The paper Haldar quotes has a bit of an 'eating your vegetables is good for you' flavour, saying that internalisation requires you to actually think, improving problem-solving efficiency. Similar to the bat and ball 'cognitive reflection' argument really. In this view, CLIs are good because they're difficult.

That's probably part of it, but there's also a sort of 'iterability pressure' or 'iterability force' thing going on. I just made up those terms and don't think they're great... what I mean is that the more you want to make actions iterable (repeatable, reusable out of context), the more you need to detach them from specific situations. This makes them powerful but is also a force driving towards making them more abstract and more difficult to learn.

I was going to dig through the HN comments as well but I've run out of steam now, maybe that will be a follow-up post or maybe not.