5 min read

Variable names as handles or sigils

I'm trying to get back into using the notebook, so I'm digging through my drafts for posts I can rescue. This one bogged down because I made the mistake of doing some reading and deciding that it was More Complicated Than That. Anyway, this seems like one part of the story, even if it isn't a unified theory of variable names, so I thought I'd write it up and publish.

Mathematicians and physicists normally use single characters like \(p\) or \(\lambda\) for variable names, whereas programmers normally use strings of characters like width or lastName. Every so often I see a take that goes something like this:

"Single-character variable names in maths are an artefact of having to write everything down on paper. Longer variable names are more descriptive and an obviously better choice when you don't have this constraint."

There's a bit of truth to this, but I think it ignores at least one major tradeoff. This tradeoff is probably pretty obvious to a lot of people, but I haven't seen it written down in any detail, so I'm going to try here.

Variable names fall along a spectrum of how generally useful the thing you're labelling is. At one end of the spectrum you have pragmatic labels that refer to some one-off thing that you need to track in the problem in front of you. For example, you're writing the business logic for handling some particular type of widgets, and so you add a widgetCount variable. At the other end of the spectrum you have labels for things that come up again and again in multiple problems. For example, in school geometry you come across angles all the time, and by default you label them with \(\theta\).

I'm going to call the first type handles because they are simple one-time attachments to a particular thing in the world. The complexity is out there in the world, in the particularities of whatever kind of widget you're tracking, and all the handle does is use your ordinary-language understanding of what widgets are to point to them and say "those ones". There's no magic in the handle itself.

In contrast, the second type of variable name has been used so much to refer to a particular thing that a strong association with the thing has accreted around the variable itself. I think of these as sigils because they've been charged with meaning through repeated use. To me, \(\theta\) just looks like an angle now, and it would be very weird and distracting to use it for a length.

Sigils are useful because they concentrate meaning into a very small space. You can just look at \(A_{ij}\), say, and say "ah, that's a matrix, I'd better do matrix things to it." This is particularly valuable in maths, where you deal with natural regularities that come up in a lot of different situations. Generally mathematical objects are fairly simple in themselves, at least compared to most of the bags of properties and functions you'll find in programming, and the complexity instead comes from combining and transforming them. So it's useful to have a compact notation that concentrates meaning into a small space.

One-off labellings don't have the advantage of bringing an interpretation along with them. You could abbreviate "widget" to  w, but that would probably be more confusing than helpful because w isn't charged with any particular feeling of widgetness. Programming normally has to connect to particulars to get anything done, so it needs a lot of handles.

This isn't a hard-and-fast rule, and sometime maths has handles and programming has sigils.

For an example of handles in maths, sometimes you do a big calculation and end up with some intermediate quantity that's temporarily useful but which you don't have an interpretation for, and you're like "I dunno, let's call that \(Q\), I haven't used that one yet and it doesn't mean anything in particular in this context". In that case \(Q\) is just a handle, and not a particularly good one. A longer programming-style name would actually be better if you're not writing by hand, at least if you can find a reasonably meaningful one.

Another way you sometimes see handles appearing in maths is people tacking on a subscript or superscript text label to a single-character variable, as a kind of cheat code for making it into a handle. Here's an example I found from the Wikipedia article on the stress-strain curve:

Here \(\sigma\), \(F\) and \(A\) are standard symbols for the stress, force and cross-sectional area, but in this case \(A\) is the area of something called the "neck". I haven't read the article and don't know what this means, but presumably "the cross-sectional area of the neck" refers to something specific enough to not have a default symbol. So instead we get this sort of hybrid of a sigil \(A\) and a handle \(\text{neck}\).

(I deliberately chose a engineering-type subject when looking for an example, because I'd expect engineers to have to relate their formalism back to some particularity of the world. I thought I'd have to click through several pages, but got one on the first try!)

There are also a lot of repeated, general concepts in programming and often you do see short sigil-like variables there. A common example is the use of i and j  as counter variables. Things like loops and arrays come up all the time, so it makes sense to have dedicated symbols for counting your place in them. This particular convention is strong enough that using i or j  for anything else feels odd.

It's probably not a coincidence that counting is a maths thing and this convention is also borrowed from maths. Other single-variable names like \(f\) for file and \(e\) for exception are common but less strongly locked to the one interpretation.

So, that's the expanded, tidied-up version of the draft I wrote before doing some reading. I don't remember exactly what I read, but it definitely involved going through the many good answers to this Math.SE question.

The top answer is interestingly different. It also points out the usefulness of compactness in mathematical notation, but instead of emphasising the use of single-character variables as centres of meaning it takes almost the exact opposite approach, thinking of them as abstract symbols where the important thing is that you know how to push them around in the right way. Tokens, not sigils. In this case the single-character name is useful because it removes the distracting details of the thing it represents.

This also seems kind of right to me? I guess this relates to the usual thing in maths, where both intuitive meaning and calculational fluency are important. But it's sort of surprising to me that the single-character variable convention can support both at once.