Différance in neural nets
[Clearing out my drafts, this is from a couple of months ago. I lost the thread of the paper halfway through but got through enough of it that I may as well post this.]
Sometimes I like to type keywords into Google Scholar to see if I can fish up something weird and interesting. This time I typed 'differance Derrida' and a few pages of search results in I found Derrida and connectionism: différance in neural nets, a 1992 paper by Gordon G. Globus. That sounds weird and interesting to me!
I looked up Globus and he has a seriously weird list of publications, mixing phenomenology, medical topics and various esoteric things that don't really fit in either category. Sample titles: 'Quantum brain theory and the appearing of world', 'Human vaginal odors', 'The effect of lorazepam on sleep', 'Unexpected symmetries in the "world knot"'. This could end up being too crankish even for me, but it reads coherently, and there's a version of this argument that would plausibly make sense to me, so it's worth a try.
Here's my understanding of why this sounds plausible. One way of thinking about différance is that it's Derrida's term for the sort of ground or substrate required to be able to create 'concepts', where concepts in this case are any kind of abstraction from sense data so that many different raw perceptions (what he calls 'events') can be identified as the same thing. Neural nets definitely have the ability to identify concepts in this narrow sense: they can label multiple different arrays of pixels as 'the letter A', or (these days) 'dogs' or 'cats'. So, if Derrida's really on to something, a neural net should provide the kind of substrate he calls 'différance'.
I've been wondering vaguely about simpler things in the same direction already. One basic property that neural nets have that's required for concept formation is what Brian Cantwell Smith would call 'flex and slop'. The inputs are not rigidly wired to the outputs, they are multiply connected to a large number of immediate nodes with weights that can be freely adjusted. This gives them some space from the raw input data, which is necessary for forming concepts where many inputs map to the same feature. (Well, I guess you could hand-wire many different input patterns to the same output. But this would be brittle and not recognise a new input pattern.)
Anyway I'm hoping that Globus has thought about this much harder and has done all the work for me. Let's find out...
He starts with a lot of contextualisation, as you might expect for such a weird collision of topics. Heidegger, Dreyfus... I'll skim this bit fast because I don't need convincing. Also Freud. This one is interesting to me because I know Derrida took inspiration from him for some of his ideas about memory and 'trace'. This feeds into différance through the 'deferral' part - in the abstract Globus talks about différance combining 'Saussurean difference and Freudian deferral'.
Freud apparently had what Globus called 'an early neural network theory' (not sure what he means by this, anyway I haven't read it), and also some ideas on memory which he wrote about in his 'mystic writing-pad' essay (I have read this one). Freud was influenced by Helmholtz and envisaged that these networks operated on thermodynamic principles, with a spontaneous movement towards energy reduction (which he called 'the pleasure principle') and some way of using energy to deflect from these attractor states (which he called 'the reality principle').
(This all seems very... sober? I thought Freud was the guy who went on about egos and ids and dream interpretation and the Oedipus complex.)
Now we're on to the introduction to différance itself. Resists easy definition as Derrida wasn't trying to make a sharp-edged term. Nice quote I hadn't seen before:
. . . the word sheaf seems to mark more appropriately that the assemblage to be proposed [i.e. différance] has the complex structure of a weaving, an interlacing which permits the different threads and different lines of meaning—or of force—to go off again in different directions, just as it is always ready to tie itself up with others. (Derrida, 1982a, p. 3, bracket added)
So it's 'open' in the sense that it could always tie in new material, and also a kind of centre holding together multiple ideas that pull in different directions.
Still, one of the core things he returns to is this idea of différance being a combination of 'difference' and 'deferral'. Globus is going to concentrate on these as 'the most prominent strands of the sheaf'.
The difference of différance
This is the Saussure bit. I already understand it reasonably well. Signs are not meaningful in isolation, but only as part of a network of other signs. Holistic - change something somewhere and other signs change.
Saussure's example - mouton vs sheep. In English, sheep is not used for its meat, we say mutton instead. So the English word 'has beside it a second term' while the French word does not.
In network terms, English has two nodes which are mutually facilitating, where French has only one.
The difference of différance is like the difference in neural nets: nodes are part of the whole, there isn't a fully present local meaning concentrated at one node, independent of what the rest of the net does.
Différance is productive of difference, it's the ground that allows it. Globus associates this with the existence of the connection weights:
Différance is the condition for the possibility of meaning, present or absent. Connection weights are the condition for the possibility of meaning in neural nets. The unweighted wiring diagram is semantically uninterpretable.
The deferral of différance
I feel much more unclear on the deferral part of différance. I know roughly that it's the temporal correlate of difference, and that as well as meaning not being fully spatially concentrated at one node, it also isn't all immediately present at one time. And I also know that this is where Derrida brings in Husserl's phenomenology of time, and starts talking about retentions and protentions. But I always find that Derrida (and secondary sources) get even more obscure at this point, and that the examples run dry. No more sheep and mutton.
Globus has more on Freud.
Freud's networks support chains of memory traces.
This is a spreading out in time rather than space - current events call back to things that happened before.
Freud's network idea was apparently that there were a couple of different constraints, 'the reality principle' and 'the pleasure principle', and that these are in tension, so that memory traces could potentially fail to reach the intended object and the memory is repressed. So there's some kind of settling out of the network over time as these constraints fight it out.
Similarly, Derrida says for différance that:
There are delays, postponements, deferrals, temporizing in the play of différance so that an inscription is put off.
This is where I would like an example like the sheep one. But it's something to do with what Husserl would call retentions, there's some retained perception of what happened earlier that is still live, we're not seeing one node at an instant.
Now for the correlates in neural nets. We need a temporal process that changes the connection weights. So deferral doesn't really fit for an already trained net. But during training there is a temporal process of trying to find a good enough solution to the various constraints on the net, which does match better.
Globus says that a more dynamic system would be a closer fit to the idea of deferral:
When it was noted above that the weights are adjustable, it was meant that the nets are relatively slowly adjustable through learning and the adjustment may remain fixed. What is required in addition now is adjustable weights which change continually, a fluid 'tuning' of the constraints on the nets provided by the connection weights.
Then there's some discussion of how Freud's memory idea fits into this framework.
The rest of the paper
This is where I got bogged down and stopped reading, so these notes got stuck in my drafts folder. He starts talking about a discussion of différance by Gasché, and then he's on to Heidegger, and I'm quickly lost. The earlier parts of the paper are lucid enough that this part might be saying something interesting, but if it is I can't make any sense of it.
Then it finishes up with a 'maybe we can find some common ground in the Science Wars' type of conclusion:
From a wider purview, there is a shock of surprise that the continental existential tradition is at a place coherent with important events in the tradition of cognitive science, namely the surge of connectionism, for these traditions have been widely separated, indeed oppositional. This surprise 'solicits' the traditional polarity of the two traditions and wonderment results. In that there are few guide posts left these days to go by, what with all the controversy and deconstruction going on, crossing a bridge that invites wonderment is probably as good a strategy as any.
I'm not too sure what I think of this paper overall, but it's interesting that someone actually wrote up a proper version of the thing I was thinking about in a very vague way.