Thursday, April 28, 2011

A gauge theory of linguistics?

Here is a bit of idle speculation. In my reading on theoretical physics, I have learned something about gauge theories. A concise description is found, naturally, on the Wikipedia page, where it is explained that the root problem addressed by gauge theory is the excess degrees of freedom normally present in specific mathematical models of physical situations. For instance, in Newtonian dynamics, if two configurations are related by a Galilean transformation (change of reference frame), they represent the same physical situation. The transformations form a symmetry group, so then a physical situation is represented by a class of mathematical configurations which are all related by the symmetry group. Usually the symmetry group is some kind of Lie group, but it need not be a commutative (abelian) group. A gauge theory is then a mathematical model that has symmetries (there may be more than one) of this kind. Examples include the Standard Model of elementary particles.

So far so great, but what does this have to do with linguistics? Well, it seems to me that mathematical models of language are often encumbered by irrelevant detail or an overly rigid dependence on conditions that are in reality never fixed. A simple example would be that a typical generative grammar of a language (in any theory) depends critically on the vocabulary and the categories assigned to it. In reality, different speakers have different vocabularies, and even different usages assigned to vocabulary items, although they may all feel they are speaking a common language. There is a sense in which we could really use a mathematical model of a language that is flexible enough to allow for some "insignificant" differences in the specific configuration assigned to the language. There may even be lurking a useful notion of "Galilean transformation" of a language.

This idea is stated loosely by Edward Sapir in his classic Language. He applies it to phonology, where he explains his conviction that two dialects may be related by a "vowel shift" in which the specific uses or identities of vowels are changed, but (in a sense that is left vague) the "phonological system" of the vowels in the two dialects is not fundamentally different. This idea may help to explain how American English speakers from different parts of the country can understand one another with relative ease even though they may use different sets of specific vowel sounds.

This is all a very general idea, of course. Gauge theory as applied in physics is really quite intricate, and I do not know yet if the specifics of the formalism can be "ported" to the particular problems of linguistic variation in describing a "common system" for a language. But what better place than a blog to write down some half-baked ideas?

Thursday, April 14, 2011

Connectionism and emergence

I have not read too much mathematical linguistics lately, but I have been reading a lot of cognitive science and neuroscience, as well as connectionist research. Let me start off with connectionism. This is the approach involving artificial neural networks to employ "distributed processing" for computational purposes. I think that, in principle, such an approach to modeling language as a cognitive phenomenon will ultimately be the right approach. But there is a very large problem with current neural net modeling, chiefly that the neurons are too simple and the networks too small.

Neuroscience studies real neurons and their networks, although at present there are huge gaps in our understanding. While we are able to record signals from single neurons or very small groups, and we can also do "brain imaging" to track activity in huge (order of 10^9) numbers of neurons, we have no way to study activity in a few thousand neurons. It is precisely this "mesoscopic" regime where the phenomena of thought, memory, and knowledge are likely to be emergent from the nonlinear dynamical system known as the brain.

This brings me to the subject of "emergent phenomena," which refers to things that happen in a nonlinear dynamical system as a result of huge numbers of interactions among nonlinear dependencies. An emergent phenomenon on the ocean is a "rogue wave." An emergent phenomenon cannot be directly simulated through deterministic calculation, because it happens at a scale where there is not enough computing power in the world to run the simulation, there are too many interdependent variables.

Meanwhile, connectionism involves running simulations of neural networks that can be deterministically calculated. There are no emergent phenomena (so far as I know) in standard connectionist networks. So, this means they are not even able to manifest the most important thing happening in the brain in principle. So there is not any question that artificial neural networks do not model anything about the brain in the slightest sense.

Meanwhile in linguistics, a 'hot' idea is that classical linguistic categories like phonemes and parts of speech are "emergent" in a similar sense to an emergent phenomenon. The "emergentist" view of language holds that a phoneme emerges as an element of knowledge only after broad experience with "exemplars" in real speech. I am not exactly clear on the sense in which emergentist linguists think that such categories are emergent; do they mean statistically somehow, or do they mean "emergent" in the nonlinear chaos theory sense?

Conventional mathematical linguistics is looking quite far behind these newer developments and directions, but there is no question that better mathematical analysis would really help everyone to understand the new ideas like emergent linguistics.