Thursday, October 30, 2014

Recursion and the infinitude of language - another tempest in the teapot

The latest issue of the Journal of Logic, Language and Information contains a marvelous little paper by András Kornai, "Resolving the infinitude controversy." In it, Kornai meets the latest perplexing linguistic discovery head-on, and uses it to show that the generative linguistic fascination with recursion in grammar and the infinitude of language turns out to be just another misplaced fixation on a tempest in a teapot, along with so many other once-cherished notions and so-called problems in linguistics.

More than a generation of linguistics professors, myself included, have harped at our students about the importance of human language building an infinite edifice from finite materials. The hoopla more or less reached its zenith in 2002, when Hauser, Chomsky and Fitch made much of the human capacity for recursion in language--indeed they proposed that it was the main thing which separated us from the other apes.  But beyond the conventional wisdom, as usual, truly interesting things were being discovered.

Over the past ten or more years, Dan Everett, for one, slowly convinced the linguistics community that the Pirahã language of the Amazon in fact has no kind of recursive or iterative grammatical structures. But other less highly advertised cases had long been lurking, among them Dyirbal and Warlbiri. What is linguistic theory to make of finite human languages?  Why would there be any, when we long thought that "infinitude" was a necessary property of a possible human language?

Although some have suggested otherwise, I think by now we all must admit that these descriptions are indeed correct, that there are really finite human languages, and that this is within the scope of possibility. It's OK--Kornai shows that this is no cause for alarm. The argument rests on the important point that even in the infinite languages, such as English, there is a steadily vanishing probability of a sentence being produced as the length increases. This probability distribution over the length of sentences shows that 99.9% of everything English speakers actually say is communicated in relatively short simple sentences. On another note, whatever the infinitude of English is good for, it is not good for saying anything beyond what Pirahã speakers can say--their expressive power is the same, so long as a Pirahã speaker is allowed to use multiple sentences to say what might be said using one English sentence.  In this respect, the information capacity of real English is similar to that of Pirahã. In fact, a mathematical argument shows that a more complicated finite language could easily outstrip the information capacity of infinitary natural languages as they actually are spoken.

One conclusion to draw is that all the hub-bub about recursion and infinity in grammar being the essence of the human condition is seriously misguided. Indeed, the evolutionary pressure, whatever it is, that leads most human languages to have infinitary sentences is a bit of a mystery, since it appears to provide little discernible advantage.  Except perhaps for the inherent value of nifty tales like "This is the House that Jack Built."  For fun, here is its final sentence:

This is the horse and the hound and the horn
That belonged to the farmer sowing his corn
That kept the rooster that crowed in the morn
That woke the judge all shaven and shorn
That married the man all tattered and torn
That kissed the maiden all forlorn
That milked the cow with the crumpled horn
That tossed the dog that worried the cat
That chased the rat that ate the cheese
That lay in the house that Jack built.

Wednesday, September 3, 2014

AAAS meeting in San José

I hope you were aware that the American Association for the Advancement of Science has a section for "Linguistics and language sciences." The annual AAAS meeting is set for 12-16 of February in San José, California, and there promises to be some activity from Section Z.

I am currently the liaison from the Association for Symbolic Logic with this section, and I plan to post items from AAAS that may be of interest to the logic and mathematics of language community.
Future posts will focus on language and logic-related items published in AAAS journals, such as Science.


Friday, July 4, 2014

Jim Lambek

Word has traveled around the mathematical linguistics community that we recently lost one of our "godfathers," Joachim Lambek. I met Jim during the 2001 Logical Aspects of Computational Linguistics conference which took place at a seaside retreat outside Nantes. After the end of that meeting, a number of us, including Jim and myself, stayed in Nantes for a workshop on learning theory that was held at the university there. I had the pleasure of going to dinner with Jim and some other colleagues on one of the days, but unfortunately Jim tripped at his hotel and broke his wrist later that evening. I waited with him outside his hotel while a colleague of his tried to get him some medical attention. In spite of the pain in his wrist, Jim recalled that I had asked him about procuring a copy of his book with P. J. Scott, Introduction to Higher Order Categorical Logic, which had fallen out of print. He gave me one of his cards and asked me to write my contact information on it.  I was pretty astounded that he would bother to talk to me about that while nursing a broken wrist and waiting for an ambulance in a foreign country. A few weeks later when I was back at the University of Chicago, a copy of the book arrived in the mail. I'm even more grateful that he signed the inside front page.

Wednesday, June 4, 2014

Recursion in linguistics, ad nauseam

After reading the discussions about the supposed role of recursion in Chomskyan linguistics, both in journals (see previous post) and on Norbert Hornstein's blog, my first thought was that if I see another linguist arguing about recursion I'm going to throw up. And yet, after thinking it over, I now see fit to add my own little tid-bit to the mix. 

Lobina argues, if I may paraphrase, that the stated or implied reasons for recursion in Chomksy's formalisms are vacuous, because supporters say things like "recursion is needed for a grammar to generate an infinite language" and things like that.  Lobina correctly points out that this is not in fact true, so a lot of these stated reasons for recursion in linguistic theory turn out to be moot.

On thinking it over, I remembered that I myself had a need for recursion in past work. In my paper of 2010 (erratum published 2011), I demonstrated that a certain kind of recursion in the structural design of sentences was necessary to have a class of infinite (tree-structured) languages that is learnable from finite data.  Now on reflection in the context of all this recursion talk, I believe that this may actually capture something that was sort of meant by Chomsky et sequitur over the years.  Recursion in syntax is not needed to generate the infinite capacity of language; rather, the recursion is needed to provide learnability of the infinite from only finite data. This is, at last, a property of the recursive structures that cannot be replicated using iterative or other methods.  

Tuesday, June 3, 2014

When linguists talk mathematical logic. . .

. . .we screw it up, or so says David Lobina in an amusing critique of a paper by Watumull, Hauser, Roberts, and Hornstein.  Both articles were recently published in Frontiers in Psychology.  Since I am chiefly a linguist and only sort of a mathematician, I am always concerned about misunderstanding or misrepresenting the formal literature.  But the gaffes pointed out by Lobina are, I would hope, not the kinds of mistakes I would generally make.

For example, Watumull et al. seem to have gravely misunderstood Gödel's 1931 definition of the primitive recursive functions. While Lobina is too gentlemanly to say so, the misunderstanding that he describes reminds me of stuff I see in undergraduate term papers. Gödel began his definition by specifying a finite list of functions; Watumull et al. apparently took this to be part of the meaning of "recursive," so they attempt to paraphrase it by stating that a recursive function must specify a finite sequence.  Huh?  Perhaps Frontiers in Psychology should have considered using one or two referees with some of the pertinent logical background.
  
While the original article may have flaws, I do stand in favor of the general point that recursion is incredibly important in natural language. Such points would be better supported without laughably wrong things getting published in the same vein. 

Thursday, May 1, 2014

Call for papers

Call for Papers
Annals of Mathematics and Artificial Intelligence special issue on:
Mathematical Theories of Natural Language Processing

Since the 1990s, the practice of natural language processing (NLP) has gradually shifted from logic-based symbol manipulation systems first to purely statistical, and more recently to hybrid systems that combine structural and statistical methods. The mathematical theory of hybrid NLP is still not fully mature, and the special issue will lend focus to this expanding research area by including papers from mathematicians, computer scientists, theoretical and computational linguists and AI researchers with an interest in its foundations. Subjects suitable for the special issue include, but are not limited to, NLP-related advances in
  • inductive learning
  • spectral techniques
  • formal grammars
  • commonsense reasoning
  • low-pass semantics
  • sparse models
  • LSTM, deep learning
  • compressed sensing
  • cvs/distributional theories
Papers which were presented in January 2014 at the ISAIM special session on this topic are especially invited for submission, but other submissions not associated with ISAIM will be given equal consideration for publication. All papers will go through the standard refereeing process of the journal.

The submission deadline is July 31, 2014. Papers should be submitted through the Springer website for the journal https://www.editorialmanager.com/amai choosing article type Special Issue S79: Mathematical Theories of NLP

Thank you very much. Best regards,
Sean A Fulop
AMAI Guest Editor