Sunday, March 28, 2010

Kornai on semantics

Chapter 6 of Kornai's "Mathematical Linguistics" considers ways of treating natural semantics formally. While the syntax chapter was in many ways a run-down of other treatments, this chapter appears to make some new proposals. The first section discusses some of the classic problems of the philosophy of language such as the Liar paradox, and largely dispenses with a great deal of the angst that philosophers and logicians have put into their treatments of these matters. K. basically shows how a lot of the technical problems with these paradoxes disappear when one deals with meanings in a way that is more appropriate for natural language modeling. He finishes by proposing that natural semantics has to have a paraconsistent logic as a formal model, which allows for inconsistency and contradiction without a total breakdown of reasoning---seems like a great idea to me.

The second section gives a very nice overview of Montague grammar (of course a must for any book on mathematical semantics). But this overview introduces some new ideas and draws in other methodologies. Prominent is the idea of a "construction," which is a semantically interpretable constituent structure analysis. K. discusses the importance of allowing "defeasible" information into a meaning, which then calls for a paraconsistent logic. The seven-valued logic of Ginsburg (1986) is then alluded to for application, though it is never really explained fully. Since Kornai seems to see this book as a textbook, he might consider really discussing the system of logic he has settled on for applications. K. then lays out in just a few pages a variety of apparently novel suggestions for treating natural language constructions of various kinds using some blend of Montague grammar with a paraconsistent logic. While intriguing, this little precis of how to do semantics in a different way really deserves to be fleshed out into a big paper of some kind that is put forth to the community as a stand-alone publication. This new approach seems like a pretty big deal to me, but here it is kind of hiding in the middle of a purported textbook, without the benefit of a thorough presentation.

The final substantive section connects K's ideas about formal semantics with the sorts of syntactic theories he earlier called "grammatical," which rely on notions of case, valence, and dependency more than traditional syntactic structures. The proposed methodology is still presented as a variant of Montague grammar, but now he is putting forth another novel set of proposals. Do these complement the proposals of the previous section, or do they replace them? I was a bit confused about what K. is really advocating we should do at this point. The brevity of this section, together with the unfamiliarity of the methods, makes it seem almost telegraphically concise.

My feeling about this chapter is that it makes a great many new proposals, but it is too short as a result. We really deserve a whole book that expands Chapter 6, I think. I, for one, would gladly read it. One thing that struck me as a little surprising, too, was the glib way in which Kornai sidesteps the hyperintensionality problem that has been long known to afflict Montague-style intensional logic treatments of semantics. This is widely regarded as a very big deal, and several researchers have spent large portions of their time working on solutions. Witness the recent work of Carl Pollard to fix Montague's IL, as he presented in a course over several days at the European Summer School in Logic, Language and Information (Dublin 2007). There was a contemporaneous publication by Reinhard Muskens in the Journal of Symbolic Logic detailing a different approach to fixing intensional logic. Does Kornai feel these people are wasting time on a non-issue? Or perhaps he would welcome joining forces between his proposals and those just cited.

Tuesday, March 16, 2010

Kornai on syntax

Chapter 5 of "Mathematical Linguistics" discusses a variety of approaches to syntax. Such a wide variety has surely never been discussed in a mathematically sound way in the same piece, I'm sure. He begins by going through relatively familiar theories like phrase structure rules and categorial grammars. He then discusses dependency grammars and case-based approaches, which gets increasingly unfamiliar to me. He then flies off to places unknown, in a fantastically interesting discussion of weighted and probabilistic grammars. The treatment of the "density" of a language is especially new to me.

There a few trouble spots, which include poorly introduced notions that get relied on at certain points. There is a discussion of weighted grammars that hinges on a "Dyck language," which is not a creature I am aware of, and I couldn't see where it had been introduced earlier in the book. This has the effect of making an already dense treatment into something overly cryptic.

The final discussion concerns "external evidence" in syntax, which refers to properties of human languages that are more or less evident by nonlinguistic facts. For instance, it is pretty plain that natural languages are, in some sense, learnable. What is missing is, as Kornai points out, "a realistic theory of grammar induction." This is something dear to me, which I have devoted some years to working on. I invite readers to see my recent paper "Grammar induction by unification of type-logical lexicons," published online in the Journal of Logic, Language, and Information. This brings me to another slight criticism of Kornai, where he (probably without meaning to) appears to propagate a common misunderstanding about the import of Gold's inductive inference results. K. states that "context-free grammars cannot be learned this way" [by exposure to positive examples]. But Gold only showed this to be true if we attempt to learn a class of context-free grammars that generates every finite language and beyond. Results can be much more positive if the finite languages are excluded from the hypothesis space, for instance. In my paper I show how a restricted class including some context-sensitive grammars can still be learned from a finite sample set, which is an even stronger learning criterion than Gold's identifiability in the limit.

On the whole, I was fascinated by this chapter, it contains a wealth of things that have never been treated very well formally, and simply to have so many approaches collected for discussion is an immense contribution to our literature.

Monday, March 8, 2010

Kornai's Chapter 4

Chapter 4 of Mathematical Linguistics deals with morphology. At this point, the book really shows its value not just as a mathematical compendium, but as a survey of linguistics. The treatment of word structure is founded in the notion of the morpheme, which I have spent some energy crusading against. I was swayed by the theory known as Whole Word Morphology, due apparently to Alain Ford and Rajendra Singh. I learned it from a graduate student at the University of Chicago, and we published one short paper stemming from his work in the ACL Workshop on Morphological and Phonological Learning (2002). Oddly enough, according to databases like Google Scholar, this is my most highly cited piece of work.

In any event there a few other things to quarrel with in Kornai's survey of morphology. For one thing, he restates the old yarn that derivational affixes appear closer to the stem than inflectional ones, and this appears to be not the case in the Athabaskan languages such as Navajo. He also considers the parts of speech to have a morphological foundation, but since they need to be used in syntactic derivations of sentences, it seems to me that parts of speech should have a more sound syntactic and semantic foundation as some kind of word-usage classes.

The section discussing Zipf's law is very useful. In sum I know of no survey of morphology that is anything like this chapter, it is a very important piece of literature. I wish Kornai would consider a mathematical treatment of word-based approaches like Whole Word Morphology; this is something I have been planning to work on for a long time now.

Friday, March 5, 2010

Chapter 3 of Kornai's book

In chapter 3 of "Mathematical Linguistics" Kornai deals with phonology. It is very nice to see the mathematical theory of phonology laid out here, formalizing the autosegmental framework. I do have a few questions about this treatment, since it seems to rely heavily on the notion that there are phonemes and sound segments (consonants, vowels etc.) Autosegmental phonology is then formalized into a computational system using biautomata. It is emphasized that context-sensitive generating power is seemingly not needed for phonology, and that finite-state power is sufficient.

Kornai appears to state that we can solve the phonemicization problem for a language (meaning that the phonemic inventory is determined by the phonetic data). I thought Yuen-Ren Chao proved otherwise in 1934, and that this proof was formalized in Kracht 2003. For example, I fail to see how it is possible to prove that English affricates are actually single segments. Why aren't they sequences of two segments (a stop and a fricative)?

Another issue comes from speech perception research, where years of trying have failed to establish that people use consonants and vowels as perceptual units. Syllables appear to have much more traction in this regard. It is of course still desirable to transcribe syllables using segments, but this can be regarded as a convenient fiction, as was suggested already by Trager, I believe, in 1935. On the view just described, each syllable would then be formally treated as a sequence of distinctive features, with a certain ordering relation but without any timing slots.