Niyogi: Informational Complexity of Learning

This is the first in a series of posts highlighting the contents of Partha Niyogi's 1998 monograph The Informational Complexity of Learning, and using this as a jumping-off point for some discussion of my own.

I'll begin with some brief notions from Chapter 2. In this chapter, Niyogi emphasizes some very important points of learning theory, while developing a new analysis of learnability by neural nets. In a fairly standard mathematical model of learning championed by Vapnik and others, we view a learner as attempting to formulate the best hypothesis possible within its hypothesis class, which approximates the target concept within the concept class.

Let me just stop right there, in order to discuss the ramifications of this for language learning. Though this framework does not originate with Niyogi, in my view he is one of a few linguists who understood it. I believe that this setup alone may be used to all but prove the necessity of some kind of "Universal Grammar" which is commonly advocated. Universal Grammar should be seen as the limitations on "possible human languages" that in effect makes the hypothesis class of languages used by the human learner sufficiently small. Numerous negative results have shown time and again that overly large classes of languages are not strictly learnable, essentially because they are too big. To me, this speaks loudly against any language learning model which invokes a "general cognitive learning" idea, as if humans could leverage their general abilities to successfully learn whatever kind of language is thrown at them. We already know from experience that humans can only learn human languages. Creolization from simpler fabricated lingua francas and pidgins is easily understood in this way. In that scenario, the target concept is outside the hypothesis class, and the learners settle on the best hypothesis in the hypothesis class, which is in fact a human language.

I presume that Universal Grammar is an innate set of things delineating the required properties of a human language, and which thereby also delineates the hypothesis class which is used by human learners. Beyond that, I do not know what it is exactly. I will continue to use Partha's book to further this discussion in later posts.

