At some point in this forum I think I posted about my old work on Whole Word Morphology. Next month I am attending the International Symposium on Artificial Intelligence and Mathematics, and speaking about this approach in a special session on Mathematics of Natural Language Processing. I think that it may be useful for NLP in a variety of languages. Here is the abstract:
Whole Word Morphology does away with morphemes, instead representing all morphology as relations among sets of words, which we call lexical correspondences. This paper presents a more formal treatment of Whole Word Morphology than has been previously published, demonstrating how the morphological relations are mediated by unification with sequence variables. Examples from English are presented, as well as Eskimo, the latter providing an example of a highly complex polysynthetic lexicon. The lexical correspondences of Eskimo are operative through their
interconnection in a network using a symmetric and an asymmetric relation. Finally, a learning algorithm for deriving lexical correspondences from an annotated lexicon is presented.
Link to the paper at the ISAIM website
This research program fits with my general theme of learning language through unification procedures, which I think is both computationally useful and cognitively relevant. It seems to me that the cognitive version of unification is "analogical learning."