e-Humanities in Action MPI

e-Humanities in Action: Alexander Clarck


January 15, 2014

Our next lecture in the series ‘e-Humanities in action’ will be given on January 15th by Alexander Clark from King’s College London. His research has focused on the computational acquisition of language and its relation to first language acquisition. He has written several relevant books, including “Linguistic Nativism and the Poverty of the Stimulus” and “The Handbook of Computational Linguistics and Natural Language Processing”, and has authored over 60 publications.


Who: Alexander Clark

What:  Strong learning of Context-Free Grammars

Where: MPI for Psycholinguistics, room 236

When: Wednesday, January 15th, 14:30


Language Acquisition is a central issue in cognitive science: here we present a family of algorithms that may provide a solution to this notoriously difficult and contentious problem.


Standard models of language learning are concerned with weak learning: the learner, receiving as input only information about the strings in the language,  must learn to generalise and to generate the correct, potentially infinite,  set of strings generated by some target grammar.  Here we define the corresponding notion of strong learning: the learner, again only receiving strings as input, must learn a grammar that generates the correct set of structures or parse trees. We formalise this using a modification of Gold’s identification in the limit model, requiring convergence to a grammar that is isomorphic to the target grammar.


We take as our starting point a simple learning algorithm for substitutable context-free languages, based on principles of distributional learning, and modify it so that it will converge to a canonical grammar for each language.  This gives a strong learning result, where we can derive a grammar that defines a set of trees from a set of strings; as far as we are aware this is the first nontrivial result of this type.


The class of languages that this works for is very small; we will therefore discuss how we could extend this first to a much larger class of context-free grammars, and then to a representative class of mildly context-sensitive grammars, well-nested multiple context-free grammars of dimension 2.