Big Language Data – Lecture by Antal van de Bosch

The next lecture in the Nijmegen e-Humanities Lectures presented by TLA will be given the 5th of June by Antal van den Bosch. He will present on Big Language Data.

Who: Antal van den Bosch, Centre for Language Studies, Radboud University Nijmegen
What: Big Language Data
Where: MPI for Psycholinguistics, room 1.63 (main lecture hall)
When: Tuesday 5th of June, 14:30

Abstract
Digitized written language can be scooped up at will from the internet and exploited for science. Even without any explicit linguistic annotation the language data itself can directly be used for practical purposes such as spelling correction, text completion, and if parallel text in two languages can be found, for machine translation. Zipf’s law ensures that when you have more data, results will be better (log-linearly). In fact many of the best natural language processing systems are based on data only, plus the power of sophisticated stochastic methods. I’ll argue that there is a less sophisticated class of methods based on analogical reasoning that produces the same impressive results. I’ll discuss the linguistic interestingness of this idea using centenary concepts such as Hermann Paul’s Analogiebildung and De Saussure’s quatrième proportionelle.

You are all cordially invited to attend.
If you plan to attend please send an email to tla-lectures@mpi.nl (for administrative reasons).