link

December 30, Tuesday
15:00 – 17:00

Climbing the Tower of Babel: Advances in Unsupervised Multilingual Learning
Computer Science seminar
Lecturer : Regina Barzilay
Affiliation : MIT
Location : 201/37
Host : Dr. Michel Elkin
For most natural language processing tasks, unsupervised methods significantly underperform their supervised counterparts. In this talk, I will demonstrate that multilingual learning can narrow this gap. The key insight is that joint learning from several languages reduces uncertainty about the linguistic structure of individual languages. These methods exploit the deep structural connections between languages, connections that have driven many important discoveries in anthropology and historical linguistics.

I will present multilingual unsupervised models for morphological segmentation and part-of-speech tagging. Multilingual data is modeled as arising through a combination of language-independent and language-specific probabilistic processes. This approach allows the model to identify and learn from recurring cross-lingual patterns, ultimately to improve prediction accuracy in each language. I will also discuss ongoing work on unsupervised decoding of ancient Ugaritic tablets using data from related Semitic languages.

This is joint work with Benjamin Snyder, Tahira Naseem and Jacob Eisenstein.