January 1, Tuesday
12:00 – 13:00
Semi-Supervised structured prediction in Natural Language Processing through Declarative Knowledge Encoding
Computer Science seminar
Lecturer : Roi Reichart
Affiliation : University of Cambridge
Location : 202/37
Host : Dr. Aryeh Kontorovich
A large number of Natural Language Processing applications, including
syntactic parsing, information extraction and discourse analysis,
involve the prediction of a linguistic structure. It is often times
challenging for standard feature-based machine learning algorithms to
perform well on these tasks due to modeling and computational reasons.
Moreover, creating the large amounts of manually annotated data
required to train supervised models for such applications is usually
labor intensive and error prone.
In this talk we describe a serious of works that integrate feature
based methods with declarative task and domain knowledge in a unified
framework. We address a wide variety of NLP tasks and domain
knowledge: for syntactic parsing we show how to parse multiple
sentences together while imposing consistency constraints, for
information extraction we present a joint model that ties together a
number of related tasks through task and domain constraints and for
discourse analysis we present a model that exploit within and cross
document regularities in a collection of documents.
Our models are implemented in the Markov Random Field (MRF) framework
and the resulted global hard optimization task is addressed by
approximate inference techniques based on linear programming (LP)
relaxations. We present improvements over state of the art models in
five languages and a wide range of supervision levels - from fully
unsupervised to fully supervised scenarios.