Michael Elhadad
Natural Language Processing (202-2-5211)
Meets:
Sun 14-16 Bdg 34 Room 007
Mon 12-14 Bdg 34 Room 005
News:
- May 15: Assignment 1 is available
- May 26: Added information on reading the Treebank files in Scheme in this page
- May 28: Added information on computing KL divergence in this page
- June 23: Assignment 2 is available
- June 23: Notes on summarization added.
- June 23: Notes on text clustering added.
Lecture Notes
- General Intro to NLP - Linguistic Concepts
- Parts of speech Tagging
- Context Free Grammars Parsing
- Automatic Text Summarization
Topics covered in assignments include:
- Language Models and n-grams -- Statistical Models of Unseen Data (Smoothing)
- Information Extraction / Named Entity Recognition (see Assignment 2)
- Using Machine Learning Tools: Classification, Sequence Labeling / Supervised Methods / SVM (see Assignment 2)
- Compositional Semantic from CFG Parsing
- Sentence Simplification
Assignments
Software
- NLTK Installation: Nltk is a Python based toolkit with wide coverage of NLP techniques - both statistical and knowledge-based.
- SISC Scheme Interpreter: we use Scheme examples to demonstrate algorithms in parsing, generation and some semantic analysis. This interpreter is very small and convenient to use on any platform supporting Java (full version is 2.4MB with full doc - jar is 300KB).
Resources
Last modified April 19, 2009
Michael Elhadad