Quizz 02: General Intro to NLP + Language Models

Name: ________________________

This quizz covers material from the second lecture of the NLP 20 course.
  1. Consider the various definitions of the linguistic unit called "word": orthographic (sequence of letters separated by delimiters); semantic (an independent unit of meaning); dictionary (a unit which is listed in the lexicon of the language). For each of the following cases, indicate whether the classification of the string between brackets as a single word according to the [O]rthographic, [S]emantic [D]ictionary sense:


    He even [wouldn't] agree to eat bread.

    [It's] on the right.

    [Trump's] tweet triggered a sworm of protest.

    [Come on]!

    [Beer Sheva] has more fountains than Tel Aviv.

    [Two hundred and fifty five] students attended the event

  2. Define what is a language model





  3. What are the two probabilistic models involved in spell checking, and how are they combined?





  4. Write the chain-rule expression of the joint probability p(w1,...,wn)







  5. What is the Markov assumption applied to language modeling







  6. Define the perplexity of a bigram language model given a dataset [w1...wN]







  7. How many parameters are needed to model a bigram language model for a vocabulary of size V?









Last modified 10 Nov 2019