Quizz 03: Deep Learning Intro and Neural LMs

This quizz covers material from the third lecture:
  1. In machine learning, we are given a dataset of the form {(xi, yi)}, i ∈ [1..N] and aim at learning a function f(x) which maps unseen input feature vectors to ŷ - the predicted value. Distinguish between the 3 types of learning problems by characterizing the form of the predicted values ŷ:




  2. Given a training dataset (x,y), we want to identify a function fΘ() such that the predictions ŷ = fΘ(x) over the training dataset are as accurate as possible, and a Loss Function L(y,ŷ) - write the criterion that the optimal value of Θ must satisfy:

    Find θ such that:

  3. Write the expression of the cross-entropy loss which is useful when the predicted output of the model we learn is interpreted as a discrete distribution p(yc|x) for c ∈ [1..C] (C-way classification model). f(x) = ŷ = (ŷ1 ... ŷC) is a distribution over the C possible classes.

    L(ŷ,y) =

  4. The deep learning approach learns a trainable non-linear mapping function φ from x to a representation φ(x) which can be used as an input to a linearly separable classification problem. The general form of this trainable mapping we consider is: ŷ = W φ(x) + b φ(x) = g(W'x + b') where g is a non-linear function. Why do we need non-linear mappings in this formulation?

  5. List 3 problems of count-based (n-gram) language models which are addressed by neural language models:

Last modified 04 Nov 2018