Quizz 04: Deep Learning Intro
This quizz covers material from the fourth lecture (second section from
Loss Functions).
- Given a training dataset (x,y), we want to identify a function fΘ() such that the predictions ŷ = fΘ(x) over the training dataset are as accurate as possible, and a Loss Function L(y,ŷ) - write the criterion that the optimal value of Θ must satisfy:
Find θ such that:
- Write the expression of the cross-entropy loss which is useful when the predicted output of the model we learn is interpreted
as a discrete distribution p(yc|x) for c ∈ [1..C] (C-way classification model).
f(x) = ŷ = (ŷ1 ... ŷC) is a distribution over the C possible classes.
L(ŷ,y) =
-
The deep learning approach learns a trainable non-linear mapping function φ from x to a representation φ(x) which can be used as
an input to a linearly separable classification problem. The general form of this trainable mapping we consider is:
ŷ = W φ(x) + b
φ(x) = g(W'x + b')
where g is a non-linear function.
Why do we need non-linear mappings?
- Consider the case of sentiment analysis on input text documents di = {wij} - where we want to classify the
sentiment expressed by the document as either positive, neutral or negative. Write the general form of a classifier deep learning
model, which takes as input one-hot encodings of the words - vi.
- Define the notion of computational graph as used in Deep Learning libraries such as DyNet, PyTorch or Tensorflow.
Last modified 26 Nov 2017