Quizz 06: Classification

This quizz covers material from the sixth lecture on Classification.

Consider the feature extraction function we reviewed for POS tagging using a classifier:

def features(sentence, index):
    """ sentence: [w1, w2, ...], index: the index of the word """
    return {
        'word': sentence[index],
        'is_first': index == 0,
        'is_last': index == len(sentence) - 1,
        'is_capitalized': sentence[index][0].upper() == sentence[index][0],
        'is_all_caps': sentence[index].upper() == sentence[index],
        'is_all_lower': sentence[index].lower() == sentence[index],
        'prefix-1': sentence[index][0],
        'prefix-2': sentence[index][:2],
        'prefix-3': sentence[index][:3],
        'suffix-1': sentence[index][-1],
        'suffix-2': sentence[index][-2:],
        'suffix-3': sentence[index][-3:],
        'prev_word': '' if index == 0 else sentence[index - 1],
        'next_word': '' if index == len(sentence) - 1 else sentence[index + 1],
        'has_hyphen': '-' in sentence[index],
        'is_numeric': sentence[index].isdigit(),
        'capitals_inside': sentence[index][1:].lower() != sentence[index][1:]
    }

Indicate which of these features are of the following types:

Lexical:
Morphological:
Syntactic:

Why is the classifier-based approach for POS superior to the backoff method of combining multiple taggers (for example, the unigram tagger with the affix-based tagger combined in a backoff manner)?
Explain the intuition behind TF*IDF word weighting for bag of words document encoding.
Why is it desirable to reduce the dimensions of the bag of words representation before applying a classifier?
Give three examples of dimensionality reduction used for bag of words representation

Last modified 10 Dec 2017