Introduction to Artificial Inteligence

Assignment 5 (bonus)

Programming assignment - learning

In this assignment, you will implement the greedy heuristic decision-tree learning algorithm studied in class. The heuristic will be the standard information-gain heuristic. You will train classifiers for the possibly-hostile rock sample scenario from assignment 4.

Program description: your input will be a scenario and a set of observation locations, exactly as in part 1 of assignment 4, except that the value of the sensor readings will not be instantiated. Using that, you will use the distribution defined by the Bayes network you constuct to generate a set of EXAMPLES (number of examples should be a user-specified parameter, typical values being 100, 1000, etc.), and store them (prefrerably to some file). You can generate the examples by using the sampling algorithm studied in class.

Then, your program should query the user for a target attribute (one of the variables: either a specific site, or the truth value of Calan's theory), and learn the decision-tree classifier for that target variable. The decision tree classifier is one output.

Note that you should use part of the examples you generated as a training set, and another part as the test set, randomly selected. The user should be able to select the size of the trainig set. You should also output the classification result quality of your classifier on the test set.

Deliverables:

Your program.
Report on a set of experimental examples: a specific scenario, target attribute, the resulting tree, and the classifier quality. You should provide a graph of the test results vs different values of the fraction of the training set.

Deadline: March 10, 2007.