1) Find the most general unifier for the following pairs, if it exists (variables are identifiers with a ? prefix). Note that we assume the need to standardize apart, first (we will "prime" the variable name in the second term, if necessary). a) P(A, ?z, C), P(?x, ?x, ?z) {?z|?x, ?x|A, ?z'|C} (alternately ?z|A, which is equivalent) b) Foo(?x, Foo(?x)), Foo(Foo(?y), ?y) Fail (occurs check) c) Q(?y, G(A, B)), Q(G(?x, ?x), ?y) {?y|G(?x, ?x), ?y'|G(A, B)} 2) Write down the axioms and facts for example below in FOPC: The taxonomic hierarchy for animals, contains birds, which which in turn contains penguins and sparrows (which are disjoint). All birds can fly, unless they are penguins. No penguins can fly. Anything that can fly has wings. Poopoo is a penguin. We can use either set notation or predicate notation for sets in the hierarchy. In predicate notations we will have one-argument predicates: Animal, Bird, Sparrow, Penguin. Also the Can-Fly(x) predicate and Wing(x) mean x can fly and x is a wing, respectively. We need the predicate Has-part(x, y) meaning x has part y. The hierarchy: (1) (forall x) Bird(x) => Animal(x) (2) (forall x) Penguin(x) => Bird(x) (3) (forall x) Sparrow(x) => Bird(x) (4) (forall x) ~(Sparrow(x) ^ Penguin(x)) Other facts: (5) (forall x) Bird(x) ^ ~Penguin(x) => Can-fly(x) (6) (forall x) Penguin(x) => ~Can-fly(x) (7) (forall x) Can-fly(x) => (exists y) Has-part(x, y) ^ Wing(y) (8) Penguin(Poopoo) a) Convert the knowledge base into normal form Using ? as a prefix to denote variables, we get (in CNF): (1) ~Bird(?x) v Animal(?x) (2) ~Penguin(?x) v Bird(?x) (3) ~Sparrow(?x) v Bird(?x) (4) ~Sparrow(?x) v ~Penguin(?x) Other facts: (5) ~Bird(?x) v Penguin(?x) v Can-fly(?x) (6) ~Penguin(?x) v ~Can-fly(?x) (7) ~Can-fly(?x) v Has-part(?x, wing-of(?x)) ; Wing-of: skolem func. (7') ~Can-fly(?x) v Wing(wing-of(?x)) (8) Penguin(Poopoo) b) Show how this knowledge-base should be indexed - using a table. Using a simple predicate-based table, with clause.literal numbers (note that since this is CNF rather than INF, we do not have "conclusion" and "premise" columns): Predicate Positive Negative Animal 1.2 Bird 2.2, 3.2 1.1, 5.1 Penguin 8.1, 5.2 2.1, 4.2, 6.1 Sparrow 3.1, 4.1 Can-fly 5.3 6.2, 7.1, 7'.1 Has-part 7.2 Wing 7'.2 c) Prove or disprove by using resolution (or argue that neither can be done): 1) Poopoo has wings, Cannot be proved or disproved - only 7' has the predicate "Wing", and since we can prove ~Can-fly(Poopoo), there is no way to prove or disprove that Poopoo has wings. 2) Poopoo can fly. From (6) and (8) with unifier {?x|Poopoo} we get (9) ~Can-fly(Poopoo) 3) Poopoo is an animal. From (1) and (2) with unifier {?x|?x'}, resolving on Bird(?x) we get: (10) ~Penguin(?x') v Animal(?x') Then from (8) and (10) with unifier {?x'|Poopoo} we get: Animal(Poopoo) 4) If poopoo is a sparrow, then Poopoo cannot fly. We need to prove: ~Sparrow(Poopoo) v ~Can-fly(Poopoo) We will add the negation of this as: (11) Sparrow(Poopoo) (11') Can-fly(Poopoo) Now resolving (9) and (11') results in the empty clause (contradiction - proving the sentence). d) Would using INPUT RESOLUTION only change the answers to c? No, we only used input resolution in the answers. 3) It is known that there is a correlation between smoking and cancer. Medical associations claim that smoking causes cancer. Tobacco companies claim that there is a hidden element that causes both cancer and smoking. Draw a Bayes network for each of the above claims. Medical associations claim network is: smoking ---=> cancer Tobbaco companies claim network is: Hidden element ----=> smoking \ --------=> cancer 4) Consider the following variables and their immediate causes (parents): variable parents ------------------------------- A none E none B A C A E D A F B C Draw the network (assume arrows pointing downwards) A E /|\ / D B C \| F a) Is this network a poly-tree? No. A - B - F and A - C - F are paths in the network. b) Determine the truth of the following independence statements (using d-separation): 1) I({A}, {E} | {}) True (paths blocked by C, F, resp.) 2) I({A}, {E} | {B}) True (paths blocked by C, F resp.) 3) I({A}, {E} | {C}) False (path A C E unblocked) 4) I({A}, {E} | {F}) False (path A C E unblocked) 5) I({A}, {C, F} | {B}) False (path A C always unblocked) 6) I({A}, {F} | {B, C}) True (paths blocked by B, C, resp.) c) Determine at least 2 different ways to convert the network into a poly-tree using clustering. 1) Make B-C a cluster. 2) Make A-B-C-F a cluster. d) Suppose that P(A) = 0.9, P(E) = 0.3, P(B|A) = 0.1, P(B|~A) = 0.8. Find P(A|E) Since I({A}, {E} | {}) then P(A|E) = P(A) = 0.9 5) You need to construct a 3-input neural network that has value 1 in the output just when exactly 2 of its inputs are 1, and 0 otherwise (assume inputs are in {0, 1}). a) Can this be done without hidden units? No - the function is not linearly separable in 3 dimensions. b) Show a network using threshold elements that performs the required computation (specify the weights, too!) Use a hidden unit that is on if ALL of its inputs are 1 (for example, input weights 1 and threshold 2.5). Use one output unit with input weights 1 from each of the input layer neurons is 1, weight -10 on an arc from the hidden unit, and threshold 1.5. Output will be 1 if AT LEAST 2 of its inputs are 1, but NOT when all 3 inputs are 1, which is exactly what we need. 6) Using the perceptron learning rule (the "delta" rule), show the steps of learning the function OR with a single perceptron and a step activation function with threshold 1. Initially all weights are 1, and the learning rate is 0.4. If the threshold is such that a weighted sum EQUAL to the threshold causes an output of 1, a single epoch finds no errors and the perceptron is trained. If the output is one only on STRICTLY greater than 1, in the first epoch the examples 01 and 10 will increase the respective weights to 1.4, after which no further changes will occur. 7) Is it possible to combine back-propagation with genetic algorithms, and if so, how? Yes, use back-propagarion as a local improvement operator on elements of the population, for example just before computing fitness in each iteration. 8) a) Construct by hand a decision tree for the following table, where the first decision is based on A, then if necessary B, etc. input variables decision A B C D -------------------------------------------- T 1 F 5 fly T 1 F 4 fly T 1 F 3 fly T 1 T 3 fly T 2 F 3 walk T 2 F 4 walk F 1 F 2 fly F 1 T 4 crawl F 1 T 5 crawl A ---(true)--- B ---(1)--- fly \ \ \ --(2)--- walk \ --(false)-- C -(false)-- fly \ -(true)-- crawl b) Which node will be pruned first, based on least information gain? Nodes B, C have the same information gain PER ENCOUNTERED EXAMPLE AT THE NODE. However, C is needed on fewer examples and should be pruned first c) Is a more compact decision tree possible (possibly using a different ordering of the variables)? Which tree? Since there is no subset of 2 attributes which uniquely determine the decision variable, every decision tree must have 3 or more non-leaf nodes, thus the above tree is optimal.