AI MidTerm 2, solutions ---------------------------- Question 1: ----------- 1) True. Using GAs to learn NN requires at most a fitness function, no knowledge of error function or even a "correct" behavior required. 2) False. i Counterexamples exist. Also, finding a minimal size decision tree is NP-hard, and the greedy algorithm is polynomial... 3) False. Even 3-layers (1 hidden layer) was sufficient to implement XOR, as shown in class. (The statement is true only for NN with no hidden layers.) 4) True. Any optimization algorithm on arbitrary vectors can be used, and SA is such an algorithm. May not be very efficient, but at least does not get stuck at local minima of number of errors! Question 2: ----------- a) In same order as sentences in exam: 1) forall x (EviAgainst(x) ^ Criminal(x)) -> CanConvict(x) 2) forall x CrimeLord(x) -> ((Exists y Bribe(x, y)) v (Exists y SellDrugs(x, y))) 3) forall x ((Exists y Bribe(x, y)) v (Exists y SellDrugs(x, y))) -> Criminal(x) 4) forall x CrimeLord(x) -> (not CanConvict(x)) 5) forall x (Criminal(x) ^ (not EviAgainst(x))) -> Slippery(x) b) Again, in same order (assume - all names beginning with lower-case are variables): 1: not EviAgainst(x) v not Criminal(x) v CanConvict(x) 2: not CrimeLord(x1) v Bribe(x1, B(x1)) v SellDrugs(x1, S(x1)) Where S(.) and B(.) are skolem functions, since y's were withing the scope of x. Note that the y's were two DIFFERENT variables above! 3: not Bribe(x2, y1) v Criminal(x2) 3a: not SellDrugs(x3, y2) v Criminal(x3) 4: not CrimeLord(x4) v not CanConvict(x4) 5: not Criminal(x5) v EviAgainst(x5) v Slippery(x5) c) We need to prove: Q: forall x CrimeLord(x) -> Slippery(x) (Note that Q itself in CNF reads: "not CrimeLord(x) v Slippery(x)") Negating Q, we get: Q': Exists x CrimeLord(x) ^ not Slippery(x) Replacing x with a Skolem constant, we get the CNF version: Q'a: CrimeLord(Capone) Q'b: not Slippery(Capone) We will now begin with the proof. In order to make this shorter, we will actually first AVOID using Q', which will make the answer to d shorter. From 2 and 3, unifocation {x2/x1, y1/B(x1)} we get: 6: not CrimeLord(x1) v SellDrugs(x1, S(x1)) v Criminal(x1) From 6 and 3a, unifocation {x3/x1, y2/S(x1)} we get: Criminal(x1) v Criminal(x1) v not CrimeLord(x1) which is simplifies to: 7: not CrimeLord(x1) v Criminal(x1) (I.e. every crime lord is a criminal). From 7 and 5, unifier {x5/x1} we get: 8: not CrimeLord(x1) v EviAgainst(x1) v Slippery(x1) From 1 and 4, unifier {x4/x}, we get: 9: not EviAgainst(x) v not CrimeLord(x) v not Criminal(x) From 9 and 7, unifier {x1/x}, we get: not EviAgainst(x) v not CrimeLord(x) v not CrimeLord(x) which is: 10: not EviAgainst(x) v not CrimeLord(x) From 10 and 8, unifier {x1/x}, we get: not CrimeLord(x) v Slippery(x) v not CrimeLord(x) which is: 11: not CrimeLord(x) v Slippery(x) (Note - this is already the answer to d, since this is the CNF representation of the query, which was proved without using Q'. Nevertheless, we need a refutation proof - i.e. to reach the empty clause, so we continue). Now, from Q'a and 11, unifier {x/Capone}, we get: 12: Slippery(Capone) And from 12 and Q'b, with the null unifier, we get the empty clause. d) Yes, we can do using resolution in a non-reutation proof, because we proved 11, which is the query, without using the clauses Q'a or Q'b. See above. e) No, cannot be done just with backward chaining. This is because clause 2 is not Horn, and there is no way to prove the query without it (it is needed because we need to show that CrimeLords are Criminals, there is no other way to show that they are Slippery). Also, using Backward chaining - there is a problem on how to represent the query, and additionally thare are no base facts in the KB. Any of these reasons alone would suffice. Question 3: ----------- a) Not a polytree, because of the undirected cycle A-E-C-B-D-A, and other cycles. b) Network is DP-singly connected. We need to examaine only all nodes with in-degree > 1 as possible sinks, and all nodes with out-degree > 1 as source. From A, there is only 1 directed path to D, C, or F. Likewise, from B there is only 1 directed path to C, D, F. From E there is no path to D, and only one to C or F. c) A is d-separated from B, because all paths from A to B have either C, or D, or F as converging nodes. Since there is no evidence, these nodes block the paths. Thus, A is independent of B. d) A is not d-separated from B given C, because the path A-E-C-B is pass-through E (and E is not in the evidence set), and converging at C (and C is in the evidence set). Since these are the only internal nodes on the path, and neither blocks, the path is unblocked. Thus, A and B are dependent given B. e) All paths from A to B that do not pass E are blocked because of converging non-evidence nodes on the path. The paths through E are all pass-through, and are blocked since E is in the evidence set. Thus, A is independent of B given C and E. f) The same argument as above can be used to show that A is d-separated from B given E. Thus: P(A | B, E) = P(A | E) = P(AE) / P(E) = (Applying Bayes' rule) P(E|A) P(A) / (P(E|A) P(A) + P(E|~A) P(~A)) = (all values are given) 0.5 * 0.9 / (0.5 * 0.9 + 0.1 * 0.1) = 0.45 / 0.46 =approx 0.98