Introduction to Artificial Inteligence

Assignment 3


Homework assignment - simulators, agents, search, games, logic, inference


1) (FOL and situation calculus)
   We need to represent and reason within the scenario defined by assignment 1.
   
   a) Write down the axioms for the behavior of the world using FOL and
      situation calculus. Assume only one agent.
      Use the predicates:
        >, <            ; (in order to check if the agent can clear the ice, etc).
        Edge(e, v1, v2) ; e is an edge between vertex v1 and vertex v2.
        Goal(v)         ; The agent's goal is vertex v.
     Also use the fluents (situation-dependent predicates)
        Salt(v, w, s)   ; Vertex v has w units of salt in situation s.
        Ice(e, i, s)    ; Edge e has i units of ice in situation s.
        Carrying(w, s)  ; Agent is carrying w units of salt in situation s.
        Loc(v,s)        ; Agent is at vertex v in situation s

     Functions:
        +, -     (in order to compute remaing amount of salt, etc.)
     And the actions: move(e), load(w).

Answer:
  In order to save on the size of the axioms, we will
define auxiliary predicates. These are not necessary, but will save some work.
       OK(a, s)        ; action a is successful in situation s


     Axioms for OK are as follows:

   ; An agent at location v, trying to move across an edge e from v to some u, carrying
   ; an amount of salt w that is no less than the amount of ice on edge e, the move succeeeds.

1. forall (v, u, e, w, w1, s)
      Loc(v, s) and Carrying(w, s) and Edge(e, v, u) and Ice(e, w1, s) and not (w < w1)
                -> OK(move(e), s)

   ; An agent at location v, that contains w1 units of salt,
   ; trying to load w units of salt, if w is no more than w1, succeeds.

2. forall (v, w, w1, s)  Loc(v, s) and Salt(v, w1, s) and not (w > w1) -> OK(load(w), s)

   ; Note that if there are also carrying capacity constraints, C then we also
   ; need to add: "and Carrying(w2, s) and not (w+w2 > C)" to the antecedent and in addition add
   ; w2 to the variables being universally quantified, as below. But we will ignore this
   ; henceforth.

3. forall (v, w, w1, w2, s)  
        Loc(v, s) and Salt(v, w1, s) Carrying(w2, s) and not (w+w2 > C) and not (w > w1)
              -> OK(load(w), s)

 We also need to state for completeness that these are the only cases where the actions
succeed. That is:

4. forall (e, s) OK(move(e), s) -> exists (v, u, w, w1)
        Loc(v, s) and Carrying(w, s) and Edge(e, v, u) and Ice(e, w1, s) and not (w < w1) 

 
5. forall (w, s) OK(load(w), s) -> exists (v, w1) Loc(v, s) and Salt(v, w1, s) and not (w > w1)

However, the latter 2 axioms will not be used in the proof.
Now we can proceed. A successful move(e) causes the agent to be at the other vertex:

6. forall (v, u, e, s) Loc(v, s) and Edge(e, v, u) and OK(move(e), s) -> Loc(u, Result(move(e), s)

and the ice on that edge is cleared:

7. forall (v, u, e, s) Edge(e, v, u) and OK(move(e), s) -> Ice(e, 0, Result(move(e), s)

and the agend carries less salt:

8. forall (w, w1, e, s) OK(move(e), s) and Carrying(w, s) and Ice(e, w1, s) 
         -> Carrying(w-w1, Result(move(e), s)

A successful load reduces the amount of salt in a vertex:

9. forall (v, w, w1, s) Loc(v, s) and OK(load(w), s) and Salt(v, w1, s)
             -> Salt(v, w1-w, Result(load(w), s))

and also increases what the agent is carrying:

10.forall (w, w1, s) OK(load(w), s) and Carrying(w, s) -> Carrying(w+w1, Result(load(w), s))

We also need frame axioms, loading does not change agent location, whether it succeeds or not:

11. forall (v, w, s) Loc(v, s) -> Loc(v, Result(load(w), s))

and a move action leaves the amount of salt in every vertex unchanged, regardless.

12. forall (v, w, e, s) Salt(v, w, s) -> Salt(v, w, Result(move(e), s))

We also need to state that edges are bidirectional:

13. forall (e, v, u) Edge(e, v, u) -> Edge(e, u, v)

And that there is only one amount of salt in a vertex:

14. forall (v, w1, w2, s) Salt(v, w1, s) and Salt(v, w2, s) -> w1=w2

one amount of ice on an edge:

15. forall (e, i1, i2, s) Ice(e, i1, s) and Ice(e, i2, s) -> i1=i2

the agent can carry one amount of salt:

16. forall (w1, w2, s) Carrying(w1, s) and Carrying(w2, s) -> w1=w2

and can only be in one vertex in any situation:

17. forall (v, u, s)  Loc(v, s) and Loc(u, s) -> u=v

Note that the last 4 axioms are not needed in the proof, but are important in general.



   b) Write down the additional facts representing the following scenario in situation S0.


Edge(E1, Vstart, Vmid)
Edge(E2, Vmid, Vend)
Goal(Vend)

; Situation S0
Salt(Vstart, 2, S0)
Ice(E1, 0, S0)
Ice(E2, 1, S0)
Carrying(0, S0)
Loc(Vstart, S0)


Answer: we only need the axioms on numbers, such as 1 > 0, 2 > 1, not 0 > 1, etc.
nothing more needed. 
But note that if we did not have axiom 13 we would
either need to add it now, or instead add:
  Edge(E1, Vmid, Vstart)
  Edge(E2, Vend, Vmid)


  c)  Now, we need to find a sequence of actions that will result in reaching the goal,
      and prove a theorem that it will do so. Do the following steps:

      A) Convert the knowledge base to conjunctive normal form.


Answer (we will be lazy and not standardize apart here):

1. not Loc(v, s) or not Carrying(w, s) or not Edge(e, v, u) or not Ice(e, w1, s) or (w < w1)
                or OK(move(e), s)

2. not Loc(v, s) or not Salt(v, w1, s) or (w > w1) or OK(load(w), s)

3. not Loc(v,s) or not Salt(v,w1,s) or not Carrying(w2,s) or (w+w2 > C) or (w > w1) or OK(load(w),s)


4a. not OK(move(e), s) or Loc(V-of(e,s))
4b. not OK(move(e), s) or Carrying(W-of(e,s), s)
4c. not OK(move(e), s) or Edge(e, V-of(e,s), U-of)e,s)
4d. not OK(move(e), s) or Ice(e, W1-of(e,s), s)
4e. not OK(move(e), s) or not W-of(e,s) < W1-of(e,s)

where V-of, U-of, W-of, and W1-of are Skolem functions.

 
5.a not OK(load(w), s) or  Loc(V'-of(w,s), s) 
5.b not OK(load(w), s) or  Salt(V'-of(w,s), W1'-of(w,s), s) 
5.a not OK(load(w), s) or  not w > W1'-of(w,s)

where V'-of, W'-of, and W1'-of are Skolem functions.


6. not Loc(v, s) or not Edge(e, v, u) or not OK(move(e), s) or Loc(u, Result(move(e), s)


7. not Edge(e, v, u) or not OK(move(e), s) or Ice(e, 0, Result(move(e), s)


8. not OK(move(e), s) or not Carrying(w, s) or not Ice(e, w1, s) or Carrying(w-w1, Result(move(e), s)


9. not Loc(v, s) or not OK(load(w), s) or not Salt(v, w1, s) or Salt(v, w1-w, Result(load(w), s))


10.not OK(load(w), s) or not Carrying(w1, s) or Carrying(w+w1, Result(load(w), s))


11.not Loc(v, s) or Loc(v, Result(load(w), s))


12.not Salt(v, w, s) or Salt(v, w, Result(move(e), s))


13.not Edge(e, v, u) or Edge(e, u, v)


14.not Salt(v, w1, s) or not Salt(v, w2, s) or w1=w2


15.not Ice(e, i1, s) or not Ice(e, i2, s) or i1=i2


16.not Carrying(w1, s) or not Carrying(w2, s) or w1=w2


17.not  Loc(v, s) or not Loc(u, s) or u=v


      B) Formally list what we need to prove, and state its negation.

Answer: A sequence of actions that reaches the goal is [load(1), move(E1), move(E2)].
To state formally that this results in reaching the goal when executed in S0
we say:

G.   exists (v) Goal(v) and Loc(v, Result(move(E2), Result(move(E1), Result(load(1), S0))))

The negation of this is:

G'.  not exists (v) Goal(v) and Loc(v, Result(move(E2), Result(move(E1), Result(load(1), S0))))

Moving the negation inwards, we get:

     forall (v) not Goal(v) or not  Loc(v, Result(move(E2), Result(move(E1), Result(load(1), S0))))

and now removing the quantifier we have a single CNF clause:

G'(CNF). not Goal(v) or not  Loc(v, Result(move(E2), Result(move(E1), Result(load(1), S0)))) 

      C) Use resolution to reach a contradiction, thus proving the theorem.

Answer: Resolve G'(CNF) with "Goal(Vend)", unifier {v|Vend} to get:

20. not Loc(Vend, Result(move(E2), Result(move(E1), Result(load(1), S0))))

Resolve 20 with 6, unifier {u|Vend, s|Result(move(E1), Result(load(1), S0)), e|E2}:

21. not Loc(v, Result(move(E1), Result(load(1), S0))) or not Edge(E2, v, Vend) 
            or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Resolve 21 with "Edge(E2, Vmid, Vend)", unifier {v|Vmid}, we get:

22. not Loc(Vmid, Result(move(E1), Result(load(1), S0)))
            or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

We will now work in another direction.
Resolve 2 with "Loc(Vstart, S0)", unifier {v|Vstart, s|S0} to get:

23. not Salt(Vstart,w1,S0) or (w > w1) or OK(load(w),S0)

Resolve 23 with "Salt(Vstart, 2, S0)", unifier {w1|2} to get:

24. or (w > 2) or OK(load(w),S0)

Which means: it is OK to load anything not more than 2 in S0.
Resolving 22 with 6, on the Loc predicate, unifier {u|Vmid, e|E1, s|Result(load(1), S0)}:

25. not Loc(v, Result(load(1), S0)) or not Edge(E2, v, Vmid) or not OK(move(E1), Result(load(1), S0))
            or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Resolving 25 with "Edge(E1, Vstart, Vmid)", unifier {v|Vstart}, we get:

26. not Loc(Vstart, Result(load(1), S0)) or not OK(move(E1), Result(load(1), S0))
            or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Resolving 26 with the frame axiom 11, unifier {v|Vstart, w|1, s|S0} we get:

27. not Loc(Vstart, S0) or not OK(move(E1), Result(load(1), S0))
            or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Resolving 27 with "Loc(Vstart, S0)", empty unifier, we get:

28. not OK(move(E1), Result(load(1), S0)) or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Resolving 28 with 1, unifier {e|E1, s|Result(load(1), S0)}, we get:

29. not Loc(v, Result(load(1), S0)) or not Carrying(w, Result(load(1), S0)) 
       or not Edge(E1, v, u) or not Ice(E1, w1, Result(load(1), S0)) or (w < w1)
       or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Resolving 29 with the frame axiom 11 again, unifier {w|1, s|S0}, we get
(note that we MUST standardize apart the two different "w" here, or we
get an incorrect result!).

30. not Loc(v, S0) or not Carrying(w, Result(load(1), S0)) 
       or not Edge(E1, v, u) or not Ice(E1, w1, Result(load(1), S0)) or (w < w1)
       or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Resolving 30 with "Loc(Vstart, S0)", unifier {v|Vstart} we get:

31. not Carrying(w, Result(load(1), S0)) 
       or not Edge(E1, Vstart, u) or not Ice(E1, w1, Result(load(1), S0)) or (w < w1)
       or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Unifying 31 with "Edge(E1, Vstart, Vmid)", unifier {u|Vmid} we get:

32. not Carrying(w, Result(load(1), S0)) 
       or not Ice(E1, w1, Result(load(1), S0)) or (w < w1)
       or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Here we observe that we actually forgot a frame axiom! Loading does not change
the amount of ice on an edge! We add it now in as a CNF clause:

A.  not Ice(e, i, s) or Ice(e, i, Result(load(w), s))

Also, moving does not change the amount of ice on an unrelated edge:

B.  not Ice(e, i, s) e=e1 or Ice(e, i, Result(move(e1), s))

Resolving 32 with A, unifier {e|E1, w1|i, w|1, s|S0} we get
(again, note the different "w" in these axioms, we will chane one of them to "w'"):

33. not Carrying(w', Result(load(1), S0)) 
       or not Ice(E1, i, S0) or (w' < i)
       or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Resolving 33 with "Ice(E1, 0, S0)", unifier {i|0}, we get:

34. not Carrying(w', Result(load(1), S0)) or (w' < 0) 
       or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Resolving 34 with 10, unifier {w'|w+w1, w|1, s|S0} we get:

35. not OK(load(1), S0) or not Carrying(w1, S0) or (1+w1 < 0)
       or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Unifying 35 with 24, unifier {w|1}, we get:

36. (1>2) or not Carrying(w1, S0) or (1+w1 < 0)
       or not OK(move(E2),Result(move(E1), Result(load(1), S0)))

Unifying 36 with "Carrying(0, S0)", unifier {w1|0}, we get:

38. (1>2) or (1+0 < 0) or not OK(move(E2),Result(move(E1), Result(load(1), S0)))


Unifying 34 with "not (1+0 < 0)" and with "not (1 > 2)", which is in the background
assumed numerical knowledge, we get:

39. not OK(move(E2),Result(move(E1), Result(load(1), S0)))

We are "almost" done, all we need to show now is that 39 leads to a contradiction.
Unfortunately, this "almost" requires quite a way to go, so here we will skip some steps.
Using 39, and 1, unifier {s|Result(move(E1), Result(load(1), S0)), e|E2} we get:

40. not Loc(v, Result(move(E1), Result(load(1), S0))
        or not Carrying(w, Result(move(E1), Result(load(1), S0))
        or not Edge(E2, v, u) 
        or not Ice(E2, w1, Result(move(E1), Result(load(1), S0)) or (w < w1)

We then use "Edge(E2, Vmid, Vend)", to drop this literal, and get:

41.  not Loc(Vmid, Result(move(E1), Result(load(1), S0))
        or not Carrying(w, Result(move(E1), Result(load(1), S0))
        or not Ice(E2, w1, Result(move(E1), Result(load(1), S0)) or (w < w1)

Now, the first literal can be removed by a sequence of operations similar to that
starting with 22. This will leave us with:

42.  or not Carrying(w, Result(move(E1), Result(load(1), S0))
        or not Ice(E2, w1, Result(move(E1), Result(load(1), S0)) or (w < w1)

Which is straightforward, but still rather long. The first literal can be dropped
due to the move axiom on "Carrying" and the proof that in Result(load(1), S0)
the agent carries 1 unit of salt. The second is dropped by using the frame axioms
on ice. Resolve 42 with 8, unifier {w'|w-w1, s|Result(load(1), S0), e|E1}:

43.  not OK(move(E1), Result(load(1), S0)) 
     or not Carrying(w, Result(load(1), S0) or not Ice(E1, w1',Result(load(1), S0)
     or not Ice(E2, w1', Result(move(E1), Result(load(1), S0)) or (w < w1')

Here the first clause can be dropped in a way exactly as in 28, and the 2nd clause
as done from 33 (which also forces the substitution {w|1})
The third is dropped using frame axiom A and the initial
state axiom "Ice(E1, 0, S0)". The 4th is dropped using the frame axioms for ice A and
B and the initial state axiom "Ice(E2,1,S0)", which also forces the substitution
{w1'|1} The last clause is now  "(1 < 1)" and can be dropped using resolution
with the fact "not (1 < 1)" in the numerical background knowledge, resulting in the empty clause.


  d) What would be a potential problem in the proof if we did not
     have "frame axioms" in our representation (i.e. stating what
     did not change)?

Answer: this was shown above, if we did not have the frame axioms we
would not have been able to prove that the agent is still carrying the salt after moving,
and that the ice on edge E1 is still 0 after loading, or that the location of the
agent is still at S0 after loading, etc.

  e) Would we be able to do the proof using only forward chaining?

This would not work in this case because we would not have been able to
get the correct complicated function expressions
"Result(.,Result(...))", and there would be no way to generate
them using unification. If we were allowed forward chaining plus
the ability to generate arbitrary substitutions, this would 
still not work as some of the clauses used in the proof are not Horn clauses
(but could have been re-written to be so, if, e.g. instead of "or (w+w2 > C)"
we had: "or not  (w+w2 <= C)", and assuming we had a "less than or equal to" predicate.


2) (search and logic)
   Finding an explanation for observed evidence is known as abductive
   reasoning.  The input is a KB of definite Horn clauses (i.e. "rules",
   see below), and a function that assigns each literal a
   non-negative cost.
   Also part of the input is a set of literals E (the observed
   evidence) to be explained.   
   An explanation is a set of assumed literals A, that together with KB
   proves E.

   In this exercise, we wish to solve the weighted abduction
   version: given E, find an explanation  A where
   the sum of the costs of the literals in A is minimal. For example, consider
   the following KB:

   R1:        a -> b
   R2:  b and c -> d
   R3:        a -> c
   R4:        x -> d

   The cost of assumption for literals is as follows:
      cost(b) = 7
      cost(c) = 5
      cost(d) = 21
      cost(a) = 10
      cost(x) = 50

   For the case where we wish to find an explanation for E = {d},
   one possible explanation is A = {d} for a cost of 21.
   But it is not an optimal explanation, because A = {a}
   also an explanation, with a weight of 10.

   a) Formalize the weighted abduction problem as a search problem,
      where the state is a partial solution A', and where an operator adds
      any literal not in the partial solution.

Answer:
     Initial state: A = {}.
     Goal test: E follows from A union KB
     Operators (one of several possibilities):
          Add any literal to A that is not already in A.
     Path cost: for each operator, the cost of the added literal,
     

   b) Same as (a), but this time a lexicographical ordering of literals
      is given (say a, then b, etc.), and one can add a literal x only
      if no literal y greater than x is in A'.

Answer:
     Initial state: A = {}.
     Goal test: E follows from A union KB
     Operators (one of several possibilities):
          Add any x literal to A that is not already in A, provided that
          for all y in A, y is "lexicigraphically before" x. Note that you need to
          this predicate for implementation, and that numerous
          definitions are possible.
     Path cost: for each operator, the cost of the added literal.
     

   c) Assume that we use uniform-cost search, are we assured of finding
      the minimal explanation using version a? If not, are we assured of such
      using option b? You need to sketch the proof for your answer.

Answer:
      Yes in both cases, ASSUMING COSTS ARE POSITIVE.
      This follows from the monotonicity of path costs and the
      monotonicity of the logic (i.e. that adding literals can never
      cause something that was provable to stop being provable).

      For case b you also need the fact that every A is reachable from
      the empty set despite the lexicographic restriction.

      (Observe that you need noth monotonicities, or this would not work!)

   d) We wish to use A* to perform weighted abduction. Find an
      admissible heuristic for versions a, b, (they may all be the
      same heuristic, and it does NOT have to be a GOOD heuristic).
      Now, show an execution of A* using your heuristic to solve the
      weighted abduction problem, given as input the above KB and
      the evidence to be explained E = {d}.
      Use only one of the search problem versions a, b - your choice
      (hint: which would result in the shortest trace?)
Answer:
      A trivial addmissible heuristic would be simply 0. However, let
      us do something a bit better:
         h(A) = 0 if A is a goal,
         h(A) = otherwise, cost of cheapest eligible literal not in A.

      We will use option b, so that we get a shorter trace... Assume
      the standard lexicographic ordering (a is smallest).

       Initialize:
          {} h = 5, f = 5
       Expand {} to get:
          {a} h =   0, f =  10
          {b} h =   5, f =  12
          {c} h =  21, f =  26
          {d} h =   0, f =  21
          {x} h = inf, f = inf
       Put them in the queue in this order.

       Now, get {a}, which is a goal state, and we are done...

       Note effectiveness of heuristic, because had we used 0 we would
       have gotten:
          {c} h =   0, f =   5
          {b} h =   0, f =   7
          {a} h =   0, f =  10
          {d} h =   0, f =  21
          {x} h =   0, f =  50

      And now we would try to expand {c}, and then try to expand {b}
      and only then get to {a}.

3) (Propositional logic)
   For each of the following sentences, determine whether they are
   satisfiable, unsatisfiable, and/or valid. For each sentence that
   is satisfiable, determine the number of models (over the propositional
   variables in the sentence).
   a) A or B or C or D
           Satisfiable, 15 models.

   b) (not (((not A) or B) and ((not B) or C))) or C 
           This is satisfiable.
           Models are: 4 in which C is true. If C is false, the expression becomes:
           "(not (((not A) or B) and ((not B))))" which is equal to:
            "not ((not A) or B) or B" which equals:
            "(A and not B) or B" which is true unless both A and B are false.
            Therefore, the original expression has 7 models. 

   c) (A and (not A)) or (B and not B)
           Unsatisfiable.

   d) (A or B) and ((not C) or not D)
           Satisfiable. There are 3 models for just the LHS, and 3 for just the RHS. Since
           they are satisfied independently, the total number of models is 3*3=9. 

   e) A -> not (A or B)
         Satisfiable, equal to "not A or (not A and not B)", which has 2 models
         (A is false, B can be either true or false).

   f) (A -> not A) and (not A -> A)
         Equivalent to:  (not A or not A) and (A or A) which simplifies to: not A or A,
         which is not satisfiable.

4) (Game trees and search)
   Consider the game to be implemented in assignment 2, but this time with a stochastic
   element: an agent can attempt to cross an icy edge, and succeeds with a given
   probability p (say p=0.1) even if it is carrying no salt.
   You may assume a time limit as desired for the length of the game in
   each scenario.

   a) Show a scenario with p=0 (no random element) where for the first agent to
      move (Clarification: not "whoever is the first to move", i.e. this need be
      shown only for one of the agents) under zero-sum scoring, 
      moving in a certain direction (say, crossing edge E1) gets a negative score,
      but under semi-cooperative the above action is optimal.

Answer: 

Consider the simple graph with 3 nodes where both agents are in vertex Vstart,
agent A needs to get to VA, and agent B needs to get to VB. There are
3 edges:

E1 between Vstart and VB, costs 5  and has 5 unit of ice.
E2 between Vstart and VA, costs 10 and has 4 unit of ice.
E3 between VA and VB, costs 7 and has no ice.
Agent A initially carries 5 unit of salt, agent B 4 units.

At the initial state, agent A can cross E2 at the cost of (1+5)*10 = 60,
and get to its goal VA (total cost 60).
Then agent B can only take this route to VA (it does not have enough salt to cross E1)
and then edge E3 to VB for a total cost
(1+4)*10 + (1*4)+7 = 85. In this case the score for agent A is 85-60=25 and agent A wins.

If agent A takes E1, this costs (1+5)*5=30. Agent B then can take E1 at the cost of (1+4)*5=25.
Agent A then takes E3 for the cost of 7, total cost for agent A is 37.
Score for agent A is thus 25-37=-12 so agent A loses.
So in this scoring A can win by taking E2, and lose by taking E1.

But if the scoring is semi-cooperative, taking E1 is better for agent A since the cost for A is
37, rather than 60.


   b) In the game p > 0 zero-sum scoring, show a scenario
   where whoever is the first agent to move, gets a negative score.

Answer: here we will assume that an agent may not do a load(0) action, acting as a no-op.
In fact, we will have no salt in vertices and therefore no load actions are allowed.
Suppose we have p=0.1, and a graph with 4 vertices (Vstartm Vmid, V1, V2) and 3 edges.

Initially, both agents A and B are at Vstart, each carrying 1 unit of salt. There
is an edge E1 from Vstart to Vmid with 1 unit of ice, and from Vmid an edge E2 to V1 and
an edge E3 to V2. These edges each have a cost of 1 and 1 unit of ice. The goal of agent A is
V1, the goal of agent B is V2. Note that the first agent to move must cross E1,
as it is the only available action, removing the ice.
And then it is "semi-stuck", that is, it cannot cross to its goal with certainty.
So the SECOND agent pays 2 to cross E1 (carrying the salt), and can then get to its goal
with an additional cost of 2 (total cost of 4 with probability 1).
The FIRST agent's optimal policy is then to just keep trying to reach its goal
without clearing the ice, it pays 1 more in any case, and then an additional 1 with
probability 0.9 (since success is with probability 0.1), until eventually it succeeds. Total
expected cost is 2 (first move) + 1 + 0.9*(1+0.9*1+....), for a total expected cost of 12
(using formula for infinite geometric series). So the FIRST player always loses.


   In both cases, you must PROVE the scores by generating the respective game tree, either
   to a terminal position or prove that you can prune some branches.

   c) Suppose the game is fully cooperative, that is, the agents try to minimize the sum
      of their costs (with p=0). Find heuristics:
     1) One that is a lower bound, but non-trivial.

ANSWER: sum of the shortest distances assuming no ice.
(note: sum of shortest distances assuming sufficient salt, which is a better heuristic
for a single agent, is not admissible in this case, as one agent can clear a path for the other).

     2) One that is an optimistic upper bound, but non-trivial

ANSWER: This is actually rather tough, as the answer may be infinite, which is neither optimistic
nor non-trivial. But one possibility is to solve the problem for ONE agent and note the cost, 
and then solve for the SECOND
agent on the graph resulting from the traversal of the first agent, and add the costs.
This is an upper bound, and is certainly non-trivial and hard to compute. However, it is a bit
easier than the original problem, as solving for the heuristic in two stages decreases the
depth of the solutions, in pronciple (though not always!).



5)  (Game-tree search - alpha-beta pruning)
   a) Consider the tic-tac-toe position (with O to move):

   | X |  
-----------------
 X | O | O
-----------------
   | X |

     Give an example in ordering the complete search tree starting from this position 
     where alpha-beta pruning saves as little as possible, and another where the savings
     is maximal. How many nodes are pruned in each case?
     You may denote moves by Xxy or Oxy in order to save space.
     (Clarification: in the case with no pruning, no need to specify
     the tree explicitly, only estimate the number of nodes).

Answer: assuming y axis 1 is at the bottom...

Tree with best pruning is (note also initial bounds plus-minus 1 on
alpha and beta!), value of each node in parenthesis, with "-"
indicating -1 and "+" indicating +1.

MAX     MIN     MAX

(+) O31 (+) X33 (+) O13               score: 1 (alpha=1)
                    Other O moves pruned
        (+) X13 (+) O33               score: 1 (alpha=1)
                    Other O moves pruned
        (+) X11 (+) O13               score: 1 (alpha=1)
                    Other O moves pruned
    Other O moves pruned

Best move for O is O31, and also (due to symmetry) O33, which is not
found in this search.


With bad sorting there is almost no pruning:
        
MAX     MIN     MAX     MIN

(0) O11 (+) X13 (-) O31 (-) X33       score: -1 (alpha=-1)
                (+) O33               score:  1 (alpha= 1)
        (+) X31 (-) O31 (-) X33       score: -1 (alpha=-1)
                (+) O33               score:  1 (alpha= 1)
        (0) X33 (0) O31 (0) X13       score:  0 (alpha= 0)
                (-) O13 (-) X31       score: -1 (alpha= 0)
(0) O13 (+) X11 (-) O33 (-) X31       score: -1 (alpha=-1)
                (+) O31               score:  1 (alpha= 1)
        (+) X31 (-) O31 (-) X33       score: -1 (alpha=-1)
                (+) O33               score:  1 (alpha= 1)
        (0) X31 (0) O33 (0) X11       score:  0 (alpha= 0)
                (-) O11 (-) X33       score: -1 (alpha= 0)
(+) O31 (+) X33 (-) O11 (-) X13       score: -1 (alpha=-1)
                (+) O31               score:  1 (alpha= 1)
        (+) X13 (-) O11 (-) X33       score: -1 (alpha=-1)
                (+) O33               score: 1 (alpha=1)
        (+) X11 (+) O13               score: 1 (alpha=1)
                    Other O move pruned
    Other O move pruned.
 
In this case 14 leaf nodes, and some intermediate nodes can be pruned by
correct ordering more than the worst ordering.

   b) Consider the same scenario as above, except that after every move by
      O, a random mark (equally probable to be X or O, uniformly distributed
      over the free locations) is generated. Draw and evaluate the
      search tree, and state where pruning is possible.

Answer:

MAX      CHANCE    MIN     MAX

(0) O31  (+) O11   (+) X33  (+) O13   score: 1
                   (+) X13  (+) O33   score: 1
         (+) O13                      score: 1
         (+) O33                      score: 1
         (-) X11   (-) X13            score: -1  (beta = -1)
                Other X move pruned
         (-) X13   (-) X11            score: -1  (beta = -1)
                Other X move pruned
         (-) X33   (-) X13            score: -1  (beta = -1)
                Other X move pruned
(0) O33  (symmetrical to O31)
(0+)O11  (+) O31   (+) X33  (+) O13   score: 1
                   (+) X13  (+) O33   score: 1
         (+) O33                      score: 1
         (+) O13   (+) X33  (+) O31   score: 1
                   (+) X31  (+) O33   score: 1
         (0) X31   (+) X13  (+) O33   score: -1  (beta = -1)
                   (0) X33  (0) O31   score: 0
         (-) X13   (0) X33            score: -1  (beta = -1)
                Other X move pruned
         (-) X33   (-) X13            score: -1  (beta = -1)
                Other X move pruned
(0+)O13  (symmetrical to O11)

Where (0+) is plus 1/6. So best move for O in this case is either O11 or O13, with an
expected utility of 1/6.


6)  (agents and environment simulators)
   For each of the following domains, which type of agent is appropriate
   (table lookup, reflex, goal based, or utility based). Also, indicate the
   type of environment:
   a) An agent that plays billiards on a real table.
         Utility based. Environment: accessible, stochastic, (possibly episodic),
         continuous, static, multi-agent (assuming an opponent).
         With an opponent, envionment becomes non-episodic as there is an opportunity
         to learn about opponent behaviour after several games.
 
   b) An agent that plays chess.
         Can use goal based or utility based. Environment: accessible, deterministic,
         discrete, semi-static (chess clock time limit), non-episodic (due to learning
         possibility of opponent), multi-agent.

   c) An agent that can play spider solitaire.
         Goal based or utility based. Environment: non-accessible, deterministic, episodic,
         discrete, static, single-agent. But actually it is better to model the
         non-accessibility as accessible and stochastic (usually easier to solve).
 
   d) An autonomous robot that can win the DARPA challenge (driving in
      a city).
         Utility based. Environment: non-accessible, stochastic, non-episodic,
         continuous, dynamic, and multi-agent. In short, the hardest possible.

   e) An internet shopping agent (flights reservation domain).
         Utility based. Environment: non-accessible, stochastic, semi-discrete
         (only costs may be seen as continuous), non-episodic, dynamic, multi-agent.

   f) An agent that can solve the 8-puzzle.
         Here a reflex agent, even using table lookup, is sufficient.
         Environment: accessible, deterministic, static, episodic, discrete,
         and single-agent.



Deadline: Noon (12:00), Wednesday, December 9, 2009 ( strict deadline).

Submissions are to be solo.