Introduction to Artificial Inteligence - Solutions

Assignment 3

Homework assignment - simulators, agents, search, games, logic, inference



1)  (agents and environment simulators)
   For each of the following domains, which type of agent is appropriate
   (table lookup, reflex, goal based, or utility based). Also, indicate the
   type of environment:
   a) An agent that plays Bridge.
ANSWER: Needs a utility-based agent to optimize the score.
        This environment can be seen as deterministic (despite the randomness of dealing
        cards initially, as this randomness is "swallowed" by the inaccessibility), inaccessible,
        non-episodic (can learn opponents tactics, etc.), discrete, semi-static (not allowed
        to think forever, but nothing changes between moves), and multi-agent.
   b) An agent that can play Tic-Tac-Toe.
ANSWER: Can be done by a goal-based agent, but since this is a very small domain can be
        "compiled down" to a reflex agent, even using a table lookup. Environment
        is deterministic, accessible, episodic, static, descrete, and multi-agent.
   c) An autonomous humanoid robot that can win the DARPA robotics challenge (driving a car,
      walking across debris, connecting two pipes). Scoring depends on success in the tasks,
      on execution speed, and on using as little communication with the operator as possible.
ANSWER: Needs a utility-based agent. Environment is most complicated: stochastic, inaccessible, dynamic,
        non-episodic, and continuous. It is also made multi-agent due to the fact that there is an
        operator.
   d) An internet coordination agent (sets meetings between humans with conflicting goals).
ANSWER: Utility based. The environment here is determinisitic, inaccessible, (perhaps) non-episodic
      (issues of trust/dependability of a participant), dynamic, (practically) discrete (you CAN set
      a meeting at 8 and 6.156247 minutes, but that is almost never done!), and multi-agent.
   e) An agent that can play peg-and-hole solitaire.
ANSWER: Goal based or utility based. Possibly can be compiled into rule-based and thus reflex, but
        domain is too large for straightforward table. Environment is deterministic, accessible,
        static, episodic, and single-agent.
   f) An agent that plays Checkers optimally.
ANSWER: Goal(search)-based. Environment is deterministic, accessible, discrete, episodic, static, and multi-agent.

2) (Search):
   Show the run of the following algorithms on the setting of question 6
   below, where the agent starting at V0 needs to reach V2 with the chems with minimal cost.
   The costs are: all edges cost 1, except for E5 which costs 2.
   Assume that h(n) is the cost to reach the goal state in the simplified problem 
   that assumes no armies, no terrorists, and no cost to travel unless with chems.

ANSWER: first we need to define the search space.
The agent, chem, and army can be anywhere. Terrorists can only be at E2 (or not).
So the state is defined by a 4-tuple:  (Agent Loc, Chem loc, Army loc, Terrorists)
where the first 3 range over V0, V1, V2, V3 and the last is binary.

The initial state is: S0= [V0, V0, V0, T]

Goal-test is true just for states that look like:  [X, V2, Y, Z]
where X, Y, Z are arbitrary values.
The operators are all relvant drive operations, with appropriate costs.

Also, since the shortest path from V0 to V2 has weight 2, we have
h(S0) = 4. In all cases, initilization begins by starting a heap with S0 with the appropriate f
cost.

   a) Greedy search (f(n) = h(n)), breaking ties in favor of states where the
      agent is moving with chems, and then in favor of lower-numbered edges.

ANSWER:
Pop S0, (which has f(S0)=h(S0)=4). This is not a goal, and 2 possible
edges to drive across, each has 4 possible drive options (with and without chems,
with and without army). So we get 8 new states:

S1 = [V1, V0, V0, T]       h(S1) = 4
S2 = [V1, V1, V0, T]       h(S2) = 2
S3 = [V1, V0, V1, T]       h(S3) = 4
S4 = [V1, V1, V1, T]       h(S4) = 2
S5 = [V3, V0, V0, T]       h(S5) = 4
S6 = [V3, V3, V0, T]       h(S6) = 4
S7 = [V3, V0, V3, T]       h(S7) = 4
S8 = [V3, V3, V3, T]       h(S8) = 4

So now we work according to h value, with S2 and S4 tied, and both have moved with chems, we did not
indicate a tie breaker here. Suppose we elect to break ties in favor of moving with an army as well.
Then we take S4, and since it is not a goal it is expanded. Now it is possible
to move across 3 edges, each with 4 possibilities (well, except for treversing E2 with
chems and no army). So we get 11 states, some of which are duplicate (and one of which is S0).
Rather than enumerating them all (the algorithm DOES generate them all, except perhaps
a copy of S0 which can be cut by duplicate detection), we note that one of the new states is:

SG = [V2, V2, V2, T]       h(SG) = 0

Since the algorithm is greedy, this node (only one with h=0) is picked in the next expansion, and is a goal state,
so is returned. Th path is "move with army and chems across E1, move with army and chems across E2". Total
cost is (1+1)*4 = 8. This solution is NOT optimal.

   b) A* search, (f(n) = g(n)+h(n)), breaking ties as above.

ANSWER: begins the same way as above, except uses an f cost that includes the g cost, so first expansion:

S1 = [V1, V0, V0, T]       h(S1) = 4, g(S1) = 1, f(S1) = 5
S2 = [V1, V1, V0, T]       h(S2) = 2, g(S2) = 2, f(S1) = 4
S3 = [V1, V0, V1, T]       h(S3) = 4, g(S3) = 2, f(S3) = 6
S4 = [V1, V1, V1, T]       h(S4) = 2, g(S4) = 4, f(S4) = 6
S5 = [V3, V0, V0, T]       h(S5) = 4, g(S5) = 1, f(S5) = 5
S6 = [V3, V3, V0, T]       h(S6) = 4, g(S6) = 2, f(S6) = 6
S7 = [V3, V0, V3, T]       h(S7) = 4, g(S7) = 2, f(S7) = 6
S8 = [V3, V3, V3, T]       h(S8) = 4, g(S8) = 4, f(S8) = 8

So A* picks S2, since this is the only node with f cost 4.
Here there are 3 possible edges, E1, E2, E3, each with or without chem, except that we
rule out moving across E1 with chem as this will be a copy of S0, and cannot cross E2 with chem.
So we get (second expansion):

S9 = [V0, V1, V0, T]      h(S9) = 4, g(S9) = 2, f(S9) = 6
S10= [V2, V1, V0, T]      h(S10)= 4, g(S10)= 2, f(S10)= 6
S11= [V3, V1, V0, T]      h(S11)= 4, g(S11)= 2, f(S11)= 6
S12= [V3, V3, V0, T]      h(S12)= 4, g(S12)= 3, f(S12)= 7

The best now are S1 and S5 (f cost 5), and we prefer S1 as this is a move across E1,
which is lower numbered than E4. Here the agent is at V1 without chems or army, so has
only 3 possible moves, one of which is a duplicate of S0, so we generate (3rd expansion, as this
is not a goal):

S13= [V2, V0, V0, T]      h(S13)= 4, g(S13)= 2, f(S13)= 6
S14= [V3, V0, V0, T]      h(S14)= 4, g(S14)= 2, f(S14)= 6

So now we try S5, which is not a goal. Trying to expand (4th expansion), we get only duplicates of S0, S13 and S14,
so can drop them, generating nothing new.
So now we have S3, S4, S6, S7, S9, S10, S11, S13, S14, all with f-cost 6.
We prefer the one where the last move is with chems, across E1, thus S4 is preferred.
This is not a goal, so its 11 successors are generated (5th expansion) pruning duplicates. We will skip this process
for brevity.

Eventually, S6 will be picked, and it is not a goal. It has 6 successors, including
several duplicates. One of its successors is:

SG'=[V2,V2,V0,T]        h(SG')=0, g(SG')=6, f(SG')=6

Eventually SG' will be picked, and it is a goal, so the path: "move with chems across E4,
move with chems across E5" is returned, woth a cost of 6. This is the optimal solution.
Needless to say, a LOT of states were expanded, and a whole lot MORE generated,
so the heuristic was not very effective here.

   c) Simplified RTA* with a limit of 2 expansions per real action, breaking ties as above.

ANSWER: begins with 2 exapnsions exactly like A*. Now assuming the same tie-breaking, picks S1, and as it is
not a goal but time is up, retuens it and the agent moves to S1.

Next time search is initiated, from S1. This (expansion 1) generates 3 successors,
copies of S0, S5, and S13. However, the search tree of the previous move was discarded,
so none of these are seen as duplicates. All these nodes have an h(n) of 4 and g(n) of 1,
so according to tie-breaking we pick S0. We expand S0 (expansion 2) similar to first expansion of S0,
except that the g cost is higher by 1 (copies of S1 and S5 dropped):

S2 = [V1, V1, V0, T]       h(S2) = 2, g(S2) = 3, f(S1) = 5
S3 = [V1, V0, V1, T]       h(S3) = 4, g(S3) = 3, f(S3) = 7
S4 = [V1, V1, V1, T]       h(S4) = 2, g(S4) = 5, f(S4) = 7
S6 = [V3, V3, V0, T]       h(S6) = 4, g(S6) = 3, f(S6) = 7
S7 = [V3, V0, V3, T]       h(S7) = 4, g(S7) = 3, f(S7) = 7
S8 = [V3, V3, V3, T]       h(S8) = 4, g(S8) = 5, f(S8) = 9

Now under tie-breaking we pick S2, and the action returned is the first action on the path to S2,
i.e. the move to S0. 

So the agent is back where it started, and under this scheme will loop forever!
(This is why the "real" version of RTA* does not work this way, and considers past g-costs
in order to avoid exactly this type of problem!)


   d) Repeat a-c but using h'(n) = 2*h(n) as the heuristic. Is h'(n) admissible?

ANSWER: h'(n) is not admissible. For example, it overestimates even for S0, for which h'(S0) = 8
while the cost of the optimal solution is 6.

Now this makes no difference for case (a) (greedy search) since multiplying
all values by a positive constant does not change the order of expanded nodes. So no change in the search
in case (a).

For case (b), A*, we have the same set of expanded states (first expansion),
with different h' and f costs:

S1 = [V1, V0, V0, T]       h'(S1) = 8, g(S1) = 1, f(S1) =  9
S2 = [V1, V1, V0, T]       h'(S2) = 4, g(S2) = 2, f(S1) =  6
S3 = [V1, V0, V1, T]       h'(S3) = 8, g(S3) = 2, f(S3) = 10
S4 = [V1, V1, V1, T]       h'(S4) = 4, g(S4) = 4, f(S4) =  8
S5 = [V3, V0, V0, T]       h'(S5) = 8, g(S5) = 1, f(S5) =  9
S6 = [V3, V3, V0, T]       h'(S6) = 8, g(S6) = 2, f(S6) = 10
S7 = [V3, V0, V3, T]       h'(S7) = 8, g(S7) = 2, f(S7) = 10
S8 = [V3, V3, V3, T]       h'(S8) = 8, g(S8) = 4, f(S8) = 12

So again we pick S2 as A* using h(n), and of course get the same set of nodes
generated (second expansion):

S9 = [V0, V1, V0, T]      h'(S9) = 8, g(S9) = 2, f(S9) = 10
S10= [V2, V1, V0, T]      h'(S10)= 8, g(S10)= 2, f(S10)= 10
S11= [V3, V1, V0, T]      h'(S11)= 8, g(S11)= 2, f(S11)= 10
S12= [V3, V3, V0, T]      h'(S12)= 8, g(S12)= 3, f(S12)= 11

This time S4 is the winner, since S4 is the only state with f(S4)=8 or lower. As for the greedy case,
there are 11 successors, one of which is S0 which is duplicate and is discarded. But one of them is:

SG = [V2, V2, V2, T]       h(SG) = 0, g(SG)=8, f(SG)=8
The f-cost of all other nodes is greater than 8, so on the next round we pick SG
and it is returned as a (suboptimal solution), similar to the greedy case!
(Using h'(n) = 2*h(n) makes the search proceed in a "more greedy" manner, in this case actually
truly greedy).

For case (c), simplified RTA*, the first 2 expansions are as for A*,
and now the algorithm picks S4 as in A*, but it is not expanded, instead the
action leading to S4 is returned and executed (drive(E1,True,True)) which actually
places us in S4 for the next phase.
Then in the next phase, S4 is expanded, giving us 11 successors that include:
SG = [V2, V2, V2, T]       h(SG) = 0, g(SG)=4, f(SG)=4
Note that g(SG)=4 from the new position at S4. Anyway, this one has the least
f value so it is selected at the next step, and since it is a goal it is returned.
It turns out that in this case, using h'(n)=2*h(n), the agent behavior is
the same under greedy, A*, and simplified RTA*, i.e. the same
suboptimal path through E1 and E2 with chem and army, with a cost of 8.


3) (Game trees and Search):
   Consider a 3-person game (A, B, C) with complete information, where A and C are fully
   cooperating, and where B is semi-cooperative with A and C (and vice-versa), i.e. picks
   the bests score for itself, breaking ties in favor of cooperation.
   However, C is a player that plays completely randomy with
   a uniform distribution. Assume that players take turns in the order A, B, C,
   repeatedly, until the game ends.
 
   a) Write an algorithm for optimal play by player A.

ANSWER: this is in principle a simple max-expecti-max algorithm with a twist that does tie-breaking. So basically, the value is
computed (assuming the scores are written as a 3-tuple: A, B, C).

Node-value(node) {
  If terminal then return score.
  If move by A then return score of child in children(node) with highest A(score(child))+C(score(child)),
     (if equals then choose one with higher B(score(child))).
  If move by B return score of child in children(node) with highest B(score(child)),
     (if equals then choose one with higher A(score(child))+C(score(child))).
  If move is by C then return weighted average of the scores of the children. 
}

At least theoretically. But in fact there is a glitch there, can you find it?

   b) If the game now is partial information: A cannot communicate with C, 
      and C cannot see the moves made by A, but
      A can see all moves made by C, is your algorithm
      still optimal? In what cases, if any, can you run into problems?
ANSWER: No additional problem here, as C is random anyway. (Assuming B can also see the moves by C).

   c) Same as in case b, but now C can see A's moves, but A cannot see the moves made by C.
ANSWER: Now it is exteremely problematic, since e.g. the best move by A may depend on the move by C.
      So in fact A will not be able to carry out the theoretically optimal pre-computed plan,
      making the entire algorithm incorrect.

   d) Show an example 3-ply game tree with branching factor 2, and show
      the values computed in this tree in case a.

ANSWER: Here the best is for A to play A1 which will give AC a score of 2+2=4, which is
better than 3+0 with A2. Note that after A2, B will play B2 in order to improve its own score, rather
than to decrease C's score.

MAX A    MAX B    CHANCE  

A1 (2 4 2)  -- B1 (1 2 1)  -- (2 4 2)
    \              \ 
     \               -------- (0 0 0)
      \               
       ------- B2 (2 4 2)  -- (4 8 4)
                    \       
                     -------- (0 0 0)
                      
A2 (3 2 0)  -- B1 (3 2 0)  -- (6 4 0)
    \              \ 
     \               -------- (0 0 0)
      \               
       ------- B2 (3 0 3)  -- (6 0 6)
                    \       
                     -------- (0 0 0)
                      

4)  (Game-tree search - alpha-beta pruning)
   a) Give an example of a 4-ply game tree (branching factor 2)
     where alpha-beta pruning saves NOTHING, and another where the savings
     is maximal. How many nodes are pruned in each case?
ANSWER: A 4-ply game tree with branching factor 2 has 16 terminal nodes. We will
      let the values be integers from 1 through 16. Ordering them so that the worst move
      is examined first will guarantee no pruning. The reverse order will guarantee optimal pruning.
      We will present the best order first (top to bottom below). We will call moves by A: A1 and A2, etc. Bounds are marked,
      e.g. 15+ means the value of this node is 15 or more, exact value unknown. X means nothing
      known of this value (using alpha-beta in this order), and the value is irrelevant.

     MIN       MAX       MIN

A1 11 -- B1 11  -- A1 11 ----- 11
    \         \          ----- 12
     \         --- A2 10 -----  9
      \                  ----- 10  pruned
       - B2 15+ -- A1 15 ----- 15
              \          ----- 16
               --- A2  X ----- 13  pruned
                         ----- 14  pruned
A2  3- -- B1 3- -- A1  3- ----  3
     \        \          -----  4  pruned
      \        --- A2  1- ----  1
       \                 -----  2  pruned
        - B2  X--- A1  X -----  7  pruned
              \          -----  8  pruned
               --- A2  X -----  5  pruned
                         -----  6  pruned

 Simply reverse the order and the alpha-beta will prune nothing.

   b) Suppose that we know ahead of time that the terminal values can only
      be integers between -10 and 10. Is there a case where alpha-beta can save even
      more search than the best case above (show it, or prove that it cannot help).
ANSWER: Possible, e.g. if the bound 10 is reached by MAX for top move A1:

     MIN       MAX       MIN

A1 10 -- B1  5  -- A1  5 -----  5
    \         \          -----  6
     \         --- A2  3 -----  3
      \                  -----  4  pruned
       - B2 10  -- A1 10 ----- 10
              \          ----- 10
               --- A2  X -----  8  pruned
                         -----  9  pruned
A2  (all pruned, not more than 10)

   c) Provide an example of a 2-ply + 1 chance node level game tree where
      one can apply an adapted alpha-beta to prune some nodes,
      and a similar example where changing the distribution on the chance node
      edges results in no savings for pruning. 

ANSWER: Can only be done if values are bounded. Assume bounds are [-10, 10].
MAX  CHANCE    MIN  

A1 5  -- C1 10  -- 10
    \  (0.5)  \ 
     \         --- 10 
      \               
       - C2  0  --  0 
        (0.5) \       
               ---  0 
                      
A2 4-  - C1 -2- -- -2
     \  (0.5) \       
      \        --- -4   pruned
       \              
        - C2 X --- 10   pruned 
        (0.5) \       
               ---  6   pruned

That is, the score of A2 cannot be higher than 4, so no point in looking further, MAX
must play A1. However, if the probabilities of the chance node after A2 are (0.1, 0.9) then
it is still possible for the value of A2 to be higher than 5, so cannot prune. 
                     


5) (Propositional logic)
   For each of the following sentences, determine whether they are
   satisfiable, unsatisfiable, and/or valid. For each sentence that
   is satisfiable, determine the number of models (over the propositional
   variables in the sentence). In case b, also trace the run of the DPLL algorithm for satisfiability with this
   formula as input (i.e. explain any recursive calls and cases encountered).
   a) (A and (A -> B) and (B -> C)) -> C
ANSWER: Valid, so 8 models. (As it is equivalent to: (not A or (A and not B) or (B and not C) or C))
   b) (A or not B or C) and (not A or B) and (not A or not B or C) and (A or not B)
ANSWER: Satisfiable, 2 models: C=true and either A=False, B=True, or A=True,B=False.
   With DPLL, C appears only as positive, so set C=True, satisfying the first and 3rd clause.
   Now this cannot be repeated and there is no unit clause. So try first A=True and call recuresively.
   This satisfies the last clause and leaves only (False or B). Now can set B=True and return success.
   (Would also work if we chose A=False, but with a different assignment).
   c) (A or B or C or D or E or F or G)
ANSWER: Satisfiable. Only 1 out of 2^7 assignments is not a model, so 127 models.
   d) (A and B) or (C or not D or E)
ANSWER: Satisfiable. The first part has one way to be satisfied (A=B=True), which makes the sentence true
      for all 8 possible assignments of the second part, so 8 models. The second part is satisfiable in 7 possible
      assignments of the variables C,D,E, irrespective of the 4 possible assignments of A and B, total 28
      models. But these overlap in cases where both parts are true, which is 7 cases. So total number
      of models is 28+8-1 = 25 (by "inclusion-exclusion").
   e) (A and (A -> B) and (B -> C)) -> not C
ANSWER: Satisfiable. True if C=False (4 models). The left-hand part negated, under C=True, is equivalent to:
      not A or (A and not B) which has 3 more models not overlapping the other 4, so total 7 models.
      The only non-satisfying assignment is A=B=C=True.
   f) not ((A and (A -> B) and (B -> C)) -> C)
ANSWER: Negation of a valid sentence, so not satisfiable.

6) (FOL and situation calculus)
   We need to represent and reason within the scenario defined by assignment 1.
   For simplicity, we will ignore the issue of action costs, assume only one agent, at most one army
   and one chem, and allow only the drive action.
   
   a) Write down the axioms for the behavior of the world using FOL and
      situation calculus.

      Use the predicates:
        Edge(e, v1, v2) ; e is an edge in the graph between v1 and v2.
        Goal(v)         ; Vertex v is a chem dumping (goal) site.
      Also use the fluents (situation-dependent predicates)
        Loc(v,s)        ; Agent is at vertex v in situation s
        Terrorist(e,s)  ; Terrorists at edge e in situation s
        Chem(v,s)       ; Chem at vertex v in situation s.
        Army(v,s)       ; Vertex v has army in situation s

     Constants: as needed, for the edges and vertices.

     Functions: not needed here, except to denote actions:
         drive(e,chem,army)     ; Denotes a move action across edge e (may be ineffective), with or without chem
                                  and with or without army.
     and of course the "Result" function:
         Result(a,s)    ; Denotes the situation resulting from doing action a in
                        ; situation s.
     You may wish to add auxiliary predicates or fluents in order to simplify the axioms. you
     are allowed to do so provided you explain why they are needed.

   Let the facts representing the scenario be:
Edge(E1, V0, V1)
Edge(E2, V1, V2)
Edge(E3, V3, V1)
Edge(E4, V0, V3)
Edge(E5, V3, V2)
Goal(V2)

; with situation S0 fluents being:
Loc(V0, S0)
Terrorist(E2, S0)
Army(V0,S0)
Chem(V0,S0)


ANSWER:
It is a good idea to add auxiliary predicates indicating whether an action actually
works, and possibly also one indicating whether the edge indicated in the action is 
adjacent to the current location. So:

The agent is at an edge in a situation s if there is a vertex at which the agent is in s
and the edge has this vertex as one of its ends.
1.  Forall e, s  AtEdge(e, s)  iff  exists v1, v2 Edge(e, v1, v2) and (Loc(v1,s) or Loc(v2,s))

A drive action across e is legal in situation s if the agent is AtEdge e in s, and
attempts to carry chem only if there is one (likewise with an army), and if the
edge has terrorists, only of with army. This can be done by ONE sentence by
quanifying over the values of chem and army, or by separate sentences. We
will take "separatist" way:

Can drive across edge with army if at edge and current location has army.
2a.  forall e, s, v  AtEdge(e,s) and Loc(v,s) and Army(v,s) => Legal(drive(e,False,True),s)
Note the constants True and False. These are NOT the same as the truth values for
the sentences!

Need to cover other cases, first with army and chems:
2b.  forall e, s, v  AtEdge(e,s) and Loc(v,s) and Army(v,s) and Chem(v,s) => Legal(drive(e,True,True),s)

Then with chems but no army:
2c.  forall e, s, v  AtEdge(e,s) and Loc(v,s) and Chem(v,s) and not Terrorist(e, s) => Legal(drive(e,True,False),s)

Then with neither:
2d.  forall e, s, v  AtEdge(e,s) and Loc(v,s) => Legal(drive(e,False,False),s)

Also need axioms for when an action is NOT legal, or alternately what conditions are
implied by an action being legal (e.g. a drive is legal only if AtEdge, etc.). 
But we do NOT need it for the proof, or for search algorithms,
so will not state them.

Note BTW that the description of S0 is lacking, i.e it does NOT say that there
is no army elsewhere, that there is only one Loc in a state,
that there is no chem elsewhere, and no terrorists elsewhere, etc.
These need to be stated specifically if desired, or by a "closed world assumption".
But we will not need them, so let us ignore this.

Now, we need successor-state axioms for all fluents (situation-dependent predicates)
that can be changed directly by actions:

Terrorists never appear, and remain unless removed by a drive with army and no chems:
3.  forall e, e1, s, army, chem  Terrorist(e, Result(Drive(e1,army,chem),s))  iff  
      (Terrorist(e,s) and ((not e=e1 or not Legal(Drive(e1,chem,army),s) or army=False or chem=True)))

Note that the above contains equalities, which may be unpleasant if we need this for the proof.
But we will not (at least not in the solution below!)

Location changes by legal drive actions (assumes no self-edges in the graph!):
4.  forall v, v1, e, army, chem   Loc(v, Result(drive(e,chem,army),s) iff
         (Loc(v,s) and not Legal(drive(e,chem,army),s))
         or ((Loc(v1,s) and Legal(drive(e,chem,army),s)) and (Edge(e,v,v1) or Edge(e,v1,v)))

Chem changes only by drive with chem, and even then only for a legal action:
5.  forall v, v1, e, army, chem   Chem(v, Result(drive(e,chem,army),s) iff
         ((Chem(v,s) and (not Legal(drive(e,chem,army),s) or chem=False or not Loc(v,s))) or
         (Chem(v1,s) and Legal(drive(e,chem,army),s) and chem=True and Loc(v1,s) and (Edge(e,v,v1) or Edge(e,v1,v))))

Likewise, army changes only by drive with army:
6.  forall v, v1, e, army, chem   Army(v, Result(drive(e,chem,army),s) iff
         ((Army(v,s) and (not Legal(drive(e,chem,army),s) or army=False or not Loc(v,s))) or
         (Army(v1,s) and Legal(drive(e,chem,army),s) and army=True and Loc(v1,s) and (Edge(e,v,v1) or Edge(e,v1,v))))

Looking at the last part of 5 and 6, perhaps we should have defined a predicate CanMoveFrom(v1,v2,v,s) as:
Aux:  forall v, v1, e, s   CanMoveFrom(v,v1,v,s) iff (Loc(v1,s) and (Edge(e,v,v1) or Edge(e,v1,v)))
but we didn't, so now must suffer.

This should be it, except for trivial axioms like False=False, not False = True, etc.

  b)  Now, we need to find a sequence of actions that will result in reaching the goal,
      of the chem being in V2,
      and prove a theorem that it will do so. Do the following steps:

      A) Convert the knowledge base to conjunctive normal form. (This is very long, 
         so you may omit the parts not needed in the proof below).
ANSWER: As stated above, the result is HORRIBLE, especially for the succesor-state axioms.
So we will first SPLIT some of the axioms into parts and thn work only on the parts!
For example,
1.  Forall e, s  AtEdge(e, s)  iff  exists v1, v2 Edge(e, v1, v2) and (Loc(v1,s) or Loc(v2,s))
is split into:
1a.  Forall e, s  AtEdge(e, s)  =>  exists v1, v2 Edge(e, v1, v2) and (Loc(v1,s) or Loc(v2,s))
1b.  Forall e, s  (exists v1, v2 Edge(e, v1, v2) and (Loc(v1,s) or Loc(v2,s))) => AtEdge(e, s)
and we only need 1b, which becomes:
1b.  not Edge(e,v1,v2) or (not  Loc(v1,s) and not Loc(v2,s)) or AtEdge(e, s)
But now due to distributive rule must become 2 CNF clauses (note, we will not standardize apart
between clauses, we will do this later if needed during resolution. In principle
this should be done during the conversion to CNF, in order to be "pure"),
1b1.  not Edge(e,v1,v2) or not Loc(v1,s) or AtEdge(e, s)
1b2.  not Edge(e,v1,v2) or not Loc(v2,s) or AtEdge(e, s)

In axiom 2, we use drive only with army and chems, so will convert only 2b:
2b.  forall e, s, v  AtEdge(e,s) and Loc(v,s) and Army(v,s) and Chem(v,s) => Legal(drive(e,True,True),s)
This becomes simply:
2b.  not AtEdge(e,s) or not Loc(v,s) or not Army(v,s) or not Chem(v,s) or Legal(drive(e,True,True),s)

Axiom 3 we will not need in this version of the solution (see below), so can save
on converting it. However, we WILL need axioms 4,5,6 which are horrid.
But then we will need only the parts where these predicates CHANGE, and will only
need the REVERSE part. For example, in axiom 4:
4.  forall v, v1, e, army, chem   Loc(v, Result(drive(e,chem,army),s) iff
         (Loc(v,s) and not Legal(drive(e,chem,army),s))
         or ((Loc(v1,s) and Legal(drive(e,chem,army),s)) and (Edge(e,v,v1) or Edge(e,v1,v)))
We only need:
4a. forall v, v1, e, army, chem   ((Loc(v1,s) and Legal(drive(e,chem,army),s)) and (Edge(e,v,v1) or Edge(e,v1,v))) =>
                 Loc(v, Result(drive(e,chem,army),s)
This becomes:
4a.  not Loc(v1,s) or not Legal(drive(e,chem,army),s)) or (not Edge(e,v,v1) and not Edge(e,v1,v))) or Loc(v, Result(drive(e,chem,army),s))
which again becomes 2 axioms due to the distrubutive rule:
4a1. not Loc(v1,s) or not Legal(drive(e,chem,army),s)) or not Edge(e,v,v1) or Loc(v, Result(drive(e,chem,army),s))        
4a2. not Loc(v1,s) or not Legal(drive(e,chem,army),s)) or not Edge(e,v1,v) or Loc(v, Result(drive(e,chem,army),s))         

Likewise from 5 we need only part of the reverse direction, i.e.:
5a. forall v, v1, e, army, chem  (Chem(v1,s) and Legal(drive(e,chem,army),s) and chem=True and Loc(v1,s) and 
                                  (Edge(e,v,v1) or Edge(e,v1,v)))) => Chem(v, Result(drive(e,chem,army),s) )
which again generates 2 clauses:
5a1. not Chem(v1,s) or not Legal(drive(e,chem,army),s) or not chem=True or nor Loc(v1,s) or not Edge(e,v,v1) or 
                Chem(v, Result(drive(e,chem,army),s))
5a2. not Chem(v1,s) or not Legal(drive(e,chem,army),s) or not chem=True or nor Loc(v1,s) or not Edge(e,v1,v) or 
                Chem(v, Result(drive(e,chem,army),s)) 

Note that we COULD have written "not Legal(drive(e,True,army),s)" instead of "not Legal(drive(e,chem,army),s) or not chem=True"
had we split axiom 5 originally - this is much simpler. But again, we did not, so must suffer.
From axiom 6 we will similarly need only part of the reverse direction:
6a.forall v, v1, e, army, chem (Army(v1,s) and Legal(drive(e,chem,army),s) and army=True and Loc(v1,s) and (Edge(e,v,v1) or Edge(e,v1,v))) =>
            Army(v, Result(drive(e,chem,army),s))
Which again generates 2 clauses:
6a1. not Army(v1,s) or not Legal(drive(e,chem,army),s) or not army=True or not Loc(v1,s) or not Edge(e,v,v1) or
            Army(v, Result(drive(e,chem,army),s))
6a2. not Army(v1,s) or not Legal(drive(e,chem,army),s) or not army=True or not Loc(v1,s) or not Edge(e,v1,v) or
            Army(v, Result(drive(e,chem,army),s))

      B) Formally list what we need to prove (a theorem), and state its negation.
ANSWER:
We will now formalize a goal state as one having a chem at V2 (should have been "all chems at V2",
but this is complicated as we did not state places that have no chems!).
So we need a state s where: Chem(V2,s). We do this in a lot of ways. The simplest
is 2 move actions with army and chems, say through E1 and E2, then we do not care about terrorists
(there are many other solutions!).

That is, we need to prove:
G.   Chem(V2, Result(drive(E2,True,True), Result(drive(E1,True,True),S0)))

And add its negation to the KB (already in CNF):
G'.  not Chem(V2, Result(drive(E2,True,True), Result(drive(E1,True,True),S0)))

      C) Use resolution to reach a contradiction, thus proving the theorem.
         At each step of the proof, specify the resolvents and the unifier.
ANSWER: We will state some things for S0 first. 
Resolve "Edge(E1, V0, V1)" with 1b1, substitution {e/E1,v1/V0,v2/V1} to get:
R1.  not Loc(V0,s) or AtEdge(E1, s)

Resolve R1 with "Loc(V0,S0)", substitution {s/S0} to get:
R2.  AtEdge(E1, S0)

Resolve R2 with 2b, substitution {e/E1, s/S0} to get:
R3.  not Loc(v,S0) or not Army(v,S0) or not Chem(v,S0) or Legal(drive(E1,True,True),S0)

Resolve R3 with "Loc(V0,S0)", substitution {s/S0} to get:
R4.  not Army(V0,S0) or not Chem(V0,S0) or Legal(drive(E1,True,True),S0)

Resolve R4 with "Army(V0,S0)", empty substitution, to get:
R5.  not Chem(V0,S0) or Legal(drive(E1,True,True),S0)

Resolve R5 with "Chem(V0,S0)", empty substitution, to get, finally, something useful:
R6.  Legal(drive(E1,True,True),S0)

OK, so now we can prove that driving in S0 gets us somewhere...
Resolve 4a2 with "Loc(V0,S0)", substitution {v1/V0, s/S0} to get:
R7. not Legal(drive(e,chem,army),S0)) or not Edge(e,V0,v) or Loc(v, Result(drive(e,chem,army),S0))   

Resolve R7 with R6, substitution {e/E1, chem/True, army/True} to get:
R8.  not Edge(E1,V0,v) or Loc(v, Result(drive(E1,True,True),S0))     

Resolve R8 with "Edge(E1, V0, V1)", substitution {v/V1} to get:
R9.  Loc(V1, Result(drive(E1,True,True),S0))
Now we are getting somewhere (namely, to V1)...

Now, we will skip a number of uninteresting steps. In a manner similar to the above,
Resolve 5a2 with the S0 facts (as done in the previous steps) to get:
R20.  not True=True  or Chem(V1, Result(drive(E1,True,True),S0) 

Resolve "True=True" with R20 to get:
R21.  Chem(V1, Result(drive(E1,True,True),S0)) 

Again, skipping some uninteresting steps, similar to the above use 6a2 as above
to get to:
R30.  Army(V1, Result(drive(E1,True,True),S0))

Now for the rest of the proof, we can use exactly the same scheme as above (the sequence
R1 to R6), using R9, R21, and R30 instead of the S0 facts,
together with "Edge(E2, V1, V2)" to show what happens with the next drive action, except now we use
the state "Result(drive(E1,True,True),S0)" rather than S0, to show:

R40:  Legal(drive(E2,True,True),Result(drive(E1,True,True),S0))

Now we can use 5a2 again, resolving in sequence (note, this is SEVERAL steps of resolution!)
with R21, R40, R9, "Edge(E2, V1, V2)", and "True=True" to get:

R50: Chem(V2, Result(drive(E2,True,True), Result(drive(E1,True,True),S0)))
Finally, resolve R50 with G', empty substitution, to get the empty clause, thereby completing the proof.

  c) Would there be a potential problem in the proof if we did not
     have "frame axioms" in our representation (i.e. stating what
     did not change), and if so, what?
ANSWER: no problem as is, because we explicitly modify all that needs to be
modified in each action, and do not care about terrorists as we keep the army.
But, for example, if we did not drive with an army,
and passed through E4, E5, we would need to prove that terrorists do not appear on
E5 as a result of the drive across E4. That would not be possible without frame axioms.

  d) Would we be able to do the proof using only forward chaining?
ANSWER: It looks like we only used Horn clauses and atomic sentences in the proofs, so in
this case forward chaining would work. In fact it would follow the same path used in our solution.

Justify all answers shortly!
Deadline: Noon (12:00), Tuesday, December 3, 2013 ( strict deadline).
Submissions are to be solo.