Introduction to Artificial Inteligence

Assignment 3: solution

Homework assignment - simulators, agents, search, games, logic, inference



1)  (agents and environment simulators)
   For each of the following domains, which type of agent is appropriate
   (table lookup, reflex, goal based, or utility based). Also, indicate the
   type of environment:
   a) An agent that plays Whist (or Bridge).
   b) An agent that can play Soduko.
   c) An autonomous humanoid robot that can win the DARPA robotics challenge (driving a car,
      walking across debris, connecting two pipes). Scoring depends on success in the tasks,
      on execution speed, and on using as little communication with the operator as possible.
   d) An internet shopping agent specializing in cheap vacations.

ANSWER: There is no single answer here, but a "best practice".
   a) Needs a utility-based agent. The environment is stochastic (card deals), but can be modeled as
      deterministic. It is inherently NOT fully observable. It is also discrete, non-episodic,
      multi-agent, and semi-static (timing in competitions).
   b) The easy Soduku are such that a reflex agent suffices. The hard ones need search, so goal-based.
      Environment is deterministic, observable, episodic, discrete, static (except in versions that cout solution time,
      where it is semi-static) , and singe-agent.
   c) Must have a utility-based agent. The environment is the hardest in many senses. It is stochasic,
      not fully observable, continuous, dynamic, partialy multi-agent (has an operator), semi-episodic (scores
      for individual episoded, but can benefit from inter-episode learning).
   d) Also needs a utility-based agent. Environment is stochastic, not fully observable, "semi" discrete (costs),
      episodic, dynamic, and multi-agent.

2) (Search):
   Show the run of the following algorithms on the setting of question 6
   below, where the agent starting at V0 needs to reach the goal V2 with minimal cost,
   where costs are as defined in assignment 1.
   Assume that h(n) is the cost to reach the goal state in the simplified problem 
   that assumes that the agent is always carrying exactly the minimal amount of food needed to cross
   an edge before traversing it.

ANSWER: (common part)  the heuristic is essentially the distance in the appropriate graph, with edges
      having the cost w' resulting from having exactly the right amount of food w to cross the edge.
      That is, w'= w*(w+1). So we have w'(E1)=2, w'(E2)=20, w'(E3)=6.
      We did not specify an edge weight for E4 as it has no effect on the best path or the results of the
      search. However, it MIGHT have an effect on the search itself if the edge wight is small despite that.
      So we will assume w(E4)=5, and ths w'(E4)=30. This heuristic h(n) is admissible.
      The world state is described by agent location, food carried, and food at nodes. Since food never increases
      at a node, only one node makes a difference there: V1. So the state will be decribed by the tuple:
       (Loc, carrying, Food at V1)
      We actually need to represent whether the agent is alive, we will do so by having it carry a negative
      amount of food in this solution.

   a) Greedy search (f(n) = h(n)), breaking ties in favor of states where the
      higher node number.

ANSWER:
     Initial state pushed onto agenda:
      S0:  (V0, 1, 3),   g=0,  h=6, f=6
     Now, retrieve S0, and it is not a goal, and has 3 successors, for noop, move to V1, move to V2
      S1:  (V0, 0, 3),   g=1,  h=6, f=6
      S2:  (V1, 3, 0),   g=2,  h=8, f=8
      S3:  (V2, -1, 3),  g=infinity, h=0, f=0     (agent dead, drop it)
     Now, retrieve S1, not a goal, same 3 actions, (noop, V1, V2) resulting in:
      S4:  (V0, -1, 3),  g=infinity, h=6, f=6     (agent dead)
      S5:  (V1, -1, 0),  g=infinity, h=8, f=8     (agent dead)
      S6:  (V2, -1, 3),  g=infinity, h=0, f=0     (agent dead)
     (Alternetaly, can disallow such successors, in which case S3, S4, S5, S6 are discarded and not pushed onto agenda).
     Now we only have S2, not a goal, 4 successors: noop, or go to V0,V2,V3
      S7:  (V1, 2, 0),   g=3, h=8, f=8
      S8:  (V0, 2, 0),   g=6, h=6, f=6
      S9:  (V2, -1, 0),   agent dead
      S10: (V3, -1, 0),   agent dead
    Now the best is S8.  It is not a goal and again has 3 successors (noop, V1, V2):
      S11: (V0, 2, 0),   g=7, h=6, f=6
      S12: (V1, 1, 0),   g=9, h=8, f=8
      S13: (V2, 0, 0),   g=12, h=0, f=0
    Note that we have generated a goal state, but it is pushed into the agenda and not returned yet.
    Now the best node S13 is picked from the agenda. 
    S13 is a goal, so it is retured. Solution found is (total cost 12):
        Go to V1 (cost of 2)
        Go to V0 (cost of 4)
        Go to V2 (cost of 6)
    In this case it happens to be the optimal (and only) solution.

   b) A* search, (f(n) = g(n)+h(n)), breaking ties as above.

ANSWER:
     Initial state pushed onto agenda:
      S0:  (V0, 1, 3),   g=0,  h=6, f=6
     Now, retrieve S0, and it is not a goal, and has 3 successors, for noop, move to V1, move to V2
      S1:  (V0, 0, 3),   g=1,  h=6, f=7
      S2:  (V1, 3, 0),   g=2,  h=8, f=10
      S3:  (V2, -1, 3),  g=infinity, h=0, f=infinity     (agent dead)
     Now, retrieve S1, not a goal, same 3 actions, (noop, V1, V2) resulting in:
      S4:  (V0, -1, 3),  g=infinity, h=6, f=infinity     (agent dead)
      S5:  (V1, -1, 0),  g=infinity, h=8, f=infinity     (agent dead)
      S6:  (V2, -1, 3),  g=infinity, h=0, f=infinity     (agent dead)
     (Alternetaly, can disallow such successors, in which case S3, S4, S5, S6 are discarded and not pushed onto agenda).
     Now we only have S2, not a goal, 4 successors: noop, or go to V0,V2,V3
      S7:  (V1, 2, 0),   g=3, h=8, f=11
      S8:  (V0, 2, 0),   g=6, h=6, f=12
      S9:  (V2, -1, 0),   agent dead
      S10: (V3, -1, 0),   agent dead
    Now the best is S7, this has 4 successsors, but in two of them the agent is dead so this time we will ignore them. So we add:
      S11:  (V1, 1, 0),   g=4, h=8, f=12
      S12:  (V0, 1, 0),   g=6, h=6, f=12
    Now we have 3 states with f=12, we the the-breaking rule prefers S11, and again we have 4 successors, in 2 of which the
    agent dies, so we ignore them, and get:
      S13:  (V1, 0, 0),   g=5, h=8, f=13
      S14:  (V0, 0, 0),   g=6, h=6, f=12
    Now we have 3 states with f=12, and the tie breaking rules does not help. Suppose we pick the one where we have more food, S8.
    It is not a goal and again has 3 successors (noop, V1, V2):
      S15: (V0, 2, 0),   g=7, h=6, f=13
      S16: (V1, 1, 0),   g=9, h=8, f=17
      S17: (V2, 0, 0),   g=12, h=0, f=12
    Note that we have generated a goal state, but it is pushed into the agenda and not returned yet.
    Now we again have several states with f=12, but S17 has the highest location node number, so S17 is picked.
    S17 is a goal, so it is retured. Solution found is (total cost 12):
        Go to V1 (cost of 2)
        Go to V0 (cost of 4)
        Go to V2 (cost of 6)

   c) Simplified RTA* with a limit of 2 expansions per real action, breaking ties as above.

ANSWER:
     Initial state pushed onto agenda:
      S0:  (V0, 1, 3),   g=0,  h=6, f=6
     Now, retrieve S0, and it is not a goal, and has 3 successors, for noop, move to V1, move to V2
      S1:  (V0, 0, 3),   g=1,  h=6, f=7
      S2:  (V1, 3, 0),   g=2,  h=8, f=10
      S3:  (V2, -1, 3),  g=infinity, h=0, f=infinity     (agent dead)
     Now, retrieve S1, not a goal, same 3 actions, (noop, V1, V2) resulting in:
      S4:  (V0, -1, 3),  g=infinity, h=6, f=infinity     (agent dead)
      S5:  (V1, -1, 0),  g=infinity, h=8, f=infinity     (agent dead)
      S6:  (V2, -1, 3),  g=infinity, h=0, f=infinity     (agent dead)
     No the only state left is S2,  but we have reached the expansion limit, so the appropriate move is made: go to V1.
     Now the search is restarted with (we will keep the current g value as an initial cost to simplify things. If not,
     deduct 2 from all the g costs and f costs below):
      S2:  (V1, 3, 0),   g=2,  h=8, f=10
     Now we only have S2, not a goal, 4 successors: noop, or go to V0,V2,V3
      S7:  (V1, 2, 0),   g=3, h=8, f=11
      S8:  (V0, 2, 0),   g=6, h=6, f=12
      S9:  (V2, -1, 0),   agent dead
      S10: (V3, -1, 0),   agent dead
    Now the best is S7, this has 4 successsors, but in two of them the agent is dead so this time we will ignore them. So we add:
      S11:  (V1, 1, 0),   g=4, h=8, f=12
      S12:  (V0, 1, 0),   g=6, h=6, f=12
    The best f-cost is 12. Using the tie-breasking rule, unfortunately we pick S11, but since the expansion limit was reachd, we
    select the action "noop" (thereby dooming the agent eventually).
    The search restarts with:
      S7:  (V1, 2, 0),   g=3, h=8, f=11
    We can only pick S7, and have 4 successors, in 2 of which the
    agent dies, so we ignore them, and get:
      S11:  (V1, 1, 0),   g=4, h=8, f=12
      S12:  (V0, 1, 0),   g=6, h=6, f=12
    Using the tie-breasking rule, unfortunately we pick S11 again. Expanding it, we have 4 actions, in 2 of which
    the agent dies, leaving:
      S13:  (V1, 0, 0),   g=5, h=8, f=13
      S14:  (V0, 0, 0),   g=6, h=6, f=12
    We can pick either S12 (meaning "noop") or S14 (meaning "go to V0"). One of them is returned as we have reached the expansion limit.
    In either case the agent will die. Suppose we pick "noop", so the agent starts the next search from S11.
      S11:  (V1, 1, 0),   g=4, h=8, f=12
    Expaning S11, we get again:
      S13:  (V1, 0, 0),   g=5, h=8, f=13
      S14:  (V0, 0, 0),   g=6, h=6, f=12
    Now the best is S14, and in all its successors the agent dies, leaving only S13. This is the expansion limit,
    so we return S13, i.e. the "noop" action again.
    We now start from the search from S13. In all its successors the agent dies, so the algorithm finally rerurns "fail"...

   d) Repeat a-c but using h'(n) = 2*h(n) as the heuristic. Is h'(n) admissible?

ANSWER: h'(n) is not admissible. For example, suppose in the initial state you have 2 food. Then the
    goal can be reached for cost 6, while h' of this state is 12.
    Now to trace the seach using h'(n).
  
   d.a) Greedy search: Initial state pushed onto agenda:
      S0:  (V0, 1, 3),   g=0,  h=12, f=12
     Now, retrieve S0, and it is not a goal, and has 3 successors, for noop, move to V1, move to V2
      S1:  (V0, 0, 3),   g=1,  h=12, f=12
      S2:  (V1, 3, 0),   g=2,  h=16, f=16
      S3:  (V2, -1, 3),  g=infinity, h=0, f=0     (agent dead, drop it)
     Now, retrieve S1, not a goal, same 3 actions, (noop, V1, V2) resulting in:
      S4:  (V0, -1, 3),  g=infinity, h=12, f=12     (agent dead)
      S5:  (V1, -1, 0),  g=infinity, h=16, f=16     (agent dead)
      S6:  (V2, -1, 3),  g=infinity, h=0, f=0     (agent dead)
     (Alternetaly, can disallow such successors, in which case S3, S4, S5, S6 are discarded and not pushed onto agenda).
     Now we only have S2, not a goal, 4 successors: noop, or go to V0,V2,V3
      S7:  (V1, 2, 0),   g=3, h=16, f=16
      S8:  (V0, 2, 0),   g=6, h=12, f=12
      S9:  (V2, -1, 0),   agent dead
      S10: (V3, -1, 0),   agent dead
    Now the best is S8.  It is not a goal and again has 3 successors (noop, V1, V2):
      S11: (V0, 2, 0),   g=7, h=12, f=12
      S12: (V1, 1, 0),   g=9, h=16, f=16
      S13: (V2, 0, 0),   g=12, h=0, f=0
    Note that we have generated a goal state, but it is pushed into the agenda and not returned yet.
    Now the best node S13 is picked from the agenda. 
    S13 is a goal, so it is retured. Solution found is (total cost 12):
        Go to V1 (cost of 2)
        Go to V0 (cost of 4)
        Go to V2 (cost of 6)
    In this case it also happens to be the optimal (and only) solution.

   d.b) A* search, (f(n) = g(n)+h'(n)), breaking ties as above.  Initial state pushed onto agenda:
      S0:  (V0, 1, 3),   g=0,  h=12, f=12
     Now, retrieve S0, and it is not a goal, and has 3 successors, for noop, move to V1, move to V2
      S1:  (V0, 0, 3),   g=1,  h=12, f=13
      S2:  (V1, 3, 0),   g=2,  h=16, f=18
      S3:  (V2, -1, 3),  g=infinity, h=0, f=infinity     (agent dead)
     Now, retrieve S1, not a goal, same 3 actions, (noop, V1, V2) resulting in:
      S4:  (V0, -1, 3),  g=infinity, h=12, f=infinity     (agent dead)
      S5:  (V1, -1, 0),  g=infinity, h=16, f=infinity     (agent dead)
      S6:  (V2, -1, 3),  g=infinity, h=0, f=infinity     (agent dead)
     (Alternetaly, can disallow such successors, in which case S3, S4, S5, S6 are discarded and not pushed onto agenda).
     Now we only have S2, not a goal, 4 successors: noop, or go to V0,V2,V3
      S7:  (V1, 2, 0),   g=3, h=16, f=19
      S8:  (V0, 2, 0),   g=6, h=12, f=18
      S9:  (V2, -1, 0),   agent dead
      S10: (V3, -1, 0),   agent dead
    Now we have 2 states with f=18, we the the-breaking rule prefers S2, and again we have 4 successors, in 2 of which the
    agent dies, so we ignore them, and get:
      S13:  (V1, 0, 0),   g=5, h=16, f=21
      S14:  (V0, 0, 0),   g=6, h=12, f=18
    Now we have 2 states with f=12, and the tie breaking rules does not help. Suppose we pick the one where we have more food, S8.
    It is not a goal and again has 3 successors (noop, V1, V2):
      S15: (V0, 2, 0),   g=7, h=12, f=19
      S16: (V1, 1, 0),   g=9, h=16, f=25
      S17: (V2, 0, 0),   g=12, h=0, f=12
    Note that we have generated a goal state, but it is pushed into the agenda and not returned yet.
    Now S17 is picked, and is a goal, so it is retured. Solution found is (total cost 12):
        Go to V1 (cost of 2)
        Go to V0 (cost of 4)
        Go to V2 (cost of 6)
    Note that compares to husing h(n), by using the more pessimistic h'(n) we avoided some expansions,
    such as not expanding S7. We potentially lose optimality, though in this specific case we lost nothing.

   d.c) Using f(n)=g(n)+h'(n), simplified RTA* with a limit of 2 expansions per real action, breaking ties as above.

     Initial state pushed onto agenda:
      S0:  (V0, 1, 3),   g=0,  h=12, f=12
     Now, retrieve S0, and it is not a goal, and has 3 successors, for noop, move to V1, move to V2
      S1:  (V0, 0, 3),   g=1,  h=12, f=13
      S2:  (V1, 3, 0),   g=2,  h=16, f=18
      S3:  (V2, -1, 3),  g=infinity, h=0, f=infinity     (agent dead)
     Now, retrieve S1, not a goal, same 3 actions, (noop, V1, V2) resulting in:
      S4:  (V0, -1, 3),  g=infinity, h=12, f=infinity     (agent dead)
      S5:  (V1, -1, 0),  g=infinity, h=16, f=infinity     (agent dead)
      S6:  (V2, -1, 3),  g=infinity, h=0, f=infinity     (agent dead)
     Now the only state left is S2, it is picked but the expansion limit was reached  so the appropriate move is made: go to V1.
     Now the search is restarted with (we will keep the current g value as an initial cost to simplify things. If not,
     deduct 2 from all the g costs and f costs below):
      S2:  (V1, 3, 0),   g=2,  h=16, f=18
     Now we only have S2, not a goal, 4 successors: noop, or go to V0,V2,V3
      S7:  (V1, 2, 0),   g=3, h=16, f=19
      S8:  (V0, 2, 0),   g=6, h=12, f=18
      S9:  (V2, -1, 0),   agent dead
      S10: (V3, -1, 0),   agent dead
     Now the best is S8. It is not a goal and again has 3 successors (noop, V1, V2):
      S15: (V0, 2, 0),   g=7, h=12, f=19
      S16: (V1, 1, 0),   g=9, h=16, f=25
      S17: (V2, 0, 0),   g=12, h=0, f=12
     Now we pick S17, which is a goal state which can be returned (also expansion limit of 2. but we are still done).
     Solution found is (total cost 12):
        Go to V1 (cost of 2)
        Go to V0 (cost of 4)
        Go to V2 (cost of 6)
     In this case, the h'(n) inadmissible heuristic not only speeded up the search, but has also saved the agent's life...


3) (Game trees):
   Consider a 3-person game (A, B, C) with complete information and deterministic environment.
   Suppose A has 5 possible moves. Provide a game tree example that shows that for
   each of the following set of rules, A will have a different optimal move:
   a) Each agent out for itself, and they cannot communicate.
   b) Paranoid assumption: B and C are against A
   c) B and C may agree on a deal if it is beneficial to both of them
   d) B and C may agree on a deal if it is beneficial to both, but C is
      allowed to violate it (but B does not know that).
   e) A and B may reach a deal, if it is beneficial to both of them

   Explain your answer by showing the computed values in the internal nodes in each case.
   Note that you should probably make the example more concise by allowing
   only ONE choice of action for some of the players in some of the branches.

ANSWER:

Consider the following tree (the score at the end is (A B C) scores, respectively).

A1   -----   B1  ----  C1 ----- (7  2  5)
                       C2 ----- (0  1  2)
             B2  ----  C1 ----- (4  3  9)
                       C2 ----- (4  0 10)
             B3  ----  C1 ----- (4  2  5)
A2   -----   B1  ----  C1 ----- (2  2  5)
                       C2 ----- (0  0  2)
             B2  ----  C1 ----- (4  3  9)
                       C2 ----- (8  0 10)
A3   -----   B1  ----  C1 ----- (6  1  0)
             B3  ----  C1 ----- (4  0  0)
A4   -----   B1  ----  C1 ----- (9  5  0)
             B2  ----  C1 ----- (0 10  0)
A5   -----   B1  ----  C1 ----- (5  0  0)

Note that in some of the cases B and/or C have only one possible move.

Under regime a, the best move for A is A1, because the optimal move for
B will be B1 and that for C will be C1, to get (7 2 5), i.e. a score of 7 for A.
This is the best A can get anywhere  under this regime. 

Under regime b, the best move for A is A5, which gives B and C no options, and results in 5 for A.
Any other choice allows B and C together to make A get less than 5.

Under regime c, if A takes move A1 (or A2) then B and C can make a deal that will benefit both:
B will agree to make move B2 if C agrees to move to C1, improving the score for both B and C.
As a result the score starting at A1 is (4 3 9), so A will only get 4, and as a result
the best move for A is A3 which earns 6 for A. 
(Doing A2 will cause the deal between B and C to be (4 3 9) which is also not good for A.

Under regime d, the best move for A is A2: B and C make the same deal for either A1 or A2. 
But since C will renege on the deal,
(in order to get 10 instead of 9 after the moves A1B2 or A2B2), then it is better for A to take the
move A2, because after B does B2 and C reneges and does C2, the score will be (8 0 10), so A will get 8,
the best available score here.

Finally, under regime e, the best move for A is A4: A can convince B to make the move B1 after A makes the move A4, becuase the
score will then be (9  5  2), the best result for both A and B. B here agrees to give up
on the result of 10 so that A will agree to move to A4 (and A gets 9).


4)  (Game-tree search - alpha-beta pruning)
   a) Give an example of a 3-ply game tree (branching factor 2)
      where alpha-beta pruning saving is maximal. How many nodes are pruned?

ANSWER: A 3-ply game tree with branching factor 2 has 8 terminal nodes. We will
      let the values be integers from 1 through 8. Ordering them so that the best move
      is examined first will guarantee maximal.
      We will present the best order first (top to bottom below). We will call moves by A: A1 and A2, etc. Bounds are marked,
      e.g. 8+ means the value of this node is 8 or more, exact value unknown. X means nothing
      known of this value (using alpha-beta in this order), and the value is irrelevant.

MAX     MIN      MAX 

A1 6  -- B1  6  -- A1   6
    \         \
     \         --- A2   5
      \               
       - B2  8+ -- A1   8
              \      
               --- A2   pruned

A2  2- -- B1 2- -- A1   2
     \        \        
      \        --- A2   1
       \               
        - B2  X--- A1   pruned
              \        
               --- A2   pruned
                      
   b) Suppose that we know ahead of time that the terminal values can only
      be integers between -10 and 10. Is there a case where alpha-beta can save even
      more search than the best case above (show it, or prove that it cannot help).

ANSWER: Once there is a strategy for gaining 10 for max, everrything else can be pruned, e.g.

MAX     MIN      MAX 

A1 10  -- B1 10  -- A1  10
    \         \
     \         --- A2   pruned
      \               
       - B2  10 -- A1   10
              \      
               --- A2   pruned

A2  X  -- B1 X  -- A1   pruned
     \        \        
      \        --- A2   pruned
       \               
        - B2  X--- A1   pruned
              \        
               --- A2   pruned
                      

   c) Provide an example of a 2-ply + 1 chance nodes level game tree where
      one can apply an adapted alpha-beta to prune some nodes.

ANSWER: Can only be done if values are bounded. Assume bounds are [-10, 10].
MAX  CHANCE    MIN  

A1 5  -- C1 10  -- 10
    \  (0.5)  \ 
     \         --- 10 
      \               
       - C2  0  --  0 
        (0.5) \       
               ---  0 
                      
A2 4-  - C1 -2- -- -2
     \  (0.5) \       
      \        --- -4   pruned
       \              
        - C2 X --- 10   pruned 
        (0.5) \       
               ---  6   pruned

That is, the score of A2 cannot be higher than 4, so no point in looking further, MAX
must play A1. However, if the probabilities of the chance node after A2 are (0.1, 0.9) then
it is still possible for the value of A2 to be higher than 5, so cannot prune. 
                     
5) (Propositional logic)
   For each of the following sentences, determine whether they are
   satisfiable, unsatisfiable, and/or valid. For each sentence
   determine the number of models (over the propositional
   variables in the sentence). Bonus question: in case a, also trace the run of the DPLL algorithm for satisfiability with this
   formula as input (i.e. explain any recursive calls and cases encountered).
   a) (A or not B or D) and (not A or B) and (not A or not B or D)
   b) (A or B or C or D or E)
   c) (A and not A) and (Not B)
   d) (A and (A -> B) and (B -> C)) -> not C
   e) not ((A and (A -> B) and (B -> C)) -> C)
   f) A -> not A
   g) (A and (A -> B) and (B -> C)) -> C

ANSWER:
   a) Satisfiable, 4 models:  A=B=C=true, and A=false with B and D any values except B=true and D=false
   b) Satisfiable, 31 models (everything except all false).
   c) Unsatisfiable.
   d) Satisfiable, 7 models (all except A=B=C=true)
   e) Unsatisfiable, since it is a negation of a valid sentence (in g)
   f) Satisfiable, one model (A=false).
   g) Valid (a tautology).

6) (FOL and situation calculus)
   We need to represent and reason within the scenario defined by assignment 1.
   For simplicity, we will ignore the issue of action costs, and assume only one refugee agent
   and that Hungarian police do not move.
   
   a) Write down the axioms for the behavior of the world using FOL and
      situation calculus.

      Use the predicates:
        Edge(e, v1, v2) ; e is an edge in the graph between v1 and v2.
        Goal(v)         ; Vertex v is a goal site.
      Also use the fluents (situation-dependent predicates)
        Loc(v,s)        ; Agent is at vertex v in situation s
        Carrying(f,s)   ; Agent is carrying f units of food in situation s.
        Food(v,f, s)    ; There are f units of food at vertex v in situation s.
        Police(v, s)    ; Police presence at vertex v in situation s.

     Constants: as needed, for the edges and vertices, and simple actions (noop).

     Functions: 
         weight(e)       ; Function denoting the weight of edge e
         traverse(e)     ; Denotes a traverse move action across edge e (may be ineffective)
     and of course the "Result" function:
         Result(a,s)    ; Denotes the situation resulting from doing action a in
                        ; situation s.
     You may wish to add auxiliary predicates or fluents in order to simplify the axioms.
     In this assignment you will need arithmetic functions and predicates, be sure to define them formally! 
     You are allowed to do so provided you explain why they are needed.


   Let the facts representing the scenario be:
Edge(E1, V0, V1)
Edge(E2, V1, V2)
Edge(E3, V0, V2)
Edge(E4, V1, V3)
Goal(V2)
Weight(E1)=1
Weight(E2)=4
Weight(E3)=2

; with situation S0 fluents being (note that this is a partial, "closed worlf assumption" description).
Loc(V0, S0)
Police(V3, S0)
Food(V1,3,S0)
Food(V0,0,S0)
Food(V2,0,S0)
Carrying(1,S0)




ANSWER: We will add predicate Alive(s) denoting that the agent is alive. We will also need the arithmetic predicate geq
(greater or equal) and the arithmetic function +.

Axioms for Alive are:

1: forall a, s, not Alive(s) => not Alive(Result(a, s))

2: forall a, s, Alive(s) and OK(a, s) => Alive(Result(a, s))
3: forall a, s, not OK(a, s) => not Alive(Result(a, s))
4: Alive(S0)

Where OK is an auxialiary predicate to be defined below.
Actions are noop (a constant) and traverse(e). Note that syntactically traverse is a function symbol.
The rules for OK are 5: all OK for a noop if carrying at least 1 food. 6: OK to traverse if have sufficient food and no police.

5: forall s, f,  Alive(s) and Carrying(f, s) and geq(f,1) => OK(noop, s)
6: forall s, e, f, v1, v2, (Edge(e, v1, v2) and not Police(v1, s) and not Police (v2, s) and Alive(s) and Carrying(f, s) and geq(f,weight(e))) => OK(travese(e), s)

Now the rules for food consumption for noop:

7: forall s, f,  Carrying(f+1, s) and OK(noop, s) => Carrying(f, Result(noop, s)) 


The latter assues NO FOOD PICKUP and will need to be fixed later!

The rules for traversal:

8: forall s, e, v1, v2  Alive(s) and OK(traverse(e),s) and Edge(e, v1, v2) and Loc (v1, s) => Loc(v2, Result(traverse(e), s))
9: forall s, e, v1, v2  Alive(s) and OK(traverse(e),s) and Edge(e, v1, v2) and Loc (v2, s) => Loc(v1, Result(traverse(e), s))

And for food pickup:

10: forall s, e, v1, v2  Alive(s) and OK(traverse(e),s) and weight(e)= f2 and carrying(f+f2, s) and Edge(e, v1, v2) and Loc (v1, s) and Food(v2, f1, s)
                     => Food(v2, 0, Result(traverse(e), s)) and Carrying(f+f1, Result(travese(e), s))
11: forall s, e, v1, v2  Alive(s) and OK(traverse(e),s) and weight(e)= f2 and carrying(f+f2, s) and Edge(e, v1, v2) and Loc (v2, s) and Food(v1, f1, s)
                     => Food(v1, 0, Result(traverse(e), s)) and Carrying(f+f1, Result(travese(e), s))

We also need to add facts due to "closed world assumption", e.g. that there is NO police at any other node:

12a: not Police(V0, S0)
12b: not Police(V1, S0)
12c: not Police(V2, S0)

Note how this is inelegant because all axioms are repeated for each possible (source, destination) order of the vertices!
We must also axiomitize illegal actions, but will ignore it as this will not be needed for the proofs. Must
also axiomatize "geq" and the "+" function. We can also go for the partial, finite axiomatization by adding the facts,
with SOME general axioms such as transitivity of geq (we will not number them):

forall x, y, z  geq(x,y) and geq(y,z) => geq(x,z)
forall x, y  x+y => y+x
forall x, y  x=y => y=x
forall x, y, z  x=y and y=z => x=z
etc.

Now some basic arithmetic facts:
0=0
1=1
2=2
3=3
not 0=1
not 1=2
not 2=3
0+1=1
1+2=3
geq(1,0)
geq(2,1)
geq(3,2)
etc.
(For people using the integer successor function, you can write:
    forall x geq(S(x),x)
this takes care of the geq predicate, together with the above transitivity axiom for geq)
Also need all axioms for equality such as:
forall x,y,z  geq(x, y) and z=y => geq(x,z)



  b)  Now, we need to find a sequence of actions that will result in reaching the goal from S0,
      and prove a theorem that it will do so. Do the following steps:

      A) Convert the knowledge base to conjunctive normal form. (This is very long, 
         so you may omit the parts not needed in the proof below).

ANSWER:

1: not needed, so we skip it.

2: not Alive(s) or not OK(a, s) or Alive(Result(a, s))
3: not needed, skipped
4: Alive(S0)

5: not needed (as we are not doing noop), skipped
6: not Edge(e, v1, v2) or Police(v1, s) or Police (v2, s) or not Alive(s) or not Carrying(f, s) or not geq(f,weight(e)) or OK(traverse(e), s)

7: not needed, skipped


8: not Alive(s) or not OK(traverse(e),s) or not Edge(e, v1, v2) or not Loc (v1, s) or Loc(v2, Result(traverse(e), s))
9: not Alive(s) or not OK(traverse(e),s) or not Edge(e, v1, v2) or not Loc (v2, s) or Loc(v1, Result(traverse(e), s))

And for food pickup:

10a: not Alive(s) or not OK(traverse(e),s) or not weight(e)= f2 or not carrying(f+f2, s) or not Edge(e, v1, v2) or not Loc (v1, s) or not Food(v2, f1, s)
                     or Food(v2, 0, Result(traverse(e), s)) 
10b: not Alive(s) or not OK(traverse(e),s) or not weight(e)= f2 or not carrying(f+f2, s) or not Edge(e, v1, v2) or not Loc (v1, s) or not Food(v2, f1, s)
                     or Carrying(f+f1, Result(travese(e), s))
11a: not Alive(s) or not OK(traverse(e),s) or not weight(e)= f2 or not carrying(f+f2, s) or not Edge(e, v1, v2) or not Loc (v2, s) or not Food(v1, f1, s)
                     or Food(v1, 0, Result(traverse(e), s)) 
11b: not Alive(s) or not OK(traverse(e),s) or not weight(e)= f2 or not carrying(f+f2, s) or not Edge(e, v1, v2) or not Loc (v2, s) or not Food(v1, f1, s)
                     or Carrying(f+f1, Result(travese(e), s))


      B) Formally list what we need to prove (a theorem), and state its negation.

ANSWER: As seen in the answer to question 1, the agent needs to traverse E1 once in each direction, and
        then traverse E3. We want to show that the agent has reached the goal node V2 alive as a result of
        these actions, i.e. we need to prove:

      Q:   Loc(V2, Result(traverse(E3), Result(traverse(E1), Result(traverseE1), S0))) and
               Alive( Result(traverse(E3), Result(traverse(E1), Result(traverseE1), S0)))

        The negation is:
      Qi: not (Loc(V2, Result(traverse(E3), Result(traverse(E1), Result(traverseE1), S0))) and
                       Alive( Result(traverse(E3), Result(traverse(E1), Result(traverseE1), S0))))

        In order to convert it to CNF, we only need to move the negation inward using De-Morgan, and a single CNF clause:

      Q':  not Loc(V2, Result(traverse(E3), Result(traverse(E1), Result(traverseE1), S0))) or
               not Alive( Result(traverse(E3), Result(traverse(E1), Result(traverseE1), S0)))

      C) Use resolution to reach a contradiction, thus proving the theorem.
         At each step of the proof, specify the resolvents and the unifier.

ANSWER: There are numerous ways to do it. One example below.

Resolve 4 with 8, substitution {s|S0}, to get:

100: not OK(traverse(e),S0) or not Edge(e, v1, v2) or not Loc (v1, S0) or Loc(v2, Result(traverse(e), S0))

Resolve 100 with "Edge(E1, V0, V1)", substitution {e|E1, v1|V0, v2|V1} to get:

101: not OK(traverse(E1),S0) or not Loc (V0, S0) or Loc(V1, Result(traverse(E0), S0))

Resolve 101 with "Loc(V0, S0)", null substitution, to get:

102:  not OK(traverse(E1),S0) or Loc(V1, Result(traverse(E0), S0))

Resolve 4 with 6, substitution {s|S0}, to get:

103:  not Edge(e, v1, v2) or Police(v1, S0) or Police (v2, S0) or not Carrying(f, S0) or not geq(f,weight(e)) or OK(travese(e), S0) 

Resolve "Edge(E1, V0, V1)" with 103,  substitution {e|E1, v1|V0, v2|V1} to get:

104:   Police(V0, S0) or Police (V1, S0) or not Carrying(f, S0) or not geq(f,weight(E1)) or OK(traverse(E1), S0) 

Resolve 12a with 104, null substitution to get:

105:   Police (V1, S0) or not Carrying(f, S0) or not geq(f,weight(E1)) or OK(traverse(E1), S0) 

Resolve 12b with 105, null substitution, to get:

106:   not Carrying(f, S0) or not geq(f,weight(E1)) or OK(traverse(E1), S0)

Resolve "Carrying(1,S0)" with 106, substitution {f|1}, to get:

107: not geq(1,weight(E1)) or OK(traverse(E1), S0)

Now we resolve one of the equality axioms: "not geq(x, y) or not z=y or geq(x,z)" with "Weight(E1)=1", substitution {z|Weight(E1),y|1} to get:

108: not geq(x, 1) or geq(x,Weight(E1))

Now resolve "geq(1,1)" with 108, substitution {x|1}, to get:

109: geq(1,Weight(E1))

Resolve 109 with 107, empty substitution, to get, finally, something useful "The action of traversing E1 in situation S0 is OK":

110:   OK(traverse(E1), S0)

From which we quickly get, by resolving 110 with 102, nul substitution:

111:   Loc(V1, Result(traverse(E0), S0))

And likewise, resolving 110 with 2, substitution {a|traverse(E1), s|S0} we get:

112: not Alive(S0) or  Alive(Result(traverse(E1), S0))

Finally, resolving 4 with 112, null substitution, we get:

113: Alive(Result(traverse(E1), S0))

Now to prove the change of carried food and food at V1:

Resolve 10b with 4, substitution {s|S0} to get:

114:   not OK(traverse(e),S0) or not weight(e)= f2 or not carrying(f+f2, S0) or not Edge(e, v1, v2) or not Loc (v1, S0) or not Food(v2, f1, S0)
                    or Carrying(f+f1, Result(travese(E1), S0))

Resolve 114 with 110, substitution {e|E1}, to get:

115:  not weight(E1)= f2 or not carrying(f+f2, S0) or not Edge(E1, v1, v2) or not Loc (v1, S0) or not Food(v2, f1, S0)
                    or Carrying(f+f1, Result(travese(E1), S0))

Resolve 115 with "Weight(E1)=1", substitution {f2|1} to get:
116:  carrying(f+1, S0) or not Edge(E1, v1, v2) or not Loc (v1, S0) or not Food(v2, f1, S0)
                    or Carrying(f+f1, Result(travese(E1), S0))

Resolve 116 with "Edge(E1,V0,V1)", substitution {v1|V0, v2|V1} to get:

117:  Carrying(f+1, S0) or not Loc (V0, S0) or not Food(V1, f1, S0)
                    or Carrying(f+f1, Result(travese(E1), S0))

Resolve 117 with "Loc(V0, S0)", null substitution, to get:

118:  Carrying(f+1, S0) or not Food(V1, f1, S0) or Carrying(f+f1, Result(travese(E1), S0))

Resolve 118 with "Food(V1,3,S0)", substitution {f1|3} to get:

119: Carrying(f+1, S0) or  Carrying(f+3, Result(travese(E1), S0))

Resolve addition axiom  "0+1=0" with equality axiom "not Carrying(x,s) or not y=x or Carrying(y,s)", substitution {x|0+1, y|1}  to get

120: not Carrying(1,s) or Carrying(0+1,s)

Resolve "Carrying(1,S0)" with 120, substitution {s|S0} to get:

121: Carrying(0+1,S0)

Resolve 119 with 121, substitution {f|0} to get:

122:   Carrying(0+3, Result(travese(E1), S0))

(We still need a few steps if we actually want  to get the "equivalent":

125: Carrying(3, Result(travese(E1), S0))

Similar steps using 10a will allow us to conclude that the traversal also deletes the food from V1

140:   Food(V1, 0, Result(traverse(E1), S0)) 

We have now proved much of the needed information about the situation Result(traverse(E1), S0).
But we still need to show that no police has "magically" materialized,
and the amount of food at every other node (frame axioms!). Without that we will run into trouble later!
Frame axioms such as: Police is not spawned at a node, for any action. In CNF, this is:
NP:  Police(v, s) or not Police(v, Result(a,s))

The food frame axiom should state that food stays the same except for a successful traverse action into a node (not shown).

The rest of the proof is similar (so we are skipping many steps here), we will need to prove that (using a similar sequence of steps):

150:  OK(Result(traverse(E1), (Result(traverse(E1), S0))))

and that:

160:  Alive(Result(traverse(E1), (Result(traverse(E1), S0))))

and that:

170:  Carrying(2, Result(traverse(E1), (Result(traverse(E1), S0))))

As well a the facts that the police have STILL not spwaned at V0, V1, V2 and that there is STILL zero
food at these nodes, using the added frame axioms. After we have all that, we no need to show:

180:  OK(Result(traverse(E3), Result(traverse(E1), (Result(traverse(E1), S0)))))

And then we can easily prove: 
190:  Alive(Result(traverse(E3), Result(traverse(E1), (Result(traverse(E1), S0)))))
200:  Loc(V2, Result(traverse(E3), Result(traverse(E1), (Result(traverse(E1), S0)))))

Finaly, we can resolve Q' with 190, and thenthe result with 200, to get a contradiction.


  c) Would there be a potential problem in the proof if we did not
     have "frame axioms" in our representation (i.e. stating what
     did not change), and if so, what?

ANSWER:  Indeed, without the frame axioms we would not be able to proceed to prove 150, or in fact anything
after the results beyond the first action

  d) Would we be able to do the proof using only forward chaining?

ANSWER: we used axiom 6 in the proof, and it MUST be used. It is NOT in Horn form, so CANNOT prove using PURE
backward chaining. 

But in "unpure" backward chaining schemes, such as in Prolog, the proof would work due to negation
by failure. Axiom 6 would be written in Prolog as:
  ok(traverse(E), S) :-  edge(E, V1, V2), alive(S),  carrying(F, S),  geq(F,weight(E)), not police(V1, S), not Police (V2, S).
and the "not police()" will succeed due to negation by failure (note that variables in Prolog need to start with caps, and predicates with lower case),


Justify all answers shortly!
Deadline: Noon (12:00), Monday, December 14, 2015 ( strict deadline).
Submissions are to be solo.