Introduction to Artificial Inteligence

Assignment 3


Homework assignment - simulators, agents, search and logic - SOLUTION

1) For each of the following domains, which type of agent is appropriate
(table lookup, reflex, goal based, or utility based):
   a) An agent that can play chess.

      Table lookup is unfeasable. Reflex might in principle work, but also impractical.
      Goal based is the default way to go, but if we also want to consider the chess
      clock timing issue, utility based is needed.

   b) An agent that plays poker.

      Definitely utility based... Need to maximize long-term profit, no specific binary-valued
      goal. Table-based may not be good even if feasible, as it would be very predictable
      an agent... Although in principle table-driven or reflex + randomization might work, this
      is also impractical.

   c) An agent that can imitate Saddam Hussein.

      The answer is somewhat political - some may claim "table-driven". But if we
      assumed "human-like" behaviour, then a utility-based agent, though G*d only knows
      what utility function...

   d) An autonomous robot for guarding the border against infiltrations by hostiles.

      Again, a utility-based agent is needed in order to support reasoning about
      conflicting goals.

   EXPLAIN each choice in about 2 sentences.

2) Represent the following sentences in FOPC - define the vocabulary first.
   a) Every tripod has three equal-length legs.

    predicates:
           Part-of(x, y): x is a part of y
           Tripod(x):     x is a tripod
           Length(x):     a function, presumably the length of x
           Leg(x):        x is a leg

     Forall (w)  Tripod(w) =>
       Exists (x, y, z) (Leg(x) and Leg(y) and Leg(z) and
            Part-of(x, w) and Part-of(x, w) and Part-of(x, w) and
            Length(x) = Length(y) and Length(y) = Length(z) and
            not (x=y) and not (x=z) and not (y=z))

     Note: the above does not state "exactly 3 legs". To do that, we also need to add,
     within the scope of the "exists", the following:
          ... and forall (t) (Part-of(t, w) and Leg(t)) =>
              (t=x or t=y or t=z)

   b) All animals are equal, but some are more equal than others.

      This one is strange, a word-play on the meaning of "equal". Clearly
      this is NOT using the standard meaning of equality!
      One meaning might be,  using the predicate:
         Animal(x): x is an animal
         Value(x):  function denoting value of x
         Greater(x, y):      x is greater than y

      forall ((x, y) Animal(x) and Animal(y)) => Value(x) = Value(y) and
        exists (x, y) Greater(Value(x), Value(y))

      Note that this is not a contradiction because we have no semantics for "Greater",
      but for most INTENDED semantics this would be a contradiction.

   c) All polititians can fool all stupid people some of the time.
      This one is more well defined. Predicates:
        Politician(x): x is a politician
        Can_fool(x, y, t):  x can fool y at time t
        Stupid_person(x):   x is a stupid person
        Time(t):            t is a time object

      forall (x) Politician(x) => 
        (forall (y) Stupid_person (y) => exists (t) Time(t) and Can_fool(x, y, t))

      Still, the above English is ambiguous, it could also mean:
      forall (x) Politician(x) => 
        (exists (t) Time(t) and forall (y) Stupid_person (y) => Can_fool(x, y, t))


   d) You will get this report when hell freezes over.

      Predicates:
         Receive(x, y, t):   x receives y at time t
         Report(x):          x is a report
         Report123:          a constant denoting this report
         Person123:          a constant denoting "you"

      The true meaning of the above is: you will never get this report, i.e.

      Report(Report123) and not exists (t) Receive(Person123, Report123, t)

      But a literal reading requires additional predicates, such as:
        Freeze(x, t):  object x freezes at time t
        Hell1: constant denoting "hell"

      In this case a solution might be:

      Report(Report123) and 
        forall (t) Receive(Person123, Report123, t) => Freeze(Hell1, t)

   e) The town barber shaves every man who does not shave himself.

      There are several ways of representing this. One is using predicates:
         Man(x):  x is a man
         Shaves(x, y): x shaves y
         Barber(x):    x is a barber
         Town_barber(t): function denoting tht town barber of t
         Town1:   constant denoting the town referenced by this sentence.

      Barber(Town_barver(Town1)) and
         forall (x) (Man(x) and not Shaves(x, x)) => Shaves(Town_barber(Town1), x))

3)  In the wumpus world, in situation S0, the robot, is facing the wumpus
   (facing_wumpus), and has an arrow (has_arrow), and does not have the gold.
    The pickup action results in the robot holding the gold (has_gold).
    The shoot action results in the wumpus being dead (not alive(wumpus))
    if the robot is facing the wumpus and has an arrow.

    a) Write down the axioms for the above information for variant of
       wumpus world.

       A1: forall (s) has_gold(result(grab, s))         ; Result of grab action

       A2: forall (s) (facing_wumpus(s) and has_arrow(s)) =>
               not alive(wumpus, result(shoot, s))      ; Result of shoot action

       AS: facing_wumpus(S0) and has_arrow(S0) and not has_gold(S0)   ; Description of S0

    b) Can you prove that has_gold(result(grab, result(shoot, S0)))
       with your axioms? If yes, do so. If not, what is missing?

       Yes, a very short proof, as follows:

         In A1, use universal specification, with s replaced by result(shoot, S0),
         to get: has_gold(result(grab, result(shoot, S0)))


    c) Can you prove that not alive(wumpus, result(shoot, result(grab, S0)))
       with your axioms? If yes, do so. If not, what is missing?

       Not without additional frame axioms - since there is no way to prove either
       facing_wumpus(result(grab, S0)) or has_arrow(result(grab, S0)), which are needed
       in order to complete the proof. 

       In this simplified world, no action we know of can change either facing-wumpus or
       has_arrow, and this must be stated explicitly! That is, we need the frame
       axioms (or the equivalent state transition equations):

       A3: forall (s) facing_wumpus(s) => facing_wumpus(result(grab, s))

       A4: forall (s) has_arrow(s) => has_arrow(result(grab, s))

       Now, from A3 and universal specification, replacing s by S0, we get:

       T1: facing_wumpus(S0) => facing_wumpus(result(grab, S0))

       From part of AS and T1 using modus ponens we get:

       T2: facing_wumpus(result(grab, S0))

       Likewise, from A4 and universal specification, replacing s by S0, we get:

       T3: has_arrow(S0) => has_arrow(result(grab, S0))
 
       From part of AS and T3 using modus ponens we get:

       T4: has_arrow(result(grab, S0))

       Now, using universal specification on A2 with s substituted to result(grab, S0) we get:
 
       T5: (facing_wumpus(result(grab, S0)) and has_arrow(result(grab, S0))) =>
               not alive(wumpus, result(shoot, aresult(grab, S0))

       Finally, using T3, T4, and T5 with modus ponens we get:

       T6: not alive(wumpus, result(shoot, aresult(grab, S0))


    d) Convert the knwoledge base (after fixes) into CNF.

       A1 simply becomes: 
         A1': has_gold(result(grab, s))

       A2 becomes:
         A2': not facing_wumpus(s1) or not has_arrow(s1) or not alive(wumpus, result(shoot, s1))

       AS becomes 3 separate clauses:

       AS1: facing_wumpus(S0)
       AS2: has_arrow(S0)
       AS3: not has_gold(S0)

       Frame axiom A3 becomes:

       A3': not facing_wumpus(s2) or facing_wumpus(result(grab, s2))

       Frame axiom A4 becomes:

       A4': not has_arrow(s3) or has_arrow(result(grab, s3))

       (note re-naming of variables in order to avoid conflict))

    e) Perform the proof of b and c formally using resolution refutation proof.

       To prove b, we temporarily assert the negation of b, i.e. add:

       G1: not  has_gold(result(grab, result(shoot, S0)))

       G1 is already a single clause, so we can proceed. 

       Now, resolve G1 with A1', using the substitution {s /  result(shoot, S0)}, this
       results in an empty clause (a contradiction), so we are done!

       The proof of c is a bit longer. Temporarily assert the negation of c, i.e. add:

       G2: alive(wumpus, result(shoot, result(grab, S0)))

       Since G2 is already a single clause, we can proceed. Resolve G2 with A2', with
       substitution {s1 / result(grab, S0)} to get:

       T1':  not facing_wumpus(result(grab, S0)) or  not has_arrow(result(grab, S0))

       Now, resolve T1' with A3' with {s2 / S0} to get:

       T2':  not facing_wumpus(S0) or not has_arrow(result(grab, S0))

       T2' resolves with AS1 with the null substitution to get:

       T3': not has_arrow(result(grab, S0))

       We now resolve T3' with A4', substitution {s3 / result(grab, S0)} to get:

       T5': not has_arrow(S0)

       Finally, T5' resolves with AS2 with the null substitution to get the empty clause.

    Predicates are: has_gold(situation), has_arrow(situation), alive(who, situation)
                    facing_wumpus(situation)
    Actions are: grab, shoot


4) We have a search problem where the cost of each operator is 1. The branching
   factor is bounded by b. We have a heuristic evaluation function h(n) that
   is admissible, and such that h(n) >= h*(n) - 1  (that is, it is never off by more than 1).
   We use h(n) as a search heuristic in the A* algorithm.

   a) Suppose that there is only one solution to the problem. How many nodes will be
      expanded by A* using h(n) above as a heuristic, in the worst case, if
      h*(initial_state) = d  ?

   Observe that for any node n on the optimal solution path, due to admissibility oh h,
   f(n) s at most d (but no less than d-1, since h(n) is wrong by at most 1).

   The number of expansions is d (not including the final goal node).
   Proof: consider a node n that is not on the solution path. These nodes must have
   infinite h(n) since they do not lead to a solution and h is wrong by at most 1.
   Thus, only nodes along the (only) path to the solution are expanded.

   b) How does the above change if there is more than one solution, but any other solution
      has path cost at least 2 greater than d?

   Again, the number of expansions is d (not including the final goal node).
   Proof:  consider a node n that is not on the optimal solution path. Again, h(n)
   must be infinite, and these nodes will not be expanded. For all nodes n in
   a NON-optimal solution path, it must be the case that f(n) is at least d+1.
   Therefore, these nodes will not be expanded at all, and the answer is as before.

   c) Same as in (b), but there is also ONE solution with cost d+1.
     
   In this case, nodes in solution paths with cost d+1 can have an f(n)=d,
   but NOT the goal node of the alternate solution. Therefore, the number of expansions
   is 2*d-1 in the worst case (the initial state must be in both paths).


5) Consider a 3-person complete information game between players A, B, and C,
   playing in turn (first A, then B, then C, and then the game ends).
   The game tree is defined as follows:

   A    B    C      scores
----------------------------
   A1   B1   C1     (100, 5, 20)
             C2     (50, 10, 0)
        B2   C1     (10, 20, 30)
             C2     (20, 0, 50)
   A2   B1   C1     (90, 0, 80)
             C2     (0, 20, 90)
   A3   B1   C1     (90, 0, 20)
             C2     (80, 30, 0)
        B2   C1     (80, 0, 40)
             C2     (0, 0, 0)
        B3   C1     (30, 20, 50)
   A4   B1   C1     (0, 90, 50)
             C2     (20, 70, 70)
        B2   C1     (100, 100, 0)
             C2     (95, 95, 1)
   
   Assume that the scores theoretically achievable by each player are
   all between 0 and  100.
   a) What is the best play for A, assuming that each player is out to improve
      its own score, and there is no communication between players?

        Simply use the MAX-MAX-MAX algorithm to evaluate the nodes of
        the tree:

   A    B    C      scores
----------------------------
   A1   B1   C1     (100, 5, 20)
             C2     (50, 10, 0)
        B1 =  (100, 5, 20)
        B2   C1     (10, 20, 30)
             C2     (20, 0, 50)
        B2 = (20, 0, 50)
    A1 = (100, 5, 20)

     Note that at this point A knows that choosing A1 will give a score
     of 100 for A, so there is no point looking elsewhere!     

 b) If players can reach agreements, will this make any difference
      (answer separately for each different type of agreements).

      If B and C can reach an agreement, then in A1 they can both get better
      score than (5 20). The agreement will be for B to move B2 and for
      C to move C1, resulting in (10 20 30). Since neither B nor C can do
      as well without the agreement, it is to the advantage of BOTH (note that
      for B to move B2 with no agreement is a serious error, as then
      C will want to move C2 and B will get 0!).  As a result, A should not
      move A1. If B and C can reach agreements, it is better for A to move
      A4 where there is no agreement possible between B and C that will
      improve both their score, and the the result will be (95, 95, 1)
      because B will move B2 to get 50 or 100, and C will move C2 to get 1
      rather than 0.

      Note that other 2-player agreements are useless for A, as with no
      agreements A gets 100. However, if there are competing agreements,
      the issue gets complicated, and beyond the scope of the exercise!

   c) How will the answer change if A is paranoid (i.e. believe that the other
      players will act to minimize A's score)?

      If A is paranoid, it can treat B-C as a super opponent that has a
      combined move. The best choice is then A1, because that is the only
      branch where A is guaranteed a score better than 0.

   d) Suppose that C always plays randomly with uniform probability. What
      is the best action by A now?

      This time, we can simply use EXPECTI-MINI-MAX to evaluate the tree,
      to get: 

   A    B    C      scores
----------------------------
   A1   B1   C1     (100, 5, 20)
             C2     (50, 10, 0)
         B1 = (75, 7.5, 10)           (average due to random move by C)
        B2   C1     (10, 20, 30)
             C2     (20, 0, 50)
         B2 = (15, 10, 40)
    A1 = (15, 10, 40)
   A2   B1   C1     (90, 0, 80)
             C2     (0, 20, 90)
         B1 = (45, 10, 85)
    A2 =  (45, 10, 85)
   A3   B1   C1     (90, 0, 20)
             C2     (80, 30, 0)
         B1 = (85, 15, 10)
        B2   C1     (80, 0, 40)
             C2     (0, 0, 0)
         B2 = (40, 0, 20)
        B3   C1     (30, 20, 50)
         B3 = (30, 20, 50)
    A2 = (30, 20, 50)
   A4   B1   C1     (0, 90, 50)
             C2     (20, 70, 70)
         B1 = (10, 80, 60)
        B2   C1     (100, 100, 0)
             C2     (95, 95, 1)
         B2 = (97.5, 97.5, 0.5)
    A4 = (97.5, 97.5, 0.5)

The best move for A is A4, which gives A an expected score of 97.5


   e) Will using alpha-beta pruning above (modified alpha-beta in case d)
      save any search in the search algorithm for the best move? What will
      be the savings in each case? Assume
      (and show) the best-case ordering for the search.

      For the original problem, the ordering is already optimal.

      When agreements can be made between B and C it appears that no
      pruning is possible (But in a more general case it may be).

      No pruning seems possible here for the expecti-mini-max case, either. 

   f) How would you treat the problem if B could not observe the move
      made by A?

      The answer here is "depends". How does B model A? If B assumes that
      A moves in "MAX-MAX-MAX" fashion, then the result for the tree
      given is the same. However, if A has two equally good moves, B is
      in trouble. B can assume A chooses randomly between ties, or some
      other method (e.g. lexical) - here we are fully into the realm of
      multi-agent under uncertainty - which we do not address in this course.


6) A rational agent has a choice of algorithms to apply for a search problem.
   Suppose the agent need to minimize total operating costs,
   which are given as:  1000 * G + S, where G is the cost of the path taken
   by the agent, and S is the number of search steps used by the search
   algorithm.
   a)  Assuming that the agent knows the above, and reasons to optimize
       the cost, what type of agent is this (reactive? goal-based? etc.).

      Utility based agent, that employes META-REASONING.

   b) Assume that the agent has 3 choices:
      1) Use default actions without search, and use an immediately available
         solution with cost 10000.
      2) Use A*, where it has an admissible heuristic that is usually
         wrong by a factor of 2.
      3) Use RTA*, with the same heuristic, and a limit of 100 search steps.

   We assume that the branching factor is 2, and that the agent knows
   (correctly) that there exists a solution at depth 10. (Assume operator
   cost of 1, and  that the search space is acyclic).

    Cost for option 1 is easy: 10000000.

    Option 2 is harder to figure out. With operator cost 1 and branching
    factor 2, even BFS will generate about 1024 nodes. A* should do better,
    and will find the optimal solution for a cost of less than 10000 + 1024
    so A* is clearly better here IF the agent knows that there is a solution
    at depth 10 (and in fact there may be one closer!)

    Option 3 is even harder: how likely are we that RTA* will find the right
    step? Note that even an error of 1 will offset all the gains saved by
    lowering search time. So option 3 is worse than 2, though still better
    than option 1.

    Observe that changing search space characteristics, or even changing the
    factor between cost of search and cost of path, can change the optimal
    choice of algorithm.

   (Note: there may be insufficient information to answer b, but the point
    is to try to answer, and to figure out what's missing, if anything, and
    what further assumptions may be necessary).


7) Consider the following search problem (also known as the WEIGHTED ABDUCTION
   problem, which is NP-hard). You have a list KB of weighted
 of propositional Horn clauses, i.e. clauses of the form:
          X1 and X2 and ... and Xn  implies Y
 where each clause C has a weight w(C). For some rules, n=0, i.e.
 nothing is required to imply Y. A proof is defined as a set of clauses P
(from KB), such that if C is in P, then all propositions in the body of C
 must also be in the head of some clause C' in P. In addition, proof P
 is a proof of X if X is also in the head of some clause C' in P.
 The weight (or cost) of a proof P is
 the sum of costs of all the clauses in P. The problem is: given clauses KB and
 proposition Y, find the proof P (subset of KB) for Y that has the least cost.

 a) Write the above problem as a formal search problem. Note that the
    path cost has to reflect the proof cost.

Answer:
   Initial state: Empty set of clauses P, proposition Y to prove.
   Final state:   Set of clauses P obeying above conditions.
   Path cost:     Sum of weights of clauses in P.
   Operators:     Add a clause from KB to P.
     (Improved version of operators - limit operators to select:
         a.  only clauses NOT already in P, and
         b.  only clauses which have in their head a proposition Y
               which appears in the body of a clause in P, but not
               in the head of any clause in P)


 b) If we represent the state in a node as a set of clauses P (that is
    not necessarily a proof), we consider the following heuristic:
          h(P) = sum of the costs of all clauses in P
    Now, we run the A* algorithm on this problem, using the above h as
    a heuristic. Is the algorithm guaranteed to find the least-cost proof?
    PROVE you answer.

Answer:  The path cost is exactly the sum of costs in P, which is NOT
         optimistic, and we cannot claim the heuristic is addmissible.
         However, we have: f(n) = g(n) + h(n) = 2g(n).
         Since this is a constant factor of g(n), the node ordering in
         search is EXACTLY the same as if we had f(n) = g(n) = g(n) + 0.
         Thus, the search with A* behaves AS IF we had h(n) = 0 for all
         n. Now, h(n) = 0 is admissible (exact for goal, optimistic
         for everything else), thus A* is guaranteed to find the optimal
         answer. However, this heuristic is NOT very useful, since
         A* with h(n)=0 behaves like uninformed uniform cost search.

    (As discussed in class: clue - what would happen if we used as a heuristic
     h(P) = 0 for all P? How would the search be different, if at all?)

 c) This sub-question also serves as an example for the above. Suppose that
    the give clauses in the knowledge base (KB) are:

         C1:  A and B implies X         (with cost 5)
         C2:  B and C implies X         (with cost 2)
         C3:  D implies B               (with cost 3)
         C4:  D implies C               (with cost 1)
         C5:  E implies A               (with cost 14)
         C6:  E implies D               (with cost 1)
         C7:  E                         (with cost 2)
         C8:  Y implies X               (with cost 1)
         C9:  Y                         (with cost 30)

    Note that here, the set of clauses P = {C8, C9} is a proof of X,
    because X is at the head of rule C8, and Y is at the head of rule C9,
    and there are no other propositions mentioned in P. The cost of P is 31.

      Show all the steps of the search using A* on the above problem, where
   we need to find the least-cost proof for X.

Answer: we will specify in the state only the DELTA, i.e. the clause added
    from the parent, in order to save space (and time!). Also, we will assume
    that our set of operators is the IMPROVED version above (the simple
    version is also correct, but will require a lot more search!)

    Initial state S0 = empty set of clasues, cost 0 (pushed before loop).

    In first loop - get S0 from agenda (unsupported: X),
     put:
        S1: added C8, cost 1
        S2: added C1, cost 5
        S3: added C2, cost 2

      agenda now (state:cost): S1:1, S3:2, S2:5
 
      (Note: we write state values as if h(n) =0, since we noted above that this
       is equivalent to having h(n) = g(n))
 
   In 2nd loop, get S1 (= C8) - (unsupported: Y) put:
        S4: added C9, total cost 31

      agenda now: S3:2, S2:5, S4:31

   In 3rd loop, get S3 (= C2) - (unsupported: B, C) put:
       S5: added C3, cost 3
       S6: added C4, cost 1

      agenda now: S6:3, S2:5, S5: 6, S4:31

   In 4th loop, get S6  (unsupported: D, B).
       S7: added C3, cost 3     (to support B)
       S8: added C6, cost 1     (to support D)

      agenda now: S8:4, S2:5, S7:6, S5: 6, S4:31

   In 5th loop, get S8  (unsupported: B).
      S9: added C3, cost 3    (to support B)

      agenda now: S2:5, S9:7, S7:6, S5: 6, S4:31

Etc. Algorithm stops when the optimal goal state:  (C2, C3, C4, C6, C7), total cost 9
is retrieved from the agenda.

 d) Consider the following heuristic h'(P):
         h'(P) = h(greedy_extension(P)-P)
    where greedy_extension(P, KB) is computed as follows:
      greedy_extension(P, KB) {
         P' = P;
         while(P' is not a proof and P' not = KB) {
             Find an arbitrary C in P', and Z in the body of C, such that
               Z does not appear as the head of any rule in P'. (Intuitively:
               some Z that is not "justified" in P'.)
             From all the given rules in KB, find the lowest-cost
               rule C' that has Z as its head, and add C' to P'.
             If there is no such rule, return KB.
         }
         return P'
       }

     Is the above heuristic computable in polynomial time? Is it addmissible?
     Prove your answers!
         
Answer: the heuristic is computable in polynomial time, since finding Z at
    each step takes time polynomial in size of P', and so does searching KB
    for the clause C'. Number of passes through loop is at most equal
    to number of propositions, so total runtime of heuristic is polynomial
    in size of KB and P'. Size of P' is also at most quadratic in number
    of propositions.

    This heuristic is NOT addmissible. For example, consider the following KB,
    when looking for least-cost proof of X:
       C1: Y for a cost of 10
       C2: A implies Y for a cost of 1
       C3: A for a cost of 100
       C4: X for a cost of 20
       C5: Y imples X - cost 1

    For the state S1 = (C5), the heuristic will return P' = C2, C3
    and a heuristic cost of 101, which is pessimistic, resulting in
    preference of the state S2 = (C4), costing 20. However, in fact S1
    has a child S2 = (C1, C5), which is a goal state, and only costs 11.