Introduction to Artificial Inteligence

Assignment 3 -solutions


Homework assignment - simulators, agents, search, games, propositional logic

1) Consider the following search problem (also known as
   LEAST-COST PROOF, which is NP-hard).
   You have a list KB of clauses F1,...Fk. Each Fi is of the form:

    l1 or l2 or ... lm

   where the l's are literals, i.e. a propositional variable or its negation.
   You need to apply the operators:  a) resolution b) simplification
   in order to prove a propositional statement (by contradiction).

   Assume that applying resolution costs 2, and
   simplification costs 1 (simplification means: in a clause that contains
   2 or more instances of the same literal - remove one of them. Note that only
   ONE literal is removed at each step (e.g. reducing 3 copies of the same
   litral to 1 copy takes 2 steps).

The problem is: given clauses KB and a sentence Y, find the proof P
that has the least cost. For example, we have the KB consisting of the
clauses:
    C1:  ~P or Q
    C2:  ~R or Q
    C3:  ~Q

and need to prove the sentence Y:   ~P or ~R

(hint - before the search, you need to add a transformed Y into your KB)

 a) Write the above problem as a formal search problem. Note that the
    path cost has to reflect the proof cost.

     Initial state: original KB with set of clauses representing "not Y"
                    added.
     Goal test:     is there an empty clause in the current KB?
     Operators:      apply either simplification to one clause (removing
                    the longer version from the KB), or add to KB a clause
                    resulting from resolution of 2 clauses in the KB
                    (keeping the original clauses!)
     Path cost:     Sum of costs of operators applied to reach the current KB

 b) Consider the heuristic h(n) = number of literals in the shortest
    clause. Is h(n) an admissible heuristic? Prove it.

    Yes, because a KB with an empty clause would give h(KB)=0.
    Also, all operators cost at least 1. Now:
      Simplify, as described above, decreases the length of a clause by 1.
      Resolution takes 2 clauses, and the resulting clause is at least as
      long as the longest clasue, minus 1.
    Since no operator can decrease the length of the shortest clasue by more
    than 1, and each operator costs at least 1, the proof must cost at least
    as much as the length of the shortest clause. Thus, the heuristic
    is admissible.

 c) Find another another admissible heuristic h1 that performs better than h,
    for at least SOME cases (and provide an example or prove that it dominates
    h).

      Resolution can only decrease the length of the longer clasue if one
      of the clauses is a unit clause. Therefore, if there is no unit clause
      in the KB and no clause with duplicate literals, adding 1 to the length
      of the shortest clause still results in an admissible heuristic. Clearly
      h1 as defined here dominates h, so we need no examples...
      (Above is called "proof by intimidation".)
      
 d) Find a non-admissible heuristic h2 and an example where it performs better
    than h and h1.

    Multiplying an admissible heuristic by a large enough constant K will
    make a heuristic inadmissible. For example, making K approach infinity
    will result in A* acting like greedy search. The performance will thus
    be better every time greedy search works better than A*. An example
    is where there are many proofs of equal length, where there appears
    to be no initial headway. In this case, greedy will find a solution
    fast (sometimes even an optimal solution), while A* can take a very
    long time! Short (and rather stupid-looking) example follows:

      Initial KB (including negation of Y): ~A, A, B v B, C V C, D V D

    From the initial KB, we can apply resolution to the first 2 clauses
    to get the empty clause, and also simplify clauses 3, 4, 5, or 6.
    Now, using A* with either h or h1, the f-cost of all the next states
    is 2. So A* can easily choose to expand and of the nodes, and can choose
    incorrectly, doing a lot of unnecessary seacrh (even though the optimal
    goal is already on the agenda!).
    But using the inadmissible heuristic, first state has an h-cost of 0
    and an f-cost of 2 as before. The rest of the states have an h-cost of
    K which is much greater than 1, so their f-costs are much larger than
    2, thus clearly A* with the inadmissible heuristic next gets the goal state
    and finishes.

 e) Would keeping track of repeated states help in the search algorithm? If so,
    give an example.

    Definitely, becasue in very many cases states are multiply reachable.
    For example:

 f) Can the algorithm be made more efficient in SPACE requirements, if
    we need to find ANY proof, rather than the shortest proof? How, and
    how much more efficient, asymptotically? 

    YES in a BIG WAY. Note that any addition to the KB cannot damage
    soundness, so there is no need whatsoever to backtrack. Once we get
    a clause we know it follows from the KB. Note that this is much stronger
    than keeping track of repeated states, and costs much less in
    terms of memory, as we only need to keep one copy of the KB!
    This could be an exponential factor better in terms of memory, and even
    in terms of runtime.


2) Consider a 3-person complete information game between players A, B, and C,
   playing in turn (first A, then B, then C, and then the game ends).
   The game tree is defined as follows:

   A    B    C      scores
----------------------------
   A1   B1   C1     (100, 5, 20)
             C2     (50, 10, 0)
        B2   C1     (10, 20, 30)
             C2     (20, 0, 50)
   A2   B1   C1     (90, 0, 80)
             C2     (0, 20, 90)
   A3   B1   C1     (90, 0, 20)
             C2     (80, 30, 0)
        B2   C1     (80, 0, 40)
             C2     (0, 0, 0)
        B3   C1     (30, 20, 50)
   A4   B1   C1     (0, 90, 50)
             C2     (20, 70, 70)
        B2   C1     (100, 100, 0)
             C2     (95, 95, 1)
   
   Assume that the scores theoretically achievable by each player are
   all between 0 and  100.
   a) What is the best play for A, assuming that each player is out to improve
      its own score, and there is no communication between players?

        Simply use the MAX-MAX-MAX algorithm to evaluate the nodes of
        the tree:

   A    B    C      scores
----------------------------
   A1   B1   C1     (100, 5, 20)
             C2     (50, 10, 0)
         B1 =  (100, 5, 20)
        B2   C1     (10, 20, 30)
             C2     (20, 0, 50)
         B2 = (20, 0, 50)
    A1 = (100, 5, 20)

     Note that at this point A knows that choosing A1 will give a score
     of 100 for A, so there is no point looking elsewhere!   

   b) If players can reach agreements, will this make any difference
      (answer separately for each different type of agreements).

      If B and C can reach an agreement, then in A1 they can both get better
      score than (5 20). The agreement will be for B to move B2 and for
      C to move C1, resulting in (10 20 30). Since neither B nor C can do
      as well without the agreement, it is to the advantage of BOTH (note that
      for B to move B2 with no agreement is a serious error, as then
      C will want to move C2 and B will get 0!).  As a result, A should not
      move A1. If B and C can reach agreements, it is better for A to move
      A4 where there is no agreement possible between B and C that will
      improve both their score, and the the result will be (95, 95, 1)
      because B will move B2 to get 50 or 100, and C will move C2 to get 1
      rather than 0.

      Note that other 2-player agreements are useless for A, as with no
      agreements A gets 100. However, if there are competing agreements,
      the issue gets complicated, and beyond the scope of the exercise!


   c) How will the answer change if A is paranoid (i.e. believe that the other
      players will act to minimize A's score)?

      If A is paranoid, it can treat B-C as a super opponent that has a
      combined move. The best choice is then A1, because that is the only
      branch where A is guaranteed a score better than 0.

   d) Suppose that B always plays randomly with uniform probability. What
      is the best action by A now?

      This time, we can simply use EXPECTI-MINI-MAX to evaluate the tree,
      to get: 

   A    B    C      scores
----------------------------
   A1   B1   C1     (100, 5, 20)
             C2     (50, 10, 0)
         B1 =  (100, 5, 20)
        B2   C1     (10, 20, 30)
             C2     (20, 0, 50)
         B2 = (20, 0, 50)
    A1 = (60, 2.5, 35)     (Average of B1 and B2)
   A2   B1   C1     (90, 0, 80)
             C2     (0, 20, 90)
    A2 = (0, 20, 90)
   A3   B1   C1     (90, 0, 20)
             C2     (80, 30, 0)
         B1 = (90, 0, 20)
        B2   C1     (80, 0, 40)
             C2     (0, 0, 0)
         B2 = (80, 0, 40)
        B3   C1     (30, 20, 50)
         B3 = (30, 20, 50)
    A3 = (67, 7, 37)     (Approximate average of B1, B2, B3 (rounded))
   A4   B1   C1     (0, 90, 50)
             C2     (20, 70, 70)
         B1 = (20, 70, 70)
        B2   C1     (100, 100, 0)
             C2     (95, 95, 1)
         B2 = (95, 95, 1)
    A4 = (57.5, 82.5, 35.5)  (Average of B1, B2)

     The best score for A is A3, which gives A the expected value of 67.
   
   e) Will using alpha-beta pruning above (modified alpha-beta in case d)
      save any search in the search algorithm for the best move? What will
      be the savings in each case? Assume
      (and show) the best-case ordering for the search.

      For the original problem, the ordering is already optimal.

      When agreements can be made between B and C it appears that no
      pruning is possible (But in a more general case it may be).
 
      For the expectiminimax case, as currently ordered, only A4 B2
      and its children are pruned. That is because even if A gets 100 in
      that branch its reward will be no more than 60, and it knows
      it can do better in A3. If A3 were first, we could also prune, for
      the same reason, the branch A1 B1 (assuming A1 B2 expanded before
      A1 B1).


   f) How would you treat the problem if C could not observe the move
      made by B?

      The answer here is "depends". How does C model B? If C assumes that
      B moves in "MAX-MAX-MAX" fashion, then the result for the tree
      given is the same. However, if B has two equally good moves, C is
      in trouble. C can assume B chooses randomly between ties, or some
      other method (e.g. lexical) - here we are fully into the realm of
      multi-agent under uncertainty - which we do not address in this course.

3) A rational agent has a choice of algorithms to apply for a search problem.
   Suppose the agent need to minimize total operating costs,
   which are given as:  1000 * G + S, where G is the cost of the path taken
   by the agent, and S is the number of search steps used by the search
   algorithm.
   a)  Assuming that the agent knows the above, and reasons to optimize
       the cost, what type of agent is this (reactive? goal-based? etc.).

       Utility based agent, that employes META-REASONING.

   b) Assume that the agent has 3 choices:
      1) Use default actions without search, and use an immediately available
         solution with cost 1000.
      2) Use A*, where it has an admissible heuristic that is usually
         wrong by a factor of 2.
      3) Use RTA*, with the same heuristic, and a limit of K search steps.
   We assume that the branching factor is 2, and that the agent knows
   (correctly) that there exists a solution at depth 10. (Assume operator
   cost of 1, and  that the search space is acyclic).

    Cost for option 1 is easy: 1000000.

    Option 2 is harder to figure out. With operator cost 1 and branching
    factor 2, even BFS will generate about 1024 nodes. A* should do better,
    and will find the optimal solution for a cost of less than 10000 + 1024
    so A* is clearly better here IF the agent knows that there is a solution
    at depth 10 (and in fact there may be one closer!)

    Option 3 is even harder: how likely are we that RTA* will find the right
    step? Note that even an error of 1 will offset all the gains saved by
    lowering search time. So option 3 is worse than 2, though still better
    than option 1.

    Observe that changing search space characteristics, or even changing the
    factor between cost of search and cost of path, can change the optimal
    choice of algorithm.

   (Note: there may be insufficient information to answer b, but the point
    is to try to answer, and to figure out what's missing, if anything, and
    what further assumptions may be necessary).