Introduction to Artificial Inteligence

Assignment 6


Written assignment: probabilistic inference and learning

Justify all answers!

1) We have the following distribution over the binary variables A, B, C, given as a full table:

   A     B     C     probability
  -------------------------------
   F     F     F       0.105
   F     F     T       0.315
   F     T     F       0.07
   F     T     T       0.21
   T     F     F       0.135
   T     F     T       0.045
   T     T     F       0.09
   T     T     T       0.03

a)   Construct the Bayes network that most compactly describes the distribution (i.e. one that has
     the smallest number of arcs). 

b) Is the answer unique?

2) Consider the following variables and their immediate causes (parents):

     variable         parents
-------------------------------
        A               none
        B               none
        C               A B
        D               A B
        E               C
        F               D

    a) Is this network a poly-tree?
    b) Is this network directed-path singly connected?
    c) Determine the truth of the following independence
       statements (using d-separation):
       1)  I({A}, {E} | {})
       2)  I({A}, {E} | {C})
       3)  I({A}, {B} | {})
       4)  I({A}, {B} | {F})
       5)  I({A}, {E, F} | {C})
       6)  I({A}, {F} | {C, D})
    d) Suppose that P(A) = 0.8, P(E|C) = 0.2, P(E|~C) = 0.4, P(B) = 0.1, P(C|A,B) = 0.3, P(C|~A,B) = 0.9

       Find P(A|B, C, E)

3) You need to construct a 4-input neural network that has value 1 in the 
   output just when an odd number of its inputs are 1, and 0 otherwise
   (assume inputs are in {0, 1}).
   a) Can this be done without hidden units?
   b) Show a network using threshold elements that performs
      the required computation (specify the weights, too!)

4) a) Using the perceptron learning rule (the "delta" rule), show the steps
      of learning 3-input function MAJORITY with a single perceptron
      and a step activation function with threshold 1. Initially all
      weights are 0.7, and the learning rate is 0.5.

   b) Show (by example) that the order of presenting the examples can
      change the number of training episodes.


5)  A project leader in the IMG4 consortium would like to construct a
    decision procedure based on results from hiring employees in the earlier KITE
    consortium. The decision procedure will be based on a decision tree, using information
    in the case database. Attributes for each employee are:
    degree level (A),  industrial work experience (B), self-motivation (C), grade in AI (D).
    The target attribute is the "decision", describing how good the employee is for the
    project.

  a) Construct by hand (using the greedy search using the best information
     gain heuristic) a consistent decision tree for the following table:

     input variables               decision
   A      B      C      D
--------------------------------------------
   2      T      H      95             good
   2      T      H      80             good
   2      F      H      95             good
   1      F      H      95             good
   1      T      L      70             bad
   1      F      L      70             bad
   1      T      M      95             average
   1      T      H      80             average

  b) Is a more compact (fewer internal nodes) decision tree possible for the above table?
     If no - prove it. If yes, which tree?


6)  Consider the following investment scenario, occuring in a certain middle-eastern country.
    We will assume that you are risk-neutral, i.e. that
    your utility function is linear in the amount of money that your assets are worth.
 
    You have 1,000,000 NIS you wish to invest, with the following options: 
       1) Buy a house and rent it out for net gain of 40,000 NIS per year (after taxes, etc.)
       2) Buy government bonds for a 5% yearly interest, out of which you need to pay 15% capital
          gains tax. However, there is talk that the capitals gains tax will be raised to 30% very
          soon, and suppose that you credit this rumor with a 0.5 probability of being accurate
          (and that if false, the capital gains tax will remain unchanged).

     a) Which of the above investments should you make as a rational agent?

     b) Suppose that a "friend" working at the treasury department can give you a "hint"
        on whether the capital gains tax will actually be raised. Given that your intended investment
        is for exactly one year, how much (maximum) should you as an intelligent agent be willing to pay this
        employee, given that 1) he is perfectly reliable, and 2) moral considerations are not
        part of your utility function.


Deadline: Tuesday, June 14, 2005, at 10AM.

Note: submission (as well as WORK) on this exercise is SOLO (that is, NOT in pairs or other groups of size > 1)

Below are additional exercises on the required material. These exercises are optional, but recommended.

a) Consider a world with the following operators:

   Op(Action: pickup(x)
      Precond: Location(Robot, y) and Location(x, y) and not Holding(x)      /* Exists y such that... */
      Effect:  Holding(x))

   Op(Action: go(from, to)
      Precond: Location(Robot, from)
      Effect:  Location(Robot, to)

The semantics of the Location predicate is such that an object can only be at one location.

a1) The initial state is: Location(Robot, Room1) and not holding(Box1) and Location(Box1, Room2)

 The final state should have: Holding(Box1)

 Show the steps for solving this planning problem using partial-order planning (POP).

a2) Add an operator that supports the robot carrying the box, and show how from the initial state
    how a plan is developed for achieving:  Location(Box1, Room1) using POP.


b) Consider the following MDP, where scoring is total value of rewards obtained in 3 steps.
   The world has 4 locations in linear sequence, as follows:  F1 I F2 P
   Actions are: R attempts to move right, and L attempts to move left. Locations F1 and F2 contains
   flags, which have value of 1 and 10, respectively. A robot visiting a location with a flag
   gets a reward equal to the flag value, and then the flag disappears. P is an absorbing state,
   with a reward of -100.

   Transition probabilities are based on: motion succeeds with probability 0.8, but with probability
   0.2 the robot either stays in the same location, or moves in the opposite direction (if possible),
   with probability 0.1 each (0.2 to stay in place if at end of sequence).

   b1) Formalize the above problem as an MDP, i.e. what are the states, the transition probability
       function, and the reward function?

   b2) Find the optimal policy.