Introduction to Artificial Inteligence

Assignment 6


Written assignment: probabilistic inference and learning

Justify all answers!

1) We have the following distribution over the binary variables A, B, C, given as a full table:

   A     B     C     probability
  -------------------------------
   F     F     F       0.125
   F     F     T       0.125
   F     T     F       0.125
   F     T     T       0.125
   T     F     F       0.02
   T     F     T       0.08
   T     T     F       0.08
   T     T     T       0.32

a)   Construct the Bayes network that most compactly describes the distribution (i.e. one that has
     the smallest number of arcs). 

b) Is the answer unique?

2) Consider the following variables and their immediate causes (parents):

     variable         parents
-------------------------------
        A               none
        B               A
        C               B D E
        D               A
        E               D
        F               E

    a) Is this network a poly-tree?
    b) Determine the truth of the following independence
       statements (using d-separation):
       1)  I({A}, {E} | {})
       2)  I({A}, {E} | {C})
       3)  I({A}, {E} | {D})
       4)  I({D}, {B} | {})
       5)  I({B}, {D} | {A})
       6)  I({B}, {D} | {A, C})
       7)  I({A}, {F} | {E})
    c) Assume that the nodes are binary-valued.
       Suppose that P(A=true) = 0.8, P(D=true|A=true) = 0.5, P(D=true|A=false) = 0.1,
       P(E=true|D=true) = 0.5, P(E=true|D=false) = 0.2

       Find P(A=true | E=true, F=true)

       Hint: you may want to preprocess to get rid of nodes before doing any computation.

3) You need to construct a 4-input neural network that has value 1 in the 
   output just when an even number of its inputs are 1, and 0 otherwise
   (assume inputs are in {0, 1}).
   a) Can this be done without hidden units?
   b) Show a network using threshold elements that performs
      the required computation (specify the weights, too!)

4) a) Using the perceptron learning rule (the "delta" rule), show the steps
      of learning 3-input AND function with a single perceptron
      and a step activation function with threshold 1. Initially all
      weights are 0.9, and the learning rate is 0.5.
      Note: in this case use the constant 1 instead of the derivative
      of the g() function.

   b) Show (by example) that the order of presenting the examples can
      change the number of training episodes.


5)  A project leader in the DT Prosero project would like to construct a
    decision procedure based on results from hiring employees in the IMG4
    consortium. The decision procedure will be based on a decision tree, using information
    in the case database. Attributes for each employee are:
    degree level (A),  Java programming ability (B), self-motivation (C), grade in AI (D).
    The target attribute is the "decision", describing the amount of money to
    pay the employee.

  a) Construct by hand (using the greedy search using the best information
     gain heuristic) a consistent decision tree for the following table:

     input variables               decision
   A      B      C      D
--------------------------------------------
   1      F      H      95             8
   1      T      L      70             4
   1      F      L      70             4
   1      T      M      95             6
   1      T      H      80             6
   2      T      H      95             8
   2      T      H      80             8
   2      F      H      95             8

  b) Is a more compact (fewer internal nodes) decision tree possible for the above table?
     If no - prove it. If yes, which tree?


6)  Consider the following taxation scenario, occuring in a certain middle-eastern country.
    We will assume that you are a risk-neutral, i.e. that
    your utility function is linear in the amount of money that your assets are worth.
    Assume also that moral considerations are not part of your utility function.
 
    Your accountant has notified you that your taxes for FY 2005-2006 amount to 1,000,000 NIS.
    You have the following options:
       1) Declare the correct taxes verbatim, and pay the full amount.
       2) Attempt to use a tax shelter, in which case if you succeed you will only need to
          pay 100,000 NIS. However, a senior tax collection evaluator will then assess
          your forms, and can either accept it, or refuse, in which case you will also pay
          fines, and your total taxes will be 2,000,000 NIS.

    There are 4 possible evaluators, equally likely to be on your case:

    Evaluator A will accept your claim with probability 0.5

    Evaluator B will accept your claim with probability 0.2, but in the rest of 80% of the
       cases can be convinced that your claim is justifiable with probability 0.9, if
       100,000 NIS (from your money) can be redirected to better causes
       (such as helping out with a certain project of evaluator B's cousin).

    Evaluators C and D will never accept your claim.

     a) Which of the above taxation options should you prefer as a rational agent?

     b) Suppose that a "friend" of the accountant working at the treasury department can give you a "hint"
        about which evaluator will be on your case, if you grease his palm.
        How much is this information worth, given that your "friend" is perfectly reliable in this respect?



7) Consider the following MDP, where scoring is total value of rewards obtained in 3 steps.
   The world has 4 locations in linear sequence, as follows:  F1 I F2 P
   Actions are: R attempts to move right, and L attempts to move left. Locations F1 and F2 contains
   flags, which have value of 1 and 10, respectively. A robot visiting a location with a flag
   gets a reward equal to the flag value, and then the flag disappears. P is an absorbing state,
   with a reward of -100.

   Transition probabilities are based on: motion succeeds with probability 0.8, but with probability
   0.2 the robot either stays in the same location, or moves in the opposite direction (if possible),
   with probability 0.1 each (0.2 to stay in place if at end of sequence).

   b1) Formalize the above problem as an MDP, i.e. what are the states, the transition probability
       function, and the reward function?

   b2) Find the optimal policy.

Deadline: Wednesday, January 17, 2007, at 12 noon.

Note: submission (as well as WORK) on this exercise is SOLO (that is, NOT in pairs or other groups of size > 1)