Robotics (202-25161) - Spring 2002

Theoretical Exercise 2


Goals

Exercising material in chapters 6-9 of the Planning and Control book. This includes stochastic control, planning under uncertainty, meta-reasoning, and learning.

Questions

  1. An aircraft is moving (exactly) parallel to the x axis. At time t=0 a GPS measurement stated its (x, y, z) coordinates as (0, 0, 100), (all coordinates in meters) and at time t=10 another GPS measurement gave the coordinates (1010, -10, 105). According to the inertial navigation system, the aircraft velocity is 100 meters per second, with a standard deviation of 10 meters per second. GPS measurements have a standard deviation of 10 in x and y and a standard deviation of 5 in z. Assume that the errors along the axes are independent. What is the optimal estimate for aircraft location at t=10 and t=20, and the standard deviations for the estimate?
  2. Continue to compute the optimal policy for 3 and 4 steps, for the 6-state soccer MDP problem given in class.
  3. Write fuzzy control rules for the inverted pendulum control problem, where there are 5 possible actions: do nothing, push left or push right, each of the latter can be either a "normal" push or a "hard" push. The observed state variables are pendulum angle (using fuzzy sets: perpendicular, SLIGHTLY LEFT, LEFT, A LOT LEFT, OK, SLIGHTLY RIGHT, RIGHT, A LOT RIGHT), and pendulum angular speed (NONE, LEFT, FAST LEFT, RIGHT, FAST RIGHT). Desired condition is 0 angle (i.e. fuzzy set OK). Define the relevant fuzzy sets (characteristic function) and a set of fuzzy control rules for achieving this task.
  4. Complete computation of the optimal action-sequence fr the 3-state STL problem example given in class.
  5. Consider the folowing simplified game of poker. You and one opponent get dealt a hand. The opponent makes a bet. You now have 2 actions: fold, call (for simplicity, we shall ignore "raise"). The pot contains 50 NIS, and you need to pay another 10 NIS to call. You have a pair of Jacks. Now, assume that you judge that the probability that the opponent has bluffed is 0.2 (in which case - if you call - you win), otherwise, the opponent will have BETTER than a pair of Jacks with probability 0.9. Do you call or fold, if:
  6. You have 3 employees, which have the following (certain) performance profiles. Employee A gives no results at all initially, but after 10 hours provides a comes up with a product worth 100 NIS, and thereafter goes to sleep forever. Employee B produces products worth 5 NIS per hour, until 10 hours of work, at which point gains only 2 NIS per hour. Employee C provides results worth 10 NIS per hour, but goes to sleep forever after 8 hours. You have the following deadlines: for employee A's product - 15 hours, for employee B and C products - 20 hours. You can only employ one person at any given time. What is the optimal employment schedule? Can you use the back-to-front algorithm? How can you produce the optimum, given that the only non-concave profiles are step-functions, and that their number is bounded (and small)?
  7. You need to learn the function OR in a perceptron. Start with weights all zero and a threshold of 1 (assume it cannot be changed), and show how the delta rule is used to train this network, with a learning rate 0f 0.4. Now, suppose you need to learn the "equals" predicate (result is 1 only if the inputs are equal). Can this be done with a single perceptron? Can this be done with more than 1 layer of perceptrons?
  8. Use K-means clustering (manually) to cluster the following set of points in 2 dimensions: (0, 0), (0, 2), (0, 4), (3, 0). Show how an initial choice for the cluster center seeds and order of points considered in the algorithm changes the obtained results.

Due to the short time remaining this is NOT a mandatory exercise. However, it is best to do it anyway before Tuesday, June 18 as preparation for the second midterm exam.