Robotics (202-25161) - Spring 2002
Theoretical Exercise 2
Goals
Exercising material in chapters 6-9 of the Planning and Control book.
This includes stochastic control, planning under
uncertainty, meta-reasoning, and learning.
Questions
- An aircraft is moving (exactly) parallel to the x axis. At time
t=0 a GPS measurement stated its (x, y, z) coordinates as (0, 0, 100),
(all coordinates in meters)
and at time t=10 another GPS measurement gave the coordinates
(1010, -10, 105). According to the inertial navigation system,
the aircraft velocity is 100 meters per second, with a standard deviation of
10 meters per second. GPS measurements have a standard deviation of 10 in
x and y and a standard deviation of 5 in z. Assume that the errors along
the axes are independent.
What is the optimal estimate for aircraft location at t=10 and t=20,
and the standard deviations for the estimate?
- Continue to compute the optimal policy for 3 and 4 steps, for the 6-state
soccer MDP problem given in class.
- Write fuzzy control rules for the inverted pendulum control problem,
where there are 5 possible actions: do nothing, push left or push right,
each of the latter can be either a "normal" push or a "hard" push.
The observed state variables are pendulum angle
(using fuzzy sets: perpendicular, SLIGHTLY LEFT, LEFT, A LOT LEFT, OK,
SLIGHTLY RIGHT, RIGHT, A LOT RIGHT),
and pendulum angular speed (NONE, LEFT, FAST LEFT, RIGHT, FAST RIGHT).
Desired condition is 0 angle (i.e. fuzzy set OK).
Define the relevant fuzzy sets (characteristic function) and a set of
fuzzy control rules for achieving this task.
- Complete computation of the optimal action-sequence fr
the 3-state STL problem example given in class.
- Consider the folowing simplified game of poker. You and one opponent
get dealt a hand. The opponent makes a bet. You now have 2 actions:
fold, call (for simplicity, we shall ignore "raise"). The pot contains
50 NIS, and you need to pay another 10 NIS to call. You have a pair of Jacks.
Now, assume that you
judge that the probability that the opponent has bluffed is 0.2
(in which case - if you call - you win), otherwise,
the opponent will have BETTER than a pair of Jacks with probability 0.9.
Do you call or fold, if:
- This is the only hand you will ever play against this opponent.
- You expect to play 1 more game against this opponent, and are sure to
be AGAIN in the exact same situation, except that if you called, the
probability that the opponent will bluff again changes depending on the
outcome - if she bluffed, she is prone to bluff and will bluff again
with probability 0.4. Otherwise, probability of bluff goes down to 0.1.
- You have 3 employees, which have the following (certain)
performance profiles. Employee A gives no results at all initially, but after
10 hours provides a comes up with a product worth 100 NIS, and thereafter
goes to sleep forever.
Employee B produces products worth 5 NIS per
hour, until 10 hours of work, at which point gains only 2
NIS per hour. Employee C provides results worth 10 NIS per hour, but
goes to sleep forever after 8 hours. You have the following deadlines:
for employee A's product - 15 hours, for employee B and C products - 20
hours. You can only employ one person at any given time. What is the optimal
employment schedule? Can you use the back-to-front algorithm? How can you
produce the optimum, given that the only non-concave profiles are
step-functions, and that their number is bounded (and small)?
- You need to learn the function OR in a perceptron. Start with weights
all zero and a threshold of 1 (assume it cannot be changed),
and show how the delta rule is used to train this network,
with a learning rate 0f 0.4.
Now, suppose you need to learn the "equals" predicate (result is 1 only
if the inputs are equal). Can this be done with a single perceptron?
Can this be done with more than 1 layer of perceptrons?
- Use K-means clustering (manually)
to cluster the following set of points in 2
dimensions: (0, 0), (0, 2), (0, 4), (3, 0). Show how an initial choice
for the cluster center seeds and order of points considered in the algorithm
changes the obtained results.
Due to the short time remaining this is NOT a mandatory
exercise. However, it is best to do it anyway before Tuesday,
June 18 as preparation for the second midterm exam.