Introduction to Artificial Inteligence

Assignment 5


Programming assignment - Decision-making under uncertainty

Syrian Traveler Problem (bonus assignment)

Goals

Sequential decision making under uncertainty using belief-state MDP for decision-making: the Syrian traveller problem. (This is an obscure variant of the Canadian treveler problem.)

Syrian Traveller Problem - Domain Description

The domain description is silmilar to that described in assignment 1, except that now we do not know the locations of the terrorists, similar to assignment 4. For simplicity, however, we will assume that the terrorists appear independently, with a known given probability, at each edge. They are revealed with certainty when an agent reaches a vertex incident on the edge.

Thus, in the current problem to be solved, we are given a weighted undirected graph, where each edge has a known probability of having terrorists. The probabilities of terrorists are mutually independent. The agent's only action are driving between vertices, with or without an army escort, and with or without chems, as before. Traversal costs are as defined in assignment 1. Also for simplicity, we will assume only one chem at vertex s, only one goal at vertex t, and at most 1 army. The problem is to find a policy that brings the chem from s to t, that has minimal expected cost.

The graph can be provided in a manner similar to previous assignments, for example:

#V 4    ; number of vertices n in graph (from 1 to n)

#E 1 2 W3 T0.1   ; Edge from vertex 1 to vertex 2, weight 3, terrorist probability 0.1
#E 2 3 W2 T0     ; Edge from vertex 2 to vertex 3, weight 2, terrorist probability 0
#E 3 4 W3 T0.3   ; Edge from vertex 3 to vertex 4, weight 3, terrorist probability 0.3
#E 2 4 W4 T0.3   ; Edge from vertex 2 to vertex 4, weight 4, terrorist probability 0.3
#E 1 4 W100 T 0  ; Edge from vertex 1 to vertex 4, weight 100, terrorist prob. 0

#A 2             ; Army at vertex 1

The start, chem, and goal vertices, should be determined via some form of user input (a file or querying the user). For example, in the above graph the start vertex could be 1, and the goal vertex could be 4.

Solution method

The Canadian traveller problem is known to be PSPACE-complete (and it is likely that the Syrian traveler problem is also PCSPACE complete) so you will be required to solve only very small instances. We will require that the entire belief space be stored in memory explicitly, and thus impose a limit of at most 10 edges with possible terrorists in the graph. Your program should initialize belief space value functions and use a form of value iteration (discussed in class) to compute the value function for the belief states. Maintain the optimal action during the value iteration for each belief state, so that you have the optimal policy at convergence.

Requirements

Your program should read the data, including the parameters (start, chem, and goal vertices). You should construct the policy, and present it in some way. Provide at least the following types of output:

  1. A full printout of the value of each belief-state, and the optimal action in that belief state, if it exists. (Print something that indicates so if this state is irregular, e.g. if it is unreachable).
  2. Run a sequence of simulations. That is, generate a graph instance (ice locations) according to the ice distributions, and run the agent through this graph based on the (optimal) policy computed by your algorithm. Display the graph instance and sequence of actions. Allow the user to run additional simulations, for a newly generated instance in each case.

Deliverables

  1. Source code and executable files of programs.
  2. Explanation of the method employed in your algorithm.
  3. Non-trivial example runs on at least 2 scenarios, including the input and output.
  4. Submit makefile and short description on how to run your program.

Deadline: January 20, 2014