# Introduction to Artificial Intelligence

## Environment simulator and agents for the Canadian traveler problem

In this first exercise you are asked to implement an environment simulator that runs a variant of the Canadian traveller problem (CTP). Then, you will implement some agents that live in the environment and evaluate their performance.

In the Canadian traveller problem, we are given a weigted graph, and the goal is to travel as cheaply as possible from a given start vertex to a given goal vertex. However, unlike standard shortest path in graph problems, which have known solutions (e.g. the Dijkstra algorithm), here the problem is that some of the edges (an edge being an abstraction of a road in the real world) may be blocked by snow or ice, this being the Canadian traveller problem.

In the open research problem version of CTP, the agent can only tell which edges are blocked when visiting an incedent vertex. This leads to a problem of shortest path under uncertainty, with which we are not ready to deal yet. Instead, we will introduce a variant with certain knowledge but also add ways to unblock paths, making this an interesting search problem.

### CTP Environment

The environment consists of a weighted graph. Each edge can be either blocked by a given amount of ice, or clear. Some vertices are known to contain a supply of salt. The environment can contain one or more agents, each with a known starting location, and a required goal location. For each agent, there are state variables designating it current location, and the amount of salt it is carrying, initially zero.

An agent can apply 2 types of action: move (as in the standard CTP), and load salt. The latter is a parametrized action, one must specify S, how much salt to load. The result of this action is that the amount of salt that the agent is carrying is increased by S, or by the amount of salt in the current vertex, whichever is less, up to a maximum capacity of salt that an agent can carry (an optional parameter). The results of a move (from a current vertex V to a specific adjacent vertex U) action is as follows. If the edge E to be traversed is clear, the agent reaches U. If E is blocked, and the agent carries a sufficient amount of salt S (at least equal to the amount of ice on E), the required amount of salt is deposited on the ice as the agent traverses E, and again the agent reaches U. Side effects of this action are that E is now clear, and the amount of salt the agent carries is reduced by S. However, if E is blocked but the agent does not have sufficient salt, the action fails, and the agent remains at V.

The simulator should keep track of score, i.e. the number of actions done by each agent, and the total weight of edges traversed by the agent. An agent carrying S salt units is penalized by having its cost increased by a factor that is a function of S. For example, the cost of traversal could be multiplied by (1+S). So while carrying salt can be useful, it is also costly. For a traversal action that deposits salt, we will use the above formula, assuming the amount of salt S the agent holds just before the action. We will also assume that a failed action costs as much as if the action had succeeded, so as to discourage the agents from making stupid move attempts.

### Implementation part I: simulator + simple agents

Initially you will implement the environment simulator, and several simple (non-AI) agents. The environment simulator should start up by reading the graph from a file, in a format of your choice. We suggest using a simple adgencency list in an ASCII file, that initially specifies the number of vertices. For example (comments beginning with a semicolon):

```#V 4    ; number of vertices n in graph (from 1 to n)

#E 1 2 W1 C   ; Edge from vertex 1 to vertex 2, weight 1, clear
#E 2 3 W1 C   ; Edge from vertex 2 to vertex 3, weight 1,clear
#E 3 4 W1 I1  ; Edge from vertex 2 to vertex 3, weight 1, icy (one unit of ice)
#E 2 3 W1 C   ; Edge from vertex 2 to vertex 3, weight 1, clear
#E 2 4 W5 C   ; Edge from vertex 2 to vertex 4, weight 5, clear

#S1 2    ; Two units of salt at vertex 1
```

The simulator should query the user about the number of agents and what agent program to use for each of them, from a list defined below. Initialization parameters for each agent (initial and goal position) are also to be queried from the user.

After the above initialization, the simulator should run each agent in turn, performing the actions retured by the agents, and update the world accordingly. Additionally, the simulator should be capable of displaying the state of the world after each step, with the appropriate state of the agents and their score. A simple screen display in ASCII is sufficient (no bonuses for fancy graphics - this is not the point in this course!).

Each agent program (a function) works as follows. The agent is called by the simulator, together with a set of observations. The agent returns a move to be carried out in the current world state. The agent is allowed to keep an internal state (for example, a computed optimal path, or anything else desired) if needed. In this assignment, the agents can observe the entire state of the world.

You should implement the following agents:

1. A human agent, i.e. read the next move from the user, and return it to the simulator.
2. A ice remover automaton. This agent will work as follows: find the closest salt, go and pick it all up, then as long as it is still carrying a sufficient amount of salt, go to the nearest vertex with appropriate blocked edge and traverse it (thus clearing it), go and load more salt if needed, until no other actions are indicated, in which case it stops. Ties are broken in favor of always going for lower-numbered vertices.
3. A greedy agent, that works as follows: the agent should compute the shortest currently unblocked path to its target, and traverse the first edge in that direction.

At this stage, you should run the environment with two agents participating in each run: ice remover automaton, and one other agent that can be chosen by the user. Your program should display and record the scores. In particular, you should run the greedy agent with various initial configurations, and various initial locations of the oice remover automaton, Also, test your environment with several agents in the same run, to see that it works correctly. You would be advised to do a good job here w.r.t. program extensibility, modularity, etc. much of this code may be used for some of the following assignments as well.

### Implementation part II: search agents

Now, after chapter 4, you will be implementing intelligent agents (this is part 2 of the assignment) that need to act in this environment, assuming that it is acting alone. You will be implementing a "search" agent as defined below. All the algorithms will use a heuristic evaluation function of your choice. We will be assuming for simplicity that the load action can only have a parameter of 1 or 2 units of salt, and that the agent's maximum capacity is 3 units.

1. A greedy search agent, that picks the move with best immediate heuristic value.
2. An agent using A* search, with the same heuristic.
3. An agent using a simplified version of real-time A*.

The performance measure will be composed of two parts: S, the agent's score, and T, the number of search expansion steps performed by the search algorithm. The performance of an agent will be:

```   P = f * S + T
```

Clearly, a better agent will have P as small as possible. The parameter f is a weight constant. You should compare the performance of the three agents against the completely passive opponent for the following values of f: 1, 100, 10000. Note that the higher the f parameter, the more important it is to expend computational resources in order to get a better score!

Bonus version: construct a search agent as above, but in addition allow a known ice remover automaton to also acting in the environment. Your search agent needs to take this into account. Observe that although this seems to be a multi-agent problem, the fact that the ice remover is perfectly predictable makes this in essence a single agent search problem.

## Deliverables

The program and code sent to the grader, by e-mail or otherwise as specified by the grader, a printout of the code and results. A document stating the heuristic function you used and the rationale for selecting this function. Set up a time for frontal grading checking of the delivered assignment, in which both members of each team must demonstrate at least some familiarity with their program...

Due date for part 1 (recommended): November 2, 2009.

For the complete assignment: November 16, 2009.