Introduction to Artificial Intelligence

Assignment 1

Environment simulator and agents for the Harassed Citizen Problem

In this first exercise you are asked to implement an environment simulator that runs a path optimization problem with resources and locks. Then, you will implement some agents that live in the environment and evaluate their performance.

We are given a weighted graph, and the goal is to travel as cheaply as possible from a given start vertex to a given goal vertex. However, unlike standard shortest path problems in graphs, which have easy known efficient solution methods (e.g. the Dijkstra algorithm), here the problem is that some of the vertices are blocked, and can only unblocked if the agent is carrying an apporpriate resource (key), which can be found in other vertices of the graph (which may also be blocked). This is a problem encountered in many real-world settings, such as when you try to construct housing: you need to purchase land, obtain building permits, etc. But to do that you must obtain proofs that you have (or are) a licensed constructor. You also have to have sufficient funds, etc. and must pass many steps of obtaining permits to obtain permits in varous steps of the bureaucracy.

Harassed Citizen problem environment

The environment consists of a weighted unidrected graph. Each node can contain resources (keys), and may or may not be blocked. Resources (keys) and blockages (locks) all have IDs: a key is intended to act on a lock with the same ID. An agent at a vertex automatically picks up all the keys at this vertex just before starting the next move. If resource limit is enforced, the resources with the smallest IDs are picked. Entering a locked vertex is only possible when the agent has the appropriate key, at which point the lock and key both disappear.

An agent can only do no-op (costing 1) or traverse actions. The cost of traverse actions is equal to the edge weight. The action always succeeds if the target vertex is unblocked (or becomes unblocked due to having the correct key(s)), and fails otherwise. For a failed action, the agent stays at the same location but still pays the cost of traversal.

The simulator should keep track of score, i.e. the number of actions done by each agent, and the total cost encurred by the agent for traversing edges, etc.

Implementation part I: simulator + simple agents

Initially you will implement the environment simulator, and several simple (non-AI) agents. The environment simulator should start up by reading the graph from a file, in a format of your choice. We suggest using a simple adjancency list in an ASCII file, that initially specifies the number of vertices. For example (comments beginning with a semicolon):

#V 4    ; number of vertices n in graph (from 1 to n)

#E 1 2 W1                 ; Edge from vertex 1 to vertex 2, weight 1
#E 3 4 W1                 ; Edge from vertex 3 to vertex 4, weight 1
#E 2 3 W1                 ; Edge from vertex 2 to vertex 3, weight 1
#E 1 3 W4                 ; Edge from vertex 1 to vertex 3, weight 4
#E 2 4 W5                 ; Edge from vertex 2 to vertex 4, weight 5
#V 1 K 1 Form1            ; Vertex 1 initially contains a key with ID 1. Last word is an optional NAME of resource/key
#V 1 K 2 BribeMoney       ; Vertex 1 initially contains a key with ID 2.
#V 2 L 1 FileForm         ; Vertex 2 is blocked, requires key ID 1 to unblock. Last word is optional NAME of block
#V 2 L 2 HungryOfficial   ; Vertex 2 is blocked, requires key ID 2 to unblock.

The simulator should query the user about the number of agents and what agent program to use for each of them, from a list defined below. Initialization parameters for each agent (initial and goal position) are also to be queried from the user.

After the above initialization, the simulator should run each agent in turn, performing the actions retured by the agents, and update the world accordingly. Additionally, the simulator should be capable of displaying the state of the world after each step, with the appropriate state of the agents and their score. A simple screen display in ASCII is sufficient (no bonuses for fancy graphics - this is not the point in this course!).

Each agent program (a function) works as follows. The agent is called by the simulator, together with a set of observations. The agent returns a move to be carried out in the current world state. The agent is allowed to keep an internal state (for example, a computed optimal path, or anything else desired) if needed. In this assignment, the agents can observe the entire state of the world.

You should implement the following agents:

A human agent, i.e. print the state, read the next move from the user, and return it to the simulator. This is used for debugging and evaluating the program.
A greedy agent, that works as follows: the agent should compute the shortest currently unblocked path to its goal, and try to follow it. If there is no such path, do no-op.
A greedy key collector agent, that works as follows: the agent should compute the shortest unblocked path (counting each edge as 1, ignoring the weights) to the nearest key location, and traverse the first edge on that path. Prefer the lowest-numbered node in case of ties. If there is no such path, do no-op.

At this stage, you should run the environment with three agents participating in each run: one greedy agent, one key collector agent, and one other agent that can be chosen by the user. Your program should display and record the scores. In particular, you should run the greedy agent with various initial configurations. Also, test your environment with several agents in the same run, to see that it works correctly. You would be advised to do a good job here w.r.t. program extensibility, modularity, etc. much of this code may be used for some of the following assignments as well.

Implementation part II: search agents

Now, after chapter 4, you will be implementing intelligent agents (this is part 2 of the assignment) that need to act in this environment. Each agent should assume that it is acting alone, regardless of whether it is true. You will be implementing a "search" agent as defined below. All the algorithms will use a heuristic evaluation function of your choice.

A greedy search agent, that picks the move with best immediate heuristic value.
An agent using A* search, with the same heuristic.
An agent using a simplified version of real-time A*.

The performance measure will be composed of two parts: S, the agent's score, and T, the number of search expansion steps performed by the search algorithm. The performance of an agent will be:

   P = f * S + T

Clearly, a better agent will have P as small as possible. The parameter f is a weight constant. You should compare the performance of the three agents (each acting alone) for the following values of f: 1, 100, 10000. Note that the higher the f parameter, the more important it is to expend computational resources in order to get a better score!

Bonus version: construct a search agent as above, but in addition allow one key collector agent also acting in the environment. Your search agent needs to take this into account. Observe that although this seems to be a multi-agent problem, the fact that key collector agents are perfectly predictable makes this in essence a single agent search problem.

Addtional bonus - theoretical: What is the computational complexity of the Harasssed Citizen Problem (single agent)? Can you prove that it is NP-hard? Or is it in P? If the latter, can you design an algorithm that solves the problem in polynomial time?

Deliverables

The program and code sent to the grader, by e-mail or otherwise as specified by the grader, a printout of the code and results. A document stating the heuristic function you used and the rationale for selecting this function. Set up a time for frontal grading checking of the delivered assignment, in which both members of each team must demonstrate at least some familiarity with their program...

Due date for part 1 (recommended, not enforced): Thursday, November 10.

For the complete assignment: Thursday, November 24.