Introduction to Artificial Intelligence

Assignment 1

Environment simulator and agents for simplified Unreal Tournament (TM)

In this first exercise you are asked to implement an environment simulator that runs a simplified discrete version of the Unreal Tournament game. Then, you will implement some agents that live in the environment and evaluate their performance.

The game is played on a 2 dimensional finite grid, where each location may contain objects, and agents move among grid locations, much like the version of the wumpus world in the Russel and Norvig Intro to AI book (chapter 6 in 1st edition, chapter 7 in 2nd edition). However, the rules of the game and the objects, as well as the goals, are different and are explained below.

Game Environment

Each grid location (square) can be either an obstacle, a patch of ice, or a free square. Assume that all the perimeter consists of obstacles. Free squares and ice are available for agent motions, but may contain objects. The type of objects in the game are flags and special objects (such as weapons, ammunition, etc.). We will ignore the special objects in this assignment. Each flag has an associated reward value: an agent stepping on a flag earns the reward (essentially "picking up" the flag, which then cannot be picked up again). An icy patch is a square where an agent can move, but cannot change its motion direction or stop unless it hits a wall.

An agent can perform the following actions: move up, move down, move right, and move left. The agent score decreases by 1 for each action it performs. The agent state is its grid position and last motion direction, and its state of health. An action always succeeds, unless the agent attempts to move into a wall or to a location occupied by another agent, or in icy patches. An agent in an icy patch continues to move in the same sirection it moved in the last turn, unless the agent hits a wall or was not moving in the last turn. Assume that hitting a wall occurs in the turn after arriving at the icy patch. We will assume that the game is turn based. Some of the agents are armed sentries. An agent moving into the field of fire of a sentry dies, and becomes a ghost (it can still move, but cannot affect the state of the environment). The field of fire is assumed to be any square in the row or column as the sentry, unless occluded by an obstacle.

Implementation Steps

Initially you will implement the environment simulator, and several simple (non-AI) agents. The environment simulator should start up by reading a map of the world from a file, in a format of your choice. We suggest the following simple format: an ASCII file containing the world map, each row on a separate line, with blank representing free space, and characters representing obstructions and flags. See an example map below, with # representing obstructions, "." representing an icy patch, and F representing a flag.

##################
#   F         ####
###    ####      #
# ###.## ....#####
#      F ....    #
#         #   F  #
#         #      #
##################

The simulator should query the user about the number of agents and what agent program to use for each of them, from a list defined below. Initialization parameters for each agent (position (and initial motion direction for the dumb automaton)) are also to be queried from the user.

After the above initialization, the simulator should run each agent in turn, performing the actions retured by the agents, and update the world accordingly. Additionally, the simulator should be capable of displaying the world map after each step, with the appropriate state of the agents and their score. (In this case, total value of flags collected, minus total number of actions). A simple screen display in ASCII is sufficient (no bonuses for fancy graphics - this is not the point in this course! But you may wish to use the graphics interface provided.) To simplify this task, you may assume that the width of the map is at most 70, and its height is at most 30.

Each agent program (a function) works as follows. The agent is called by the simulator, together with a set of observations. The agent returns a move to be carried out in the current world state. The agent is allowed to keep an internal state (for example, a computed optimal path, or anything else desired) if needed. In this assignment, the agents can observe the entire state of the world.

You should implement the following agents:

A human agent, i.e. read the next move from the user, and return it to the simulator.
A dumb automaton agent. This agent will work as follows: the agent starts out moving left, until it hits an obstacle. Then it starts moving right until it hits an obstacle, and repeats this indefinitely.
A dumb greedy agent, that works as follows: the agent should find the goal (flag) with a smallest Manhattan distance from the current position, and make a move in that direction.

At this stage, you should run the environment with two agents participating in each run: one dumb automaton acting as a sentry, and one other agent that can be chosen by the user. Your program should display and record the scores. In particular, you should run the dumb greedy agent with various initial configurations and flag locations, and various initial locations of the opposing dumb automaton agent. Also, test your environment with several agents in the same run, to see that it works correctly.

Later on, after chapter 4, you will be implementing intelligent agents (this is part 2 of the assignment) that need to act in this environment, assuming that it is populated by one dumb automaton sentry and one "search" agent as defined below. All the algorithms will use a heuristic evaluation function of your choice. Note that although your agent supposedly competes against an opposing agent, the fact that the opposing agent is a dumb automaton allows your agent to model the opposition simply as part of a static and deterministic environment!

A greedy search agent, that picks the move with best immediate heuristic value.
An agent using A* search, with the same heuristic.
An agent using real-time A*.

The performance measure will be composed of two parts: S, the agent's score, and T, the number of search expansion steps performed by the search algorithm. The performance of an agent will be:

   P = f * S - T

Clearly, a better agent will have P as large as possible. The parameter f is a weight constant. You should compare the performance of the three agents against the completely passive opponent for the following values of f: 1, 100, 10000. Note that the higher the f parameter, the more important it is to expend computational resources in order to get a better score!

Deliverables

The program and code sent to the grader, by e-mail or otherwise as specified by the grader, a printout of the code and results. A document stating the heuristic function you used and the rationale for selecting this function. Set up a time for frontal grading checking of the delivered assignment, in which both members of each team must demonstrate at least some familiarity with their program...

Due date for part 1 (recommended): Dec. 1, 2008.

For the complete assignment: Dec. 14, 2008.