Introduction to Artificial Intelligence

Assignment 2

Adversarial games for agents in the simplified Unreal Tournament (TM)

In the second exercise you will be using the environment simulator from the first assignment, the discrete version of the Unreal Tournament game, as a platform for implementing intelligent adversarial agents. The environment is the same as before, except that now the opposing agent is an intelligent agent, rather than the dumb automaton.

Game Environment

As before, each grid location (square) can be either an obstacle, an icy patch, or a free square. An agent can perform the following actions: move up, move down, move right, and move left. The agent state is its grid position, last motion, and life (alive or ghost). A motion action always succeeds, unless the agent attempts to move into a wall or to a location occupied by another agent, or is in an icy patch and attempts motion not in the same direction. An agent on an icy patch can try to move in a different direction, with success probability p, an input parameter between 0 and 1 (otherwise it slips and continues motion in the same direction). We will also add a shoot action, which hits an opponent if an agent has a clear line of sight to its opponent. Each agent has only k charges (that is, can execute the shoot action at most k times). Note change from previous assignment, where an agent was killed when it was in the field of fire WITHOUT the shoot action. A shot uses up a bullet, obviously, whether successful ot not. A successful shot kills the opposing agent, which then becomes a ghost. Ghosts have no further effect on the state of the environment: an opposing agent can move through them, they cannot collect additional flags, and their shots have no effect. Whether they are displayed or not is your choice. The game still continues until all flags are collected or a time limit is reached.

Additional clarifying remarks:

The type of objects in the game are flags and special objects (such as weapons, ammunition, etc.). The basic assignment again ignores the special objects. Each flag has an associated reward value: an agent stepping on a flag earns the reward (essentially "picking up" the flag, which then cannot be picked up again). The goal of the game is to pick flags with total value as high as possible. There is no score modifier for getting killed or killing an agent. Dead agents still get a score for the flags collected, and cannot be "robbed" by the another agent...

Implementation Steps

The simulator should query the user about the above parameters (k, and p), the type of game (see below) as well as other initialization parameters for each agent (type of agent, position and direction).

After the above initialization, the simulator should run each agent in turn, performing the actions retured by the agents, and update the world accordingly. Additionally, the simulator should be capable of displaying the world map after each step, with the appropriate state of the agents and their score. Here there are two types of agent programs: human (i.e. read input from keyboard) and game tree search agent.

Each agent program (a function) works as follows. The agent is called by the simulator, together with a set of observations. The agent returns a move to be carried out in the current world state. The agent is allowed to keep an internal state if needed. In this assignment, the agents can observe the entire state of the world. You should support the following types of games:

  1. A deterministic zero sum game. In this version, the probability of not slipping on the ice p is 0. The score of an agent is the sum of all its collected flag values minus the sum of the flag values collected by the other agent. ("optimal" agent using mini-max, with alpha-beta pruning).
  2. A deterministic non zero-sum game, same as above but the agent score is just the sum of the flag values it collects on its own. ("optimal" agent using maxi-max).
  3. A non-deterministic zero sum game with score as in the first game, but with p less than 1 ("optimal" agent using expecti-mini-max).

Assume that the game ends (and scores are counted) when there are no more flags, or after 100 moves by each agent, whichever comes first. Since the game tree will usually be too big to reach terminal position in the search, you should also implement a cutoff, and a heuristic static evaluation function for each game. You may use the same heuristic for all games, if you think this is justified.


The program and code sent to the grader, by e-mail or otherwise as specified by the grader, a printout of the code and results. You need to show example scenarios where the optimal behavior differs for the three kinds of games (you will need to make the example scenarios very small in order to be able to reach terminal states in the search). A description of your heuristic evaluation functions and their rationale. Set up a time for frontal grading checking of the delivered assignment, in which both members of each team must demonstrate at least some familiarity with their program...

Due date: December 29, 2008.