Introduction to Artificial Intelligence

Assignment 2

Adversarial games for agents in the simplified Unreal Tournament (TM)

In the second exercise you will be using the environment simulator from the first assignment, the discrete version of the Unreal Tournament game, as a platform for implementing intelligent adversarial agents. The environment is the same as before, except that now the opposing agent is an intelligent agent, rather than the dumb automaton.

Game Environment

As before, each grid location (square) can be either an obstacle or a free square. An agent can perform the following actions: move forward, turn right, and turn left. The agent state is its grid position, and the direction it is facing (NEWS). A turn is 90 degrees. A motion action always succeeds, unless the agent attempts to move into a wall or to a location occupied by another agent. We will also add a shoot action, which succeeds with probability p, if for an agent facing its opponent with a clear line of sight (where p is a parameter between 0 and 1). Each agent has only k stun-gun charges (that is, can execute the shoot action at most k times). A shot uses up a bullet, obviously, whether successful ot not. A successful shot stuns the opposing agent, which then becomes incapable of any actions (except thinking, and executing "null" actions which do not change the environment) for K turns (a user-provided parameter). Note that a stunned agent still may block a passage, and may even win the game! The game still continues until all flags are collected or a time limit is reached.

The type of objects in the game are flags and special objects (such as weapons, ammunition, etc.). The basic assignment again ignores the special objects. As before, each flag has an associated reward value: an agent stepping on a flag earns the reward (essentially "picking up" the flag, which then cannot be picked up again). The goal of the game is to pick flags with total value as high as possible. There is no score modifier for getting stunned or shooting an agent. Stunned agents still get a score for the flags collected, and cannot be "robbed" by the another agent...

Implementation Steps

The simulator should query the user about the above parameters (k, K, and p), the type of game (see below) as well as other initialization parameters for each agent (type of agent, position and direction).

After the above initialization, the simulator should run each agent in turn, performing the actions retured by the agents, and update the world accordingly. Additionally, the simulator should be capable of displaying the world map after each step, with the appropriate state of the agents and their score. Here there are two types of agent programs: human (i.e. read input from keyboard) and game tree search agent.

Each agent program (a function) works as follows. The agent is called by the simulator, together with a set of observations. The agent returns a move to be carried out in the current world state. The agent is allowed to keep an internal state if needed. In this assignment, the agents can observe the entire state of the world. You should support the following types of games:

A deterministic zero sum game. In this version, there are no bulets or shooting, and the score of an agent is the sum of all its collected flag values minus the sum of the flag values collected by the other agent. (Optimal agent using mini-max, with alpha-beta pruning).
A deterministic non zero-sum game, same as above but the agent score is just the sum of the flag values it collects on its own. (Optimal agent using maxi-max and alpha pruning).
A non-deterministic zero sum game with score as in the first game, but with k greater than 0 stun-gun charges for each agent. (Optimal agent using expecti-mini-max).

Assume that the game ends (and scores are counted) when there are no more flags, or after 100 moves by each agent, whichever comes first. Since the game tree will usually be too big to reach terminal position in the search, you should also implement a heuristic static evaluation function for each game (possibly the same heuristic).

Deliverables

The program and code sent to the grader, by e-mail or otherwise as specified by the grader, a printout of the code and results. A description of your heuristic evaluation functions and their rationale. Set up a time for frontal grading checking of the delivered assignment, in which both members of each team must demonstrate at least some familiarity with their program...

Due date: May 1.