Introduction to Artificial Intelligence

Assignment 2


Adversarial games for agents in the simplified Unreal Tournament (TM)

In the second exercise you will be using the environment simulator from the first assignment, the discrete version of the Unreal Tournament game, as a platform for implementing intelligent adversarial agents. The environment is the same as before, except that now there are two opposing agents operating in the environment.

Game Environment

As before, each grid location (square) can be either an obstacle or a free square. An agent can perform the following actions: move forward, turn right, and turn left. The agent score decreases by 1 for each action it performs. The agent state is its grid position, and the direction it is facing (NEWS). A turn is 90 degrees. A motion action always succeeds, unless the agent attempts to move into a wall or to a location occupied by another agent. We will also add a shoot action, which succeeds with probability p/d, if for an agent facing its opponent with a clear line of sight (where d is the distance between agents, and p is a parameter between 0 and 1). Each agent has only k bullets (that is, can execute the shoot action at most k times). A shot uses up a bullet, obviously, whether successful ot not. A successful shot kills the opposing agent, which then becomes a "ghost": it cannot collect any more flags, takes up no physical space (for example, it can no longer block a passage) and its only action is a "null" action which does not change the environment. However, the game still continues until all flags are collected or a time limit is reached.

The type of objects in the game are flags and special objects (such as weapons, ammunition, etc.). The basic assignment again ignores the special objects. As before, each flag has an associated reward value: an agent stepping on a flag earns the reward (essentially "picking up" the flag, which then cannot be picked up again). The goal of the game is to pick flags with total value as high as possible. There is no score modifier for getting kiled or for killing an agent - the dead agent still gets a score for the flags collected before death, and cannot be "robbed" by the other agent...

Implementation Steps

The simulator should query the user about the above parameters (k and p), the type of game (see below) as well as other initialization parameters for each agent (type of agent, position and direction).

After the above initialization, the simulator should run each agent in turn, performing the actions retured by the agents, and update the world accordingly. Additionally, the simulator should be capable of displaying the world map after each step, with the appropriate state of the agents and their score. Here there are two types of agent programs: human (i.e. read input from keyboard) and game tree search agent.

Each agent program (a function) works as follows. The agent is called by the simulator, together with a set of observations. The agent returns a move to be carried out in the current world state. The agent is allowed to keep an internal state if needed. In this assignment, the agents can observe the entire state of the world. You should support the following types of games:

  1. A deterministic zero sum game. In this version, there are no bulets or shooting, and the score of an agent is the sum of all its collected flag values minus the sum of the flag values collected by the other agent. (Optimal agent using mini-max, with alpha-beta pruning).
  2. A deterministic non zero-sum game, same as above but the agent score is just the sum of the flag values it collects on its own. (Optimal agent using maxi-max and alpha pruning).
  3. A non-deterministic zero sum game with score as in the first game, but with k greater than 0 bullets for each agent. (Optimal agent using expecti-mini-max).

Assume that the game ends (and scores are counted) when there are no more flags, or after 100 moves by each agent, whichever comes first. Since the game tree will usually be too big to reach terminal position in the search, you should also implement a heuristic static evaluation function for each game (possibly the same heuristic).

Deliverables

The program and code sent to the grader, by e-mail or otherwise as specified by the grader, a printout of the code and results. A description of your heuristic evaluation functions and their rationale. Set up a time for frontal grading checking of the delivered assignment, in which both members of each team must demonstrate at least some familiarity with their program...

Due date: May 1. (Due date will be postponed a few days due to midterm exam.)