In the second exercise you will be using the environment simulator from the first assignment, the discrete version of the Unreal Tournament game, as a platform for implementing intelligent adversarial agents. The environment is the same as before, except that now the opposing agent is an intelligent agent, rather than the dumb automaton.
As before, each grid location (square) can be either an obstacle or a free square. An agent can perform the following actions: move forward, turn right, and turn left. The agent state is its grid position, and the direction it is facing (NEWS). A turn is 90 degrees. A motion action always succeeds, unless the agent attempts to move into a wall or to a location occupied by another agent. We will also add a shoot action, which succeeds with probability p, if for an agent facing its opponent with a clear line of sight (where p is a parameter between 0 and 1). Each agent has only k stun-gun charges (that is, can execute the shoot action at most k times). A shot uses up a bullet, obviously, whether successful ot not. A successful shot stuns the opposing agent, which then becomes incapable of any actions (except thinking, and executing "null" actions which do not change the environment) for K turns (a user-provided parameter). Note that a stunned agent still may block a passage, and may even win the game! The game still continues until all flags are collected or a time limit is reached.
The type of objects in the game are flags and special objects (such as weapons, ammunition, etc.). The basic assignment again ignores the special objects. As before, each flag has an associated reward value: an agent stepping on a flag earns the reward (essentially "picking up" the flag, which then cannot be picked up again). The goal of the game is to pick flags with total value as high as possible. There is no score modifier for getting stunned or shooting an agent. Stunned agents still get a score for the flags collected, and cannot be "robbed" by the another agent...
The simulator should query the user about the above parameters (k, K, and p), the type of game (see below) as well as other initialization parameters for each agent (type of agent, position and direction).
After the above initialization, the simulator should run each agent in turn, performing the actions retured by the agents, and update the world accordingly. Additionally, the simulator should be capable of displaying the world map after each step, with the appropriate state of the agents and their score. Here there are two types of agent programs: human (i.e. read input from keyboard) and game tree search agent.
Each agent program (a function) works as follows. The agent is called by the simulator, together with a set of observations. The agent returns a move to be carried out in the current world state. The agent is allowed to keep an internal state if needed. In this assignment, the agents can observe the entire state of the world. You should support the following types of games:
Assume that the game ends (and scores are counted) when there are no more flags, or after 100 moves by each agent, whichever comes first. Since the game tree will usually be too big to reach terminal position in the search, you should also implement a heuristic static evaluation function for each game (possibly the same heuristic).
The program and code sent to the grader, by e-mail or otherwise as specified by the grader, a printout of the code and results. A description of your heuristic evaluation functions and their rationale. Set up a time for frontal grading checking of the delivered assignment, in which both members of each team must demonstrate at least some familiarity with their program...
Due date: May 1.