# Introduction to Artificial Inteligence

## Homework assignment - simulators, agents, search, games, logic, inference

```
1) (FOL and situation calculus)
We need to represent and reason within the scenario defined by assignment 1.
For simplicity, we will ignore the issue of transit times and flooding.

a) Write down the axioms for the behavior of the world using FOL and
situation calculus. Assume only one agent.

Use the predicates:
Edge(e, v1, v2) ; e is an edge in the graph between v1 and v2.
Goal(ag, v)     ; Agent ag's goal is vertex v.
Agent(ag)       ; ag is an agent.
Car(c)          ; c is a car.
Also use the fluents (situation-dependent predicates)
InCar(ag, c, s) ; Agent ag is in car c in situation s.
Loc(o,v,s)      ; Object o is at vertex v in situation s

Constants: as needed, including constants for the agents and cars:
Agents A1, A2 ... and cars C1, C2 ...

Functions: not needed here, except to denote actions:
drive(e)        ; Denotes a move action across edge e (may be ineffective).
(ANSWER: we will fix this to be "drive(ag, e)")
switch(ag, c)   ; Denotes agent ag attempting to switch to car c.
and of course the "Result" function:
Result(a,s)    ; Denotes the situation resulting from doing action a in
; situation s.
You may wish to add auxiliary predicates or fluents in order to simplify the axioms. you
are allowed to do so provided you explain why they are needed.

We will add an auxiliary predicate:
OK(action, s)
meaning that action is successful in situation s
and has the intended effect. If this predicate does not hold, the action will have no effect.
This should shortent the successor-state axioms. The axioms for OK are:

"drive one succeeeds if the agent is in a car at a node incident on e":
1. forall e, s, ag   OK(drive(ag, e),s)  <=>  Agent(ag) and
exists u, v, c   Car(c) and InCar(ag,c,s) and
Edge(e,u,v) and (Loc(ag,u,s) or Loc(ag,v,s))
"switch succeeds if the agent is in the same location as the car":
2. forall c, s, ag   OK(switch(ag,c))   <=>   Agent(ag) and Car(c) and
exists v  Loc(ag,v,s) and Loc(c,v,s)

Now the successor-state axiom for Incar() is "In a car iff successfully
switched to this car, or already was in this car and action was not successful,
or action was not to switch to another car":

3. forall a, s, c, ag   InCar(ag,c,Result(a,s))  <=>
(a=switch(ag,c) and OK(a,s))  or
(InCar(ag,c,s) and (not OK(a,s) or not exists c1 a=switch(ag,c1)))

The successor-state axiom for Loc() is "In a location if drove there
successfully along an incident edge, or was there and no successful
drive action took us elsewhere" (need to consider both cases: object is a car,
and object is an agent):

4. forall a, s, v, o   Loc(o,v,Result(a,s)   <=>
(exists ag, e, u  a=drive(ag,e) and OK(a,s) and Agent(ag) and
Loc(o,u) and (Edge(e,u,v) or Edge(e,v,u)) and
((Car(o) and InCar(ag,o,s)) or o=ag))  or
Loc(o,v,s) and (not OK(a,s) or exists c  a=switch(o,c)

Note that these are rather long AND include the dreaded equality, so
better to split them depending on the action (a good idea since there
are only 2 types of action), to get, for the InCar() predicate:

3a.   forall s, c, ag   InCar(ag,c,Result(switch(ag,c),s))  <=>
OK(switch(ag,c),s) or InCar(ag,c,s)

3b.   forall s, c, c1, ag   not c=c1 =>
InCar(ag,c,Result(switch(ag,c1),s)) <=> InCar(ag,c,s) and not OK(switch(ag,c1),s)

3c.   forall s, c, ag, e   InCar(ag,c,Result(drive(ag,e),s)) <=>  InCar(ag,c,s)

And for the Loc() predicate:

4a.   forall ag, s, v, o, c   Loc(o,v,Result(switch(ag,c),s)  <=> Loc(o,v,s)

4b.   forall ag, s, v, e   Loc(ag,v,Result(drive(ag,e),s)  <=>
(exist u  Loc(ag,u,s) and (Edge(e,v,u) or Edge(e,u,v)) and OK(drive(ag,e),s)) or
(Loc(ag,v,s) and not OK(drive(ag,e),s)))

4c.   forall ag, s, v, o, e   Loc(o,v,Result(drive(ag,e),s)  <=>
(exist u  Loc(ag,u,s) and Incar(ag,o,s) and (Edge(e,v,u) or Edge(e,u,v)) and OK(drive(ag,e),s)) or
(Loc(o,v,s) and (not OK(drive(ag,e),s) or not InCar(ag,c,s)))

Add the facts representing the following scenario in situation S0.
Edge(E1, Vstart, V1)
Edge(E2, V1, Vend)
Car(C1)
Agent(A1)
Goal(A1, Vend)

; Situation S0
Loc(A1, Vstart, S0)
Loc(C1, Vstart)
Loc(C2, V1)
InCar(A1, C1, S0)

b)  Now, we need to find a sequence of actions that will result in reaching the goal,
of agent A1 being in Vend,
and prove a theorem that it will do so. Do the following steps:

A) Convert the knowledge base to conjunctive normal form.

ANSWER: We will first separate <=> into 2 different sentences, to simplify the work.
So for axiom 1:
1. forall e, s, ag   OK(drive(ag, e),s)  <=>  Agent(ag) and
exists u, v, c   Car(c) and InCar(ag,c,s) and
Edge(e,u,v) and (Loc(ag,u,s) or Loc(ag,v,s))

We first do:
1a. forall e, s, ag   OK(drive(ag, e),s)  =>  Agent(ag) and
exists u, v, c   Car(c) and InCar(ag,c,s) and
Edge(e,u,v) and (Loc(ag,u,s) or Loc(ag,v,s))

We have 3 existential variables in the scope of 3 universals, so need 3 Skolem
functions, say C1_of(), U1_of(), and V1_of(), to get (removing universal quantifiers):

1a. not OK(drive(ag, e),s) or
Agent(ag) and Car(C1_of(ag,e,s)) and InCar(ag,C1_of(ag,e,s),s) and
Edge(e,U1_of(ag,e,s),V1_of(ag,e,s)) and (Loc(ag,U1_of(ag,e,s),s) or Loc(ag,VC1_of(ag,e,s),s))

Now we distribute AND over OR to get the CNF clauses:

1a1.   not OK(drive(ag, e),s) or Agent(ag)
1a2.   not OK(drive(ag, e),s) or Car(C1_of(ag,e,s))
1a3.   not OK(drive(ag, e),s) or InCar(ag,C1_of(ag,e,s),s)
1a4.   not OK(drive(ag, e),s) or Edge(e,U1_of(ag,e,s),V1_of(ag,e,s))
1a5.   not OK(drive(ag, e),s) or Loc(ag,U1_of(ag,e,s),s) or Loc(ag,VC1_of(ag,e,s),s)

Now the other direction:
1b. forall e, s, ag   OK(drive(ag, e),s)  <=  Agent(ag) and
exists u, v, c   Car(c) and InCar(ag,c,s) and
Edge(e,u,v) and (Loc(ag,u,s) or Loc(ag,v,s))

Since the existentials are in the antecedents, they are negated and are actually universal
quantifiers. So we get:

1b. OK(drive(ag, e),s)  or  not Agent(ag) or not Car(c) or not InCar(ag,c,s) or
not Edge(e,u,v) or (not Loc(ag,u,s) and not Loc(ag,v,s))

Applying distribution we get CNF clauses:

1b1. OK(drive(ag, e),s)  or  not Agent(ag) or not Car(c) or not InCar(ag,c,s) or
not Edge(e,u,v) or not Loc(ag,u,s)

1b2. OK(drive(ag, e),s)  or  not Agent(ag) or not Car(c) or not InCar(ag,c,s) or
not Edge(e,u,v) or not Loc(ag,v,s)

Sentence 2, likewise we split the <=> into 2 sentences. We get:
2a. forall c, s, ag   OK(switch(ag,c))   =>   Agent(ag) and Car(c) and
exists v  Loc(ag,v,s) and Loc(c,v,s)

2b. forall c, s, ag   OK(switch(ag,c))   <=   Agent(ag) and Car(c) and
exists v  Loc(ag,v,s) and Loc(c,v,s)

For 2a we have 1 existential on v in scope of universals c, s, ag, replace v by
Skolem function V2_of(c,s,ag), and get:

2a.  not OK(switch(ag,c))  or (Agent(ag) and Car(c) and Loc(ag,v,s) and Loc(c,v,s))

Which distributes into CNF clauses:

2a1: not OK(switch(ag,c))  or Agent(ag)
2a1: not OK(switch(ag,c))  or Car(c)
2a1: not OK(switch(ag,c))  or Loc(ag, V2_of(c,s,ag),s)
2a1: not OK(switch(ag,c))  or Loc(c, V2_of(c,s,ag),s))

In 2b again the negation of exists v makes it effectively universally quantified, so we get CNF clause:

2b: OK(switch(ag,c)) or  not Agent(ag) or not Car(c) or not Loc(ag,v,s) or not Loc(c,v,s)

Now for 3a, we get CN clauses:

3a1.  not InCar(ag,c,Result(switch(ag,c),s)) or OK(switch(ag,c),s) or InCar(ag,c,s)
3a2.  InCar(ag,c,Result(switch(ag,c),s)) or not OK(switch(ag,c),s)
3a3.   InCar(ag,c,Result(switch(ag,c),s)) or not InCar(ag,c,s)

In 3b, we first get:

3b.   forall s, c, c1, ag  c=c1 or
(InCar(ag,c,Result(switch(ag,c1),s)) <=> InCar(ag,c,s) and not OK(switch(ag,c1),s))

Which will split into CNF clauses:

3b1:   c=c1 or InCar(ag,c,Result(switch(ag,c1),s)) or not InCar(ag,c,s) or OK(switch(ag,c1),s))
3b2:   c=c1 or not InCar(ag,c,Result(switch(ag,c1),s)) or InCar(ag,c,s)
3b3:   c=c1 or not InCar(ag,c,Result(switch(ag,c1),s)) or not OK(switch(ag,c1),s))

Finally 3c is split into 2, generating CNF clauses:

3c1.   not InCar(ag,c,Result(drive(ag,e),s)) or InCar(ag,c,s)
3c2.   InCar(ag,c,Result(drive(ag,e),s)) or not InCar(ag,c,s)

Now to convert the Loc() predicate axioms, starting with 4a which splits into 2:

4a1.   not Loc(o,v,Result(switch(ag,c),s)  or Loc(o,v,s)
4a2.   Loc(o,v,Result(switch(ag,c),s) or not Loc(o,v,s)

Before converting 4b we first split:

4b.   forall ag, s, v, e   Loc(ag,v,Result(drive(ag,e),s)  =>
(exist u  Loc(ag,u,s) and (Edge(e,v,u) or Edge(e,u,v)) and OK(drive(ag,e),s)) or
(Loc(ag,v,s) and not OK(drive(ag,e),s)))

4b'.   forall ag, s, v, e   Loc(ag,v,Result(drive(ag,e),s)  <=
(exist u  Loc(ag,u,s) and (Edge(e,v,u) or Edge(e,u,v)) and OK(drive(ag,e),s)) or
(Loc(ag,v,s) and not OK(drive(ag,e),s)))

Now 4b has existential u in scope of universally quantified ag,s,v,e, so introduce Skolem
function U2_of(ag,s,v,e) to get:

4b.    not Loc(ag,v,Result(drive(ag,e),s))  or
((Loc(ag,U2_of(ag,s,v,e),s) and
(Edge(e,v,U2_of(ag,s,v,e)) or Edge(e,U2_of(ag,s,v,e),v)) and
OK(drive(ag,e),s)) or
(Loc(ag,v,s) and not OK(drive(ag,e),s)))

Distributing, we get the CNF clauses:

4b1.  not Loc(ag,v,Result(drive(ag,e),s)) or Loc(ag,U2_of(ag,s,v,e),s) or Loc(ag,v,s)
4b2.  not Loc(ag,v,Result(drive(ag,e),s)) or Loc(ag,U2_of(ag,s,v,e),s) or not OK(drive(ag,e),s)
4b3.  not Loc(ag,v,Result(drive(ag,e),s)) or Edge(e,v,U2_of(ag,s,v,e)) or Edge(e,U2_of(ag,s,v,e),v) or Loc(ag,v,s)
4b4.  not Loc(ag,v,Result(drive(ag,e),s)) or Edge(e,v,U2_of(ag,s,v,e)) or Edge(e,U2_of(ag,s,v,e),v) or not OK(drive(ag,e),s)
4b5.  not Loc(ag,v,Result(drive(ag,e),s)) or OK(drive(ag,e),s) or Loc(ag,v,s)
4b6.  not Loc(ag,v,Result(drive(ag,e),s)) or OK(drive(ag,e),s) or not OK(drive(ag,e),s)
(the last one is useless, and can be dropped!)

In 4b' due to the negation u is universally quantified, so we get:
4b'.  Loc(ag,v,Result(drive(ag,e),s)  or (not Loc(ag,u) or (not Edge(e,v,u) and not Edge(e,u,v)) or not OK(drive(ag,e),s)) and
(not Loc(ag,v,s) or OK(drive(ag,e),s)))

This distributes into:
4b'1.  Loc(ag,v,Result(drive(ag,e),s)  or not Loc(ag,u,s) or not Edge(e,v,u) or not OK(drive(ag,e),s)

4b'2.  Loc(ag,v,Result(drive(ag,e),s)  or not Loc(ag,u,s) or not Edge(e,u,v) or not OK(drive(ag,e),s))

4b'3.  Loc(ag,v,Result(drive(ag,e),s)  or not Loc(ag,v,s) or OK(drive(ag,e),s)

Finally, 4c has the same structure as 4b, though somewhat longer, but we will not do it here as
it is not needed for the proof.

B) Formally list what we need to prove, and state its negation.

ANSWER: The sequence of actions: drive(A1,E1), drive(A1,E2) will reach the goal.
We will need to show formally that in the resulting situation the location of A1 is at a goal
location. Formally, we need to show:
Q.     exists v  Goal(A1,v) and Loc(A1,v,Result(drive(A1,E2),Result(drive(A1,E1),S0)))

Negating Q we get:
Q'.     not exists v  Goal(A1,v) and Loc(A1,v,Result(drive(A1,E2),Result(drive(A1,E1),S0)))

Moving negation inwards we have:
Q'.     forall v  not Goal(A1,v) or not Loc(A1,v,Result(drive(A1,E2),Result(drive(A1,E1),S0)))

which results in the CNF clause:
Q'.     not Goal(A1,v) or not Loc(A1,v,Result(drive(A1,E2),Result(drive(A1,E1),S0)))

C) Use resolution to reach a contradiction, thus proving the theorem.

try to be as efficient as possible. We will start with the negated query.
Resolve Q' with the "Goal(A1,Vend)" fact, unifier {v/Vend} to get:
10. not Loc(A1,Vend,Result(drive(A1,E2),Result(drive(A1,E1),S0)))

Resolve 10 with 4b'2, unifier {ag/A1, e/E2, v/Vend, s/Result(drive(A1,E2),S0)} to get:
11. not Loc(A1,u,Result(drive(A1,E2),S0)) or not Edge(E2,u,Vend) or not OK(drive(A1,E2),Result(drive(A1,E1),S0))

Resolve 11 with the "Edge(E2,V1,Vend)" fact, unifier {u/V1} to get:
12. not Loc(A1,V1,Result(drive(A1,E2),S0)) or not OK(drive(A1,E2),Result(drive(A1,E1),S0))

Resolve 12 with 4b'2, unifier {ag/A1, e/E1, v/V1, s/S0} to get:
13. not Loc(A1,Vstart,S0) or not Edge(E1,u,V1) or not OK(drive(A1,E1),S0) or not OK(drive(A1,E2),Result(drive(A1,E1),S0))

Resolve 13 with the "Edge(E1,Vstart,V1)" fact, unifier {u/Vstart} to get:
14. not Loc(A1,Vstart,S0) or not OK(drive(A1,E1),S0) or not OK(drive(A1,E2),Result(drive(A1,E1),S0))

Resove 14 with the "Loc(A1, Vstart, S0)" fact, unifier {}, to get:
15. not OK(drive(A1,E1),S0) or not OK(drive(A1,E2),Result(drive(A1,E1),S0))

We seem to be "almost done", all we need to do is to show that the actions are OK.
As it turns out, not so almost...
Resolve 15 with 1b1 (on the first "OK"), unifier {ag/A1,e/E1,s/S0} to get:
16. not Agent(A1) or not Car(c) or not InCar(A1,c,S0) or not Edge(E1,u,v) or not Loc(A1,u,S0)
or not OK(drive(A1,E2),Result(drive(A1,E1),S0))

For conciseness, we will now do several steps, only listing the end result. Each step
removes one term from the clause:
Resolve 16 with the fact "Agent(A1)", unifier {}.
Then resolve the resulting clause with the fact "InCar(A1,C1,S0)", unifier {c/C1}.
Then resolve the resulting sentence with the fact "Car(C1)", unifier {}.
Then resolve the resulting clause with the fact "Loc(A1,Vstart,S0)", unifier {u/Vstart}.
Then resolve the resulting clause ith the fact "Edge(E1,Vstart,V1)", unifier {v/V1}, to get:
17.   not OK(drive(A1,E2),Result(drive(A1,E1),S0))

So now we only need to prove that the second action is OK. Unfortunately this is more complicated.
Resolve 17 with 1b1, unifier {ag/A1,e/E2,s/Result(drive(A1,E1),S0)} to get:
18. not Agent(A1) or not Car(c) or not InCar(A1,c,Result(drive(A1,E1),S0)) or
not Edge(E2,u,v) or not Loc(ag,u,Result(drive(A1,E1),S0))

Again for conciseness we will do several steps:
Resolve 18 with the fact "Agent(A1)", unifier {}.
Then resolve the resulting sentence with the fact "Car(C1)", unifier {c/C1}.
Then resolve the resulting clause ith the fact "Edge(E2,V1,Vend)", unifier {u/V1,v/Vend}, to get:
19.   not InCar(A1,C1,Result(drive(A1,E1),S0)) or not Loc(ag,V1,Result(drive(A1,E1),S0))

Now what remains to be proved is that the agent is both at V1 and in a car, after the first action.
Resolve 19 with 3c2 on the FIRST Incar() term of 3c2, unifier {ag/A1,c/C1,e/E1,s/S0} to get:
20.    not InCar(A1,C1,S0) or not Loc(ag,V1,Result(drive(A1,E1),S0))

Resolve 20 with fact "InCar(A1,C1,S0)", unifier {} to get:
21.    not Loc(A1,V1,Result(drive(A1,E1),S0))

Resolve 21 with 4b'2, unifier {ag/A1,v/V1,e/E1,s/S0} to get:
22.    not Loc(A1,u,S0) or not Edge(E1,u,V1) or not OK(drive(A1,E1),S0))

Resolving 22 first with the fact "Loc(A1,Vstart,S0)", unifier {u/Vstart},
and then the resulting clause with the fact "Edge(E1,Vstart,V1)", unifier {} we get:
23.     not OK(drive(A1,E1),S0))

This one was already done, essentially, but formally:
Resolve 13 with 1b1, unifier {ag/A1,e/E1,s/S0} to get:
24. not Agent(A1) or not Car(c) or not InCar(A1,c,S0) or not Edge(E1,u,v) or not Loc(A1,u,S0)

For conciseness, we will now do several steps, only listing the end result. Each step
removes one term from the clause:
Resolve 14 with the fact "Agent(A1)", unifier {}.
Then resolve the resulting clause with the fact "InCar(A1,C1,S0)", unifier {c/C1}.
Then resolve the resulting sentence with the fact "Car(C1)", unifier {}.
Then resolve the resulting clause with the fact "Loc(A1,Vstart,S0)", unifier {u/Vstart}.
Then resolve the resulting clause ith the fact "Edge(E1,Vstart,V1)", unifier {v/V1},
to get the empty clause FINALLY DONE!

c) What would be a potential problem in the proof if we did not
have "frame axioms" in our representation (i.e. stating what
did not change)?

We would not have been able to show: InCar(A1,C1,Result(drive(A1,E1),S0)), because this is not
a direct result of drive(), rather something that is NOT CHANGED by drive().

d) Would we be able to do the proof using only forward chaining?

As we used only Horn clauses in the proof, it remains to be seen whether we can come
up with appropriate substitutions. It seems that in this case we can, although need to
go through with the proof to be certain.

2)  (agents and environment simulators)
For each of the following domains, which type of agent is appropriate
(table lookup, reflex, goal based, or utility based). Also, indicate the
type of environment:
a) An agent that plays Poker.
b) An agent that can play Spider solitaire.
c) An autonomous robot that can win the DARPA challenge (driving in
a city).
d) An internet coordination agent (sets meetings between humans).
e) An agent that can solve a Soduku puzzle.
f) An agent that plays Go.

a) Poker needs a utility-based agent. Environment is stochastic, inaccessible, discrete, non-episodic,
static, and multi-agent.
b) Spider solitare can be goal-based or utility based. Environment is deterministic, inaccessible, discrete,
episodic, static, and single-agent.
c) DARPA challenge: Utility-based agent.
Environment is stochastic, inaccessible, continuous, non-episodic, dynamic, and multi-agent.
d) Internet coordination: Utility-based agent.
Environment is stochastic, inaccessible, semi-continuous, non-episodic, dynamic, and multi-agent.
e) Soduku: in general, need goal based. However, most Soduku puzzles require no search and
a reflex agent based on rules suffices. Environment is deterministic, accessible, discrete, episodic,
static, and single-agent.
f) Go: goal-based or utility based. Environment is deterministic, accessible, discrete, (parially) episodic,
static, and multi-agent.

3) (Propositional logic)
For each of the following sentences, determine whether they are
satisfiable, unsatisfiable, and/or valid. For each sentence that
is satisfiable, determine the number of models (over the propositional
variables in the sentence).
a) (A and B and C) or not D
b) (not (A and ((not A) or B) and ((not B) or C))) or C
c) (A and (not A)) and (B and not B)
d) (A and not B) and (C or not D)
e) not A -> A
f) (A -> not A) or (not A -> A)

a) Satisfiable 8+1=9 models.
b) Valid. Same as:  (A and (A=>B) and (B=>C)) => C
c) Unsatisfiable.
d) Satisfiable, 3 models
e) Satisfiable, 1 model
f) Valid

4) (Search):
Show the run of the following algorithms on the setting of question 1
above, where the agent needs to reach Vend as quickly as possible. The speed of C1 is 50, and
that of C2 is 100. Length of E1 and E2 are 100 each, and the time to switch is
0.2.   Assume that h(n) uses simple Dijsktra in the graph and assumes
that the agent is in the fastest available car.

We have 1 agent and 2 cars. Thus the state can be represented as a tuple (Agent, Car1, Car2, InCar) locations
and car ID, respectively. For conciseness we will call Vstart "V0", Vend will be "Vg", and
agent not in a car will be denoted by 0 as the Incar value.

Initial state is thus: (V0, V0, V1, C1), i.e. the agent is in car C1, in location V0, and so is car C1.
Car C2 is in V1. The fastest car is C2, so h(n) will assume speed is 100 and divide the
distance by 100.

a) Greedy search (f(n) = h(n)), breaking ties in favor of states where the
agent is in a faster car.

Initialization: Initial state S0 = (V0, V0, V1, C1), g=0, h=2, put onto agenda. Steps now:
1. Get node with S0, not a goal so expand. Only possibility is drive(E1), cost is 2, which gives:
S1 = (V1, V1, V1, C1), g=2, h=1

2. Get node with S1, not a goal so expand. 3 possibilities: drive(E1), drive(E2), switch(C2),
resulting in, respectively:
S2 = (V0, V0, V1, C1), g=4, h=2
S3 = (Vg, Vg, V1, C1), g=4, h=0
S4 = (V1, V1, V1, C2), g=2.2, h=1

3. Since this is greedy search, it picks S3, which is a goal, and we are done
(path found: drive(E1), drive(E2), cost 4)

b) A* search, (f(n) = g(n)+h(n)),  breaking ties in favor of states where the
agent is in a faster car.

Same as above up to step 2, inclusive. But now:
3. Picks minimal g+h=3, which is S4, not a goal. From here can: drive(E1), drive(E2), switch(C1), to get:
S2 = (V0, V0, V1, C1), g=4, h=2    ; from earlier
S3 = (Vg, Vg, V1, C1), g=4, h=0    ; from earlier
S5 = (V0, V1, V0, C2), g=3.2, h=2
S6 = (Vg, V1, Vg, C2), g=3.2, h=0
S7 = (V1, V1, V1, C1), g=2.4, h=1

4. Now we have S6  with g+h=3.2 as a single best node.
This is a goal state, so we are done, generating the path: drive(E1),switch(C2),drive(E2)
with a path cost of 3.2

c) A* search, (f(n) = g(n)+h(n)),  breaking ties in favor of states where the
agent is in a slower car.

Same as in c, because no ties needed to be broken.

d) Repeat a-c but using h'(n) = 3*h(n) as the heuristic. Is h'(n) admissible?

Using greedy search, same behaviour here as with h(n) as h'(n) is
a linear scaling of h(n) and sorting is according to h'(n).

Using A*: Initial state S0 = (V0, V0, V1, C1), g=0, h=6, put onto agenda. Steps now:
1. Get node with S0, not a goal so expand. Only possibility is drive(E1), cost is 2, which gives:
S1 = (V1, V1, V1, C1), g=2, h'=3

2. Get node with S1, not a goal so expand. 3 possibilities: drive(E1), drive(E2), switch(C2),
resulting in, respectively:
S2 = (V0, V0, V1, C1), g=4, h'=6
S3 = (Vg, Vg, V1, C1), g=4, h'=0
S4 = (V1, V1, V1, C2), g=2.2, h'=3

So here we pick S3, which is a goal, and the resulting path is drive(E1), drive(E2)
with a cost of 4, clearly suboptimal. Same behavior with algorithm c.

5) (Game trees and Search):
Consider a 4-person game with complete information, where the score is
computed for PAIRS that are defined in advance. That is, the players are
A, B, C, D, and AC are a pair, BD are the other pair. The score for
each member a pair are the same, and is NEGATIVE of the scores for the
other pair. For example, if A scores 1, C also scores 1, B scores -1
and D scores -1. However, B is a player that plays completely randomly with
a known distribution. Assume that players take turns in the order A, B, C, D,
repeatedly, until the game ends.

a) Write an algorithm for optimal play by player A.
at the top level A examines every possible moves, and calls the
following eval algorithm with the state resulting from each of them (with turn=B), and just selects the move with
the maximal score:

eval(state, turn) {
if(terminal(state)) then return value(state);
if(turn==B) then loop over s in successors of state,
return expectation(eval(s,C));  /* expectation according to known distribution */
if(turn==C) then loop over s in successors of state,
return max(eval(s,D));
if(turn==D) then loop over s in successors of state,
return min(eval(s,A));
loop over s in successors of state, /* turn is A */
return max(eval(s,B));
}

b) If the game now is partial information: A cannot communicate with C,
and C cannot see the moves made A, but
still optimal? In what cases, if any, can you run into problems?

Although A can see what C does, the fact that C does not see the moves by A means
that the game is imperfect information. So C will not know the best action to take
in order to optimize the score by the pair AC, and the standard algorithm will fail.

But C can ASSUME that A has taken the action that will lead to the best score, modifying
the algorithm accordingly, by mimicking A's computtion. This will again allow for perfect pay.
The remaining problem is if A and C encounter equally good moves, which again prevents
predicting the move to be taken. This can be solved by a "locker room agreement" (a convention
declared before the game) on how ties will be broken by A and C.

c) Show an example 4-ply game tree with branching factor 2, and show
the values computed in this tree case a.

AVG       MAX       MIN

A1 12.5- B1 10---- C1  0 ----- D1  100
\         \          ----- D2  0
\         --- C2 10 ----- D1  10
\                  ----- D2  20
- B2 15---- C1 -1 ----- D1  -1
\          ----- D2  0
--- C2 15 ----- D1  100
----- D2  15
A2 45.5-- B1 1---- C1  1 ----- D1  1
\        \          ----- D2  15
\        --- C2 -9 ----- D1  -9
\                 ----- D2  0
B2 90---- C1 10 ----- D1  100
\          ----- D2  10
--- C2 -3 ----- D1  10
----- D2  -3

A will prefer A2, with an expected value of 45.5 for A and C.

6)  (Game-tree search - alpha-beta pruning)
a) Give an example of a 4-ply game tree (branching factor 2)
where alpha-beta pruning saves NOTHING, and another where the savings
is maximal. How many nodes are pruned in each case?

ANSWER: first the example with the best pruning:

MIN       MAX       MIN

A1 90 -- B1 90  -- A1 90 ----- B1  100
\         \          ----- B2  90
\         --- A2 10- ---- B1  10
\                  ----- B2  40  (pruned, MAX will never let this be reached))
- B2 95+ -- A1 95+ ---- B1  100
\          ----- B2  95
--- A2    ----- B1  10 (pruned, no need to check A2 and sub-branches)
----- B2  20 (pruned)
A2 80- - B1 80- -- A1 80- ---- B1  80
\        \          ----- B2  70   (pruned)
\        --- A2 65- ---- B1  65
\                 ----- B2  0    (pruned)
B2   ---- A1    ----- B1  -80  (Pruned sub-tree, as A2 top level is inferior anyway)
\            ----- B2  -90  (pruned)
--- A2    ----- B1  -100 (pruned)
----- B2  -10  (pruned)

Re-ordering the branches from bottom to top decreases the savings considerably,
and may even result in no pruning at all (may need changing some of the numbers).

b) Suppose that we know ahead of time that the terminal values can only
be 0 or plus 1 or minus 1. Is there a case where alpha-beta can save even
more search than the best case above (show it, or prove that it cannot help).

In this case, if the first main sub-tree has a value of 1, there is no
point in looking anywhere else, so the rest of the tree is pruned.

MIN       MAX       MIN

A1 1 --- B1 1 ---- A1  1 ----- B1  1
\         \          ----- B2  1
\         --- A2   (pruned, A is assured to get 1 above)
\
- B2 1 ---- A1  1 ----- B1  1
\          ----- B2  1
--- A2  (pruned, A is assured to get 1 above)

A2  1--- B1   ---- A1    ----- B1  1     (Entire subtree here pruned, as A can get 1 with move A1)
\        \          ----- B2  1
\        --- A2    ----- B1  -1
\                 ----- B2  0
B2   ---- A1    ----- B1  0
\          ----- B2  0
--- A2    ----- B1  1
----- B2  -1

c) Provide an example of a 2-ply + 1 chance node level game tree where
alpha-beta saves as much search as the maximum possible in a deterministic game,
and a similar example where changing the distribution on the chance node
edges results in no savings for pruning.

ANSWER: can only prune if values are bounded.