Академический Документы
Профессиональный Документы
Культура Документы
So far
Artificial Intelligence: A Modern Approach
Stuart Russell and Peter Norvig
Prentice Hall, 2nd ed.
Chapter 1: AI taxonomy
Chapter 2: agents
Chapter 3: uninformed search
Chapter 4: informed search
From now on
Artificial Intelligence: A Modern Approach
Chapter 4.
Chapter 6: adversarial search
Network part
Learning (maybe from the same textbook)
Game AI techniques
Outline
Ch 4. informed search
Online search
Ch 6. adversarial search
Optimal decisions
- pruning
Imperfect, real-time decisions
Online DFS
function ONLINE_DFS-AGENT(s) return an action
input: s, a percept identifying current state
static: result, a table of the next state, indexed by action and state, initially empty
unexplored, a stack that lists, for each visited state, the action not yet tried
unbacktracked, a stack that lists, for each visited state, the predecessor states
agent has not yet backtracked
s, a, the previous state and action, initially null
if GOAL-TEST(s) then return stop
if s is a new state then unexplored[s] ACTIONS(s)
if s is not null then do
result[a,s] s
add s to the front of unbackedtracked[s]
if unexplored[s] is empty then
if unbacktracked[s] is empty then return stop
else a an action b such that result[b, s]=POP(unbacktracked[s])
else a POP(unexplored[s])
s s
return a
to which the
GOAL-TEST((1,1))?
s != G thus false
s=(1,1)
{RIGHT,UP}
s is null?
True (initially)
UX[(1,1)] empty?
False
POP(UX[(1,1)]) a
a=UP
s = (1,1)
Return a
s=(1,2)
GOAL-TEST((1,2))?
s != G thus false
s is null?
false (s=(1,1))
result[UP,(1,1)] (1,2)
UB[(1,2)]={(1,1)}
s
s
{DOWN}
UX[(1,2)] empty?
False
a=DOWN, s=(1,2)
return a
s=(1,1)
GOAL-TEST((1,1))?
s != G thus false
s is null?
false (s=(1,2))
result[DOWN,(1,2)] (1,1)
UB[(1,1)] = {(1,2)}
UX[(1,1)] empty?
False
a=RIGHT, s=(1,1)
return a
s=(2,1)
GOAL-TEST((2,1))?
s != G thus false
s is null?
false (s=(1,1))
result[RIGHT,(1,1)] (2,1)
UB[(2,1)]={(1,1)}
UX[(2,1)] empty?
False
a=LEFT, s=(2,1)
return a
s=(1,1)
GOAL-TEST((1,1))?
s != G thus false
s is null?
false (s=(2,1))
result[LEFT,(2,1)] (1,1)
UB[(1,1)]={(2,1),(1,2)}
UX[(1,1)] empty?
True
UB[(1,1)] empty? False
a=RIGHT, s=(1,1)
Return a
And so on
Online DFS
The current
position of agent
Outline
Ch 4. informed search
Ch 6. adversarial search
Optimal decisions
- pruning
Imperfect, real-time decisions
* Environments with very many agents are best viewed as economies rather than games
Game formalization
Initial state
A successor function
Returns a list of (move, state) paris
Terminal test
Terminal states
Game tree
The state space
Minimax
Perfect play for deterministic games: optimal strategy
Idea: choose move to position with highest minimax value
= best achievable payoff against best play
E.g., 2-ply game: only two half-moves
Minimax algorithm
Revisit example
Alpha-Beta Example
Do DF-search until first leaf
Rangeofpossiblevalues
[-,+]
[-, +]
[-,+]
[-,3]
[-,+]
[-,3]
[3,3]
[3,3]
[-,2]
[3,14]
[3,3]
[-,2]
[-,14]
[3,5]
[3,3]
[,2]
[-,5]
[3,3]
[3,3]
[,2]
[2,2]
[3,3]
[3,3]
[-,2]
[2,2]
Properties of -
Pruning does not affect final result
Good move ordering improves effectiveness of pruning
With "perfect ordering," time complexity = O(bm/2)
doubles depth of search
Why is it called -?
is the value of the
best (i.e., highestvalue) choice found
so far at any choice
point along the path
for max
If v is worse than ,
max will avoid it
prune that branch
Resource limits
In reality, imperfect and real-time decisions are
required
Suppose we have 100 secs, explore 104 nodes/sec
106 nodes per move
Standard approach:
cutoff test:
e.g., depth limit
evaluation function
= estimated desirability of position
Evaluation functions
For chess, typically linear weighted sum of features
Eval(s) = w1 f1(s) + w2 f2(s) + + wn fn(s)
e.g., w1 = 9 for queen, w2 = 5 for rook, wn = 1 for pawn
f1(s) = (number of white queens) (number of black
queens), etc.
If n is a terminal
If n is a max node
If n is a max node
If n is a chance node
Left, A1 is best
Right, A2 is best
Outcome of evaluation function (hence the agent behavior) may change
when values are scaled differently.
Behavior is preserved only by a positive linear transformation of EVAL.