Академический Документы
Профессиональный Документы
Культура Документы
Requirements
Agents
as percepts.)
A rational agent is one that does the right thing, that is every entry in the table
for the agent function is filled out correctly. A performance measure embodies
the criterion for success of an agent's behaviour. Autonomy is the extent to
which an agent can act without relieving on prior knowledge from its designer.
For example, we might not want to favour a car-cleaner that's extremely fast in
the rest hour and then sits around reading, over one that works consistently.
Environments
static state;
static rules;
state = update(state,percept);
rule = match(state,rules);
next_action = find_action(rule);
state = update(state,action);
return next_action;
Simple reflex agents are the simplest kind of agents. These agents act on the
current percept, as opposed to the rest of the percept history. Eg; if car-in-front-
is-breaking then initiate-breaking. These kind of agents only work if the
environment is fully observable.
Model-based reflex agents maintain internal state to track aspects of the world
that are not evident on the current percept.
Utility based agents -they try to maximize their maximum expected happiness
Learning agents -as Turing suggested, this is the easier way of creating
complex AI. These agents have a learning element and a performance
element (action performing) and the problem generator.
Solving Problems by Searching
There are methods that an agent can use to select actions in environments that
are deterministic, observable, static and completely known. In such cases the
agent can construct sequences of actions that achieve its goals; this process is
called search.
● Before an agent can start searching for solutions, it must formulate a goal
and then use the goal to formulate a problem.
● A problem consists of four parts: the initial state, a set of actions, a goal
test function, and a path cost function. The environment of the problem is
represented by a state space. A path through the state space from the initial
state to a goal state is a solution.
● When the environment is partially observable, the agent can apply search
algorithms in the space of belief states, or sets of possible states that the agent
might be in. In some cases, a single solution sequence can be constructed; in
most other cases, the agent needs a contingency plan, to handle unknown
circumstances that may arise.
Uniform-cost search differs in that it always expands the node with the lowest
path-cost first
A heuristic function, usually denoted h(n) is one that estimates the cost of the
best path from any node to a goal. If n is a goal then h(n)=0.
It uses path cost g(n) and also the heuristic function h(n) by forming
f(n) = g(n) + h(n)
where
and
Definition: an admissible heuristic f(n) is one that never overestimates the cost of
the best path from n to a goal.
Iterative deepening search used depth-first search with a limit on depth that
gradually increased.
● RBFS and SMA* are robust, optimal search algorithms that use limited
amounts of memory; given enough time, they can solve problems that A* cannot
solve because it runs out of memory.
● Exploration problems arise when the agent has no idea about the states
and actions of its environment. For safely explorable environments, online
search agents can build a map and find a goal if one exists. Updating heuristic
estimates from experience provides an effective method to escape from local
minima.
Games
● A game may be defined by the initial state (how the board is set up), the
legal actions in each state, a terminal test (which says when the game is over),
and a utility function that applies to terminal states.
● Usually, it is not feasible to consider the whole game tree (even with alpha-
beta), so we need to cut the search off at some point and apply an evaluation
function that gives an estimate of the utility of a state.
● Programs can match or beat the best human players in checkers, Othello,
and backgammon and are close behind in bridge, A program has beaten the
world chess champion in one exhibition match. Programs remain at the amateur
level in Go.
Lecture notes:
CSPs standardise the manner in which states and goal tests are represented.
The form of the goal test can tell us about the structure of the problem.
Initial state:
In addition:
Planning
● Decision trees can represent all Boolean functions. The information gain
heuristic provides an efficient method for finding a simple, consistent decision
tree.
Knowledge in Learning
● ILP can be done with a top-down approach or refining a very general rule or
though a bottom-up approach of inverting the deductive process.
● ILP methods generate new predicates with which concise new theories can
be expressed and show promise as general-purpose scientific theory formation
systems.
Breadth-first search is complete and optimal, but has exponential cost both in
terms of space and time. Depth first search is neither complete nor optimal, but
has exponential time complexity yet linear space complexity. Iterative
deepening search is complete, optimal for unit step costs, and exponential time
complexity and linear space complexity.
The term backtracking search is used for a depth-first search that chooses
values for one variable at a time and backtracks when a variable has no legal
values left to assign.
In pseudo-code:
return failure
The degree heuristic attempts to reduce the branching factor on future choices
by selecting the variable that is involved in the largest number of constraints on
other unassigned variables.
Once a variable has been selected, the algorithm must decide on the order in
which to examine its values. The least-constraining-value heuristic can be
effective in some cases; it prefers the value that rules out the fewest choices for
the neighbouring variables in the constraint graph. In general, the heuristic is
trying to leave the maximum flexibility for subsequent variable assignments.
h(n) is monotonic if f(n) = g(n)+h(n) never decreases along a path from the root.
"A heuristic h(n) is monotonic if, for every node n and every successor n' or n
generated by any action a, the estimated cost of reaching the goal from n is no
greater than the step cost of getting to n' plus the estimated cost of reaching the
goal from n':
Greedy best first search tries to expand the node that is closest to the goal (ie
f(n)=h(n) ).
A* search however evaluates nodes by combining g(n), the cost to reach the
node, and h(n), the estimated cost of the cheapest path from n to the goal:
Proof A* is optimal:
Now say Goal2 is chosen for expansion before n. This means that
f(n) > f2
● There is a finite, positive constant c such that each operator has cost at
least c.
The search expands nodes according to increasing f(n). So the only way it can
fail to find a goal is if there are infinitely many nodes with f(n) < f(Goal).
● There is a path with an infinite number of nodes but a finite path cost.
Given a game tree, the optimal strategy can be determined by examining the
minimax value of each node, MINIMAX-VALUE(n). The minimax value of a
node is the utility (for MAX, ie current player) of being in the corresponding state,
assuming that both players play optimally from there to the end of the game. The
minimax decision is the optimal choice for MAX because it leads to the
successor with the highest minimax value. The minimax algorithm computes
the minimax decision from the current state. It uses a simple recursive
computation of the minimax values of each successor state, directly
implementing the defining equations. The recursion proceeds all the way down to
the leaves of the tree, and then the minimax values are backed up through the
tree as the recursion unwinds.
If the maximum depth of the tree is m, and there are b legal moves at each point,
then the time complexity of the minimax algorithm is O(bm). The space complexity
is O(bm) for an algorithm that generates all successors at once, or O(m) for an
algorithm that generates successors one at a time.
For real games the time cost is impractical, so a number of time saving
techniques are implemented. Whilst we can't eliminate the exponent from the
computational complexity, we can effectively cut it in half with alpha-beta pruning.
Consider a node n somewhere in the tree, such that Player has a choice of
moving to that node. If Player has a better choice m either at the parent node of n
or at any choice point further up, then n will never be reached in actual play. So
once we have found out enough about n (by examining its descendants) to reach
this conclusion, we can prune it.
Alpha beta-pruning gets its name from the following two parameters that describe
bounds on the backed-up values that appear anywhere along the path:
● α = the value of the best (ie highest value) choice we have found so far at
any choice point along the path for MAX.
● β = the value of the best (ie lowest value) choice we have found so far at
any choice point along the path for MIN.
Alpha-beta search updates the values of α and β as its goes along and prunes
the remaining branches at a node as soon as the value of the current node is
known to be worse than the current α or β value for MAX or MIN, respectively.
Complexity theory:
Np:
It may by hard to find a solution, but a possible solution is relatively small, once
you have guessed one it is easy to verify whether it is actually a solution.
"verification algorithm" V
have
We have
LISO,Lk-COL,LHAM 2 NP.v