Вы находитесь на странице: 1из 19

● Artificial Intelligence

Requirements

Knowledge representation to store what it knows or hears

● Automated reasoning to use the stored information to answer questions and


to draw new conclusions

● Machine learning to adapt to new circumstances and to detect and


extrapolate patterns

Agents

An agent is anything that can be viewed as perceiving its environment through


sensors and acting upon that environment through actuators. We use the term
percept to refer to the agents perceptual inputs at any given instance. An agent's
percept sequence is the complete history of everything the agent has ever
perceived. The mathematical agent function maps any given percept sequence
to an action, and is implemented by an agent program.

Example: email spam filter.

Percepts: the textual content of individual email messages. (A more

sophisticated program might also take images or other attachments

as percepts.)

Actions: send to the in box, delete, or ask for advice.

Goals: remove spam while allowing valid email to be read.

Environment: an email program.

A rational agent is one that does the right thing, that is every entry in the table
for the agent function is filled out correctly. A performance measure embodies
the criterion for success of an agent's behaviour. Autonomy is the extent to
which an agent can act without relieving on prior knowledge from its designer.

Definition or a rational agent:


"For each possible percept sequence, a rational agent should select an action
that is expected to maximize its performance measure, given the evidence
provided by the percept sequence and whatever built-in knowledge the agent
has"

● In general, we will be interested in success over the long term.

For example, we might not want to favour a car-cleaner that's extremely fast in
the rest hour and then sits around reading, over one that works consistently.

● We are generally interested in expected performance because usually


agents are not omniscient. they don't infallibly know the outcome of their actions.

As a general rule it is better to design performance measures according to what


one actually wants in the environment, rather than according to how one thinks
the agent should behave.

In designing an agent we design the task environment which is defined by


PEAS (Performance, Environment, Actuators, Sensors)

Environments

● An environment is fully observable if if an agents sensor give it access to


the complete state of the environment of each point in time.

● If the next state of the environment is completely determined by the current


state and the action executed by the agent, then we say the environment is
deterministic.

● In an episodic task environment, the agent's experience is divided into


atomic episodes.

● Static if the environment doesn't change whilst deliberating.

● Discrete environment (time,actions) eg chess as opposed to taxi driving


● Single environment eg crossword as opposed to chess.

Keeping track of the environment:

action agent (percept)

static state;

static rules;

state = update(state,percept);

rule = match(state,rules);

next_action = find_action(rule);

state = update(state,action);

return next_action;

Simple reflex agents are the simplest kind of agents. These agents act on the
current percept, as opposed to the rest of the percept history. Eg; if car-in-front-
is-breaking then initiate-breaking. These kind of agents only work if the
environment is fully observable.

Model-based reflex agents maintain internal state to track aspects of the world
that are not evident on the current percept.

Goal based agents act to achieve their goals

Utility based agents -they try to maximize their maximum expected happiness

Learning agents -as Turing suggested, this is the easier way of creating
complex AI. These agents have a learning element and a performance
element (action performing) and the problem generator.
Solving Problems by Searching

There are methods that an agent can use to select actions in environments that
are deterministic, observable, static and completely known. In such cases the
agent can construct sequences of actions that achieve its goals; this process is
called search.

● Before an agent can start searching for solutions, it must formulate a goal
and then use the goal to formulate a problem.

● A problem consists of four parts: the initial state, a set of actions, a goal
test function, and a path cost function. The environment of the problem is
represented by a state space. A path through the state space from the initial
state to a goal state is a solution.

● A single, general TREE-SEARCH algorithm can be used to solve any


problem; specific variants of the algorithm embody different strategies.

● Search algorithms are judged on the basis of completeness, optimality,


time complexity, and space complexity. Complexity depends on b, the
branching factor in the state space, and d, the depth of the shallowest solution.

● Breadth-first search selects the shallowest unexpanded node in the


search tree for expansion. It is complete, optimal for unit step costs, and has time
and space complexity of O(bd). The space complexity makes it impractical in
most cases. Uniform-cost search is similar to breadth-first search but expands
the node with lowest path cost, g(n). It is complete and optimal if the cost of each
step exceeds some positive bound e.

● Depth-first search selects the deepest unexpanded node in the search


tree for expansion. It is neither complete nor optimal and has time complexity of
O(bm) and space complexity of O(bm), where m is the maximum depth of any
path in the state space.

● Depth-limited search imposes a fixed depth limit on a depth-first search.

● Iterative deepening search calls depth-limited search with increasing limits


until a goal is found. It is complete, optimal for unit step costs, and has time
complexity of O(bd) and space complexity of O(bd).

● Bidirectional search can enormously reduce time complexity, but it is not


always applicable and may require too much space.
● When the state space is a graph rather than a tree, it can pay off to check
for repeated states in the search tree. The GRAPH-SEAERCH algorithm
eliminates all duplicate states.

● When the environment is partially observable, the agent can apply search
algorithms in the space of belief states, or sets of possible states that the agent
might be in. In some cases, a single solution sequence can be constructed; in
most other cases, the agent needs a contingency plan, to handle unknown
circumstances that may arise.

Notes from lectures:

the total cost=path cost+search cost

● Uninformed or blind search is applicable when we only distinguish goal


states from non-goal states. Methods are distinguished by the order in which
nodes in the search tree are expanded. These methods include: breadth-first,
uniform cost, depth-rst, depth-limited, iterative deepening, bidirectional.

● Informed or heuristic search is applied if we have some knowledge of the


path cost or the number of steps between the current state and a goal. These
methods include: best first, greedy, A*, iterative deepening A* (IDA*), SMA*.

Uniform-cost search differs in that it always expands the node with the lowest
path-cost first

A heuristic function, usually denoted h(n) is one that estimates the cost of the
best path from any node to a goal. If n is a goal then h(n)=0.

A* search combines the good points of:

greedy search—by making use of h(n)

uniform-cost search- by being optimal and complete.

It uses path cost g(n) and also the heuristic function h(n) by forming
f(n) = g(n) + h(n)

where

g(n) = cost of path to n

and

h(n) = estimated cost of best path from n

So: f(n) is the estimated cost of a path through n.

Definition: an admissible heuristic f(n) is one that never overestimates the cost of
the best path from n to a goal.

Iterative deepening search used depth-first search with a limit on depth that
gradually increased.

IDA * does the same thing with a limit on f cost.

It is complete and optimal under the same conditions as A*.

It only requires space proportional to the longest path.

The time taken depends on the number of values h can take.

Informed Search and Exploration

This section examines the application of heuristics to reduce search costs.


Optimality comes at a stiff price in terms of search cost, even with good
heuristics.

● Best-first search is just GRAPH-SEARCH where the minimum-cost


unexpanded nodes (According to some measure) are selected for expansion.
Best-first algorithms typically use a heuristic function h(n) that estimates the cost
of a solution from n.

● Greedy best-first search expands nodes with minimal h(n). It is not


optimal, but is often efficient.
● A* search expands nodes with minimal f(n) = g(n) + h(n). A* is complete
and optimal, provided that we guarantee that h(n) is admissible (for TREE-
SEARCH) or consistent (for GRAPH-SEARCH). The space complexity of A* is
still prohibitive.

● The performance of heuristic search algorithms depends on the quality of


the heuristic function. Good heuristics can sometimes be constructed by relaxing
the problem definition, by pre computing solution costs for subproblems in a
pattern database, or by learning from experience with the problem class.

● RBFS and SMA* are robust, optimal search algorithms that use limited
amounts of memory; given enough time, they can solve problems that A* cannot
solve because it runs out of memory.

● Local search methods such as hill climbing operate on complete-state


formulations, keeping only a small number of nodes in memory. Several
stochastic algorithms have been developed, including simulated annealing,
which returns optimal solutions when given an appropriate cooling schedule.
Many local search methods can also be used to solve problems in continues
spaces.

● A genetic algorithm is a stochastic hill-climbing search in which a large


population of states is maintained. New states are generated by mutation and by
crossover, which combines pairs of states from the population.

● Exploration problems arise when the agent has no idea about the states
and actions of its environment. For safely explorable environments, online
search agents can build a map and find a goal if one exists. Updating heuristic
estimates from experience provides an effective method to escape from local
minima.

Constraint Satisfaction Problems

● Constraint satisfaction problems consist of variables with constraints on


them. Many important real-world problems can be described as CSPs. The
structure of a CSP can be represented by its constraint graph.

● Backtracking search, a form of depth-first search, is commonly used for


solving CPSs.

● The minimum remaining values and degree heuristics are domain-


independent methods for deciding which variable to choose next in a
backtracking search. The least-constraining-value heuristic helps in ordering
the variable values.
● By propagating the consequences of the partial assignments that it
constructs, the backtracking algorithm can reduce greatly the branching factor of
the problem. Forward checking is the simplest method for doing this. Arc
consistency enforcement is a more powerful technique, but can be more
expensive to run.

● Backtracking occurs when no legal assignment can be found for a variable.


Conflict-directed backjumping backtracks directly to the source of the problem.

● Local search using the min-conflicts heuristic has been applied to


constraint satisfaction problems with great success.

● The complexity of solving a CSP is strongly related to the structure of its


constraint graph. Tree-structured problems can be solved in linear time. Cutset
conditioning can reduce a general CSP to a tree-structured one and is very
efficient if a small cutset can be found. Tree decomposition techniques
transform the CSP into a tree of sub problems and are efficient if the tree width
of the constraint graph is small.

Games

● A game may be defined by the initial state (how the board is set up), the
legal actions in each state, a terminal test (which says when the game is over),
and a utility function that applies to terminal states.

● In two-player zero-sum games with perfect information, the minimax


algorithm can select optimal moves using a depth-first enumeration of the game
tree.

● The alpha-beta search algorithm computes the same optimal move as


minimax, but achieves much greater efficiency by eliminating sub trees that are
provably irrelevant.

● Usually, it is not feasible to consider the whole game tree (even with alpha-
beta), so we need to cut the search off at some point and apply an evaluation
function that gives an estimate of the utility of a state.

● Games of chance can be handled by an extension to the minimax algorithm


that evaluates a chance node by taking the average utility of all its children
nodes, weighted by the probability of each child.

● Optimal play in games of imperfect information, such as bridge, requires


reasoning about the current and future belief states of each player. A simple
approximation can be obtained by averaging the value of an action over each
possible configuration of missing information.

● Programs can match or beat the best human players in checkers, Othello,
and backgammon and are close behind in bridge, A program has beaten the
world chess champion in one exhibition match. Programs remain at the amateur
level in Go.

Lecture notes:

CSPs standardise the manner in which states and goal tests are represented.

As a result we can devise general purpose algorithms and heuristics.

The form of the goal test can tell us about the structure of the problem.

Consequently it is possible to introduce techniques for decomposing problems.

We can also try to understand the relationship between the structure of a


problem and the difficulty of solving it.

Clearly a CSP can be formulated as a search problem in the familiar sense:

Initial state:

--no variables are assigned.

Successor function: assigns value(s) to currently unassigned variable(s) provided


constraints are not violated.

Goal: reached if all variables are assigned.

Path cost: constant # per step.

In addition:

The tree is limited to depth so depth-first search is usable.

It is fairly easy to see that a CSP can be give as an incremental formulation as


a standard search problem as follows:
● Initial state: the empty assignment {}, in which all variables are unassigned.

● Successor function: a vallue can be assigned to any unassigned variable,


provided that it does nto conflict with previously assigned variables.

● Goal test: the current assignment is complete.

● Path cost: a constant cost (eg 1) for every step.

Every solution must be a complete assignment and therefore appears at depth n


if there are n variables. Furthermore, the search tree extends only to depth n. For
these reasons, depth-first search algorithms are popular for CSPs.

Planning

● Planning systems are problem-solving algorithms that operate on explicit


propositional (or first-order) representations of states and actions. These
representations make possible the derivation of effective heuristics and the
development of powerful and flexible algorithms for solving problems.

● The STRIPS language describes actions in terms of their preconditions and


effects and describes the initial and goal states as conjunctions of positive
literals. The ADL language replaces some of these constraints, allowing
disjunction, negation and quantifiers.

● State0space search can operate in the forward direction (progression) or


backward direction (regression). Effective heuristics can be derived by making a
subgoal independence assumption and by various relaxations of the planning
problem.

● Partial-order planning (POP) algorithms explore the space of plans without


committing to a totally ordered sequence of actions. They work back from the
goal, adding actions to the plan to achieve each subgoal. They are particularly
effective on problems amenable to a divide-and-conquer approach.

● A planning graph can be constructed incrementally, starting form the intitial


state. Each layer contains a superset o fall the literals or actions that could occur
at that time step and encodes mutual exclusion, or mutex, relations among
literals or actions that cannot co-occur. Planning graphs yield useful heuristics for
state-space and partial-order planners and can be used directly in the
GRAPHPLAN algorithm.

● The SATPLAN algorithm translates a planning problem into propositional


axioms and applies a satisfiability algorithm to find a model that corresponds to a
valid plan. Several different propositional representations have been developed,
with carying degrees of compactness and efficiency.
● Each of the major approaches to planning as its adherents, and there is as
yet no consensus on which is best. Competition and cross-fertilization among the
approaches have resulted in significant gains in efficiency for planning systems.

Learning from Observation

● Learning takes many forms, depending on the nature of the performance


element, the component to be improved, and the available feedback,.

● If the available feedback, either from a teacher or from the environment,


provides the correct value for the examples, the learning problem is called
supervised learning. The task, also called inductive learning, is then to learn a
function from examples of its inputs and outputs. Learning a discrete-valued
function is called classification; learning a continuous function is called
regression.

● Inductive learning involves finding a consistent hypothesis that agrees with


the examples. Ockham's razor suggests choosing the simplest consistent
hypothesis. The difficulty of this task depends on the chosen representation.

● Decision trees can represent all Boolean functions. The information gain
heuristic provides an efficient method for finding a simple, consistent decision
tree.

● The performance of a learning algorithm is measured by the learning


curve, which shows the prediction accuracy on the test set as a function of the
training set size.

● Ensemble methods such as boosting often perform better than individual


methods.

● Computation learning theory analyses the sample complexity and


computational complexity of inductive learning. There is a trade-off between the
expressiveness of the hypothesis language and the ease of learning.

Knowledge in Learning

● The use of prior knowledge leads to a picture of cumulative learning. In


which learning agents improve their learning ability by eliminating otherwise
consistent ypotheses and by "filling in" the explanations of examples, thereby
allowing for shorter hypotheses. These contributions often result in faster
learning from fewer examples.
● Understanding the different logical roles played by prior knowledge, as
expressed by entailment constraints, help to define a variety of learning
techniques.

● Explanation based learning (EBL) extracts general rules from single


examples by exaplining the examples and generalizing the explanation. It
provides a deductive method turning first-principles knowledge into useful,
efficient, special-purpose expertise.

● Relevence-based learning (RBL) uses prior knowledge in the form of


determinations to identify the relevant attrivutres, thereby generating a reduced
hypohtesis space and speeding up learning. RBL also allows deductive
gneerazlizations from single examples.

● Knowledge-based inductive learning (KBIL) finds inductive hypotheses


that explain sets of observations with the help of background knowledge.

● Inductive logic programming (ILP) techniques perform KBIL on


knowledge that is expressed in first-order logic. ILP methods can elarn relational
knowledge htat is not expressible in attribute based systems.

● ILP can be done with a top-down approach or refining a very general rule or
though a bottom-up approach of inverting the deductive process.

● ILP methods generate new predicates with which concise new theories can
be expressed and show promise as general-purpose scientific theory formation
systems.

Boolean CSPs include as special cases some NP-complete problems, such as


3SAT. In the worst case, therefore, we cannot expect to solve finite-domain
CSPs in less than exponential time. In most practical applications, however,
general-purpose CSP algorithms can solve problems orders of magnitude larger
than those solvable via the general-purpose (non heuristic) searches described
below.

The complexity of solving a CSP is strongly related to the structure of its


constraint graph. Tree-structured problems can be solved in linear time. Cutset
conditioning can reduce a general CSP to a tree-structured one and is very
efficient if a small cutset can be found. Tree decomposition techniques
transform the CSP into a tree of sub problems and are efficient if the tree width
of the constraint graph is small.

Breadth-first search is complete and optimal, but has exponential cost both in
terms of space and time. Depth first search is neither complete nor optimal, but
has exponential time complexity yet linear space complexity. Iterative
deepening search is complete, optimal for unit step costs, and exponential time
complexity and linear space complexity.

Informed or heuristic search is applied if we have some knowledge of the path


cost or the number of steps between the current state and a goal, and whilst
more intelligent they can still fare poorly in terms of performance. Greedy search
is not optimal nor complete, and has exponential time and space complexity - it
can however, be very effective provided you have a good heuristic function. A*
search combines the good points of greedy search (By making good use of h(n) )
and also of uniform-cost search (complete and optimal). Whilst being optimally
efficient (ie no other optimal algorithm that works by constructing paths from the
root can guarantee to examine fewer nodes) it still has exponential time and
space complexity. IDA* has only linear space complexity but still has exponential
time complexity.

The term backtracking search is used for a depth-first search that chooses
values for one variable at a time and backtracks when a variable has no legal
values left to assign.

In pseudo-code:

function BACKTRACKING-SEARCH(csp) returns a solution, or failure

return RECURSIVE-BACKTRACKING( {}, csp)

function RECURSIVE-BACKTRACKING(assignment, csp) returns a solution, or


failure

if assignment is complete then return assignment

var <-- SELECT-UNASSIGNED-VARIABLE(VARIABLES[csp],assignment,csp)

for each value in ORDER-DOMAIN-VALUES(var, assignment, csp) do

if value is consistent with assignment according to CONSTRAINTS[csp] then

add { var = value } to assignment


result <-- RECURSIVE-BACKTRACKING(assignment, csp)

if result != failure then return result

remove { var = value } from assignment

return failure

This simple version of backtracking uses chronological backtracking, a better


way would be to go back to one of the set of variables that caused the failure
(with a conflict set).

By default, SELECT-UNASSIGNED-VARIABLE in my answer above simply


selects the next unassigned variable in the order given by the list
VARIABLE[csp]. This static variable ordering seldom results in the most efficient
search. A better way to do it would be to assign the variable with the least
number of possible "legal values" next. For example, at a certain stage there may
be only one possible value that a variable could take- it would make sense to
assign that value now and not have to worry about it later. This idea is called the
minimum remaining values (MRV) heuristic.

The degree heuristic attempts to reduce the branching factor on future choices
by selecting the variable that is involved in the largest number of constraints on
other unassigned variables.

Once a variable has been selected, the algorithm must decide on the order in
which to examine its values. The least-constraining-value heuristic can be
effective in some cases; it prefers the value that rules out the fewest choices for
the neighbouring variables in the constraint graph. In general, the heuristic is
trying to leave the maximum flexibility for subsequent variable assignments.

An admissible heuristic h is a function that gives a guaranteed lower bound on


the distance from any node u to the destination t.

● Natural example: straight-line distance (at maximum speed) to t.

h(n) is monotonic if f(n) = g(n)+h(n) never decreases along a path from the root.

● Almost all admissible heuristics are monotonic


● h(n) is monotonic iff it obeys the triangle inequality

Or rephrased (from Course Text...)

"A heuristic h(n) is monotonic if, for every node n and every successor n' or n
generated by any action a, the estimated cost of reaching the goal from n is no
greater than the step cost of getting to n' plus the estimated cost of reaching the
goal from n':

h(n) < c(n,a,n') + h(n)

(This is a form of the general triangle inequality)."

Best-first search is an instance of the general TREE-SEARCH algorithm in


which a node is selected fro expansion based on an evaluation function f(n) =
estimated cost of the cheapest path from node n to a goal node.

Greedy best first search tries to expand the node that is closest to the goal (ie
f(n)=h(n) ).

A* search however evaluates nodes by combining g(n), the cost to reach the
node, and h(n), the estimated cost of the cheapest path from n to the goal:

f(n) = g(n) + h(n) or in words

f(b) = estimated cost of the cheapest solution through n.

A* is optimal if h(n) is an admissible heuristic.

Proof A* is optimal:

Let Goalopt be an optimal goal state with

f(Goalopt) = g(Goalopt) = fopt

Let Goal2 be a suboptimal goal state with

f(Goal2) = g(Goal2) = f2 > fopt


We need to demonstrate that the search can never select Goal2 (a suboptimal
goal state)

Let n be a leaf node on an optimal path to Goalopt. So

fopt > f(n)

because h is admissible and we're assuming it's also monotonic.

Now say Goal2 is chosen for expansion before n. This means that

f(n) > f2

so we've established that

fopt > f2 = g(Goal2).

But this means that Goalopt is not optimal. Contradiction!

A* search is also complete provided:

● The graph has finite branching factor;

● There is a finite, positive constant c such that each operator has cost at
least c.

The search expands nodes according to increasing f(n). So the only way it can
fail to find a goal is if there are infinitely many nodes with f(n) < f(Goal).

There are two ways this can happen:

● There is a node with an infinite number of descendants

● There is a path with an infinite number of nodes but a finite path cost.
Given a game tree, the optimal strategy can be determined by examining the
minimax value of each node, MINIMAX-VALUE(n). The minimax value of a
node is the utility (for MAX, ie current player) of being in the corresponding state,
assuming that both players play optimally from there to the end of the game. The
minimax decision is the optimal choice for MAX because it leads to the
successor with the highest minimax value. The minimax algorithm computes
the minimax decision from the current state. It uses a simple recursive
computation of the minimax values of each successor state, directly
implementing the defining equations. The recursion proceeds all the way down to
the leaves of the tree, and then the minimax values are backed up through the
tree as the recursion unwinds.

The minimax algorithm performs a complete depth-first exploration of the game


tree.

If the maximum depth of the tree is m, and there are b legal moves at each point,
then the time complexity of the minimax algorithm is O(bm). The space complexity
is O(bm) for an algorithm that generates all successors at once, or O(m) for an
algorithm that generates successors one at a time.

For real games the time cost is impractical, so a number of time saving
techniques are implemented. Whilst we can't eliminate the exponent from the
computational complexity, we can effectively cut it in half with alpha-beta pruning.

Consider a node n somewhere in the tree, such that Player has a choice of
moving to that node. If Player has a better choice m either at the parent node of n
or at any choice point further up, then n will never be reached in actual play. So
once we have found out enough about n (by examining its descendants) to reach
this conclusion, we can prune it.

Alpha beta-pruning gets its name from the following two parameters that describe
bounds on the backed-up values that appear anywhere along the path:

● α = the value of the best (ie highest value) choice we have found so far at
any choice point along the path for MAX.

● β = the value of the best (ie lowest value) choice we have found so far at
any choice point along the path for MIN.

Alpha-beta search updates the values of α and β as its goes along and prunes
the remaining branches at a node as soon as the value of the current node is
known to be worse than the current α or β value for MAX or MIN, respectively.
Complexity theory:

Lk-COL,LSAT,LISO,LHAM contained in LSpace.

and computable in expopnential time

Np:

It may by hard to find a solution, but a possible solution is relatively small, once
you have guessed one it is easy to verify whether it is actually a solution.

More precisely, the class NP P({0, 1})

(as "nondeterministic polynomial time", roughly

"polynomial time using guessing") is defined as the

set of all languages L {0, 1} (aka problems), such

that there exists a

"verification algorithm" V

running in polynomial times, which takes as input a

pair (x,w) 2 {0, 1}, where x is a "problem" and w

is a possible "witness" for x 2 L, and outputs 0 or 1

(where in case V (x,w) = 1 it follows x 2 L), and there

exists a polynomial p, such that for all x 2 {0, 1} we

have

x 2 L if and only if there is a w 2 {0, 1} with

|w| p(|x|) and V (x,w) = 1.

We have

LISO,Lk-COL,LHAM 2 NP.v

Вам также может понравиться