AI

Artificial Intelligence
think like human, act rationally and

Chapter 1
think rationally to solve the given
Introduction to Artificial Intelligence
problem.
What is AI? What are its application?
Goals of AI
- Act like human
- Intelligence in machine
- Think like human
- Automation
- Act rationally
- To solve problems
- Think rationally
Challenge of AI
Application
- Uncertainty
- Business
- Efficiency
- Education
- Beliefs
- Medical
- Entertainment
Application of AI
- Science and engineering
- Robotics
- Simulation
- Decision support system (DSS)
- Game Playing
Artificial intelligence
- Searching
Intelligence is the act of knowledge to solve
- Natural language processing
problem. Human uses his/her intelligence
- BEMESS
to solve problem. So, the scientist thought
to provide intelligence to the machine so as
Interrogation Test
to the machine can act like as human and
- Asking help in online or whether
this branch of computer science is called
server replies or person replies
artificial intelligence
- If the person replies, interrogation
There are mainly four school of
test is successful else unsuccessful.
thought which define artificial intelligence
as:
- AI is the branch of science for
developing machine with
intelligence which act like human
- AI is the branch of computer
science for developing machine
with intelligence which can act
rationally
with intelligence which can think
like human.
with intelligence which can think
rationally.
Hence, simply AI can be defined as the
branch of computer science for
developing which can act like human,
1
distinct from omniscience (all knowing

Chapter 2
with infinite knowledge)
Artificial Agent
Agent is anything that can be viewed as
perceiving its environment through sensors Types of agent:
and acting upon that environment through Agent can be classified into five types.
actuators. They are:
Eg: Human Agent a) Table Driven Agent
Preceptor: Skin, Ear, Eyes etc.
Activator: limbs, mouth etc.
Robotic Agent
Preceptor: sensor, camera,
Antenna etc.
Activator: Arm, Legs, head etc.
An agent can be expressed in block diagram Fig: Table Driven Agent

as:
b) Simple Reflex Agent
Fig: Block Diagram of Agent
An agent percepts the environment through

sensor and the agent function to map the Fig: Simple Reflex Agent
percept information with percepts histories It keeps track of the current state of world
to action using an internal model then chooses an
i.e. 𝐹𝐹: 𝑃𝑃∗ → 𝐴𝐴 action in same way as reflex agent.
The agent program runs on the physical - Fast but too simple
architecture to produce F. hence, agent is - No memory
combined form of agent architecture and - Fails if environment is partially
agent program. observable.
Rational agent select an action from

possible percept sequence that is expected
to maximize its performance measures,
based on the evidence provided by the
percept sequence and whatever built in
knowledge that agent has. Rationality is
2
c) Model Based Reflex Agent e) Learning agent
Fig: Model Based Reflex Agent
- Can work with partial information

- Unclear what to do without clear
goal
It can be divided into four conceptual
elements:
d) Goal based agent i. Learning element: is responsible for
making improvements
ii. Performance element: is
responsible for external actions.
iii. Critic elements: is used by learning
elements that has the agent is doing
iv. Problem generator: is responsible
for suggesting actions that will lead
to new and information experience
Fig: Goal based agent.
- Doesn’t learn
It keeps track of the world state as well as
of goal it tries to achieve and choose an
action that will (eventually) lead to the
achievement of its goal.
3
Task Environment
Task environment is accounted by four
parameters. They are: Performance
measure, Environment, Actuators and
Sensors of agent i.e. PEAS.
Let us consider an example of Automatic

Taxi Driver as an artificial agent. Then,
• Performance measure
- Speed, punctuality, handling, safety
• Environment
- Road, traffic light, pedestrian,
vehicle.
• Actuators
- Steering wheels, wipers, headlight,
and horn.
• Sensors
- Radar, camera, fog sensor, fuel
sensor.
4
Example:
Chapter 3
Measuring Problem: measuring 4L
Problem Solving by using 5L and 3L jug
Problem Let 3L be represented by x.

Problem is defined by its elements Let 5L be represented by y.
and their relation state space, initial space,
goal state rules and action are its element. So,
State is (x,y).
A problem is an abstract space Goal State is (3,1) or (0,4) or (2,2) or
which encompasses all valid states that can (1,3) (3,4) etc.
be generated by the application of any
combination of operators on any Operators
combination of objects. The problem space S.N Initial Next Rules Rema
may contain one or more solution. It can be . State State rk
tree or graph. 1 (x,y) (0,y) (x,y) Empti
 es 3L
Search refers to the search for the (0,y) Jug
solution in problem space. Search proceeds 2 (x,y) (x,0) (x,y) Empti
with different types of search control  es 5L
strategies. (x,0) Jug
3 (x,y) (x-p,y+p) (x,y) Pour
Problem solving is the process of  (x- pL
generating solution from observed data p,y+p from
which is characterized by a set of goals, a ) x to y
4 (x,y) (x+p,y-p) (x,y) Pour
set of object and a set of operation.
 pL
(x+p, from
Problem Formulation (How to solve problem?) y-p) y to x
 Define problem precisely (i/p, goal, 5 (0,y) (3,y) (0,y) Fill
initial state)  3L
 Analyze the problem (3,y) Jug
 Isolate and represent knowledge 6 (x,0) (x,5) (x,0) Fill
necessary to solve problem  5L
 Choose best problem solving (x,5) Jug
method and apply it to particular
problem.
State space is shown in next space
(0,0) (1,0) (2,0) (3,0) (4,0)
(0,1) (1,1) (2,1) (3,1) (4,1)
(0,2) (1,2) (2,2) (3,2) (4,2)
(0,3) (1,3) (2,3) (3,3) (4,3)
(0,4) (1,4) (2,4) (3,4) (4,4)
(0,5) (1,5) (2,5) (3,5) (4,5)
Solution Space:
(0,0)  (0,5)  (3,2)  (0,2)  (2,0) 
(2,5)  (3,4)  (0.4)
5
Types of Problems Searching Strategies.

a. Single state problem
Environment: Deterministic and accessible Uninformed Search.
- In this problem, agent knows - It is also called as blind search as it
everything about world and can uses no information about likely
calculate optimal action sequence to “direction” of goal node.
reach goal stage. - Breadth First, Depth First, Uniform
Example: playing chess Cost Search, Iterative Depending
Search etc. are method of
b. Multi-stage problem uninformed search.
Environment: Deterministic and accessible
- In this problem, agent doesn’t know
that exact state and assume state BFS DFS
while working towards goal state. FIFO approach LIFO approach
Example: walking in dark room. Higher memory Lower than BFS
Complete: YES Complete: NO
c. Contingency Problem Space Complexity: Space Complexity:
Environment: Deterministic and accessible 𝑂𝑂(𝑏𝑏 𝑑𝑑+1 ) 𝑂𝑂(𝑏𝑏 𝑚𝑚 )
- In this problem, agent must use Time Complexity: Time Complexity:
sensor during executions and 𝑂𝑂(𝑏𝑏 𝑑𝑑+1 ) 𝑂𝑂(𝑏𝑏 𝑚𝑚 )
solution in a tree or policy often Optimal: YES Optimal: NO
interleave search execution.
Example: a new skater in area.
d. Exploration Problem
Environment: Deterministic and accessible
- In this problem, agent discover and
learn about environment while A-B-C-D-E-F-G A-B-D-E-C-F-G
taking action.
Example: Maze game.
Characteristic of Search Strategies

(Performance Measuring Parameter)
 Completeness (Does it always find
solution if one exist?)
 Time Complexity
 Space Complexity
 Optimality (Does it guarantee that
least cost solution?)
Parameter of Space and Time

Complexity
- b: Max Branching Factor
- d: depth of least cost solution
- m: max depth of the search tree
6
Breadth-First Search
BFS uses FIFO approach and expands
shallowest unexpected nodes. This search
method is complete if b is finite.
• Time complexity: �1 + 𝑏𝑏 + 𝑏𝑏 2 +
𝑏𝑏 3 + ⋯ … + 𝑏𝑏 𝑑𝑑 + (𝑏𝑏 2 + 1)� =
𝑂𝑂(𝑏𝑏 𝑑𝑑+1 )
• Space complexity: 𝑂𝑂(𝑏𝑏 𝑑𝑑+1 ) as
every node is kept in memory.
• Optimal: if cost = 1 per step.
• Space is bigger problem Step 1: Fringe [A]
is A goal?
Breadth first search is the
uniformed search strategy in which each
node are explore at a given depth before
moving on the next level. As it eventually
visits every node to a given depth
guaranteed to be complete. In this search,
optimum path cost is non decreasing
function of the depth of the node as nodes Step 2: Fringe [B, C]
explored in depth order, let us assume the is B goal?
expansion of tree checks 1000 nodes/sec.
and need 100 bytes/sec and b=10.
Depth Nodes Time Memory

0 1 1 ms 100 B
2 111 0.1 sec 11 KB
4 11111 11 sec 11 MB
6 106 18 min 111 MB
8 108 31 min 11 GB
From this empirical analysis the time

complexity and space complexity of BFS
can be given by 𝑂𝑂(𝑏𝑏 𝑑𝑑 ) where d is depth of
tree to find optimal solution.
Example:
The search path is A-B-C-D-E-F-G.
7
Step 3: Fringe [C, D, E]

is C goal? Step 6: Fringe [F, G]
is F goal?
Step 4: fringe [D, E, F, G]

is D goal? Step 7: Fringe [G]
is G goal?
Step 5: Fringe [E, F, G]

is E goal? The path is:
8
Depth First Search

DFS uses LIFO approach and expand
deepest unexpected nodes. In this method,
the successor is kept at front.
• Complete: No as it fails in finite
depth space, space with loops
: Yes in finite step.
• Time complexity: 𝑂𝑂(𝑏𝑏 𝑚𝑚 ), terrible
if m is much larger than d but if
solution are dense maybe faster than
BFS.
• Space Complexity: 𝑂𝑂(𝑏𝑏 𝑚𝑚 ) is linear
space
• Optimal: NO Step 1: Fringe [A]
is A goal?
Depth First search is the uniformed
search strategy in which all the way to
bottom of a path is explored before moving
backup to another higher level child. It may
not guaranteed to be complete as it might
get lost following infinite path. So, it don’t
guarantee the optimal as it can find deeper
solution. It is more regressive and risk Step 2: Fringe [B, C]
taking search in which one path is choose is B goal?
and all other are ignored until it reaches the
end of chosen path.
If there is no end to the chosen path,
the process will continue forever which
indicates this search is incomplete. Its time
complexity is 𝑂𝑂(𝑏𝑏 𝑚𝑚 ) where m is the level
at which it finals solution.
The search path is A-B-D-E-C-F-G.
9
Step 3: Fringe [D, E, C] Step 6: Fringe [F, G]

Is D goal? Is F goal?
Step 4: Fringe [E, C]

Step 7: Fringe [G]
Is E goal?
Is G goal?
Step 5: Fringe [C] The path is:

Is C goal?
10
Informed Search Let us consider a tree as:

- It is also termed as heuristic search
as it uses information about the
domain to head in the general
direction of the goal node.
- Best First, A*, Hill Climbing etc.
are the method involved in
informed Search.
Best First Search (BeFS)

BeFs is also called as Greedy Best First
search. It uses the heuristic function i.e.
represented by ℎ(𝑛𝑛) and normally termed
as path cost or straight distance cost from
goal node. For this approach
𝑓𝑓(𝑛𝑛) = ℎ(𝑛𝑛)
• Complete: NO, as one can get stuck

in loop State Heuristic
Eg: Lasi  Dahi  Lasi  Dahi ℎ(𝑛𝑛)
• Time Complexity: 𝑂𝑂(𝑏𝑏 𝑚𝑚 ), but A 366
good heuristic can give B 374
improvement C 329
• Space Complexity: 𝑂𝑂(𝑏𝑏 𝑚𝑚 ) as D 244
keeps all node in memory E 253
• Optimal: NO F 178
G 193
This search tries to expand the node H 98
that is closest to the goal which likely to I 0
land the solution quickly. Thus, it evaluates
node by using heuristic function 𝑓𝑓(𝑛𝑛) = Step 1:
ℎ(𝑛𝑛). At each step best-first search sorts the
queue according to heuristic function
Best first search function on basis of
given problem, evaluation function and Step 2:
return a solution sequence.
Inputs is problem, a problem evaluation
function, an evaluation function, queuing
function (Fn) (which orders nodes by
evaluation as returns general search
(problem Queuing Function)).
Since, h(n) of Node E is less than that of

h(n) of node B and C. So, we choose Node
E.
𝑖𝑖. 𝑒𝑒. ℎ(𝑛𝑛𝐸𝐸 ) < ℎ(𝑛𝑛𝐵𝐵 ) 𝑎𝑎𝑎𝑎𝑎𝑎 ℎ(𝑛𝑛𝐸𝐸 ) < ℎ(𝑛𝑛𝐶𝐶 )
11
Step 3: Step 5:
Since, h(n) of node F is smaller than that of

node G. so, we choose node F.
Since, node I is only one node from node H.

So, we choose node I.
Now,
Total path cost
∑ ℎ𝑛𝑛 = ℎ(𝑛𝑛𝐸𝐸 ) + ℎ(𝑛𝑛𝐹𝐹 ) + ℎ(𝑛𝑛𝐻𝐻 ) + ℎ(𝑛𝑛𝐼𝐼 )
= 300 + 220 + 120 + 0
= 640
Step 4:
Now,
Total distance cost
∑ 𝑔𝑔(𝑛𝑛) = 𝑔𝑔(𝑛𝑛𝐸𝐸 ) + 𝑔𝑔(𝑛𝑛𝐹𝐹 ) + 𝑔𝑔(𝑛𝑛𝐻𝐻 ) +
𝑔𝑔(𝑛𝑛𝐼𝐼 )
= 110 + 232 + 242 + 252
= 836
Now,
Total cost of search
∑ 𝑔𝑔(𝑛𝑛) + ∑ ℎ(𝑛𝑛)
= 836 + 640
= 1476
The path is:
Since, there is only one node from node F.

so, we choose node H.
12
Uniform Cost Search A* Search

In this search approach, the fringe queue is It avoids expanding path that are already
order by path cost. It is equivalent to BFS expanded. It uses heuristic function as well
if step costs are equal. as node to node distance cost. So, for this
• Complete: if step cost ≥ E approach
• Time Complexity of node 𝑓𝑓(𝑛𝑛) = 𝑔𝑔(𝑛𝑛) + ℎ(𝑛𝑛)
with 𝑔𝑔 ≤ 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 of optimal solution
is 𝑂𝑂(𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐(∗/∈) ) • Complete: YES (unless there are
Where c* is the cost of optimal infinitely many node with 𝑓𝑓 ≤
solution 𝑓𝑓(𝑛𝑛))
• Space Complexity of node • Time Complexity: Exponential
with 𝑔𝑔 ≤ 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 of optimal solution • Space Complexity: keeps all node
is 𝑂𝑂(𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 (𝑐𝑐 ∗ /∈)) in memory
• Optimal: YES
BFS UCS DFS
Complete Yes Yes No A* search combines feature of
uniform cost search (complete, optimal,
∗ /𝜖𝜖)
Time 𝑂𝑂(𝑏𝑏 𝑑𝑑+1 ) 𝑂𝑂(𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐(𝑐𝑐 𝑂𝑂(𝑏𝑏 𝑚𝑚 ) efficient) with best first search (incomplete,
Complex non-optimal, efficient). The queue is sorted
ity by estimation of a path and is given by
∗ /𝜖𝜖)
Space 𝑂𝑂(𝑏𝑏 𝑑𝑑+1 ) 𝑂𝑂(𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐(𝑐𝑐 𝑂𝑂(𝑏𝑏 𝑚𝑚 ) 𝑓𝑓(𝑥𝑥) = 𝑔𝑔(𝑥𝑥) + ℎ(𝑥𝑥)
Complex If the heuristic function always under-
ity
estimated the distance to the goal then is
Optimal Yes Yes No
said to be admissible and if h is admissible
then function𝑓𝑓(𝑥𝑥) never overestimate the
Depth Iterative Bidirectional actual cost of the best solution through x.
Limited Depending
Example:
No Yes Yes
𝑂𝑂(𝑏𝑏 𝑙𝑙 ) 𝑂𝑂(𝑏𝑏 𝑑𝑑 ) 𝑂𝑂(𝑏𝑏 𝑑𝑑/2 )
𝑂𝑂(𝑏𝑏 𝑙𝑙 ) 𝑂𝑂(𝑏𝑏 𝑑𝑑 ) 𝑂𝑂(𝑏𝑏 𝑑𝑑/2 )
No Yes Yes
13
Step 1: Step 4:
Step 2:
Since, 𝑓𝑓(𝑛𝑛𝐸𝐸 ) is less than that of 𝑓𝑓(𝑛𝑛) of B

and C. so, we choose node E.
𝑖𝑖. 𝑒𝑒. 𝑓𝑓(𝑛𝑛𝐸𝐸 ) < 𝑓𝑓(𝑛𝑛𝐵𝐵 ) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑓𝑓(𝑛𝑛𝐸𝐸 ) < 𝑓𝑓(𝑛𝑛𝐶𝐶 ) Now,
Total path cost
Step 3: ∑ ℎ𝑛𝑛 = 300 + 240 + 0
= 540
Now,
Total distance cost
∑ 𝑔𝑔𝑛𝑛 = 110 + 200 + 300
= 610
Now,
Total cost of search
∑ ℎ𝑛𝑛 + ∑ 𝑔𝑔𝑛𝑛
Since, 𝑓𝑓(𝑛𝑛𝐺𝐺 ) is less than that of 𝑓𝑓(𝑛𝑛𝐹𝐹 ). so, = 540 + 610
we choose node G. = 1150
𝑖𝑖. 𝑒𝑒. 𝑓𝑓(𝑛𝑛𝐺𝐺 ) < 𝑓𝑓(𝑛𝑛𝐹𝐹 )
The path is:
A* search is complete as long as branching

factor is always finite. Time and space
complexity is 𝑂𝑂(𝑏𝑏 𝑚𝑚 ) in worst case since
must maintain and sort. However, with
good heuristic can final optimal solution for
many problems in reserve time.
14
Hill Climbing  A state is represented as a string

over a finite alphabets (often a
string of 0’s and 1’s)
 Evaluation function (fitness
function). Higher values for better
states
 Produces the next generation of
states by selection crossover and
mutation
Eg:
Basically, DFS with measure of quality i.e.

assigned to each node in tree can be
represented to Hill Climbing search. Can be
Initial population fitness selection
related with blind man Climbing Hill as
crossover mutation function
can’t see the peak of the hill.
Fitness function:
Disadvantage
no. of non- attacking pairs of queen
- Foot hill problem (might think to 7
reach higher point and just reach (min = 0, max=8 ∗ 2 = 28)
local maximum) and get stuck in 24
= 31%
middle of search. 24 + 23 + 20 + 11
- Plateau problem 23
= 29%
- Ridge problem 24 + 23 + 20 + 11
Simulated Annealing Search.

It escape local maximum by allowing some
“bad” moves but gradually decreases their
frequency. If T decreases slowly enough,
the simulated annealing search will find a
global optimum with probability
approaching
1. It is widely used in VLSI layout
Genetic Algorithm (GA)

In this approach a biological genetics is
used to solve the problem. The steps can be
given as
 A successor state is generated by
combining two parent
 Starts with k randomly generated
state (Population)
15
or, C2 + 2I = 1
Chapter 4 Since the sum of carry and 2I does not
Adversial Search generate carry the value of the I ranges from
Crypto-Arithmetic Problem 0 to 4 but it must provide the result to be A
i.e. 1 so, I can't be supposed to be 1,2,3, and
NINA 4 as well because it results other number
+ SING than 1 i.e. A . The only remaining in our
AGAIN range so it must be 0 i.e. we should consider
I be 0.
Soln
Variable Used (V) = {N,I,A,S,G} When the value of the I is substituted to the
Digit Available (D) = {0,1,2,3.......,9} equation then we get,
Carry Possible(C) = {0,1} C2 + 2 * 0= 1
(Since there is addition of two digit only Therefore, C2 = 1.
possible carry is 0 or 1.) Again, substituting the value of I,N and
C2 in equation 2 we get,
Generating Algebraic equation: C1 + 5+ 5 = 0+ 10*1
A + G = N + 10 C1 -----------1 or, C1 = 10 -10
C1 + N + N = I +10 C2 ---------2 Therefore, C1 = 0.
C2 + I +I = A +10 C3 ------------3 Again, Substituting the values of A,G,N
C3 + N + S = G + 10C4 ----------4 and C1 in Equation 1, we get
C4 = A 1 + 4 = 5 + 10 * 0
Since the last carry is A i.e. C4 and normally i.e. 5 = 5 which is true.
the initial digit of any number are not So our expected values are valid.
represented with the Zero and so the Hence, the required value of Variables are
possible carry for the given expression is 0 {N,I,A,S,G} = { 5, 0, 1, 9, 4}
and 1 so the value of A can be expected as
1.
Therefore, C4 = A = 1
Now, substituting the value of C4 in
equation 4 we get,
C3 + N + S= G + 10* 1
or, C3 + N + S = G + 10
It surely express that the sum of N + S
+C3 must give carry. So, their value ranges
from 5 to 9.
Let us consider that the value of S be 9 and
C3 be 0
So, S= 9 and C3 =0
Then, the equation becomes,
0 + N + 9 = G + 10
or, N = G + 10 – 9
or, N = G + 1
Now, let the value of the G be the 4 as the
value of the N and S ranges from 5 to 9.
Therefore, N = 4 + 1= 5
Again, Substituting the value of the A and
C3 in equation 3. we get,
C2 + I + I = A + 10 * 0
or, C2 + 2I = 1 + 0
16
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 So, E = 5 and C2 = 1

+𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 Then, the equation becomes,
𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 1+4+0=N
Given, or, N = 6
Variables (V) =
{S,E,N,D,M,O,R,Y} Again, substituting the value of C2 & N in
Domain (values) (D) = equation 2. we get,
{0,1,2,3,…,9} C1 + 6 + R = 5 + 10*1
Carry (C) = {0,1} or, C1 + 6 + R = 5 + 10
It express that the sum of C1 + R must give
𝑐𝑐4 𝑐𝑐3 𝑐𝑐2 𝑐𝑐1 carry.
𝑆𝑆 𝐸𝐸 𝑁𝑁 𝐷𝐷 Let us consider that the value of C1 be 1.
+ 𝑀𝑀 𝑂𝑂 𝑅𝑅 𝐸𝐸 So, C1 = 1
𝑀𝑀 𝑂𝑂 𝑁𝑁 𝐸𝐸 𝑌𝑌 Then, the equation becomes,
1 + 6 + R = 5 + 10
Generating Algebraic equation: or, R = 15 – 7
D + E = Y + 10C1 ------- (1) or, R = 8
C1 + N + R = E + 10C2 ------- (2)
C2 + E + O = N +10C3 ------- (3) Again, substituting the value of C1 & E in
C3 + S + M = O + 10C4 ------- (4) equation 1. we get,
C4 = M D + 5 = Y + 10*1
Since the last carry is M i.e. C4 and or, D + 5 = Y + 10
normally the initial digit of any number are It express that the sum of D + E must give
not represented with the zero and so the carry. So, their value ranges from 5 to 9.
possible carry for the given expression is 1. Let us consider that the value of D be 7.
So the value of M can be expected as 1. So, D = 7
Therefore, C4 = M = 1 Then, the equation becomes,
Now, substituting in equation 4, the value 7 + 5 = Y + 10,
of C4 & M. we get, or, Y = 2.
C3 + S + 1 = O + 10*1 So, our expected values are valid.
C3 + S + 1 = O + 10 Hence, the required value of Variables are
It express that the sum of C3 + S must give { S,E,N,D,M,O,R,Y } = { 9,5,6,7,1,8,2}.
carry. So, their value ranges from 5 to 9.
Let us consider that the value of S be 9 and
C3 be 0
So, S= 9 and C3 =0
Then, the equation becomes,
0 + 9 + 1 = O + 10
or, O = 10 – 10
or, O = 0
Again, substituting the value of C3 & O in
equation 3. we get,
C2 + E + 0 = N +10 * 0
or, C2 + E + 0 = N
It express that the sum of C2 + E doesn’t
give a carry. So, their value ranges from 1
to 9.
Let us consider that the value of E be 5 and
C2 be 1.
17
AS TO SHE LYNNE EAT SEND

+ A + GO +THE +LOOKS +THAT +MORE
MOM OUT BEST SLEEPY APPLE MONEY
USED I LEO
+ SEX +DID +LEE SEEM PLEASE TESS
WORDS TOO ALL +MEAN +MAKE +SEES
TEAMS OFFERS ELLEN
THY SEEN A
+HAY +SOME +FAT THIS NOTICE
MYTH BONES ASS +SIZE +NICE
SHORT PRICES
IS HERE MEMO
+THIS +SHE +FROM NINA TRIED WHAT
HERE COMES HOMER +SING +RIDE +THAT
AGAIN STEER HERE
SO FOOD WAIT
+SO +FAD +ALL AT NO MAKE
TOO DIETS GIFTS EAST GUN A
+WEST +NO +CAKE
NINE US DOWN SOUTH HUNT EMMA
+FINE +AS +WWW
WIVES ALL ERROR SYSTEM SNIP
-DEEMED -NIPS
DAYS BASE ED SENSE PINS
+TOO +BALL +DI TED GEE TAKE
SHORT GAMES DID HAS SEE A
+GOOD +HE +CAKE
TUT GOOD LEAH TASTE GOES KATE
+TUT +DOG +LOVES
RAT OREO RUSSIA DONALD COUPLE
+GERALD +COUPLE
DI OOOH NOT ROBERT QUARTET
+IS +FOOD +THIS
ILL FIGHT YOUTH BUM HE OLD
BUM SHE OLD
LEW OH COCA +BUM +SEE +OLD
+WILL +GO +COLA DUD THIS GOOD
ABLE GEE OASIS
CRACK STARS PEAR
NO NEVER SHHH +HACK +RATE +APPLE
+NO -DRIVE -S ERROR TREAT GRAPE
LATE RIDE ZZZ
LABEL TAKE STORE
WOW TAKE CROSS ALL HER AND
+WHAT +THAT +ROADS +SEAL +SHARE +NAME
TOOT SHEET DANGER BALES THESE BRANDS
DAN SPOT MEET BARREL MOSES

+NAN +A +MOST +BROOMS +MEETS
NORA GHOST TEENS SHOVELS SALOME
18
WHO THE HE
IS TEN SEES
+THIS +MEN +THE
IDIOT MEET LIGHT
ON SEE
MONTY SEND
+PYTHON +TEN
SPIRIT THERE
ALL HER AND

+SEAL +SHARE +NAME
BALES THESE BRANDS
BARREL MOSES
+BROOMS +MEETS
SHOVELS SALOME
19
Game Playing
Game can be deterministic which consist of
one or two or more players with perfect
information and turn taking simultaneous
action. It want algorithm for calculating a
strategy which recommends more in each
state.
Deterministic Game can be
mathematically characterized as
G(S,P,A,f,R) where
States: S (starts at sd)
Player: P = {1,……..N}
Action:A (may depend on player or state)
Transition function: f = S*A  R
Note: solution for a player is a
Policy: S*PR
8 1 6
3 5 7
4 9 2
1 15 14 4
12 6 7 9
8 10 11 5
13 3 2 16
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 2
11 18 25 2 9
1 35 34 33 32 6
30 8 28 27 11 25
24 33 15 16 20 19
18 17 21 22 14 13
12 26 16 9 29 7
31 5 4 3 2 36
30 39 48 1 10 19 28
38 47 7 9 18 27 29
46 6 8 17 26 35 37
5 14 16 25 34 36 45
13 15 24 33 42 44 4
21 23 32 41 43 3 12
22 31 40 49 2 11 20
20
Game Tree
21
Generate Game Tree
Sub-tree
22
Good Move
23
Min – Max Algorithm

Perfect play for deterministic environment
with perfect information basic idea: choose
move with highest min max values.
Algorithm
1. Generate game tree completely
2. Determine utility of each terminal
state
3. Propagate the utility values upward
in tree by applying MIN and MAX
operators on nodes in the current
level
4. At the root node use minimum
decision to select move max (of the
main) utility value.
Note: step (2) and (3) in the algorithm
assumes that the opponent will play
perfectly.
Mini-Max Tree
24
Step 1 Step 4
Step 5
Step 2
Step 6
Step 3
i.e. 3 is min so we take 3. i.e. 2 is min than 3 so we take 2
25
Step 7:
Step 11:
Step 12
Step 8:
𝑀𝑀𝑀𝑀𝑀𝑀 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣
= max[min(3,12,8) , min(2,5,7) , min(14,5,2)]
= max[3,2,2]
=3
Property
• Optimal against a perfect player
Step 9: • Time Complexity : 𝑂𝑂(𝑏𝑏 𝑚𝑚 )
• Space Complexity: 𝑂𝑂(𝑏𝑏 𝑚𝑚 )
Example:
For chess, b ≈ 32, m ≈100
Exact solution is completely infeasible.
Resource Dimit
- Cannot search to leaves
- Depth Dimited search
Step 10: - Guarantee of optimal play
26
𝜶𝜶 − 𝜷𝜷 𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷𝑷 Step 2:
The problem with min-max search
is that the number of game states it has to
examine in exponential in the exponent but
can effectively cut it in half we can borrow
the idea of pruning in order to eliminate
larger parts of the tree from consideration
and that technique is 𝛼𝛼 − 𝛽𝛽 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃.
When applied to standard min-max tree it
returns the same moves as min – max would Step 3:
but prunes away branches that cannot
possibly influence the final decision.
Where, if 𝛼𝛼 < 𝛽𝛽, then max will choose m.

so, prune tree in term similar for 𝛽𝛽 for min.
Example:
Step 1:
27
Chapter 5 • John likes strawberry

likes (John, strawberry)
Knowledge representation • chicken is a food

food (chicken)
• Fido is a dog
Dog (Fido)
• All dogs are animal

∀𝑥𝑥 (𝑑𝑑𝑑𝑑𝑑𝑑(𝑥𝑥) → 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥))
• All animals will die

∀𝑥𝑥 (𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥) → 𝑑𝑑𝑑𝑑𝑑𝑑(𝑥𝑥))
• Someone likes ice-cream

Fig: Different types of Knowledge ∃𝑥𝑥 (𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥, 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠))
• All girls like ice-cream.

∀𝑥𝑥 (𝑔𝑔𝑔𝑔𝑟𝑟𝑟𝑟𝑟𝑟(𝑥𝑥) → 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥, 𝑖𝑖𝑖𝑖𝑖𝑖 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐))
• Roses are red.

∀𝑥𝑥 (𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅(𝑥𝑥) → 𝑅𝑅𝑅𝑅𝑅𝑅(𝑥𝑥))
• Somebody is loyal to someone.

∃𝑥𝑥 ∃𝑦𝑦 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑡𝑡𝑡𝑡(𝑥𝑥, 𝑦𝑦)
• John likes every kind of food

∀𝑥𝑥 (𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓(𝑥𝑥) → 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝐽𝐽𝐽𝐽ℎ𝑛𝑛, 𝑥𝑥))
Fig: Components of AI
28
Knowledge Base b. Declarative knowledge

Knowledge base is the set of sentence in a - It describes object instead of the
formal language where declarative process and what the situation is
approach is to build an agent and account it about
what it needs to know. Then it ask itself Eg: It is windy today
what to do. So, that answer or response
should follow from the knowledge base. c. Meta knowledge
Knowledge representation (KR) is one of - It is knowledge about knowledge
the vital component of AI system and it can i.e. understanding of the given
be expressed in figure as: domain can draw various
knowledge
Eg: the knowledge that blood pressure is
more vital for digonalizing a medical
condition than skin color
d. Heuristic knowledge
- It is also referred as “Rule of
Thumb” and is empirical and
opposed to deterministic
Eg: if I start seeing shop. I am close to
market
Fig: AI cycle e. Structural knowledge

- It describes structure and their
Perception components helps system to get relationship that how they are
information from its environment. System compared with each other.
must then form a meaningful and useful Eg: how the various parts of the computer
representation of precept information fit together.
where KB may be static or may be coupled
with a learning component i.e. adaptive and
draw patterns from the perceived data. KR
and reasoning are tightly coupled
component. Knowledge is defined as
concept of understanding of a subject area
of a given domain. Knowledge can be
classified as:
a. Procedural Knowledge
- It describes how to do things as well
as provides a set of direction how to Fig: Types of Knowledge
perform
Eg: how to make coffee There are various method to represent
knowledge and choosing proper
representation method is important as it
helps in reasoning because knowledge is
power for drawing conclusion in given
domain.
29
Facts are building blocks/atomic units of Eg: if it is raining then I will not go to
knowledge which represents declarative college
knowledge. A proposition is considered as
facts which is the statement which Compound Rules
expresses truth value i.e. either true or false Eg: if it is raining and I have an umbrella
but not both. then I will go to home
Eg: P: It is raining. Rules can be classified as
1. Relationships
Facts can be classified as - Used to express a direct occurance
a. Single valued or multivalued relationship between two events.
- Each facts or attributes can take one Eg: if Nib is broken, then pen is not
or more than one values at the same working
time.
Eg: A person may have one hair color but 2. Recommendation
many cars. - Offers recommendation on the basis
of some known information
b. Uncertain Facts Eg: if it is raining then bring an umbrella
- It represents uncertain information
Eg: it will probably rain today 3. Directive
- Like recommendation rules but they
c. Fuzzy facts offers a specific lines of action as
- They can ambigious facts used to opposed to the “advice” of
represents unclear description and recommendation
uses certainity factor values to Eg: if it is raining and don’t have an
specific value of path umbrella then wait until the rain stops
Eg: I am heavy / light.
4. Variable
d. Object attribute Value triplet fact - If same type of rule is applied to
composed of three facts i.e. object, multiple object we use it.
attributes and value which are used
to assert a specify property of rules With variables Eg:
object. If X Is a Student
Eg: Ram’s Eyecolor is Black AND X’s GPA>3.8
O A V
Then place X on merit list
They are also called pattern matching rules

as rules are matched with known facts and
Rules are the next form of KR that relates different possibilities for the variables are
some known information to other tested to determine the truth of the facts
information that can be concluded or
informed to be tree. Rule consist of two 5. Uncertain Rules
component: - Introduce uncertain facts into
1. Antecedents/ Premises/ if part system
2. Consequent/ conclusion/ then part Eg: if you have never pass the exam then
you may pass this time
30
Eg: ¬AvA, p→q, q→p

6. Meta Rules
- Describes how to use other rules Contradiction
Eg: if you are coughing and you have chest Contradiction is the condition of the
congestion then use the set of respiratory statement whose truth value is always false
disease rules for combination of each variabbles of given
domain
 Logic Eg: ¬AvA, ¬ (A→B), ^(A→B)
Propositional logic and predicate
calculus are form of formal logic dealing Contingency
proposition. Contingency is the condition of the
statement whose truth is combination of
Propositional Logic true and false for each variable of given
- It is statement of facts and they are domain.
assigned with symbolic variables to Eg: (A→B), ~(A→B)
represent whose truth value may be
determined. Predicate Calculus
Eg: P: It is raining is false It is an example of PL which allows
structure of facts and sentence to be defined
- Number of proposition may be and expressed
logically related using logical Eg: king(Birendra)
connectives (v,^, ¬, →, ↔) to form
compound statement of proposition. Predicate calculus also helps to express the
p q p^q p v q p→q p↔q relationship of sub statement units and
F F F F T T provides the mechanism for providing
F T F T T F statement that can be used as a logic
T F F T F F representational power.
T T T T F T
Quantifiers
Limitation of propositional Logic Predicate calculus can use quantifiers for
- Can only represent knowledge as statement which express the expression as
complete sentence some object or all objects within given
- Can’t analyse internal structure of domain. Universal (∀) and existential (∃)
sentence quantifiers are used for expressing
- Absence of quantifiers statement.
- No framework for providing
statement
Eg: All kings are men
All men are mortal
Conclusion: All king are mortal
Tautology
Tautology is the condition of the statement
whose truth value is always true for
combination of each variables of given
domain.
31
Universal quantifier  Express following sentence in

- Symbol (∀) read as ‘for all’/ ‘for FOPL
every’ 1. Ram is a boy.
- Valid if (P1^P2^P3…………Pn) is 2. Rita is mother of Ram
true where 1,2…..n is value of given 3. Fido is a dog
domain 4. Apple is food
- All are true for all variable in given 5. All animals will die
domain 6. Dogs are animal
Eg: ∀𝑥𝑥 (𝑥𝑥 + 𝑥𝑥 = 2𝑥𝑥) 7. No mango is blue
∀𝑥𝑥 (𝑥𝑥: 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 → 𝑥𝑥. 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝) 8. Every girls like ice cream.
9. Some boys are like monkey
Existential Quantifier 10. John like all kind of food
- Symbol (∃) reads as ‘for some’/ 11. Everybody is loyal to someone
‘there exist’ 12. All cat have tail and whisker
- Valid if (P1^P2^P3…………Pn) is 13. Steve only like easy course
true where 1,2…….n is value of 14. Science course are hard
given domain 15. All kings are mortal
- Something is true at least for one 16. Everybody who loves animals is
value in given loved by someone.
First Order Predicate Logic (FOPL)

It is the simplest form of predicate
logic which uses symbol like constant (used
to name specific object or properties Eg:
Ram, Green), Predicated(A fact or
proposition is divided into three parts –
Predicate and Argument . Eg: like(Ram,
Gita) variables (used to represent general
class of objects or properties Eg: like(x,y))
and formulae (combines preditcate and
quantifier to represent information)).
Eg:
Male(Ram)
Male(Hari)
Parent(Ram,hari)
Father(x,y):- Parent(x,y),male
x,y are variables
Ram, Hari are constant
32
Inference Rules CNF (Conjunctive Normal Form)

Name Rules - Connecting the given predicates in
Modus Ponen 𝛼𝛼 → 𝛽𝛽 the form of conjunction of
𝛼𝛼 disjunction
∴ 𝛽𝛽 Eg:
a. ¬(𝐴𝐴𝐴𝐴𝐴𝐴) = ¬𝐴𝐴 ^ ¬𝐵𝐵
Modus Tolens 𝛼𝛼 → 𝛽𝛽 b. 𝐴𝐴^(𝐵𝐵𝐵𝐵𝐵𝐵) = (𝐴𝐴𝐴𝐴𝐴𝐴)^(𝐴𝐴𝐴𝐴𝐴𝐴)
¬𝛽𝛽
∴ ¬𝛼𝛼  Proof by Reduction
And introduction 𝛼𝛼
𝛽𝛽 Fido is a dog
∴ 𝛼𝛼^𝛽𝛽 All dog are animal
All animal will die
And elimination 𝛼𝛼𝛼𝛼𝛼𝛼 Prove: Fido will die
∴ 𝛼𝛼, 𝛽𝛽
1. Fido is a dog – dog(Fido)
Resolution 𝛼𝛼 or 𝛼𝛼𝛼𝛼𝛼𝛼 2. All dog are animal – ∀𝑥𝑥 (𝑑𝑑𝑑𝑑𝑑𝑑(𝑥𝑥) →
𝛽𝛽 ¬𝛽𝛽𝛽𝛽𝛽𝛽 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥))
∴𝛼𝛼^𝛽𝛽 𝛼𝛼𝛼𝛼𝛼𝛼
3. All animal will die –
Universal ∀𝑥𝑥 (𝛼𝛼(𝑥𝑥) → 𝛽𝛽(𝑥𝑥)) ∀𝑥𝑥 �𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥) → 𝑑𝑑𝑑𝑑𝑑𝑑(𝑥𝑥)�
instantiation 4. Conclusion is – die(Fido)
∴ 𝛼𝛼(𝑥𝑥) → 𝛽𝛽(𝑥𝑥)
Existential ∃𝑥𝑥 (𝛼𝛼(𝑥𝑥) → 𝛽𝛽(𝑥𝑥)) By table method

instantiation SN Statement Reason
𝛼𝛼(𝑥𝑥) → 𝛽𝛽(𝑥𝑥)
1 Dog(Fido) Given
2 ∀𝑥𝑥 (𝑑𝑑𝑑𝑑𝑑𝑑(𝑥𝑥) “
Logical Equivalence → 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥))
3 ∀𝑥𝑥 (𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥) “
Two or more predicates are called
→ 𝑑𝑑𝑑𝑑𝑑𝑑(𝑥𝑥))
logical equivalence off their truth value are
4 Dog(Fido) → Universal
same for combination of each variables of
animal(Fido) instantiation in
predicate. statement (2)
Eg: Fido ∈ 𝑥𝑥
a. ¬(¬𝐴𝐴) ≡ 𝐴𝐴 5 Animal(Fido) Same as statement
b. 𝐴𝐴𝐴𝐴𝐴𝐴 ≡ 𝐵𝐵𝐵𝐵𝐵𝐵 → die(Fido) (4)
c. 𝐴𝐴^𝐵𝐵 ≡ 𝐵𝐵^𝐴𝐴 6 Animal(Fido) Applying modus
d. 𝐴𝐴^(𝐵𝐵𝐵𝐵𝐵𝐵) ≡ (𝐴𝐴^𝐵𝐵)𝑣𝑣(𝐴𝐴^𝐶𝐶) Ponen
e. 𝐴𝐴𝐴𝐴(𝐵𝐵^𝐶𝐶) ≡ (𝐴𝐴𝐴𝐴𝐴𝐴)^(𝐴𝐴𝐴𝐴𝐴𝐴) 7 Die(Fido) Applying Modus
f. ¬(𝐴𝐴𝐴𝐴𝐴𝐴) ≡ ¬𝐴𝐴^¬𝐵𝐵 Ponen in (5) and (6)
g. ¬(𝐴𝐴^𝐵𝐵) ≡ ¬𝐴𝐴𝐴𝐴¬𝐵𝐵 Converting the predicates into clauses
h. 𝐴𝐴 → 𝐵𝐵 ≡ ¬𝐴𝐴𝐴𝐴𝐴𝐴
i. 𝐴𝐴 → 𝐵𝐵 ≡ ¬𝐵𝐵 → ¬𝐴𝐴 1. Dog(Fido)
j. (¬𝐴𝐴𝐴𝐴𝐴𝐴)𝑣𝑣¬𝐵𝐵 ≡ ¬𝐴𝐴 C1: dog(Fido)
2. ∀𝑥𝑥 (𝑑𝑑𝑑𝑑𝑑𝑑(𝑥𝑥) → 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥))
i.e. dog(Fido) →animal(Fido) where
Fido ∈ 𝑥𝑥
C2: ¬dog(Fido) v animal(Fido)
33
3. ∀𝑥𝑥 (𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥) → 𝑑𝑑𝑑𝑑𝑑𝑑(𝑥𝑥))

i.e. animal(Fido) → die(Fido) where Facts
Fido ∈ 𝑥𝑥  Steve only like easy course.
C3: ¬animal(Fido) v die(Fido) Science course are hard. All
course in basket weaving
Applying resolution in clause C1 and C2 department are easy. BK301 is a
i.e. basket weaving course.
Dog(Fido) Use resolution to answer the
¬dog(Fido) v animal(Fido) question “What course would
Animal(Fido) Steve like”?
C4: animal (Fido)
Hypothesis
Again, applying resolution in clause C4 and ∀𝑥𝑥 (𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥) → 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆, 𝑥𝑥)
C3 ∀𝑥𝑥 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆(𝑥𝑥) → ¬𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥))
I.e. animal(Fido) ¬𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆(𝑥𝑥) v ¬𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥)
¬animal(Fido) v die(Fido) ∀𝑥𝑥 (𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵. (𝑥𝑥) → 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥))
Die(Fido) ¬𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(𝑥𝑥) v 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥)
𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵. (𝐵𝐵𝐵𝐵301)
Proof by resolution refundation
All same: from upto C4 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥) → 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆, 𝑥𝑥)
¬𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥) v 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆, 𝑥𝑥)
4. Negating conclusion i.e. ¬𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆(𝑥𝑥) v ¬𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥)
¬𝑑𝑑𝑑𝑑𝑑𝑑 (𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹) ¬𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(𝑥𝑥) v 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥)
i.e. C4 : ¬𝑑𝑑𝑑𝑑𝑑𝑑 (𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹) 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(𝐵𝐵𝐵𝐵301)
𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 (𝐵𝐵𝐵𝐵301)
By resolution refundation 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆, 𝐵𝐵𝐵𝐵301)
Hence, by refundation method, it is proved

that Fido will die.
34
 Anyone passing his history exam ¬𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥) v

and winning lottery is happy. 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙)
Anyone who studies or is lucky 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
can pass all exam. John did not
study but he is lucky. Anyone who
is lucky wins the lottery. Prove 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝐽𝐽𝐽𝐽ℎ𝑛𝑛, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙)
John is happy. ¬𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥, ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖)
v
∀𝑥𝑥 (𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥, ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖)^𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙) ¬𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙)
→ ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑦𝑦(𝑥𝑥) v
∀𝑥𝑥 ∀𝑦𝑦 �𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥) 𝑣𝑣 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝑥𝑥) ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥)
→ 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥, 𝑦𝑦)�
¬𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝐽𝐽𝐽𝐽ℎ𝑛𝑛, ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖)v ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥)
1. ∀𝑥𝑥 (𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥, ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖)^𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙) → ¬(𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥)v 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝑥𝑥)
ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥)) v 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥, 𝑦𝑦)
2. ∀𝑥𝑥 ∀𝑦𝑦 �𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥) 𝑣𝑣 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝑥𝑥) →
¬𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝐽𝐽𝐽𝐽ℎ𝑛𝑛) v 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥, 𝑦𝑦)�
v ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
3. ¬𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)^𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
i.e. ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
4. ∀𝑥𝑥 �𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥) →
𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙)�
 All people who are not poor and
i.e. are smart are happy. Those
𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥, ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖)^𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙) → people who reads are not stupid.
ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥) John can read and is wealthy.
a. ¬(𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥, ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖)^𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙) Happy people have exciting life.
v ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥) Draw conclusion.
b. C1:
¬(𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥, ℎ𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖) ^ Given,
¬𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙) v ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥) 1. ∀𝑥𝑥 (¬𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑥𝑥)^ 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝑥𝑥) →
C2: ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥))
¬(𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥) v 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝑥𝑥) 2. ∀𝑥𝑥 �𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟(𝑥𝑥) → 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝑥𝑥)�
v ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥) 3. 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)^ 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤ℎ𝑦𝑦(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
C3: 4. ∀𝑥𝑥 �ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑥𝑥) →
¬(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝐽𝐽𝐽𝐽ℎ𝑛𝑛) ^ 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥)�
¬(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠(𝐽𝐽𝐽𝐽ℎ𝑛𝑛), 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
From (2) and (3)
𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥) → 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙)
5. 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)^ 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤ℎ𝑦𝑦(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
c. C4:
From (1) and (5)
¬𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝑥𝑥) v 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝑥𝑥, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙)
6. ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝐽𝐽𝐽𝐽ℎ𝑛𝑛) (But we use
resolution)
From (6) and (4)
7. 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙(𝐽𝐽𝐽𝐽ℎ𝑛𝑛)
35
 John likes all kind of food. Apple Semantic Nets and Frame
are food. Chicken are food. Semantic nets are the graph with
Anything anyone eats and isn’t nodes representing objects and arcs
killed by is food. Bill eats peanuts representing relationship between objects
and is still alive. Sue eats everything various type of relationship may be defined
Bill eats. Prove “John likes peanuts” using semantic Nets “is – a” i.e. inheritance
using resolution. and “has – a” i.e. ownership are two types
of most used relationship.
 Every child loves Santa. Everyone
who loves Santa is loved by any
reindeer. Rudolph is a reindeer and
Rudolph has a red nose. Anything
which has red nose is weird or is a Fig: Vehicle Semantic Net
clown. No reindeer is clown.
Scrooge doesn’t love anything Semantic Nets are computationally
which is weird. Conclusion: expensive at runtime as we need to traverse
“Scrooge is not a child”. the network to answer some question. In
worst case, it may need to traverse the entire
 Anyone whom Mary love is a network and then discover that the
football star. Any student who requested information doesn’t exist. It can’t
doesn’t pass doesn’t play. John is a model human associated memory as human
student. Any student who doesn’t brain consist of number of neuron (about
study doesn’t pass. Anyone who 1010 neuron) which are interlinked with
doesn’t play is not a football star. each other. Semantic nets are logically
Conclusion: “If John doesn’t study inadequate as they don’t have any
Mary doesn’t loves John” equivalent quantifier i.e. ∃, ∀.
Eg:
A person is a mammal.
Bimal Gharti Magar is a person
Person has nose.
36
Bimal Gharti Magar is in Nepali team. Frames are the data structure for
resonating stereo-typical knowledge of
same concept or object i.e. it is the
collection of attributes and associated
values that can be describe some entity in
Uniform color of Bimal Gharti Magar is real world. Frames are similar to schema
blue/red. in DBMS. They are developed from a
semantic nets and can be used to encode
knowledge and support reasoning. Frames
consists of various components and they
are termed as slots.
Combining all of above information. We
get, Frame Name: Frame Name:
Student Course
Properties: Properties:
Age: 19 C-Name: 19
GPA: 4.0 Credit Hr: 4.0
Ranking: 1 C-ID: BEG471CO
Fig: Student Frame Fig: Course Frame
Frame Frame Name:

Name: Course
Student Register
Properties: Properties:
Age:19 C-Name: 19
GPA: 4.0 Credit Hr: 4.0
Ranking: 1 C-ID: BEG471CO
Fig: Student & course frame with
relationship
37
o A person is a mammal. Mammals Combining all of above information, we

are those who have mammary get.
glands. John is a person and he has
2 legs and 2 hands. John plays in
English team and he wear uniform
which is white/blue in color. John
has a motorcycle. The brand of the
motorcycle is Ducati. He visits
London via Ducati
A person is a mammal.
Mammal are those who have mammary

glands.
John is a person and he has 2 legs and 2

hands.
John plays in English team and he wear

uniform which is white/blue in color.
John has a motorcycle.
The brand of the motorcycle is Ducati.
He visits London via Ducati
38
Types of Learning.
Chapter 6
1. Learning by Analogy
2. Rote Learning
Machine Learning / Learning 3. Learning by Example
Concept of Learning 4. Explanation based Learning
One of the most often heard criticism of AI 5. Learning from the advice/influence
is that machine can’t be called intelligent of other
until they are able to do new things and to 6. Reinforcement Learning
adopt to new condition rather than simply
doing as they told to do. So, there can be
little queries that the ability to adopt to view 1. Learning by Analogy
surrounding and to solve new problem is a Analogy is a powerful interface tool
vital characteristics of intelligent agent. for a learning system.
Let us consider an example, consider the
Definition following sentence
Changes in a system that are adaptive in the Boys are like monkeys.
sense that they enable the system to do the  One of the property of monkey is it
same task or takes down from some shows mischievous activities
population more efficiently the next time.  Realizes that boys shows the
Learning covers large range phenomena, at mischievous activities and disturb
one end of the system is skill requirement others.
and other end of the spectrum lies
knowledge acquisition. Human often solve by making analogies to
things they already know/understand how
Skill Requirement to do. This process is more complex than
People get better at many task simply by string macro-operators, because the old
practicing and repeating the same task for problem may be different from the new
the multiple numbers of times. problem. The difficulty comes in
Eg: if you play chess or play football, you determining, what things are similar and
will be better to play next move. what are not.
Two methods of analogical problem
Knowledge Acquisition solving have been stored in AI.
As we have seen, many AI programs draw a. Transformational Analogy
heavily or knowledge as their source and b. Derivational Analogy
power knowledge acquired by experience.
Knowledge Acquisition includes many Transformational Analogy
various activities such as storing the Suppose you are asked to prove theorem in
computed information and utilizing the plane geometry you might look for a
information for the task intake. previous theorem that is very similar and
“copy” its proofs making substitution when
necessary. The idea is to transfer a solution
of previous problem into a solution for
current problem and figure below shows the
process.
39
Derivational Analogy
Transformational analogy doesn’t work at
how the old problem was solved. It only
looks at the final solution. The detail of a
problem solving episode is called its
Fig: Transformational Analogy derivation. Analogical learning that take
Eg: their histories into account is called
derivation analogy. It is necessary
component in transfer of skills complex
decision.
Fig: Derivation Analogy

Old Proof:
AC = BD Given
CD = CD Reflexive
Eg: you have called an efficient storing
AC + CD = CD + Additive
routine in Pascal and then you are asked to
BD
recode the routine in LISP. As line by line
AD = CB Transitive
translation not appropriate but you will
reuse the major structural and control
decision you made when you constructed
the Pascal problem.
Old problem:
New Proof: BO = OA Radii
∠ABD = ∠EBC Given ∠OAB = ∠OBA Base ∠s of ∆
∠DBE = ∠EBD Reflexive ∠AOC = ∠OAB + Sum of 2 interior
∠ABD + ∠DBE = Additive ∠OBA ∠s of ∆ =
∠EBC + ∠EBD exterior angle
∠ABE = ∠DBC Transitive ∠AOC = ∠OBA + ∠OAB = ∠OBA
∠OBA
∠AOC = 2 ∠OBA
Derived: central angle twice inscribed.
40
Let us consider that we are trying to learn

Boolean function (all input and outputs are
binary)
Eg: there is 2d possible ways to write d

binary values and so with d inputs the
New problem: training set has almost 2d instances where
∠OBC = 180 Straight ∠ each of these instances can be labelled as 0
∠OBC = 2∠BAC Central ∠ is and 1. Therefore, there are 22d possible
twice of Boolean function of d inputs. Each distinct
inscribed ∠ training examples removes half of
2∠BAC = 180 ∠BOC = 180 hypothesis namely those where guesses are
∠BAC = 90 wrong for examples. So it interrupts
Derived: angle standing on diameter is inductive learning.
right angle (90) If the training set given contains
only a small part of all possible instance at
it generally does, then the solution is not
2. Rote learning unique. After seeing N examples, there
It avoids understanding the inner remain 22d-N position function, the
complexity but focuses on memorizing the examples of ill-paused problem where the
material so that it can be recalled by the data by itself is not sufficient to find a
learner exactly the way it was read or heard. unique solution. So unless the possible
Learning by memorizing avoids instance of the data by itself is not efficient
understanding inner complexities the for inductive learning to find unique
subject that is being learned. Rote learning solution with data given
instead focus on memorizing the material
so that it can be recalled by learner exactly The set of assumption which are made to
the way it was read or heard. have learning possible is called inductive
Learning something by repeating over & bias learning algorithm that includes set of
over again saying and trying to remember assumption and the true function which are
how to say it. It doesn’t help us to to be modified as given. It defines the set of
understand. It helps us to remember like we hypothesis that a learning algorithm.
learn a poem or a song or something like Consider when it is learning and guides the
that rote learning. learning algorithm to prefer one hypothesis
over other is a necessary pre-request for
learning to happen because inductive
3. Learning By Example(Inductive learning is a ill-posed problem.
Learning)
A process of learning by example where
system tries to induce a general rules from Inductive Bias Learning Algorithm
set of observed instances. The learning Is the set of assumption that the
method extracts rules and pattern out of learners use to predict output of given
massive data sets. The learning process inputs that it has not encountered. In
being to supervise learning, does machine learning, aim is to construct
classification and constructs class algorithm that are able to learn to predict a
definition called induction or concept certain target outputs. To achieve this, the
learning.
41
learning algorithm is presented some Example 2

training examples that demonstrate the Maximum Description Length
untended relations of inputs and outputs When forming a hypothesis, attempts to
values. Then the learners is supposed to minimize the length of the description of
approximate the correct output, even for hypothesis.
example that have not shown during Example 3
training without any additional this can’t be Maximum Cross-Validation Error
solved exactly since unseen might have an When trying to choose among hypothesis
arbitrary o/p values. The kind of necessary select the hypothesis with lowest cross
assumption about the nature of target validation error.
function are subsumed in the phrase Example 4
inductive bias. Minimum Features
Unless there is good evidence that a feature
Training Example: is useful, it should be deleted.
An example of form <x.f(x)>
Target function (target concept) 4. Explanation Based Learning
The tree function of x. (EBL)
Hypothesis: a proposed function is believed Human appears to learn quite a lot from
to be similar to f. examples and is accomplished by
Hypothesis space: the space of all examining particular situation and relating
hypothesis that cab be output by learning them to the background knowledge in the
algorithm. form of known general principles. This
method of learning is EBL. EBL is
A classic example of an inductive bias is abstracting a general concept from a
Occamis Razor (is the principle of particular training examples and is also a
Parsimony or economy used in logic & method to define general conception on the
problem solving. It states that among basis of specific training example. It
competing hypothesis with the fewest analyses the specific training example in
assumption should be selected Razor sates terms of domain knowledge and good
that one should proceed to simpler theories concept that results in an explanation
until simplicity can be traded for greater structure which elaborates why the training
exploratory power. The simplest available examples is accounted for good concept
theory need not be more accurate) assuming and the explanation structure is used as
that the simplest consistent hypothesis basis for generating general concept.
about the target function is actually the best.
Here consistent means the hypothesis
learned yields the correct outputs for all and
the example that are given.
Example 1:
Maximum Conditional Independence
If the hypothesis can be cast in a Bayesian
framework, try to maximize conditional
independence.
Fig: EBL Architecture
42
values are missing it is not possible to

inform anything about the outputs.
Latent variable
Fig: EBL System Schematic
5. Learning from the advice of other

Basically includes the similar features
of learning by analogy the similar features
of learning by analogy and learning by
example and the different is that there may
Observation
or may not be the presence of guidance to
perform the learning activities. It can be
6. Reinforcement Learning
supervised or unsupervised learning
It allows machine and s/w agents to
automatically determine the ideal behavior
Supervised Learning
within a specific context, in other to
It is the learning process which consist of
maximize its performance. Simple reward
the information about the input to be
feedback is required for the agent to learn
provided i.e. data and output to be
its behavior, this is called reinforcement
accounted in the supervision of an advisor
signal. It helps the machine agent to learn
or supervisor. This type of learning is
its behavior based on feedback from
basically performed to develop complex
environment which can be learn one and for
system. In this method, the model defines
all or keep on adapting as time goes by. If
the effects on set of observation called
the problem are handled with care, it can
inputs has another set of observation called
cover to global optimum which maximizes
output. It includes mediating variables
rewards.
between the input and output. It is one that
tries to find the connection between two
Disadvantage
sets of observation which increases
- Memory expensive to store values
learning task difficult and increased
of each stage
exponentially in the number of steps
- Limited perception and often
between the two sets
impossible to fully determine the
current states
Observation (i/p)
Observation (o/p)
Unsupervised learning
In this approach all observation are
assumed to be caused by latent variable i.e.
the observation are assumed to be at the end
of the casual chain. It is not needed as long
as input are available, but if some of inputs
43
4. Analogical Reasoning:
Chapter 7
It works by drawing analogies between
two situations, looking for similarities and
Reasoning
difference.
Reasoning is the process of deriving logical
Eg: When you say riding a bicycle
conclusion from given facts where works
is just like scooter.
with knowledge, facts and problems
By analogy there must be
solving strategies.
similarities in riding but difference
in their characteristics.
Types:
1. Deductive Reasoning:
5. Common Sense Reasoning:
It is based on deducing new information
It is an informal form of reasoning that
from logically related known information.
uses rules gain through experience or rule
A deductive arguments offers assertion that
of thumb. It operates on heuristic
leads automatically to conclusion.
knowledge and heuristic loop.
Eg: If there is dry wood, oxygen and a spark
Eg: If I am seeing shop. I am near to
there will be fire.
market.
We deduce: there will be fire.
All kings are mortal. Birenda is a king.
6. Non-Monotonic Reasoning:
We deduce: Birendra is mortal.
It is used when facts of the case or likely
to change to change after sometime.
2. Inductive Reasoning:
It is based on forming or inducing
Eg: If the wind blows. The curtain sway.
generalization from a limited set of
When the wind stops the curtain should
observation and is based on experience.
sway no longer. However if we use
Eg:
monotonic reasoning this wouldn’t happen.
Observation: All the crow that, I have seen
The fact that the curtains are swaying
in my life are black.
would be retained even further and stopped
Conclusion: All crow are black.
blowing.
3. Abductive Reasoning:
In non-monotonic reasoning, have
Deduction is exact in the sense that
truth maintenance system which keeps
deduction follow in a logically provable
track of what caused a fact to become true
way from axioms. It is a form of deduction
and if cause is removed then fact is
that allows for plausible inference i.e. the
removed.
conclusion might be wrong.
Eg: She carries an umbrella if it is raining.
Axiom: She is carrying an umbrella.
Conclusion: It is raining.
Here, she might carry umbrella if it is
raining.
44
Uncertainty in reasoning Conditional Probability

It is remarkable that science began Probability of A when B given
with the consideration of chance should 𝐴𝐴(𝐵𝐵) = 𝑃𝑃(𝐴𝐴&𝐵𝐵)/𝑃𝑃(𝐵𝐵)
become the most important object of human 𝐴𝐴(𝐴𝐴/𝐵𝐵) = 𝑃𝑃(𝐴𝐴𝐴𝐴𝐴𝐴)/𝑃𝑃(𝐵𝐵)
knowledge. The most vital question of life
are really only problem of probability. The let 𝑃𝑃(𝐴𝐴𝐴𝐴𝐴𝐴) = 0.2
theory of probabilities is at bottom of & P(B)=0.3
nothing but common sense reduced to So, p(A/B)=0.2/0.2
calculus. =0.66
All of the time, agent are forced to Therefore, probability of occurrence of A
make decision based on incomplete when B is given is 0.66
information. Even when an agent sense the
world to find out more information, it rarely Independence
finds out exact states of world. Probability of A independence of B
i.e. P(A/B)=P(A)
Eg: which indicates that event B is independent
- A robot doesn’t know exactly to event A
where the object is.
- A doctor doesn’t know exactly Joint Probability
what is wrong with a patient It completely specifies all belief In problem
When intelligent must take decisions, they domain. It is an n-dimensional table with a
have to use whatever information they probability in each cell of that state
have, considering the reasoning under occurring. It is expressed as
uncertainty determining what is true world P(X1,X2,X3,……………..,Xn)
based on observation on the world. It used And instantiated as
in neural nets (NN) as bias for acting under P(x1,x2………,xn)
uncertainty where agent must make
decision about what action to take even Eg: P(toothache, cavity).
though it cannot precisely predict the
outcome of its action. It includes Toothache Toothache
probability, shows how to represent the Cavity 0.04 0.06
world by making appropriate independence 𝜏𝜏cavity 0.01 0.86
assumption and show how to reason with
such representation Probability distribution
Distribution of the probability of
occurrence of the events over the histogram
Parameter
# Probability of A i.e. P(A) & probability of Eg: P(A)=0.1
𝑛𝑛
𝐴𝐴 = 𝑠𝑠𝐴𝐴 where 𝑛𝑛𝐴𝐴 is occurrence of A and S P(B)=0.4
is sample space. P(C)=0.4
P(D)=0.1
45
graph is estimated by using known

0.4 statistical and computational method.
0.4 0.4
0.3
Probability
Eg:
0.2 Let us consider an example of a certain area
of Bkt which can detect earthquakes & fire
0.1
0.1 0.1 & it initialize the dorm. Then Saroj &
0 Sanskriti be the call operator & its graph
A B C D can be represented as:
Events
& ∑ 𝑃𝑃𝑃𝑃 = 1
The probability of any event must be
between o & 1 or equal to 0/1
Bayes theorem
Bayes theorem can be expressed as
𝐵𝐵
𝐴𝐴 𝑃𝑃 �𝐴𝐴 � 𝑃𝑃(𝐴𝐴) Let us consider probability for each:
𝑃𝑃 � � =
𝐵𝐵 𝑃𝑃(𝐵𝐵)
i.e.
𝐵𝐵
𝐴𝐴 𝑃𝑃 �𝐴𝐴 , 𝐸𝐸� 𝑃𝑃(𝐴𝐴, 𝐸𝐸)
𝑃𝑃 � , 𝐸𝐸� =
𝐵𝐵 𝑃𝑃(𝐵𝐵/𝐸𝐸)
where, E represents event
Statistical reasoning
There are several techniques that can be
used to augment knowledge representation
technique with statistical measures that
describes levels of evidence & belief. An
important goal for many evidence as the
system goes along & to modify its behavior, Retrieving probability from the conditional
we need a statistical theory of evidence. distribution
Bayesian statistics is such a theory of which 𝑃𝑃(𝑥𝑥1 , 𝑥𝑥2 , … … , 𝑥𝑥𝑛𝑛 )
stress the conditional probability as 𝑛𝑛
fundamental notation = 𝜋𝜋𝜋𝜋(𝑥𝑥𝑖𝑖 � 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝑥𝑥1 ))

𝑖𝑖=1
Bayesian Network (BN)
BN is also known as belief network, are the
graphical structure which are used to
represent knowledge, about an uncertain
domain, where each in the graph represents
random variables while edge between node
represents probabilistic dependencies
among the corresponding random
variables. The conditional dependencies in
46
Eg: In a particular clinic, 10% of patient are

𝑃𝑃(𝐶𝐶&𝐷𝐷&𝐴𝐴&¬𝐹𝐹&¬𝐸𝐸) prescribed narcotic pain killer. Overall 5%
𝐴𝐴 𝐴𝐴 of clinic’s patient are addictive to narcotics.
= 𝑃𝑃 �𝐶𝐶/𝐴𝐴)8 ∗ 𝑃𝑃(𝐷𝐷/𝐴𝐴) ∗ 𝑃𝑃( , �
¬𝐹𝐹 ¬𝐸𝐸 Among the addicts the clinic 8% have been
∗ 𝑃𝑃(¬𝐹𝐹) ∗ 𝑃𝑃(¬𝐸𝐸) prescribed narcotics. If the patient
= 0.9 ∗ 0.7 ∗ 0.001 ∗ 0.999 ∗ 0.998 prescribed pain pills. What is the
probability that they will become an addict?
𝑃𝑃(𝐶𝐶&𝐷𝐷&𝐴𝐴&𝐹𝐹&¬𝐸𝐸)
𝑃𝑃(¬𝐶𝐶&𝐷𝐷&𝐴𝐴&¬𝐹𝐹&𝐸𝐸) Given the following statistic, find the

probability that women has cancer of she
has a positive mammogram result.
- 1% of women over 50 have
In a clinic, 10% of patient entering have breast cancer
liver disease, 5% of patient are alcoholic, - 90% of women who have breast
7% of patient diagnosed with liver disease cancer test positive on
are alcoholic. Find the probability of mammogram test.
patient who are alcoholic that have - 8% of woman will have false
chances of liver disease. positive
Soln
10% of patient entering have liver disease
𝑃𝑃(𝐿𝐿) = 0.1
5% of patient are alcoholic 𝑃𝑃(𝐴𝐴) = 0.05
7% of patient diagnosed with liver disease
are alcoholic 𝑃𝑃(𝐿𝐿 𝑛𝑛 𝐴𝐴) = 0.07
Probability of patient who are alcoholic that
𝐴𝐴
have chances of liver 𝑃𝑃 �𝐿𝐿 � =?
By Bayes theorem,
𝐿𝐿
𝐴𝐴 𝑃𝑃� �∗𝑃𝑃(𝐴𝐴)
𝐴𝐴
𝑃𝑃 �𝐿𝐿 � =
𝑃𝑃(𝐿𝐿)
47
Case Base Reasoning:

Case base reasoning is a process of Revise:
solving new problems based on solution of Having mapped the previous
similar past problem. solution to the target situation test new
Example 1: an automatic fixes an engine solution in the real world and if necessary
by recalling another car and other car that revise. Suppose Sabin adapt his coffee
exhibits similar symptoms each using case solution by stirring the amount of coffee
base reasoning. with milk. After mixing he discovered that
Example 2: A lawyer who advocates a the cappuccino is not made get then he
particular outcomes in a trail based on legal revise and stir the movement carefully.
precedent a judge create case using case
base reasoning.
Example 3: An engineer copying working
element of nature is treating nature as Retain:
database of solution to the problem. After the solution has been
successfully adapted to the target problem,
CBR is a prominent kind of analogy solve the resulting experiment as a new
making. It is not only a powerful method for case in memory. Sabin, accordingly records
computer reasoning but also a provisive his new-found procedure for making
behavior in every human problem solving cappuccino there by enriching his set of
or all reasoning is based on past cases stored experience and better preparing him
personally experienced. CBR has bit for future cappuccino.
formalized for purpose of computer
reasoning as a four step process.
1. Reuse
2. Revise
3. Retain
4. Retrieve
Retrieve
Given a target problem, retrieve
form memory cases relevant to solve it. A
case consist of a problem, its solution and
annotation about how solution was derived.
Example: Sabin wants to prepare
cappuccino, being a novice, the most
relevant experience he can recall is one in
which he successfully made coffee.
Reuse:
Map the solution from the precious
cases to the target problem which may
involve adapting the solution as needed to
fit new situation. In above example, Sabin
must adapt his retrieved solution to include
how much the coffee is to be stirred.
48
configuration (old fax family of

Chapter 8
minicomputer) “Drilling Advisor” is an
example of R1/XCON which helps in
configuration of drilling limit and saves the
driller from damage. It was used by French
oil mines which saved millions of amount
at that time.
Roles of Expert System
- Can replace the human expert
Expert System: - Assisting expert
Expert System were commercially the
successful domain in AI which mimic the Application of Expert System
experts in whatever field. They are also 1. Control Application
called rule based system in which expert’s 2. Design tools
expertise are built into the program through 3. Diagnosis and Prescription
the collection rules, which act desired 4. Instruction and Simulation
program function at same level as human 5. Planning and Production
expert. In accordance with Durkin “An 6. Entertainment
Expert system is a computer program
designated to model the problem solving Factors Human Expert
ability of human expert”. Expert System
Availability Limited Always
Example of Expert System. Geographic Locally Anywhere
location Available
i. Dendrals
Safety Irreplaceable Can be
It was one of the pioneering expert
Consideration replaced
system which was developed at Stanford
Durability Depends on Non -
for NASA to perform chemical analysis of Individual Perishable
Martian Soil for space mission. It is mainly Performance Variable High
focused on to determine molecular Speed Variable High
structure of soil. It acts as Drukin’s Cost High Low
definition of ES. Learning Variable / Low
Ability High
ii. MYCIN Explanation Variable Exact
It was developed in Stanford to support
physician in diagnosing and treating patient
with a particular blood disease. It was tested
in 1982, at that time the member of patient
exceeded due to which physician were not
available for the patient. So, MYCIN
replaced the physicians in in-availability of
physician.
iii. R1/XCON
It also term as EXCON which was
developed by Digital Equipment
Corporation which supports computer
49
4. Knowledge Base
It contains domain knowledge i.e. facts,
rules, concept and relationships. The
strength of expert system lies on the
richness of knowledge. So, knowledge
engineers are accounted to design
knowledge base. As knowledge engineer,
designer must overcome the knowledge
acquisition bottle neck and find an effective
way to get information from the expert and
encode it in the knowledge base, using
knowledge representation technique.
Fig: Structure of Expert System.
5. Working Memory
Components of Expert System It contains the problem facts that are
1. Domain Expert discovered during session. In this, user
2. Knowledge engineer presented facts about the situation are
3. System engineer stored in memory and using knowledge
4. Knowledge base stored in KB, new information is informed
5. Inference engine and also added to working memory:
6. Working memory Eg:
7. User Interface and user 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 (𝑅𝑅𝑅𝑅𝑅𝑅)
𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 �𝐹𝐹𝐹𝐹𝐹𝐹ℎ𝑒𝑒𝑒𝑒 (𝑅𝑅𝑅𝑅𝑅𝑅, 𝑆𝑆ℎ𝑦𝑦𝑦𝑦𝑦𝑦)
1. Domain Expert 𝐹𝐹𝐹𝐹𝐹𝐹ℎ𝑒𝑒𝑒𝑒 (𝑅𝑅𝑅𝑅𝑅𝑅, 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻)
Domain expert are the person who have
skills and knowledge to solve specific The presented facts are used to generate
problem in a manner superior to other. An new information or the knowledge which is
expert should have knowledge of given added in working memory let the
domain, good communication skills, information generated is brother (Shyam,
availability and rediness to cooperate Hari)
So, the content of WM is
2. Knowledge engineer 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 (𝑅𝑅𝑅𝑅𝑅𝑅)
Knowledge engineer is a person who 𝐹𝐹𝐹𝐹𝐹𝐹ℎ𝑒𝑒𝑒𝑒 (𝑅𝑅𝑅𝑅𝑅𝑅, 𝑆𝑆ℎ𝑦𝑦𝑦𝑦𝑦𝑦)
𝑁𝑁𝑁𝑁𝑁𝑁 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 �
designs build and test an expert system. 𝐹𝐹𝐹𝐹𝐹𝐹ℎ𝑒𝑒𝑒𝑒 (𝑅𝑅𝑅𝑅𝑅𝑅, 𝐻𝐻𝐻𝐻𝐻𝐻𝑖𝑖)
He/she plays key role in identifying, 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵ℎ𝑒𝑒𝑒𝑒 (𝑆𝑆ℎ𝑦𝑦𝑦𝑦𝑦𝑦, 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻)
acquiring and encoding knowledge.
6. Inference Engine
3. End user it is the chief part of ES which matches the
All people who users ES are end user or facts contained in WM with domain
users knowledge presented in KB to draw
conclusion of problem. It acts with KB &
WM where both facts are presented in WM
to draw new information. Then, inference
engine matches the given facts, if match
found fires the conclusion of the rule i.e.
adds the conclusion to WM.
50
7. Inference Mechanism Solution:

Inference engine uses some sorts of KB WM
mechanism to add new facts in WM in Rule1:
which different sequence for matching are 𝐼𝐼𝐼𝐼 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒(𝑥𝑥, 𝑦𝑦)
possible and that can have multiple 𝑎𝑎𝑎𝑎𝑎𝑎 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒 (𝑥𝑥, 𝑧𝑧),
strategies for inferring new information 𝑇𝑇ℎ𝑒𝑒𝑒𝑒, 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 (𝑦𝑦, 𝑧𝑧)
depending upon goal. There are two types Rule 2:
of inference mechanism 𝐼𝐼𝐼𝐼 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒 (𝑥𝑥, 𝑦𝑦),
i. Forward Chaining 𝑇𝑇ℎ𝑒𝑒𝑒𝑒, 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 (𝑥𝑥, 𝑦𝑦)
ii. Backward Chaining Rule 3:
𝐼𝐼𝐼𝐼 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒 (𝑥𝑥, 𝑦𝑦),
𝑇𝑇ℎ𝑒𝑒𝑒𝑒, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 (𝑥𝑥, 𝑦𝑦)
Forward Chaining
It is an inference strategy that begins with ∅
1st Pass
set of known facts, derives new facts using
Rule 1:
rules where premises matches the known
If father(x,y) Father (Ram, Shyam)
facts and continue the process until a goal
and father (x,z), Father (Ram, Hari)
state is reached or until no further rules Then, sibling (y,z) Sibling (Shyam, Hari)
have premises that matches the known or
derived facts 2nd Pass
Rule 2:
Approach If father (x,y),
- Add facts to WM Then, paytution (x,y) paytuition(Ram, Shyam)
- Take each rule in turn and check to paytuition(Ram, Hari)
see if any of its premises matches
the facts in the WM 3rd Pass
- If match found for all premises of a Rule 3:
rule, then place the conclusion of If sibling (x,y),
the rule in WM Then, likes (x,y) likes(Ram, Shyam)
- Repeat until no more facts can be
added. Each repetition is called 1.
Pass.
Rule 1 𝐼𝐼𝐼𝐼 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒(𝑥𝑥, 𝑦𝑦)
Eg:
𝑎𝑎𝑎𝑎𝑎𝑎 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒 (𝑦𝑦, 𝑧𝑧),
Rule 1 𝐼𝐼𝐼𝐼 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒(𝑥𝑥, 𝑦𝑦) 𝑇𝑇ℎ𝑒𝑒𝑒𝑒, 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔ℎ𝑒𝑒𝑒𝑒 (𝑥𝑥, 𝑧𝑧)
𝑎𝑎𝑎𝑎𝑎𝑎 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒 (𝑥𝑥, 𝑧𝑧), Rule 2 𝐼𝐼𝐼𝐼 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒(𝑥𝑥, 𝑦𝑦)
𝑇𝑇ℎ𝑒𝑒𝑒𝑒, 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 (𝑦𝑦, 𝑧𝑧) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒 (𝑥𝑥, 𝑧𝑧),
Rule 2 𝐼𝐼𝐼𝐼 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒 (𝑥𝑥, 𝑦𝑦), 𝑇𝑇ℎ𝑒𝑒𝑒𝑒, 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 (𝑦𝑦, 𝑧𝑧)
𝑇𝑇ℎ𝑒𝑒𝑒𝑒, 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 (𝑥𝑥, 𝑦𝑦) Rule 3 𝐼𝐼𝐼𝐼 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔ℎ𝑒𝑒𝑒𝑒 (𝑥𝑥, 𝑦𝑦)
Rule 3 𝐼𝐼𝐼𝐼 𝑓𝑓𝑓𝑓𝑓𝑓ℎ𝑒𝑒𝑒𝑒 (𝑥𝑥, 𝑦𝑦), 𝑇𝑇ℎ𝑒𝑒𝑒𝑒 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔ℎ𝑖𝑖𝑖𝑖𝑖𝑖 (𝑦𝑦, 𝑥𝑥)
𝑇𝑇ℎ𝑒𝑒𝑒𝑒, 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 (𝑥𝑥, 𝑦𝑦)
Facts:
Case Facts Father (Ram, Hari)
Father (Ram, Shyam) Father (Ram, Shyam)
Father (Ram, Hari) Father (Hari, Muna)
51
2.
Rule 1 If patient has deep cough
And we suspect an infection
Then patient has Pneumonia
Rule 2 If patient temp. is above 100oF
Then patient has fever
Rule 3 If patient has been sick for over
a fortnight
And the patient has fever
Then, we suspect infection
Facts:
Patient temp. is 103oF.
Patient has been sick for over a month
Patient has violent coughing fits.
Solution:
KB WM
(Knowledge Based) (Working Memory)
Rule1:
If patient has deep
cough
and we suspect and
infection
then the patient has
pneumonia.
Rule 2:
It the patients temp is
above 100oF,
then patient has fever.
Rule 3:
If the patient has been
sick for over a
fortnight
and patient has fever
then we suspect
infection.
1st Pass
Rule1:
If patient has deep
cough
and we suspect and
infection
then the patient has
pneumonia.
52
Backward Chaining Consider the previous example of patient

It is an inference strategy that works where goal is patient has pneumonia.
backward from a hypothesis to a proof. In
this method it begins with a hypothesis Solution:
about what the situation might be and we KB WM
prove it using given facts, in backward (Knowledge Based) (Working Memory)
chaining terminology the hypothesis to Rule1:
prove is called goal. If patient has deep
cough and we suspect
Approach and infection then the
1. Start with goal. patient has
pneumonia.
2. Goal may be in WM initially, so
Rule 2:
check and finish if found.
It the patients temp is
3. If not, then search for goal in then above 100oF, the n
part of the rules which can be patient has fever.
termed as goal rule. Rule 3:
4. Check to see if the foal rules If the patient has been
premises are listed in the WM. sick for over a
5. Premises not listed becomes sub- fortnight and patient
goals to prove. has fever then we
6. Process continues in a recursive suspect infection.
fashion until a premises is found
that is not support by a rule, if it
cannot be concluded by any rule. 1st Pass
7. When a primitive us is found, ask Goal: The patient has pneumonia.
for information about it. Backtrack Rule1:
and use information to prove sub- If patient has deep Patient has deep
goals and subsequently the goal. cough and we cough.
suspect and infection We suspect an
then the patient has infection.
pneumonia.
2nd Pass
Patient has deep cough.
The sub goal patient has deep cough is not
found in any part of the given rules
3rd Pass
We suspect an infection.
Rule 3:
If the patient has Patient has been sick
been sick for over a for over a fortnight.
fortnight and patient Patient has fever.
has fever then we
suspect infection.
53
4th Pass
Patient has been sick for over a fortnight
The sub goal is not found in the any part of
the given rules
5th Pass
Patient has fever
Rule 2:
It the patients temp is The patients temp
above 100oF, the n is above 100oF.
patient has fever.
6th Pass
The patients temp is above 100oF
The sub-goal is of found in any part of the
given rules.
Since, no any sub-goal is found from 6th

pass, now backtracking the sub goals.
7th Pass
• Patient has deep cough.
• Patient has been sick for over a
fortnight.
• The patients temp is above 100oF.
Conclusion
The patient has Patient has deep
pneumonia. cough.
Patient has been
sick for over a
fortnight.
The patients
temp is above
100oF.
54
biological neuron. The conventional

Chapter 9
computer are good for fact arithmetic &
does what programmer programs ask them
Neural network
to do. They are not good for interacting with
Human brain consist of enormous amount
noise data or data from environment,
of neuron cell or nerve cell which are
massive parallelism fault tolerance &
interconnected with each other though
adaptic to circumstances. The Von-
dendrites. Dendrites receives in fact
Neuman machines are based on the
multiple members of neuron, then the input
processing memory i.e. memory abstraction
are sum up by the some of the neuron cells
of human information processing. The NN
and the axiom act as transmitter of
are based on the parallel architecture of
transmitting the output through a point
biological brains.
termed as synapse.
Dendrites are branch like structure present
Smallest unit of ANN is perceptron which
in neuron that extends from cell body or
was developed by Mc Culloch & Pits & a
soma & act a input unit in a neuron. Soma
single perceptron can also act like a neural
or cell body of a neuron contains the
nets perceptron receives a number of inputs
nucleus & other structure that supports
where they are accounted by a connection
chemical processing and production of
that has a strength i.e. weight & they are
neuron transmitter. It sum up the inputs.
corresponds to synaptic efficiency in
biological neuron. Each perceptron also has
Axon is a singular fiber which carries
single threshold value. The weighted sum
information away firm the soma to the
of the input is formed & the threshold is
synaptic sites of neuron. Synapse is a point
subtracted to compose the activation of
of connection point between two neurons or
neuron. The activation signal is passed
neuron & a muscle or gland.
through an Activation function to produce
Electrochemical communication between
outputs of the neuron.
neurons takes place at these points.
A neuronal nets (NN) is an artificial

representation of the human brain that tries
to simulate its learning process. An ANN is
often called Neural Network or Neural
Nets. Traditionally, the work neural
network is referred to a network of
biological neuron in the nervous system
Fig. A single Perceptron
that processes & transmit information.
ANN is a mathematical model or
Activation Function.
computational Model for information
If ∑ 𝑤𝑤𝑖𝑖 𝑥𝑥𝑖𝑖 + 𝐵𝐵𝐵𝐵𝐵𝐵𝑠𝑠 >
processing based on a connectionist
𝑇𝑇ℎ𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑜𝑜𝑜𝑜𝑜𝑜 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 (𝜃𝜃)
approach to computational. The ANN are
Then, 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 = 1 i.e. fire
mode of inter-connecting artificial neuron
Else, 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 = 0 i.e. not fire
which may share some properties &
biological neuron
Net Bias is used for shifting of decision line
up & down.
So, it is possible to realize an artificial
neural nets which exhibits the properties of
55
Linearly Separable hidden layer is responsible for different

If the space is represented with lines.
respect to output space of a class or problem 𝑋𝑋𝑋𝑋𝑋𝑋 = 𝐴𝐴′ 𝐵𝐵 + 𝐵𝐵′𝐴𝐴
then it is called linearly separable 𝑖𝑖𝑖𝑖𝑖𝑖 they
can be separated or classified with straight
line. Eg: OR gate, AND gate, NOT gate etc. Fig:
OR Gate
A B F
0 0 0
0 1 1
1 0 1
1 1 1
AND Gate
A B F
0 0 0
0 1 0
1 0 0
1 1 1 Hebbian Learning Neural Network
Updating Function.
NOT Gate 𝑤𝑤𝑖𝑖 (𝑛𝑛𝑛𝑛𝑛𝑛) = 𝑤𝑤𝑖𝑖 (𝑜𝑜𝑜𝑜𝑜𝑜) + 𝑥𝑥𝑖𝑖 𝑡𝑡 t is target
A F value
0 1 𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = 𝑏𝑏(𝑜𝑜𝑜𝑜𝑜𝑜) + 𝑡𝑡 & b is bias
1 0 value
Non Linearly Separable OR Gate

If the input space is represented w.r.t output b X1 X2 T
space of a problem then it is called non- 1 1 1 1
linearly separable 𝑖𝑖𝑖𝑖𝑖𝑖 they can’t be 1 -1 1 1
separated or classified with straight line. 1 1 -1 1
1 -1 -1 -1
XOR Gate
A B F Let Neural Net be:
0 0 0
0 1 1
1 0 1
1 1 0
Multilayered Perceptron
The XOR is non-linearly separable i.e. two
lines or more likely are required to separate
data without in corporating errors. So
multilayered perceptron achieves a
properties to provide acceptable for these
problems by encompassing one or more
hidden layers where each neuron in the
56
1st Step 4th Step

Put the value of 𝑤𝑤1 = 𝑤𝑤2 = 𝑎𝑎, 𝑏𝑏 = Put the value of weight obtained.
0 & 𝜃𝜃 = 0
Now, supply second input as entry from

Now, supply first input as entry from truth truth table i.e. (1,-1,-1) & target =1
table i.e. (1, 1, 1) & target =1 𝑤𝑤1 (𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + (−1). (−1) = 2
Then, 𝑤𝑤2 (𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + (−1). (−1) = 2
𝑤𝑤1 (𝑛𝑛𝑛𝑛𝑛𝑛) = 𝑤𝑤1 (𝑜𝑜𝑜𝑜𝑜𝑜) + 𝑥𝑥1 𝑡𝑡 = 0 + 1 = 1 𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = 3 + (−1) = 2
𝑤𝑤2 (𝑛𝑛𝑛𝑛𝑛𝑛) = 𝑤𝑤2 (𝑜𝑜𝑜𝑜𝑜𝑜) + 𝑥𝑥2 𝑡𝑡 = 0 + 1 = 1
𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = 𝑏𝑏(𝑜𝑜𝑜𝑜𝑜𝑜) + 𝑡𝑡 = 0 + 1 = 1
Verification
2nd Step b x1 x2 t B+∑ 𝑤𝑤𝑖𝑖 𝑥𝑥𝑖𝑖 Remarks
Put the value of weight obtained. 1 1 1 1 2 6>0,
+ (1.1 So fire
+ 2.1) = 6
1 -1 -1 1 2 2>0,
+ (2. −1 So fire
+ 2.1) = 2
1 1 1 1 2 2>0,
+ (2.1 So fire
truth table i.e. (1, -1, 1) & target =1 + 2. −1)
𝑤𝑤1 (𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + (−1). 1 = 0 =2
𝑤𝑤2 (𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + 1.1 = 2 1 -1 -1 - 2 -2!>0,
𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + 1 = 2 1 + (2. −1 So do
+ 2. −1) not fire
3rd Step = −2
Put the value of weight obtained

truth table i.e. (1,1,-1) & target =1
𝑤𝑤1 (𝑛𝑛𝑛𝑛𝑛𝑛) = 0 + 1.1 = 0
𝑤𝑤2 (𝑛𝑛𝑛𝑛𝑛𝑛) = 2 + (−1).1 = 2
𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = 2 + 1 = 3
57
AND Gate 3rd Step

b x1 x2 t Put the value of weight obtained
1 1 1 1
1 -1 1 -1
1 1 -1 -1
1 -1 -1 -1
Let Neural Net be:

truth table i.e. (1, 1, -1) & target = -1
𝑤𝑤1 (𝑛𝑛𝑛𝑛𝑛𝑛) = 2 + 1. (−1) = 1
𝑤𝑤2 (𝑛𝑛𝑛𝑛𝑛𝑛) = 0 + (−1). (−1) = 1
𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = 0 + (−1) = −1
th
4 Step
1st Step
Put the value of weight obtained.
Put the value of 𝑤𝑤1 = 𝑤𝑤2 = 𝑎𝑎, 𝑏𝑏 =
0 & 𝜃𝜃 = 0

truth table i.e. (1, -1, -1) & target =-1
Now, supply first input as entry from truth 𝑤𝑤1 (𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + (−1). (−1) = 2
table i.e. (1, 1, 1) & target =1 𝑤𝑤2 (𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + (−1). (−1) = 2
Then, 𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = (−1) + (−1) = −2
𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = 𝑏𝑏(𝑜𝑜𝑜𝑜𝑜𝑜) + 𝑡𝑡 = 0 + 1 = 1 Verification
b x1 x2 t b+∑ 𝑤𝑤𝑖𝑖 𝑥𝑥𝑖𝑖 Remarks
2nd Step 1 1 1 1 −2 2>0,
Put the value of weight obtained. + (2.1 So fire
+ 2.1) = 2
1 -1 -1 -1 −2 -2!>0,
+ (2. −1 So do
+ 2.1) not fire
= −2
Now, supply second input as entry from 1 1 1 -1 −2 -2!>0,
truth table i.e. (1, -1, 1) & target =-1 + (2.1 So do
𝑤𝑤1 (𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + (−1). −1 = 2 + 2. −1) not fire
𝑤𝑤2 (𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + 1. −1 = 0 = −2
𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = 1 + (−1) = 0 1 -1 -1 -1 −2 -6!>0,
+ (2. −1 So do
+ 2. −1) not fire
= −6
58
NOT Gate Learning Methods in Neural Nets

b x t
1 1 -1 Supervised Learning
1 -1 1 In supervised learning method,
there is given set of example inputs/outputs
1st Step pairs rule which can do job of predicting
Put the value of 𝑤𝑤1 = 𝑤𝑤2 = 𝑎𝑎, 𝑏𝑏 = outputs associated with a new input.
0 & 𝜃𝜃 = 0
Back Propagation is one of method
Now, supply first input as entry from truth included in supervised Learning Method in
table i.e. (1, 1) & target =-1 Neural Nets.
Then,
𝑤𝑤(𝑛𝑛𝑛𝑛𝑛𝑛) = 𝑤𝑤(𝑜𝑜𝑜𝑜𝑜𝑜) + 𝑥𝑥. 𝑡𝑡 Algorithm.
= 0 + 1. (−1) = −1 Step 1: Randomize the weight {ws} to
𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = 𝑏𝑏(𝑜𝑜𝑜𝑜𝑜𝑜) + 𝑡𝑡 = 0 + (−1) = −1 small random value that can be
either positive or negative
2nd Step Step 2: Select a training instance t i.e.
Put the value of weight obtained. the vector {𝑥𝑥𝑘𝑘 (𝑡𝑡)}, 𝑘𝑘 =
1 … … 𝑁𝑁𝑖𝑖𝑖𝑖𝑖𝑖 from training set
Now, supply second input as entry from Step 3: Apply network input vector to
truth table i.e. (1, -1) & target =1 network input
𝑤𝑤1 (𝑛𝑛𝑛𝑛𝑛𝑛) = −1 + (−1). 1 = −2 Step 4: Calculate network output
𝑏𝑏(𝑛𝑛𝑛𝑛𝑛𝑛) = −1 + 1 = 0 vector {𝑧𝑧𝑘𝑘 (𝑡𝑡)}, 𝑘𝑘 = 1 … … 𝑁𝑁𝑜𝑜𝑜𝑜𝑜𝑜
Step 5: Calculate errors for each
outputs k, 𝑘𝑘 = 1 … … 𝑁𝑁𝑜𝑜𝑜𝑜𝑜𝑜 , the
Verification difference between the desired
b x t b+∑ 𝑤𝑤𝑖𝑖 𝑥𝑥𝑖𝑖Remarks output & the network outputs
1 1 -1 0 + (−2). 1 -2 > 0, Step 6: Calculate necessary updates
= −2 So do for weight −𝑤𝑤𝑠𝑠 in a way that
not fire minimize errors
1 -1 1 0 2 > 0, Step 7: Adjust the weights of the n/w
+ (−2). (−1) So fire by −𝑤𝑤𝑠𝑠
=2
Step 8: Repeat steps for each instance
in training set until the error for
the entire system.
59
Unsupervised Learning: Neural Network Development / Planning

In the learning method, there is given set of
examples with no labelling, group sets
(clusters). Self-organizing neural nets is an
example for unsupervised learning neural
nets, which accounts clustering
quantization function approximation &
Kohonen Maps.
Algorithm.
Step 1: Each nodes weight initialized
Step 2: A data input from training data
is chosen at random &
presented to cluster lattice
Step 3: Every cluster center is ADALINE (Adaptive Linear Element)
examined to calculate which Learning)
weight are most like the input Input to network
vector. The wining node is Yin =b+wixi
called Best Matching Units New weights,
(BMU) Bnew=bold+α(t-yin)
Wnew=wiold + α (t-yin)xi
Step 4: The radius of the neighborhood Eg: define problem: construction of AND
of the BMU is calculated & gate data selection, preparation and pre-
node within radius are placed processing.
to be inside BMU neighbor Truth table for AND gate including bias.
Step 5: Each neighborhood node’s B x1 x2 t
weights are adjusted to make 1 1 1 1
more like input vectors. 1 -1 1 -1
1 1 -1 -1
Step 6: Repeat steps for N – iteration. 1 -1 -1 -1
Let an AND network be:
Select best suited algorithm

Let us consider ADALINE learning
algorithm
So, inputs to network
𝑦𝑦𝑖𝑖𝑖𝑖 = 𝑏𝑏 + � 𝑥𝑥𝑖𝑖 𝑤𝑤𝑖𝑖
New weights
𝑏𝑏𝑛𝑛𝑛𝑛𝑛𝑛 = 𝑏𝑏𝑜𝑜𝑜𝑜𝑜𝑜 + 𝛼𝛼(𝑡𝑡 − 𝑦𝑦𝑖𝑖𝑖𝑖)
𝑤𝑤𝑛𝑛𝑛𝑛𝑛𝑛 = 𝑤𝑤𝑜𝑜𝑜𝑜𝑜𝑜 + 𝛼𝛼(𝑡𝑡 − 𝑦𝑦𝑖𝑖𝑖𝑖)𝑥𝑥𝑖𝑖
𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 0.1 < 𝛼𝛼 < 10
α is called learning rule.
Initially, let b=0.1, w1=0.2,w2=0.3 and α
=0.1
60
Train network 𝑤𝑤2𝑛𝑛𝑛𝑛𝑛𝑛 = 0.216 + 0.1(−1 − 0.164)(−1)

Phase I: = 0.3324
Apply 1st input as (1,1) and target=1 Maximum change is 0.1164
𝑦𝑦𝑖𝑖𝑖𝑖 = 0.1 + (0.2 ∗ 1) + (0.3 ∗ 1) = 0.6
New weight Phase IV:
𝑏𝑏𝑛𝑛𝑛𝑛𝑛𝑛 = 0.1 + 0.1(1 − 0.6) Applying the value in network
= 0.14
𝑤𝑤1𝑛𝑛𝑛𝑛𝑛𝑛 = 0.2 + 0.1(1 − 0.6)1 = 0.24
𝑤𝑤2𝑛𝑛𝑛𝑛𝑛𝑛 = 0.3 + 0.1(1 − 0.6)1 = 0.34
Maximum change is 0.04
Phase II:
Apply the value in network Now, applying 4th input as (-1, -1) and
target =-1
𝑦𝑦𝑖𝑖𝑖𝑖 = −0.1004 + 0.2476 ∗ (−1) +
0.3324 ∗ (−1)
= −0.6804
New weight
𝑏𝑏𝑛𝑛𝑛𝑛𝑛𝑛 = −0.1004 + 0.1(−1 + 0.6804)1
Now, applying 2nd input as (-1,1) and = 0.13236
target =-1 𝑤𝑤1𝑛𝑛𝑛𝑛𝑛𝑛 = 0.2476 + 0.1(−1 + 0.6804)
𝑦𝑦𝑖𝑖𝑖𝑖 = 0.14 + (0.24 ∗ −1) + (0.34 ∗ 1) ∗ (−1)
= 0.24 = 0.27956
New weight 𝑤𝑤2𝑛𝑛𝑛𝑛𝑛𝑛 = 0.3324 + 0.1(−1 + 0.6804)
𝑏𝑏𝑛𝑛𝑛𝑛𝑛𝑛 = 0.14 + 0.1(−1 − 0.24) ∗ (−1)
= 0.016 = 0.36436
𝑤𝑤1𝑛𝑛𝑛𝑛𝑛𝑛 = 0.24 + 0.1(−1 − 0.24). (−1) Maximum change is 0.03196
= 0.364
𝑤𝑤2𝑛𝑛𝑛𝑛𝑛𝑛 = 0.34 + 0.1(−1 − 0.24)1 After Phase IV
= 0.216 Continue from phase I until the change is
Maximum change is 0.124 lesser than 0.01
After 500 to 600 number of iteration, we
Phase III: get, 𝑏𝑏 = −0.5, 𝑤𝑤1 = 0.5, 𝑤𝑤2 = 0.5
Applying the value in network i.e. Divide the value of b, w1, w2 by 4 of
Hebbian Network
Now, applying 3rd input as (1, -1) and

target =-1
𝑦𝑦𝑖𝑖𝑖𝑖 = 0.016 + (1 ∗ 0.364) + (−1) ∗ .216
= 0.164
New weight
𝑏𝑏𝑛𝑛𝑛𝑛𝑛𝑛 = 0.016 + 0.1(−1 − 0.164)
= −0.1004
𝑤𝑤1𝑛𝑛𝑛𝑛𝑛𝑛 = 0.364 + 0.1(−1 − 0.164)1
= 0.2476
61

AI

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

AI

Загружено:

Авторское право:

Доступные форматы

Artificial Intelligence

think like human, act rationally and

distinct from omniscience (all knowing

An agent can be expressed in block diagram Fig: Table Driven Agent

b) Simple Reflex Agent

Fig: Block Diagram of Agent

An agent percepts the environment through

Rational agent select an action from

c) Model Based Reflex Agent e) Learning agent

Fig: Model Based Reflex Agent

- Can work with partial information

Fig: Goal based agent.

Let us consider an example of Automatic

Problem Let 3L be represented by x.

Types of Problems Searching Strategies.

Characteristic of Search Strategies

Parameter of Space and Time

Depth Nodes Time Memory

From this empirical analysis the time

The search path is A-B-C-D-E-F-G.

Step 3: Fringe [C, D, E]

Step 4: fringe [D, E, F, G]

Step 5: Fringe [E, F, G]

Depth First Search

The search path is A-B-D-E-C-F-G.

Step 3: Fringe [D, E, C] Step 6: Fringe [F, G]

Step 4: Fringe [E, C]

Step 5: Fringe [C] The path is:

Informed Search Let us consider a tree as:

Best First Search (BeFS)

• Complete: NO, as one can get stuck

Since, h(n) of Node E is less than that of

Since, h(n) of node F is smaller than that of

Since, node I is only one node from node H.

The path is:

Since, there is only one node from node F.

Uniform Cost Search A* Search

Since, 𝑓𝑓(𝑛𝑛𝐸𝐸 ) is less than that of 𝑓𝑓(𝑛𝑛) of B

A* search is complete as long as branching

Hill Climbing  A state is represented as a string

Basically, DFS with measure of quality i.e.

Simulated Annealing Search.

Genetic Algorithm (GA)

𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 So, E = 5 and C2 = 1

AS TO SHE LYNNE EAT SEND

DAN SPOT MEET BARREL MOSES

ALL HER AND

Generate Game Tree

Min – Max Algorithm

i.e. 3 is min so we take 3. i.e. 2 is min than 3 so we take 2

Where, if 𝛼𝛼 < 𝛽𝛽, then max will choose m.

Chapter 5 • John likes strawberry

Knowledge representation • chicken is a food

• All dogs are animal

• All animals will die

• Someone likes ice-cream

• All girls like ice-cream.

• Roses are red.

• Somebody is loyal to someone.

• John likes every kind of food

Knowledge Base b. Declarative knowledge

Fig: AI cycle e. Structural knowledge

They are also called pattern matching rules

Eg: ¬AvA, p→q, q→p