00 голосов за00 голосов против

41 просмотров150 стр.Jul 20, 2017

© © All Rights Reserved

PDF, TXT или читайте онлайн в Scribd

© All Rights Reserved

41 просмотров

00 голосов за00 голосов против

© All Rights Reserved

Вы находитесь на странице: 1из 150

Inferential reasoning

Hridaya Kandel

hridayakandel@gmail.com

AI Lecturer

BScCSIT Vth

Hridaya Kandel 1

Objective

Students should understand the importance of knowledge

representation

Students should understand the use of formal logic as a knowledge

representation language

The student should be familiar with the concepts of logic such as

syntax, semantics, validity, models, entailment.

student should be able Represent a natural language description as

statements in logic and Deduct new sentences by applying inference

rules.

Students should learn in details about resolution techniques

Hridaya Kandel 2

Knowledge-based Agents

Intelligent agents should have capacity for:

Perceiving, that is, acquiring information from environment,

Knowledge Representation, that is, representing its understanding of

the world,

Reasoning, that is, inferring the implications of what it knows and of

the choices it has, and

Acting, that is, choosing what it want to do and carry it out.

Representation: How are the things stored?

Reasoning: How is the knowledge used?

To solve a problem

To generate more knowledge

Hridaya Kandel 3

Knowledge-based Agents

Representation of knowledge and the reasoning process are

central to the entire field of artificial intelligence.

Useful mostly in partially observable environments

with current percepts to infer hidden aspects of the current

state prior to selecting actions.

Example: a physician diagnoses a patient

that is, infers a disease state that is not directly observable

Some of the knowledge used in the form of rules learned from

textbooks and teachers, and some is in the form of patterns of

association that the physician may not be able to consciously

describe.

If its inside the physicians head, it counts as knowledge.

Hridaya Kandel 4

Knowledge-based Agents

Understanding natural language also requires inferring hidden

state, namely, the intention of the speaker.

Example:

John saw the diamond through the window and coveted it, we

know it refers to the diamond and not the window

when we hear, John threw the brick through the window and

broke it, we know it refers to the window

Reasoning allows us to cope/deal with the virtually infinite

variety of utterances(vocal expressions) using a finite store of

commonsense knowledge.

Problem-solving agents have difficulty with this kind of

ambiguity because their representation of contingency

problems is inherently exponential.

Hridaya Kandel 5

Knowledge-based Agents

Knowledge-based Agents are flexible

They are able to accept new tasks in the form of explicitly described

goals,

they can achieve competence quickly by being told or learning new

knowledge about the environment, and

they can adapt to changes in the environment by updating the relevant

knowledge.

Hridaya Kandel 6

Knowledge-based Agents

The central component of a knowledge-based agent is its

knowledge base, or KB. Informally, a knowledge base is a set

of sentences (not English sentences).

Knowledge base = set of sentences in a formal language

Each sentence is expressed in a language called a knowledge

representation language. A language used to express

knowledge about the world

Declarative approach language is designed to be able to easily

express knowledge for the world the language is being implemented

for

Procedural approach encodes desired behaviors directly in program

code

Hridaya Kandel 7

Knowledge-based Agents

The new sentences are added to the Knowledge Base and the

query are made on Knowledge Base. This two task are

performed using the two generic functions.

TELL: add new sentences (facts) to the KB

Tell it what it needs to know

ASK: query what is known from the KB

Ask what to do next

knowledge base

When the agent draws a conclusion from available information, it is

guaranteed to be correct if the available information is correct

Hridaya Kandel 8

Wumpus World

Performance measure

gold +1000, death -1000

-1 per step, -10 for using the arrow

Environment

Squares adjacent to wumpus are smelly

Squares adjacent to pit are breezy

Glitter iff gold is in the same square

Shooting kills wumpus if you are facing it

Shooting uses up the only arrow

Grabbing picks up gold if in same square

Releasing drops the gold in same square

Sensors: Stench, Breeze, Glitter, Bump, Scream

Actuators: Left turn, Right turn, Forward, Grab, Release, Shoot

The percepts will be given to the agent in the form of a list of five symbols; for

example, if there is a stench and a breeze, but no glitter, bump, or scream, the agent

will receive

Hridaya Kandel the percept [Stench; Breeze; 9None; None; None].

Wumpus World

Deterministic Yes outcomes exactly specified

Episodic No sequential at the level of actions

Static Yes Wumpus and Pits do not move

Discrete Yes

Single-agent? Yes Wumpus is essentially a natural

feature

Hridaya Kandel 10

Exploring Wumpus World

We will mark down what we know initially the agent (A) is in 1,1, and

we know that that square is OK.

It then gets the first percept remember order is [stench, breeze, glitter,

bump, scream] and first percept has these all null [null null null null

null].

The agent will move only into a square it knows to be OK.

Hridaya Kandel 11

Exploring Wumpus World

In initial state, there are no percepts (the sequence is none, none, none, none,

none) and therefore it can be inferred that the neighboring squares are safe (OK).

Hridaya Kandel 12

Exploring Wumpus World

The agent decides to move up and feels a breeze what does this mean?

Hridaya Kandel 13

Exploring Wumpus World

That there is a pit in either of the two neighboring squares. Better not move there

since it knows there is a safer move if it backs up

Hridaya Kandel 14

Exploring Wumpus World

Hridaya Kandel 15

Exploring Wumpus World

A pretty difficult inference!; note how difficult because of the inference of where

the pit is depends on the lack of a percept (no B in 2,1) and percepts gathered over

time.

Hridaya Kandel 16

Exploring Wumpus World

Hridaya Kandel 17

Exploring Wumpus World

Hridaya Kandel 18

Exploring Wumpus World

Hridaya Kandel 19

Exploring Wumpus World

In each case where the agent draws a conclusion from the available

Information, that conclusion is guaranteed to be correct if the available

Information is correct.

Hridaya Kandel 20 reasoning

Logic in general

Knowledge Bases consist of sentences

Logics are formal languages for representing information such that

conclusions can be drawn

Syntax defines how symbols can be put together to form the sentences in

the language

E.g., In the ordinary arithmetic.

x + y = 4 is a well-formed sentence, whereas x2y+ = is not

Semantics define the "meaning" of sentences;

Semantics allows you to relate the symbols in the logic to the domain

youre trying to model.

The semantics of the language defines the truth of each sentence with

respect to each possible world.

x + y =4 is true in a world where x is 2 and y is 2, but false in a world

where x is 1 and y is 1.

When we need to be precise, we will use the term model in place of

possible world.

Hridaya Kandel 21

Logic in general

Entailment means that one thing follows logically from

another.

In mathematical notation, we write as a |= b to mean that

the sentence a entails the sentence b.

model in which a is true, b is also true. Another way to say

this is if a is true, then b must be true.

Informally the truth of b is contained in the truth of a

i.e in any model where x + y =4 such as the model in which x is 2 and y

is 2 it is the case that 4=x + y.

We will see shortly that a knowledge base can be considered a

statement, and we often talk of a knowledge base entailing a sentence.

Hridaya Kandel 22

Logic in general

KB

Knowledge base KB entails sentence if and only if is true in all

worlds where KB is true

E.g., the KB containing the Nepal won and the India won entails

Either the Nepal won or the India won

E.g., x+y = 4 entails 4 = x+y

formally structured worlds with respect to which truth can be

evaluated

m is a model of a sentence means that sentence is true

in model m

M() is the set of all models of

Then

KB |= a iff M(KB) M(a)

Hridaya Kandel 23

Logic in general

Entailment in Wumpus-World

Consider a situation:

The agent has detected nothing in [1,1] and a breeze in [2,1].

The agent is interested in whether the adjacent squares [1,2], [2,2],

and [3,1] contain pits.

Each of the three squares might or might not contain a pit, so (for the

purposes of this example) there are 23 =8 possible models

Hridaya Kandel 24

Logic in general

The KB is false in models that contradict what the agent

knowsfor example, the KB is false in any model in which

[1,2] contains a pit, because there is no breeze in [1,1]. There

are in fact just three models in which the KB is true, and these

are shown as a subset of the models in Figure

Hridaya Kandel 25

Logic in general

Now let us consider two possible conclusions

1 = There is no pit in [1,2].

2 = There is no pit in [2,2].

We have marked the models of 1 in figure.

By inspection, we see the following: in every model in which

KB is true, 1 is also true.

Hence, KB |= a1 :

there is no pit in [1,2].

Hridaya Kandel 26

Logic in general

For conclusion

2 = There is no pit in [2,2].

By inspection, we see the following: In some model in which KB

is true, 2 false.

Hence, KB a1 :

the agent cannot conclude that there is no pit in [2,2]

The preceding example not only

illustrates entailment, but also shows

how the definition of entailment can be

applied to derive conclusionsthat is,

to carry out logical inference.

Hridaya Kandel 27

Logic in general

Model checking enumeration of all possible models to

ensure that a sentence is true in all models in which KB is

true

Inference is the process of deriving a specific sentence from a KB

(where the sentence must be entailed by the KB)

KB |-i a = sentence a can be derived from KB by procedure I

KBs are a haystack

Entailment = needle in haystack

Inference = finding it

Hridaya Kandel 28

Logic in general

An inference algorithm that derives only entailed sentences is called

sound or truth preserving.

Soundness

i is sound if

whenever KB |-i a is true, KB |= a is true

is entailed.

Completeness

i is complete if

whenever KB |= a is true, KB |-i a is true

by a sound inference procedure is also true in the real world

Hridaya Kandel 29

Propositional Logic

Propositional logic is the simplest logic.

Also Known As Boolean Logic

The syntax of propositional logic defines the allowable sentences.

Proposition symbols P1, P2, etc are sentences

Atomic sentence - consists of a single propositional symbol, which is True or

False

Complex sentence-sentence constructed from simpler sentences using

parentheses and logical connectives:

(not) negation

(and) conjunction

( or) disjunction

(implies) implication (premise=>conclusion)

(if and only if) biconditional

Hridaya Kandel 30

Propositional Logic

Propositional logic is the simplest logic.

Also Known As Boolean Logic

The syntax of propositional logic defines the allowable sentences.

Proposition symbols P1, P2, etc are sentences

Atomic sentence - consists of a single propositional symbol, which is True or

False

Complex sentence-sentence constructed from simpler sentences using

parentheses and logical connectives:

(not) negation

(and) conjunction

( or) disjunction

(implies) implication (premise=>conclusion)

(if and only if) biconditional

P Q P PQ PQ PQ PQ

False False True False False True True

False True True False True True False

Truth table for connectives:

Hridaya Kandel

True False

31

False False True False False

True True False True True True True

Propositional Logic

Formal grammar for propositional logic can be given as below

Hridaya Kandel 32

Propositional Logic

A simple KB : Wumpus World

For simplicity: we deal only with pits.

Choose vocabulary of proposition symbols. For each i, j

Let Pi,j be True if there is a pit in [i,j]

Let Bi,j be True if there is a breeze in [i,j]

The KB contains the following (Rules)

There is no pit in [1,1]:

R1: P1,1

A square is breezy if and only if there is a pit in a neighboring square. (for

simplicity only relevant square)

R2: B1,1 (P1,2 P2,1)

R3: B2,1 (P1,1 P2,2 P3,1)

The breeze percepts for the first two squares visited in the specific world the agent

is in

Hridaya Kandel

R4: B1,1 33

R5: B2,1

Propositional Logic

Figure: A truth table constructed for the knowledge base as discussed. KB is true if

R1 through R5 are true, which occurs in just 3 of the 128 rows. In all 3 rows, P 1,2 is

false, so there is no pit in [1,2]. On the other hand, there might (or might not) be a pit

in [2,2].

Hridaya Kandel 34

KB example

Hridaya Kandel 35

KB example

Hridaya Kandel 36

KB example

Hridaya Kandel 37

Propositional Logic: Equivalence

Hridaya Kandel 38

Validity, Satisfiability, Unsatisfiability

Valid sentences are also known as tautologies

e.g. True, A A, A A, (A (A B) B

Validity is connected to inference via the Deduction Theorem

KB if and only if (KB ) is valid

A sentence is satisfiable if it is True in some model

e.g. A B, C

A sentence is unstatisfiable if it is True in no models

e.g. A A

Satisfiability is connected to inference via the following

KB |= a iff (KB a) is unsatisfiable

proof by contradiction

Hridaya Kandel 39

Exercise

and D. How many models are there for the following sentences?

a) (A ^ B) V (B ^ D)

b) A V B

c) A B C

Hridaya Kandel 40

Reasoning Patterns

A Inference Rules

Patterns of inference that can be applied to derive chains of

conclusions that lead to the desired goal.

Proof rules (or inference rules) show us, given true statements how to

generate further true statements.

Example p p is an axiom of propositional logic

We use the symbol denoting is provable or is true.

We write A1, . . .An B to show that B is provable from A1, . . .An (given

some set of inference rules).

Stating that B follows (or is provable) from A1, . . .An can be written

Hridaya Kandel 41

Inference Rules

Modus Ponens

This well known proof rule is called modus ponens, i.e. in general

can be inferred

(WumpusAhead WumpusAlive) Shoot and (WumpusAhead

WumpusAlive), Shoot can be inferred

Hridaya Kandel 42

Inference Rules

AND ()-elimination

From a conjunction, any of the conjuncts can be inferred

The first of these can be read if A and B hold (or are provable or true) then

A must also hold.

(WumpusAhead WumpusAlive), WumpusAlive can be inferred

Hridaya Kandel 43

Inference Rules

OR ()-introduction

Another proof rule, known as -introduction is

The first of these can be read if A holds (or are provable or true) then A B

must also hold.

All of the logical equivalences can be used as inference rules

Note:

sequence of applications of inference rulesis called a Proof.

Finding proofs is exactly like finding solutions to search problems.

Monotonicity

says that the set of entailed sentences can only increase as information is

added to the knowledge base.

If we have a proof, adding information to the DB will not invalidate the

Hridaya Kandel 44

proof

Inference Rules: Example

From r s and s p can we prove p, i.e. show r s, s p

p?

Hridaya Kandel 45

Resolution

We have argued that the inference rules covered so far are sound, but we have not

discussed the question of completeness for the inference algorithms that use them.

Resolution is a proof method for classical propositional and first-order logic.

Resolution allows a complete inference mechanism (search-based) using only one

rule of inference ie. Resolution itself

The (propositional) resolution rule is as follows

A V p and B V p are called parents of the resolvent.

p and p are called complementary literals.

Hridaya Kandel 46

Resolution

Unit Resolution

Unit resolution rule takes a clause a disjunction of literals and a literal and

produces a new clause. Single literal is also called unit clause.

Generalized resolution rule takes two clauses of any length and produces a

new clause as below.

Example:

Hridaya Kandel 47

Resolution

The Resolution method involves:-

translation to a normal form (CNF);

To prove a fact P, repeatedly apply resolution until either:

No new clauses can be added, (KB does not entail P)

The empty clause is derived (KB does entail P)

This is proof by contradiction: if we prove that KB P derives a

contradiction (empty clause) and we know KB is true, then P must be false,

so P must be true!

To apply resolution mechanically, facts need to be in Conjunctive

Normal Form (CNF)

A sentence is in Conjunctive Normal Form (CNF) if it is a conjunction of clauses,

each clause being a disjunction of literals

Example:

Hridaya Kandel 48

Resolution Method

Example

Show by resolution that the following set of clauses is unsatisfiable.

The sets of clauses are already in CNF.

Hridaya Kandel 49

Resolution Method: example

First Convert to CNF B1,1 (P1,2 P2,1)

1. Eliminate , replacing with ( )( ).

(B1,1 (P1,2 P2,1)) ((P1,2 P2,1) B1,1)

(B1,1 P1,2 P2,1) ((P1,2 P2,1) B1,1)

double-negation:

(B1,1 P1,2 P2,1) ((P1,2 P2,1) B1,1)

(B1,1 P1,2 P2,1) (P1,2 B1,1) (P2,1 B1,1)

Hridaya Kandel 50

Resolution Method: Example

Conclusion: there is no pit in [1,2]

i.e = P1,2

Proof by contradiction, i.e., show

KB unsatisfiable

We have

KB = (B1,1 (P1,2 P2,1)) B1,1

= P1,2

Hridaya Kandel 51

Resolution Method: Exercise

Use resolution Algorithm to solve the following problem

KB entails A

Hridaya Kandel 52

Evaluation : Resolution

Resolution is sound

Because the resolution rule is true in all cases

Resolution is complete

Provided a complete search method is used to find the

proof, if a proof can be found it will

Note: you must know what youre trying to prove in

order to prove it!

Resolution is exponential

The number of clauses that we must search grows

exponentially

Hridaya Kandel 53

Horn clause

Real-world knowledge bases often contain only clauses of a

restricted kind called Horn clauses.

A Horn clause is a disjunction of literals of which at most one is

positive.

The positive literal is called the head and the negative literals form the

body of the clause

For example, the clause ( L Breeze B), is a Horn clause, whereas

( B P P) is not.

Importance:

Horn Clauses form the basis of forward

and backward chaining

Deciding entailment with Horn Clauses

is linear in the size of the knowledge

base.

Can be written as an implication

(example) Fig: Example of Horn clauses

Note: Hridaya

TheKandel

Prolog language is based on Horn

54

Clauses

ANDOR graphs

ANDOR graphs,

multiple links joined by an arc indicate a conjunctionevery

link must be proved

while multiple links without an arc indicate a disjunction

any link can be proved.

Hridaya Kandel 55

Reasoning with Horn Clauses

Forward Chaining

For each new piece of data, generate all new facts, until

the desired fact is generated

Data-directed reasoning

Backward Chaining

To prove the goal, find a clause that contains the goal as

its head, and prove the body recursively

(Backtrack when you chose the wrong clause)

Goal-directed reasoning

Hridaya Kandel 56

Forward Chaining

Fire any rule whose premises are satisfied in the KB

Add its conclusion to the KB until the query is found

Hridaya Kandel 57

Forward Chaining

Hridaya Kandel 58

Forward Chaining

Hridaya Kandel 59

Forward Chaining

Hridaya Kandel 60

Forward Chaining

Hridaya Kandel 61

Forward Chaining

Hridaya Kandel 62

Forward Chaining

Hridaya Kandel 63

Forward Chaining

Hridaya Kandel 64

Forward Chaining

Hridaya Kandel 65

Backward Chaining

Idea: work backwards from the query q:

To prove q by BC,

Check if q is known already, or

Prove by BC all premises of some rule concluding q

Avoid loops

Check if new subgoal is already on the goal stack

Has already been proved true, or

Has already failed

Hridaya Kandel 66

Backward Chaining

Hridaya Kandel 67

Backward Chaining

Hridaya Kandel 68

Backward Chaining

Hridaya Kandel 69

Backward Chaining

Hridaya Kandel 70

Backward Chaining

Hridaya Kandel 71

Backward Chaining

Hridaya Kandel 72

Backward Chaining

Hridaya Kandel 73

Backward Chaining

Hridaya Kandel 74

Backward Chaining

Hridaya Kandel 75

Backward Chaining

Hridaya Kandel 76

Backward Chaining

Hridaya Kandel 77

Translation Guide

Hridaya Kandel 78

Pros and Cons of PL

Propositional logic is declarative

Propositional logic allows partial/disjunctive/negated

information

(unlike most data structures and databases)

Propositional logic is compositional:

meaning of B1,1 P1,2 is derived from meaning of B1,1 and of P1,2

Meaning in propositional logic is context-independent

(unlike natural language, where meaning depends on context)

Propositional logic has very limited expressive power

(unlike natural language)

E.g., cannot say "pits cause breezes in adjacent squares

except by writing one sentence for each square

Hridaya Kandel 79

Logics in General

Ontological Commitment:

What exists in the world TRUTH

PL : facts hold or do not hold.

FOL : objects with relations between them that hold or do not hold

Epistemological Commitment:

What an agent believes about facts BELIEF

Hridaya Kandel 80

First-order logic

Whereas propositional logic assumes the world contains facts,

first-order logic (like natural language) assumes the world

contains

Objects, which are things with individual identities

Properties of objects that distinguish them from other objects

Relations that hold among sets of objects

Functions, which are a subset of relations where there is only one value for

any given input

Examples:

Objects: Students, lectures, companies, cars ...

Relations: Brother-of, bigger-than, outside, part-of, has-color, occurs-after,

owns, visits, precedes, ...

Properties: blue, oval, even, large, ...

Functions: father-of, best-friend, second-half, one-more-than ...

Hridaya Kandel 81

Models for FOL: Graphical Example

Hridaya Kandel 82

Syntax of FOL: Basic elements

Constant Symbols: which represent individuals in the world

Stand for objects

e.g., KingJohn, 2, UCI,...

Predicate Symbols : which map individuals to truth values

Stand for relations

E.g., Brother(Richard, John), greater_than(3,2)...

Function Symbols : which map individuals to individuals

Stand for functions

E.g., Sqrt(4), LeftLegOf(John),...

Variables x, y, a, b,...

Connectives , , , ,

Equality =

Quantifiers ,

Hridaya Kandel 83

Syntax of FOL:BNF

Hridaya Kandel 84

FOPL: Sentences

A term (denoting a real-world individual) is a constant symbol, a variable

symbol, or an n-place function of n terms.

x and f(x1, ..., xn) are terms, where each xi is a term.

A term with no variables is a ground term

n terms

logical connectives:

P, PQ, PQ, PQ, PQ where P and Q are sentences

That is, all variables are bound by universal or existential quantifiers.

(x)P(x,y)

Hridaya Kandel has x bound as a universally

85 quantified variable, but y is free.

Atomic sentence

Atomic sentences state facts using terms and predicate symbols

P(x,y) interpreted as x is P of y

Examples:

LargerThan(2,3) is false.

Brother_of(Mary,Pete) is false.

Married(Father(Richard), Mother(John)) could be true or false

Brother(Pete) refers to John (Petes brother) and is neither true nor

false.

Brother_of(Pete,Brother(Pete)) is True.

Hridaya Kandel 86

Binary relation Function

Complex Sentence

logic).

property

binary function

relation

objects

connectives

Hridaya Kandel 87

Universal Quantification

Universal quantification

<variables> <sentence>

Allows us to make statements about all objects that have certain properties

(x)P(x) means that P holds for all values of x in the domain associated

with that variable

E.g.,

x dolphin(x) => mammal(x)

x King(x) => Person(x)

x Person(x) => HasHead(x)

i Integer(i) => Integer(plus(i,1))

(x) student(x) smart(x) means All students are smart

b) Universal quantification is rarely used to make blanket statements about every

individual in the world:

(x)student(x)smart(x) means Everyone in the world is a student and is smart

Hridaya Kandel 88

x King(x) Person(x) is not correct!

Existential Quantification

Existential quantification

<variables> <sentence>

( x)P(x) means that P holds for some value of x in the domain associated

with that variable

Permits one to make a statement about some object without naming it

E.g.,

( x) mammal(x) lays-eggs(x)

x King(x)

x Lives_in(John, Castle(x))

i Integer(i) GreaterThan(i,0)

a) Existential quantifiers are usually used with and to specify a list of properties

about an individual:

(x) student(x) smart(x) means There is a student who is smart

b) A common mistake is to represent this English sentence as the FOL sentence:

(x) student(x) smart(x)

But what happens when there is a person who is not a student?

Hridaya Kandel 89

FOPL: Example

Lets consider following objects.

Richard the Lionheart, King of England from 1189 to 1199; His younger brother,

the evil King John, Who ruled from 1199 to 1215; the left leg of Richard and john;

and a crown.

The domain of the model is all the set of object. (objects are also called domain

element)

Symbols: Symbols are the syntactic elements of FOPL.

Constant symbols : Stands for object. Eg. Richard and John

Predicate symbols: Stands for relation. Eg. Brother, onHead, Person, King, Crown.

Function Symbol : Stands for functions. Eg LeftLeg.

Atomic sentence and Complex sentence (provide Example)

Quantified Sentence (provide Example)

Interpretation: specify exactly which object, relation, and function are referred to

by respective symbols.

Richard Refers to Richard the lionheart and John refers to the evil king John.

Brother refers to brotherhood relation

LeftLeg refers to Left leg function.

Hridaya Kandel 90

Quantifier Scope

More complex sentences can be expressed with nested quantifiers.

Like nested variable scopes in a programming language

Like nested ANDs and ORs in a logical sentence

Switching the order of universal quantifiers does not change the meaning:

(x)(y)P(x,y) (y)(x) P(x,y)

Similarly, you can switch the order of existential quantifiers:

(x)(y)P(x,y) (y)(x) P(x,y)

Switching the order of universals and existentials does change meaning:

Everyone loves someone: (x)(y) loves(x,y)

For everyone (all x) there is someone (exists y) whom they love.

There might be a different y for each x (y is inside the scope of x)

Someone is liked by everyone: (y)(x) loves(x,y)

There is someone (exists y) whom everyone loves (all x).

Every x loves the same y (x is inside the scope of y)

This is more Clearer with parentheses:

Hridaya Kandel 91 y ( x Loves(x,y) )

Connection of Quantifier

The two quantifiers are actually intimately connected with each other, through

negation.

Asserting that all x have property P is the same as asserting that

there does not exist any x that doest have the property P

Eg. When one says that everyone dislikes carrot, one is also saying that there does

not exist someone who likes them; and vice versa:

x Likes(x, Carrot) is equivalent to x Likes(x, Carrot)

Note.

- is a conjunction over the universe of objects

- is a disjunction over the universe of objects

Thus, DeMorgans rules can be applied

x P x (P ) P Q (P Q )

x P x (P ) P Q (P Q )

x P x (P ) (P Q ) P Q

x P x (P ) (P Q ) P Q

Hridaya Kandel 92

Hridaya Kandel 93

Translating English to FOL

Every gardener likes the sun.

x gardener(x) likes(x,Sun)

You can fool some of the people all of the time.

x t (person(x) time(t)) can-fool(x,t)

You can fool all of the people some of the time.

x t (person(x) time(t) can-fool(x,t))

Equivalent

x (person(x) t (time(t) can-fool(x,t)))

All purple mushrooms are poisonous.

x (mushroom(x) purple(x)) poisonous(x)

No purple mushroom is poisonous.

x purple(x) mushroom(x) poisonous(x) Equivalent

x (mushroom(x) purple(x)) poisonous(x)

There are exactly two purple mushrooms.

x y mushroom(x) purple(x) mushroom(y) purple(y) ^ (x=y) z

(mushroom(z) purple(z)) ((x=z) (y=z))

Clinton is not tall.

tall(Clinton)

X is above Y iff X is on directly on top of Y or there is a pile of one or more other

objects directly on top of one another starting with X and ending with Y.

x y above(x,y) (on(x,y) z (on(x,z)

Hridaya Kandel 94

above(z,y)))

Inference in FOL

The inference rules for propositional logic: Modus Ponens, And-Elimination, And-

Introduction, Or-Introduction, and Resolution hold for first-order logic. Some

additional inference rules are required to handle first-order logic sentences with

quantifiers.

These are:

Universal Elimination / Universal instantiation (UI)

Existential Elimination /Existential instantiation(EI)

Existential Introduction/ Existential generalization

These rules are more complex as the variable have to be substituted by particular

individuals.

SUBST(, ) to denote the result of applying the substitution (or binding list) to

the sentence

SUBST({v/g}, ) means the result of substituting g for v in sentence.

= {v/g}

(Theta is also called unifier later which is the result of unification).

E.g SUBST({x/Sam, y/Pam}, Likes(x,y)) = Likes(Sam,Parri)

Hridaya Kandel 95

Universal instantiation

Universal Elimination / Universal instantiation (UI)

If (x) P(x) is true, then P(C) is true, where C is any constant in the domain of x

for any sentence , variable v and ground term g

The variable symbol can be replaced by any ground term, i.e., any constant symbol or

function symbol applied to ground terms only.

For example, from xLikes(x,IceCream), we can use the substitution {x/Ben} and infer

Likes(Ben, IceCream).

More E.g., x King(x) Greedy(x) Evil(x) yields

King(John) Greedy(John) Evil(John), {x/John}

King(Richard) Greedy(Richard) Evil(Richard), {x/Richard}

King(Father(John)) Greedy(Father(John)) Evil(Father(John)), {x/Father(John)}

Note:

Universal instantiation can be applied several times to add new sentences: the new KB is

logically equivalent to the old.

Hridaya Kandel 96

Existential instantiation

Existential Elimination / Existential instantiation (UI)

From (x) P(x) infer P(c)

For any sentence , variable v, and constant symbol k (that does not appear elsewhere in the

knowledge base):

where C1 is a new constant symbol, called a Skolem constant

Note that the variable is replaced by a brand new constant that does not occur in

this or any other sentence in the Knowledge Base. In other words, we don't want to

accidentally draw other inferences about it by introducing the constant. All we

know is there must be some constant that makes this true, so we can introduce a

brand new one to stand in for that (unknown) constant.

Note:

Existential instantiation can be applied once to replace the existential sentence: the new KB

is not equivalent to the old, but is satisfiable iff the old KB was satisfiable.

Hridaya Kandel 97

Existential generalization

Existential Introduction/ Existential generalization

If P(c) is true, then (x) P(x) is inferred

For any sentence , variable v that does not occur in , and ground term g that

does occur in :

v SUBST({g/v}, )

Example

eats(Ziggy, IceCream) (x) eats(Ziggy, x)

All instances of the given constant symbol are replaced by the new variable

symbol

Note that the variable symbol cannot already exist anywhere in the expression

Hridaya Kandel 98

Reduction to propositional form

FOL->predicate Logic(PL) (Some may call reduction to Propositional Logic)

Existential and universal instantiation allows to propositionalize any FOL

sentence or KB

EI produces one instantiation per Existential Quantified (EQ) sentence

UI produces a whole set of instantiated sentences per Universal Quantified (UQ)

sentence

Example:

Suppose the KB contains the following: Instantiating the universal sentence in all

x King(x) Greedy(x) Evil(x) possible ways, we have:

Father(x) King(John) Greedy(John) Evil(John)

King(John) King(Richard) Greedy(Richard) Evil(Richard)

Greedy(John) King(John)

Brother(Richard,John) Greedy(John)

Brother(Richard,John)

King(John), Greedy(John), Evil(John), King(Richard), etc

Hridaya Kandel 99

Reduction to propositional form

Problems with Propositionalization :works if is entailed, loops if is not entailed

with function symbols, there are infinitely many ground terms,

Example First Depth

x King(x) Greedy(x) Evil(x) Father(John)

Father(Richard)

Father(x) (we assume this as a function)

King(John)

King(John) Greedy(Richard)

Greedy(Richard) Brother(Richard , John)

Brother(Richard,John) King(John) Greedy(John) Evil(John)

King(Richard) Greedy(Richard) Evil(Richard)

King(Father(John)) Greedy(Father(John)) Evil(Father(John)

King(Father(Richard)) Greedy(Father(Richard))

Evil(Father(Richard))

This continues

Propositionalization generates lots of irrelevant sentence :So inference may be very

inefficient.

If we take previous example

It seems obvious that Evil(John) is entailed, but propositionalization produces lots of facts such

as Greedy(Richard) that are irrelevant.

With pHridaya

k-aryKandel

predicates and n constants, there

100

are pnk instantiations

Unification and Lifting

Solution to Propositionalization could be doing inference directly with FOL

sentences

A key component of all first-order inference algorithms is unification.

Unification is a "pattern matching" procedure that takes two atomic sentences,

called literals, as input, and returns "failure" if they do not match and a

substitution list, Theta, if they do match.

Unify algorithm: takes 2 sentences p and q and returns a unifier if one exists

Unify(p,q) = where Subst(, p) = Subst(, q)

Example:

p = Knows(John,x)

q = Knows(John, Jane)

Unify(p,q) = = {x/Jane}

Most of the propositional inference rules are lifted to FOL inference rules with the

help of Unification. Lifted - transformed from

Generalized Modus Ponens = lifted Modus Ponens

Backwards chaining, forwards chaining, and resolution algorithms also have lifted

forms which we will see later

Hridaya Kandel 101

Unification examples

simple example: query = Knows(John,x), i.e., who does John know?

Given KB i.e all sentences in q

p q

Knows(John,x) Knows(y,OJ) {x/OJ,y/John}

Knows(John,x) Knows(y,Mother(y)) {y/John,x/Mother(John)}

Knows(John,x) Knows(x, Elizabeth) {fail}

Last unification fails: only because x cant take values John and Elizabeth at

the same time

Problem is due to use of same variable x in both sentences .Both use the same

variable, X. X cant equal both John and Elizabeth.

The solution: change the variable X to Y (or any other value) in Knows(X,

Elizabeth)

Knows(X, Elizabeth) is changed to Knows(Y, Elizabeth) which Still means the

same.

This is called standardizing apart.

Hridaya Kandel

Standardizing apart eliminates

102

overlap of variables

Unification Complication

A problem may arise in unification when the variable take two values as seen in the

example before

Standardizing apart eliminates overlap of variables.

Process: rename all variables so that variables bound by different quantifiers have

unique names for each sentences in KB.

For example if KB has following After Standardizing apart KB becomes

x Apple(x) => Fruit(x) x Apple(x) => Fruit(x)

x Spider(x) => Arachnid(x) y Spider(y) => Arachnid(y)

There is one more complication with Unification: we said that UNIFY should return

a substitution that makes the two arguments look the same. But there could be more

than one such unifier.

Example To unify Knows(John,x) and Knows(y,z),

This can return two possible unifications:

{y/ John, x/ z} which means Knows(John, z) OR {y/ John, x/ John, z/ John} which means

Knows(John, John).

So the unification algorithm is required to return the (unique) most general unifier (MGU).

for above example MGU = { y/John, x/z }

There is a single most general unifier (MGU) that is unique up to renaming of variables

Hridaya Kandel 103

Generalized Modus Ponens (GMP)

This is a general inference rule for FOL(lifted) that does not require

instantiation

For atomic sentences pi, pi' , and q, where there is a substitution such that

Subst(, pi' )= Subst(, pi), for all i,

Subst(,q)

Example: (For KB described before)

King(John), Greedy(John) ,x King(x) Greedy(x) Evil(x)

Evil(John)

GMP is sound

p1' is King(John) p1 is King(x) Only derives sentences that are logically

p2' is Greedy(John) p2 is Greedy(x) entailed (Proof in Book).

is {x/John} q is Evil(x) GMP is complete (derives all sentences that

entailed) for a 1st-order KB in Horn Clause

Subst(,q) is Evil(John)

Hridaya Kandel 104

format

Inference in FOL

Forward-chaining

Uses GMP to add new atomic sentences

Useful for systems that make inferences as information streams in

Requires KB to be in form of first-order definite clauses

Backward-chaining

Works backwards from a query to try to construct a proof

Can suffer from repeated states and incompleteness

Useful for query-driven inference

Resolution-based inference (FOL)

Refutation-complete for general KB

Can be used to confirm or refute a sentence p (but not to

generate all entailed sentences)

Requires FOL KB to be reduced to CNF

Uses generalized version of propositional inference rule Resolution

Note that all of these methods are generalizations of their propositional

equivalents ( the rules are Lifted with unification)

Hridaya Kandel 105

Inference in FOL Example

The law says that it is a crime for an American to sell weapons to hostile nations.

The country Nono, an enemy of America, has some missiles, and all of its missiles

were sold to it by Colonel West, who is American.

... it is a crime for an American to sell weapons to hostile nations:

American(x) Weapon(y) Sells(x,y,z) Hostile(z) Criminal(x)

Owns(Nono,M1) and Missile(M1)

Missile(x) Owns(Nono,x) Sells(West,x,Nono)

Missile(x) Weapon(x)

Enemy(x,America) Hostile(x)

American(West)

Enemy(Nono,America)

Hridaya Kandel 106

Forward Chaining

Forward Chaining

Forward Chaining

Backward Chaining

Backward Chaining

Backward Chaining

Backward Chaining

Backward Chaining

Backward Chaining

Backward Chaining

Resolution in FOL

Full first-order version:

l1 lk , m1 mn

Subst( , l1 li-1 li+1 lk m1 mj-1 mj+1 mn)

(li, mj) are complementary literals

The two clauses are assumed to be standardized apart so that they share no

variables.

For example,

Rich(x) Unhappy(x) ,Rich(Ken)

Unhappy(Ken)

with = {x/Ken}

it is complete for FOL.

Resolution refutation

The general technique is to add the negation of the sentence to be proven to

the KB and see if this leads to a contradiction.

Idea: if the KB becomes inconsistent with the addition of the negated

sentence, then the original sentence must be true.

This is called resolution refutation. Also called RRS

The procedure is complete for FOL.

Algorithm

Converting FOL sentences to CNF

1. Eliminate biconditionals and implications.

2. Reduce the scope of : move inwards.

3. Standardize variables apart: each quantifier should use a different variable

name.

4. Skolemize: a more general form of existential instantiation.

5. Each existential variable is replaced by a Skolem function of the enclosing

universally quantied variables.

6. Drop all universal quantifiers: It's all right to do so now.

7. Distribute over .

8. Make each conjuct a separate clause.

9. Standardize the variables apart again if required.

More on Skolemization

If an existentially quantified variable is in the scope of universally quantified

variables, then the existentially quantified variable must be a function of those

other variables. We introduce a new, unique function called Skolem function.

Hridaya Kandel 119

Converting FOL sentences to CNF: Example

Original sentence:

Anyone who likes all animals is loved by someone:

x [y Animal(y) Likes(x,y)] [y Loves(y,x)]

x [y Animal(y) Likess(x,y)] [y Loves(y,x)]

2. Move inwards:

Recall: x p x p, x p x p

x [y (Animal(y) Likes(x,y))] [y Loves(y,x)]

x [y Animal(y) Likes(x,y)] [y Loves(y,x)]

x [y Animal(y) Likes(x,y)] [y Loves(y,x)]

Either there is some animal that x doesn'tt like if that is not the case then someone

loves x

Converting FOL sentences to CNF

3. Standardize variables: each quantifier should use a different one

x [y Animal(y) Likes(x,y)] [z Loves(z,x)]

4. Skolemize:

x [Animal(A) Likes(x,A)] Loves(B,x)

Everybody fails to love a particular animal A or is loved by a particular

person B

Animal(cat)

Likes(marry, cat)

Loves(john, marry)

Likes(cathy, cat)

Loves(Tom, cathy)

a more general form of existential instantiation.

universally quantified variables:

x [Animal(F(x)) Likes(x,F(x))] Loves(G(x),x)

Hridaya Kandel

Converting FOL sentences to CNF

[Animal(F(x)) Likes(x,F(x))] Loves(G(x),x)

(all remaining variables assumed to be universally quantified)

6. Distribute over :

[Animal(F(x)) Loves(G(x),x)] [Likes(x,F(x)) Loves(G(x),x)]

Original sentence is now in CNF form can apply same ideas to all

sentences in KB to convert into CNF

derive the empty clause which show that the query is entailed by the KB

FOL Resolution

FOL Resolution : Example 2

KB:

a) Everyone who loves all animals is loved by someone.

b) Anyone who kills animals is loved by no-one.

c) Jack loves all animals.

d) Either Curiosity or Jack killed the cat, who is named Tuna.

Query: Did Curiosity kill the cat?

Inference Procedure:

1. Express sentences in FOL.

2. Eliminate existential quantifiers.

3. Convert to CNF form and negated query.

Hridaya Kandel 124

FOL Resolution : Example 2

FOL Resolution : Example 2

Hridaya Kandel 126

FOL Resolution : Example 2

Reasoning

We have already covered reasoning in symbolic logic

Reasoning is the act of deriving a conclusion from certain premises using a given

methodology.

Reasoning is a process of thinking; reasoning is logically arguing; reasoning is

drawing inference.

When a system is required to do something, that it has not been explicitly told how

to do, it must reason. It must figure out what it needs to know from what it already

knows.

Definitions :

Reasoning is the act of deriving a conclusion from certain premises using a

given methodology (inference is methodology).

Any knowledge system must reason, if it is required to do something which has

not been told explicitly .

For reasoning, the system must find out what it needs to know from what it

already knows.

Example :

If we know : Robins are birds. All birds have wings

Then if we ask: Do robins have wings?

To answer this question - some reasoning must go.

Hridaya Kandel 128

Reasoning Vs inference ??

Uncertainty

The world is an uncertain place; often the Knowledge is imperfect , which

causes uncertainty. Therefore reasoning must be able to operate under

uncertainty.

AI systems must have ability to reason under conditions of uncertainty.

agents almost never have access to the whole truth about their

environment. Agents must, therefore, act under uncertainty.

for example,

An agent wants to drive someone to the airport to catch a plane .

He makes a plan say A90 = leaving home 90 minutes before the flight departs

and driving at a reasonable speed.

Even though the airport is only about 15 miles away, the agent cannot be

certain that plan A90 will get him on time.

Approaches to Reasoning

There are three different approaches to reasoning under uncertainties.

Symbolic reasoning (facts)

Statistical reasoning (degree of belief)

Fuzzy logic reasoning (degree of truth)

Uncertainty

Symbolic versus statistical reasoning

True,

False, or

Neither True nor False.

Some methods also had problems with

Incomplete Knowledge

Contradictions in the knowledge.

Statistical reasoning

In the logic based approaches (symbolic), we have assumed that everything is either

believed false or believed true.

However, it is often useful to represent the fact that we believe such that something

is probably true, or true with probability (say) 0.55.

This is useful for dealing with problems where there is randomness and

unpredictability (such as in games of chance) and also for dealing with problems

where we could, if we had sufficient information, work out exactly what is true. To

do all this in a principled way requires techniques for probabilistic reasoning.

Handling Uncertainty

Let us take an example of medical Diagnosis

Let us take a first-order logic rule for dental diagnosis using

p symptom(p, Toothache) disease(p,cavity)

The problem is that this rule is wrong. Not all patients with toothaches have

cavities; some of them have gum disease, an abscess, or one of several other

problems. To make the rule true we may have to add unlimited list of possible

causes as below

p sympt(p,Toothache) disease(p,cavity) disease(p,gum_disease)

We could turn this into causal rule which is also not true.

p disease(p,cavity) symptom(p, Toothache)

Trying to use first-order logic to cope with a domain like medical diagnosis thus

fails for three main reasons:

Laziness: It is too much work to list the complete set of antecedents or consequents

needed to ensure an exceptionless rule and too hard to use such rules.

Theoretical ignorance: Medical science has no complete theory for the domain.

Practical ignorance: Even if we know all the rules, we might be uncertain about a

particular patient because not all the necessary tests have been or can be run.

Handling Uncertainty

In domain such as medical and other judgmental domains: law, business, design,

automobile repair, gardening, dating, and so on. The agent's knowledge can at best

provide only a degree of belief (plausibility) in the relevant sentences. probability

theory is a tool for dealing with degree of belief , which assigns a numerical degree of

belief between 0 and 1 to each sentence

Decision under uncertainty

Ex , Take an example of : An agent wants to drive someone to the airport to catch a plane

Suppose Agent believe the following:

P(A25 gets me there on time | ) = 0.04

P(A90 gets me there on time | ) = 0.70

P(A120 gets me there on time | ) = 0.95

P(A1440 gets me there on time | ) = 0.9999

(remember if you reach early at airport you have to wait at airport)

In this case Which action to choose?

Choosing above actions Depends on preferences for missing flight vs. time spent waiting

in airport, etc.

Utility theory is used to represent and infer preference

so

Decision theory = probability theory + utility theory

Hridaya Kandel 132

Probability Theory : syntax

We need a formal language for representing and reasoning with uncertain knowledge

Degrees of belief are always applied to propositions language (Propositional Logic &

FOPL) The basic element of the language is the random variable, which can be thought of

as referring to a "part" of the world whose "status" is initially unknown. For example.

Cavity might refer to whether my lower left wisdom tooth has a cavity

Random variables play a role similar to that of CSP variables in constraint satisfaction

problems and that of proposition symbols in propositional logic.

Each random variable has a domain of values that it can take on. Ex. Cavity might take

values ( true,false) ( mostly random variables names are capitalize and values names are

lowercase)

Random variables are typically divided into three kinds, depending on the type of the

domain:

a) Boolean random variables, such as Cavity, have the domain (true, false). We will often abbreviate a

proposition such as Cavity = true simply by the lowercase name cavity. Similarly, Cavity = false

would be abbreviated by cavity.

b) Discrete random variables, which include Boolean random variables as a special case, take on values

from a countable domain. For example, the domain of Weather might be (sunny, rainy, cloudy,

snow). The values in the domain must be mutually exclusive and exhaustive. Where no confusion

arises, we: will use, for example, snow as an abbreviation for Weather = snow.

c) Continuous random variables take on values from the: real numbers. The domain can be either the

entire real line or some subset such as the interval [0,1].

Hridaya Kandel 133

Probability Theory : syntax

Elementary propositions, such as Cavity = true and Toothache =false, can be

combined to form complex propositions using all the standard logical

connectives.

For example, (Cavity = true Toothache = false) is a proposition to which one may

ascribe a degree of belief. Can also be written as (cavity toothache)

Atomic event: A complete specification of the state of the world about which the

agent is uncertain

E.g., if the world consists of only two Boolean Random variables Cavity and

Toothache, then there are 4 distinct atomic events:

Cavity = false Toothache = false Can also be written as

Cavity = false Toothache = true cavity toothache

cavity toothache

Cavity = true Toothache = false cavity toothache

Cavity = true Toothache = true cavity toothache

Atomic events are mutually exclusive and exhaustive

Probability Theory : syntax

Unconditional or prior probability

The unconditional or prior probability associated with a proposition a is the

degree of belief accorded to it in the absence of any other information; it is

written as P(a). For example, if the prior probability that I have a cavity is 0.1,

then we would write P(Cavity = true) = 0.1 or P(cavity) = 0.1

If weather is a discrete random variables and its probability for each state as below

a) P( Weather = sunny) = 0.7

b) P( Weather = rain) = 0.2

c) P( Weather = cloudy) = 0.08

d) P( Weather = snow) = 0.02 .

We simply write as P(Weather) = (0.7, 0.2, 0.08, 0.02)

This statement defines a prior probability distribution for the random variable

Weather

We will also use expressions such as P( Weather, Cavity) to denote the

probabilities of all combinations of the values of a set of random variable. In that

case, P( Weather, Cavity) can be represented by a 4 x 2 table of probabilities.

This is called the joint probability distribution of Weather and Cavity.

Hridaya Kandel 135

Probability Theory : syntax

Conditional or posterior probability

Once the agent has obtained some evidence concerning the previously unknown

random variables making up the domain, prior probabilities are no longer

applicable. Instead, we use conditional or posterior probabilities.

The notation used is P(a|b), where a and b are any proposition. This is read as

"the probability of a, given that all we know is b."

For example

P(Cavity | Toothache) = 0.8 indicates that if a patient is observed to have a toothache and no

other information is yet available, then the probability of the patient's having a cavity will be

0.8.

The condition probability of the occurrence of A if event B occurs

P(A|B) = P(A B) / P(B)

This can also be written as :

P(A B) = P(A|B) * P(B)

called the product rule

Hridaya Kandel 136

Axioms of probability: semantics

Elementary propositions, such as Cavity = true and Toothache =false, can be

combined to form complex propositions using all the standard logical

connectives.

For example, (Cavity = true Toothache = false) is a proposition to which one may

ascribe a degree of belief. Can also be written as (cavity toothache)

Atomic event: A complete specification of the state of the world about which the

agent is uncertain

E.g., if the world consists of only two Boolean Random variables Cavity and

Toothache, then there are 4 distinct atomic events:

Cavity = false Toothache = false Can also be written as

Cavity = false Toothache = true cavity toothache

cavity toothache

Cavity = true Toothache = false cavity toothache

Cavity = true Toothache = true cavity toothache

Atomic events are mutually exclusive and exhaustive

Axioms of probability: semantics

All probabilities are between 0 and 1. For any proposition A,

0 P(A) 1

Necessarily true (i.e., valid) propositions have probability 1, and necessarily false

(i.e.unsatisfiable) propositions have probability 0.

P(true) = 1 and P(false) = 0

P(A B) = P(A) + P(B) - P(A B)

Inference: Joint probability Distribution

Example

Probability distribution for P(Cavity, Tooth)

Toothache Toothache

Cavity 0.01 0.89

P(CavityToothache) P(CavityToothache)

P(Cavity Tooth) = 0.04 + 0.01 + 0.06 = 0.11

P(Cavity | Tooth) = P(Cavity Tooth) / P(Tooth) = 0.04 / 0.05

Note : P(A B) = P(A) + P(B) - P(A B)

Hridaya Kandel 139

Inference: Joint probability Distribution

Example a domain consisting of just the three Boolean variables Toothache, Cavity, and Catch

(the dentist's nasty steel probe catches in my tooth). The full joint distribution is a 2 x 2 x 2 table as

shown in Figure

Probability distributions for P(Cavity, Tooth, Catch)

Tooth ~ Tooth

Catch ~ Catch Catch ~ Catch

Cavity 0.108 0.012 0.072 0.008

~ Cavity 0.016 0.064 0.144 0.576

P(Cavity Tooth) = 0.108 + 0.012 + 0.072 + 0.008 + 0.016 + 0.064 = 0.28

P(Cavity | Tooth) = P(Cavity Tooth) / P(Tooth) =

[P(Cavity Tooth Catch) + P(Cavity Tooth ~ Catch)] * / P(Tooth) =??

Hridaya Kandel

Bayes' Theorem

Bayesian view of probability is related to degree of belief

It is measure of plausibility of an event given incomplete knowledge.

Bayes theorem is also known as Bayes rule or Bayes law, or called Bayesian reasoning

The probability of an event A conditional to another event B i.e P(A|B) is generally

different from probability of B conditional A i.e P(B|A).

There is a relationship between the two P(A|B) and P(B|A) , and the Bayes theorem

is the statement of that relationship.

Bayes theorem is a way to calculate P(B|A) from a knowledge of P(A|B).

Use Product rule P(A B) = P(A|B) *P(B) (as discussed in class)

Bayes' Theorem: Useful

Bayes rule is useful in practice because there are many cases where we have good

probability estimates for three of the four probabilities involved, and therefore can

compute the fourth one.

Often useful for diagnosis:

If X are (observed) effects and Y are (hidden) causes,

We may have a model for how causes lead to effects (P(X | Y))

We may also have prior beliefs (based on experience) about the frequency of occurrence of

effects (P(Y))

Which allows us to reason abductively from effects to causes (P(Y | X)).

Bayes rule Example

Suppose we know that

Stiff neck is a symptom in 50% of meningitis cases

Meningitis (m) occurs in 1/50,000 patients

Stiff neck (s) occurs in 1/20 patients

Then

Given P(s|m) = 0.5, P(m) = 1/50000, P(s) = 1/20

P(m|s) = (P(s|m) P(m))/P(s)

= (0.5 x 1/50000) / 1/20 = .0002

Hridaya Kandel 142

So we expect that one in 5000 patients with a stiff neck to have meningitis.

Bayes' Theorem: Useful

In doing an expert task, such as medical diagnosis, the goal is to determine

identifications (diseases) given observations (symptoms). Bayes' Theorem

provides such a relationship.

has gained importance recently due to advances in efficiency

more computational power available

better methods

Advantages

sound theoretical foundation

well-defined semantics for decision making

problems

requires large amounts of probability data

sufficient sample sizes

subjective evidence may not be reliable

independence of evidences assumption often not valid

relationship between hypothesis and evidence is reduced to a number

explanations for the user difficult

high computational overhead

Hridaya Kandel 143

Issues with Probabilities

Often don't have the data

Just don't have enough observations

Data can't readily be reduced to numbers or frequencies.

Human estimates of probabilities are notoriously inaccurate. In

particular, often add up to >1.

Doesn't always match human reasoning well.

P(x) = 1 - P(-x). Having a stiff neck is strong (.9998!) evidence that you

don't have meningitis. True, but counterintuitive.

problems.

Bayesian networks

Also called

Bayesian belief network (BBNs) or simply belief Network,

probabilistic network,

causal probabilistic network (CPNs) or simply causal Network, and

knowledge map.

In statistics graphical model

Give a short specification of conditional probability distribution

Many random variables are conditionally independent

Simplifies computations

Graphical representation

Directed Acyclic Graph (DAG) causal relationships among random variables

Allows inferences based on the network structure

Bayesian networks

The full specification of BN

1. A set of random variables makes up the nodes of the network. Variables may be

discrete or continuous.

2. A set of directed links or arrows connects pairs of nodes. If there is an arrow from

node X to node Y, X is said to be a parent of Y.

3. Each node Xi has a conditional probability distribution P(Xi | Parents(Xi)) that

quantifies the effect of the parents on the node.

4. The graph has no directed cycles (and hence is a directed, acyclic graph, or DAG).

Simple example

A simple Bayesian network in which Weather is independent of the other three variables and

Toothache and Catch are conditionally independent, given Cavity.

Bayesian networks : Example

Consider following domain

You have a new burglar alarm installed at home. It is fairly reliable at detecting a

burglary, but also responds on occasion to minor earthquakes. You also have two

neighbors, John and Mary, who have promised to call you at work when they

hear the alarm. John nearly always calls when he hears the alarm, but sometimes

confuses the telephone ringing with the alarm and calls then, too. Mary, on the

other hand, likes rather loud music and often misses the alarm altogether. Given

the evidence of who has or has not called, we would like to estimate the

probability of a burglary.

A Bayesian network for this domain is shown in figure next slides.

The network structure shows that burglary and earthquakes directly affect the

probability of the alarms going off, but whether John and Mary call depends

only on the alarm. The network thus represents our assumptions that they do not

perceive burglaries directly, they do not notice minor earthquakes, and they do

not confer before calling.

The conditional distributions are shown as a conditional probability table, or

CPT

Bayesian networks : Example

A typical Bayesian network, showing both the topology and the conditional probability

tables (CPTs). In the CPTs, the letters B, E, A, J, and M stand for Burglary, Earthquake,

Alarm, JohnCalls, and MaryCalls , respectively.

Example: What is the probability that the alarm has sounded, but neither burglary nor an

earthquake has occurred, and both John and Mary call?

P(J M A B E ) = P(J|A)* P(M|A)*P(A|B E )*P(B) P(E)

Hridaya Kandel

=0.9 * 0.7 * 0.001 *148

0.999 * 0.998 = 0.00062

Thank You!

Hridaya Kandel 150

## Гораздо больше, чем просто документы.

Откройте для себя все, что может предложить Scribd, включая книги и аудиокниги от крупных издательств.

Отменить можно в любой момент.