Вы находитесь на странице: 1из 56
UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE LECTURE 1 Introduction to Artificial Intelligence Definition. ‘There is no set definition of Artificial Intelligence (Al). Basically it is the boundary between what people can do and what computers can’t yet. Definitions for AT fall into two main dimensions ¢ Human approach: This covers all systems which act or think like humans + Rationality approach: This covers all systems which act or think rationally See more detail in Chapter 1 of Text Foundations of Artificial Intelligence The field of Artificial Intelligence was formally designed in 1956 however the field has existed since Aristotle who used logic as an instrument for studying thought. Since then this field has developed by incorporating ideas and views from other fields. 1. Philosophy (428BC — present) has given it the theories of reasoning and learning and operations of the mind 2. Mathematics (800 — present) gave the formal theories of logics, probability, decision making, and computation Economic (1776-present) gave the ideas of preferred outcomes and decision making based on satisfaction 4, Neuroscience (1861-present) gave the theories of the brain and its operations 5. Psychology (1879-present) has given the tools with which to investigate the human mind and a scientific language within which to express the resulting theories 6. Computer Engineering (1940 —present) gave the tools with which to make Alareality 7. Control Theory and Cybernetics (1948 — present) gave the ideas of how an artificial thing can operate under its own control 1 Prepared by Felix Akinladejo & Janett Williams “August 2006 8. Linguistics (1957 — present) has given the theories of the structure and meaning of language Areas of Artificial Intelligence ‘© Perception © Machine Vision © Speech Understanding © Touch (haptic or tactile) Sensation Robotics * Natural Language Processing © Natural Language Understanding © Language Generation © Machine translation * Planning © Autonomous Planning & scheduling © Logistics Planning Experts Systems Machine Learning Autonomous Control ‘Theorem Proving Symbolic Mathematics Game Playing Diagnostics Tutorial 1, Define in your own words a. Intelligence b. Artificial Intelligence c. Agent 2. Are reflex actions (such as moving your hand away from a hot stove) rational? Are the intelligent? 3. Are there any things that you can find in every day life that has been influenced by Artificial Intelligence? Justify. 2 Prepared by Felix Akinladejo & Janets Williams August 2006 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE LECTURE 2 &3 Intelligent Agents Rational agents are central to the approach to Artificial Intelligence. What is an Agent? This is anything that perceives its environment through sensors and acts on that environment through actuators, Examples 1, Human Agent - Sensors — eyes, ears, nose, etc. actuators ~ hands, legs, mouth ete. 2. Robots (Robotic agent or hardware robots) Sensors — camera, infrared range finders actuators — various motors 3. Softbots ( software agents or software robots) Sensors and actuators — encoded bit strings Rational Agent This is an agent that does the right thing (something that causes success). How does one know if there is success? - Performance measure which are criteria establishment (by some authority outside the environment) that determines success. Performance measure must be over a long run. Example ‘An agent may work hard for the first 2hrs of the day and does nothing for the rest of the day while other that is consistent with little work throughout the day will be less rewarded if performance is measured in the 1st two hours. You cannot blame an agent for failure on what it could not perceive. Hence doing the right thing might not necessarily mean the agent must do the perfect thing ‘Onnniscient Agent This is an agent who knows the actual outcome of its actions and can act accordingly. This is impossible in reality. ‘What makes something Rational? Rationality then depends on 1 performance measure that defines degree of success 1 Prepared by Felix Akinladejo & Janett Williams “August 2006 2. percepts sequence to date (everything perceived so far) 3. prior knowledge of environment (clock in Jamaica and London) 4. actions that the agent can perform (human agent climbing the air) Ideal rational agent An ideal rational agent should do whatever action is expected to maximize its performance measure on the basis of the evidence provided by each possible percept sequence, and built-in knowledge the agent possesses. Don’t confuse rationality with omniscience 2 Prepared by Felix Akinladejo & Janett Williams August 2006 Autonomous Agent ‘© An agent whose behavior is determined by its own experience (although this experience is usually a product of initially built-in knowledge). ‘© Butif'an agent depends sorely on its built-in knowledge, it lacks autonomy. ‘Types of Rational Agents 1. Problem Solving Agents 2. Search Agents 3. Learning Agents 4, Logical Agents Building Agents Task Environments The first step is to understand the task environment (problems to which the agent is the solution). Do this PEAS (Performance, Environment, Actuators, Sensors) is used for design, Example - Taxi Driver Performance Measures- Safe, fast, legal, comfortable trip, maximizes profits Environment -Roads, other traffic, pedestrians, customers Sensors -Cameras, speedometer, Global Positioning Systems, sonar, microphone or keyboard etc. Actuators - Steer, accelerate, brake, talk to passenger, etc, ‘An agent must be coupled with an environment. ‘There are different kinds of agents and environments. Actions are done by agents on environment, which in turn provides percepts to the agents, Different types of environments affect design of agents Properties of environments © Fully Observable vs. partially observable The agent’s sensors give it access to detect all aspects that are relevant to the choice of an action within the environment. This environment is convenient because the agent need not maintain any internal state to keep track of the world. © Deterministic vs. Stochastic Next state is completely determined by the current state and actions selected by the agents. Note that a complex environment can be inaccessible and non- deterministic. If the environment is deterministic except for the actions of other agents it is strategic * Episodic vs. Sequential Agent's experience is divided into ‘episodes’. Quality of actions depends on just the episode itself. Subsequent episodes do not depend on what actions occur in previous episodes. Hence this environment is simple as agent does not need to think ahead. ‘Static vs. Dynamic 3 Prepared by Felix Akinladejo & Janett Williams August 2006 If environment can change while agent is deliberating then it is dynamic, otherwise, it is static. Passage of time crucial. Semidynamic ~ environment does not change with the passage of time but the agent’s performance score does. © Discrete vs. Continuous Discrete — limited number of distinct clearly defined percepts and actions e.g. chess playing is discrete (fixed # of possible moves on each turn) while taxi driving is continuous. * Single Agent vs. multiagent How many agents work in this environment. Structure of Intelligence Agents The job of Artificial Intelligence is to design the agent programs, which is a function that implements agent mapping from percepts to actions. This will run on the architecture (computing device with physical sensors and actuators), Agent = architecture + program - Issues to consider: ‘© Note that goals may conflict and thus require trade-offs. The more restricted the environment, the easier the design problem (Jamaica drives on the left, USA drives on the right). ‘* Designer needs to build program to implement mapping from percept to action Types of agent programs Simple Reflex Agents: This is based on Condition - Action Rule: e.g. If car in front is braking then initiate- braking. The percept defines the current situations. A rule is sought whose condition matches the current situation. The action associated with that rule is performed. Model Based Agents ‘The agent needs to keep some internal state in order to choose an action, as current percept may not be enough to make correct decision. This is because sensors do not provide access to complete state of the world. (Some red lights from outside, not necessarily due to braking, should be distinguishable from the red light braking of the car) The internal state needs updating using: - Information on how the world evolves independently of the agent - How the agent actions affect the world Goal-based agent Along with current state description, an agent must have a goal in order to decide what action to perform. The agent must choose at a junction where to go depending on its goals even though it knows the current state of all the road to tum to. E.g. Goal > a desirable situations e.g. passengers destinations. Achieving the goal may be simple (turn right and get there) or tricky (requiring sequences of twists) to achieve. Scarch and planning may come to play. Search and planning -> branches of AI that help to find action sequences that do achieve agent’s goal. Goal-based agents are 4 Prepared by Felix Akinladejo & Janett Williams August 2006 different from reflex agent as it considers the future. What happens if I take this action? Hence more flexible. Goal-based agent can adapt to new behavior which would have required substantial reprogramming in condition-action agents. Utility-based agents Knowing goals alone are not sufficient but knowing the best alternative to achieve the goal will be more desirable. (Getting to Sovereign Center is a goal but how do you get there to safe your gas). Goal must be useful. This quality of being useful is called Utility. Utility maps a state to a real number that describes degree of happiness. It allows rational decision in two states where goal will have trouble. When there are conflicting goals, utility function can make rational decisions, Note that goal enable agent to pick an action right away if it satisfies the goal. In a sense, it is possible to translate utility function into a set of goals, and the goal-based agents that work on these sets will behave as utility agents. Tutorial 1. What is the difference between a performance measure and an utility function. 2. Define in your own words the following terms: agent, agent function, agent program, rationality, autonomy, reflex agent, model-based agent, goal-based agent, utility-based agent, learning agent, 3. Apply PEAS to any automaton of choice 5 Prepared by Felix Akinladejo & Janett Williams August 2006 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE, Problem Solving Agents What a Problem? A problem is a goal and a set of means of achieving the goal. To build a system to solve a problem the following are needed: ib Define the problem precisely 2. Analyze the problem 3. Isolate and represent the task knowledge that is necessary to solve the problem 4. Choose the best problem-solving techniques and apply Problem Characteristics The following should be answered when a problem is being analyzed. > Is the problem decomposable into a set of (nearly) independent smaller or casier subproblems? > Can solution steps be ignored or at least undone if they prove unwise? > Is the problem’s universe predictable? > Is a good solution to the problem obvious without comparison to all other possible solutions > Is the desired solution a state of the world or a path to a state? > Isa large amount of knowledge absolutely required to solve the problem, or is knowledge important only to constraint the search? Problem Solving > Given precise definitions of problems, itis relatively easy to construct a search process for finding solutions. > First step in problem solving is goal formulation (Crucial goal must be identified among other desirable alternatives). > Problem formulation > deciding what actions and states to consider, It follows goal formulation. > Search —> an agent with several immediate options of unknown value can decide what to do by first examining different possible sequences of actions that lead to state of known value, and then choosing the best one. Search is the process of looking for such sequence of actions. > Solution + action sequences that accomplish a goal > Execution —> carrying out the solution (action sequences identified as best to achieve the goal). 1 Prepared by Felix Akinladejo & Janett Williams August 2003 + Inagent design, we usually ~ Formulate, Search, Execute Problem formulation There are four essentially different types of problems: > Single-state problems > Multiple-state problems > Contingency problems > Exploration problems Structure of problems and solutions tate problem > Initial state — state the agent knows itself to be in > A set of possible actions available to the agents, denoted by a successor functionS. e.g. ns S (x) > returns the set of states reachable from x by carrying out any single action > State space of the problem: The set of all states reachable from the initial state by any sequence of actions > Path in the state space: Any sequence of actions leading from one state to another > Goal test: Condition to determine ifa single state is a goal state (Note that goal can be explicit or abstract) A = Enumerated st of sates 1. Chackmate tpping opponents that ay subsequent move leads its captare 2 Goals might be pats with fewer oles costly > Path cost function: A function that assigns a cost to a path. The cost is the sum of the costs of the individual actions along the path denoted by g All of the above defines a problem (single-state) > The datatype can be represented as: datatype problem components: Initial-state, successor-function, goal-test, path-cost- function > Instance of this datatype will form input to a search algorithm which will return a solution ‘Summarily, Goal-based agents consider future actions rather than just condition-action rules, 2 Prepared by Felix Akinladejo & Janett Williams August 2003 Problem -> a goal and a set of means of achieving it Search > exploring what the means can do within its limitations, adversaries, etc. Goal -> a set of world states (goal objectives must be minimized and based on current states to be manageable). Actions —> causing transitions between world states. Agents must search for the best transitions to achieve the goal state. Problem-solving agents decide on what to do by finding sequences of actions that lead to desirable states, A search algorithm takes a problem as input and returns a solution in form of an action sequence. Once this solution is found, then the action sequence can be executed. Solution -> a path from the initial state to a state that satisfies the goal test. Generally a problem-solving agent ~ formulate a goal - search for solution = execute actions to get a new goal (and then remove that step from the sequence if this new goal is not the final optimal goal) Tutorial 1. Have the Tutor explain the Missionaries and Cannibal Problem. Determine the type of problem and define the structure. 3 Prepared by Felix Akinladejo & Janett Williams August 2003 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE LECTURE 4 Problem Solving Agents using Uninformed Searches Problem-solving performance measurement The effectiveness of a search can be measured by Does it find solution at all? © Does it find good solution (One with a low path cost. A path cost is the sum of the costs of the individual actions along the path. Represented by g) ‘© What is the search cost associated with the time and memory requirement to find a solution v Note that: Total cost of the search = path cost (online) + search cost (ofiline) > For problems with very small state spaces, it is easy to find solutions with the lowest path cost. vy For large complicated problems, tradeofis is needed between = Search for a very long time and get an optimal solution or = Search for a shorter time and get a solution with a slightly larger path cost. > The agent must decide what resourees to devote to search and what resources to devote to execution accordingly as above. This is an issue of resource allocation, Problem States = World states space and actions Vs Search states space and actions. The search states and actions must be abstracted from the world state space_and actions as there will be many things in the world space and actions that will be irrelevant to the goal of the problem at hand. This process of removing details is called abstraction. Class Example: Goal — Drive from Arad to Buchamest Search —> So many alternative routes Best solution + Sibin to Fageras (using # of steps as cost function. Path cost =3) 1 Prepared by Felix Akinladejo & Janett Williams August 2003 Searching for Solutions ® We find a solution by a search through the state space. ® The process involves maintaining and extending a set of partial solution Fyruinces and keeping track of them using suitable data structures. > We must expand state to generate new state(s) or else we will wey from the start state and end in the same start state. > Whenever we expand to more than one state, then a choice must be made~ search for the best state. 3 Phe choice of which state to expand frst is determined by the search strategy. > Search process must be seen as a tree structure. > The root of the search tree is a search node corresponding to the initial state, > (because they have not been expanded yet) or they have null children after expansions, Data structures for search trees > Ways to represent trees, 7 Anode isa data structure with the following properties: ~ The state to which the node corresponds ~The parent node ~ The operator that generate the node > The depth of the node (# of nodes on path from the root to this node) ~ Path cost from initial state to the node. Note that nodes have depths and parents but states do not have, Itis even possible for two different nodes to contain the same state. Search Strategy We must find the right search strategy for a problem. S earch strategies can be evaluated on Completeness — does it guarantee to find a solution when there is one? Time complexity — how long does it take to find a solution? Space complexity — how much memory does it need to perform the search? Optimality ~ does it find the highest-quality solution when there are several different solutions? vv Uniformed search > Uniformed means no informatior the current state to the goal state > They can only distinguish between a > Hence called a blind search (some problems f mn about the number of steps or path cost from 2 Prepared by Feltx Akinladejo & Janett Willams August 2003 > Informed search or heuristic search possesses such information. Informed search as some problem specific knowledge, which makes them more efficient hence called heuristic search. > Uniformed search is less effective than informed search, ‘The root node is expanded first, then all the notes generated by the root node are expanded next, and then their successors, and so on. i.e. All nodes at depth din the search tree are expanded before notes at depth d+ 1. It finds the shallowest goal state, which may not however be the least cost solution for a general path cost function, 2. Uniform cost search Uniform cost modifies breadth-first by always expanding the lowest-cost node rather than the lowest-depth node. To find the lowest-path cost, the cost of the path must be non-decreasing as we go along the path. This way uniform-cost search can find the cheapest path without exploring the whole search tree. If there is negative path cost, then the whole tree must be expanded if we man the least cost as big negative value can drastically reduce path cost. 3. Depth-first search This search strategy always expands the nodes at the deepest level of the tree until the search hits a dead end i.e. non-goal node without expansion, Only when this happens will the search go back and expands nodes at shallower levels, 4, Depth limited This imposes a cutoff on the maximum depth of a path. An operator keep track of the depth. e.g. The problem before has 20 states hence there must be at least 19 steps to get to the goal. So we can have a solution like: If you are in City A, and have travelled a path less than 19, then generate a view state in City B with a path length that is one greater ‘We are guaranteed to find solution but not necessarily the shortest one. Hence Depth-limited is complete but not optimal. Time and space complexity similar to that of depth first. 5. Iterative deepening search Breadth-first and depth-first methods can be combined to yield this better search method which can also be called Depth-first with Iterative Deepening. The search performs depth-first search iteratively, increasing depth bound by 1 at each iteration. Ifno solution is found by searching to the depth bound, then a new 3 Prepared by Felix Akinladejo & Janett Williams August 2003, depth-first search is performed with a higher level. No information is retained between iterations. Because of the problem of picking a good depth limit (Giameter of state, this is 9 in our example, as any city can be reached from any other city in at most 9 steps) iterative deepening search avoids choosing the best depth limit by trying all possible depth limits. First depth 0, then depth 1, 2, and soon. 6. Bi-directional search This search starts simultaneously both forward from the initial state and backwards from the goal and stop when the two searches meet in the middle, Forward and backward searches have to go only half way. Several issues need to be addressed before this search can be implemented. Tutorial 1, Consider the search problem defined by: State space: {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15} Initial State: 1 Goal states: 5 and 14 Successor function: 1 > {2,3} [Meaning that state 1 has two successors: 2 and 3] 2 (45.6) 331) 4> {8,9} 5> {10} 6 > (11,12,13} 7 (14,15) 8 > {} [Meaning 8 has no successor] 990 10>) u>p 12>) 13> 14> 15> Give a possible sequence of states generated by a search algorithm using 1. Breadth-first 2. Depth-first (use inorder, reorder and postorder) 3. Iterative deepening ‘Assume that the algorithm checks if a node N is a goal node when it decides to generate its successors. 2. Assume that the search algorithm uses a best-first search. Let the cost of each step (or arc) in the search tree be 1, and the heuristic function be h(N), where: 4 Prepared by Felix Akinladefo & Janett Williams August 2003 H(N) is an arbitrary positive number if the state associated with N is 4, 6,7, 9, 10, 11, 12, 13, or 15 . H(N) = 1, 0.5, 1.5, 0, 0.5, and 0, if the state associated with N is 1,2, 3, 5,7, and 14, respectively. Is the function h is admissible? Why? 3. Using the following 8-puzzle problem construct a breadth-first search for the shortest solution. Show all states making allowance for repeated states Start State Goal State 5 [4 2 BI 6 1 8 4 7 = 2 = 6 5 ee)= Prepared by Felix Akinladejo & Janet Williams “August 2003 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE, LECTURE 5 Problem Solving Agents Using Informed Searches Informed search methods > The idea here is that information about the state space can prevent the search algorithms form blundering about in the dark, > Informed searches use problem-specific knowledge to find solution more efficiently. > This knowledge is usually provided by an evaluation function that returns a number that purports to describe the desirability or lack thereof of expanding a node. 1. Best first search Here the nodes are ordered so that the one with the best evaluations are expanded first. It aims to find low-cost solutions. Typically use some estimated measure of the cost of the solution and try to minimize it.It is different from uniform cost search because it directs the path cost toward the goal not just expanding a low path cost.It incorporates some estimates of the cost of the path from a state to the closest goal state.Jt can come in two flavours (a) expand node closest to the goal (b) expand node on the least-cost solution path a, Greedy search (algorithm) A form of best-first search that aims to minimize estimated cost to reach a goal. It expands first the node whose state is judged to be closest to the goal state. This estimate is calculated by a function called heuristic function (note that cost of reaching a goal cannot be determined exactly hence estimated by a heuristic function). Best-first search that uses heuristic function to expand node is called greedy search (or algorithm). This algorithm is called greedy because it expands the node closest to the goal state first and thereby cut the search cost considerably but the total cost on this solution path may not be optimal. Note that if'n = goal, then h(n) =0 b. At 1 Prepared by Felix Akinaldejo & Janett Williams August 2006 oy The greedy search strategy in a above is neither optimal nor complete, it only just cat the search cost considerably. Recall that uniform cost minimizes the cost of the path. Itis optimal and complete but not efficient e.g. ifthe condition of nondecreasing path cost is not met, the algorithm must evaluate all the search tree before knowing the optimal (path cost) solution. Algorithm A = f{n) = g(n) + h(n). f(a) is called the feost. Therefore if we are interested in the cheapest solution, we can just aim for the node with the lowest value of f. Algorithm A is both complete and optimal provided that h function abides by a simple restriction i.e. should not ‘overestimate. Given this simple restriction on h function, f(n) = g(n) + h(n) is both complete and optimal. If this restriction is violated, f{n) may become greedy (ie. greedy algorithm). The restriction on h function is that it should never imate the cost to reach the goal. h functions that abide by this rule is called le heuristic Hill Climbing This is simply a loop that continually moves in the direction of increasing value. ‘There is no search tree to maintain so the node need only record the state and its evaluation, a Algorithm Evaluate the initial state. If itis also a goal state, then return it and quit. Otherwise continue with the initial state as current state 2. Loop until a solution is found or until there are no new operators left to be applied in the current state: a. Select an operator hat has not yet been applied to the current state and apply it to produce a new state . Evaluate the new state i, If'it isa goal state then return and quit ii, Ifitis not a goal state but it is better than the current state then ‘make it the current state iii, Ifit is not better than the current state the continue in the loop Problems: 1, Local maxima —a peak that is lower than the highest in the state space. Once it is reached the algorithm will halt even though the solution maybe far from satisfactory 2. Plateaux — an area of the state space where the evaluation function is essentially flat. The search will conduct a random walk 3. Ridges - Simulated Annealing This exploits an analogy between the way in which a metal cools and freezes into a minimum energy crystalline structure (the annealing process) and the search for a minimum in a more general system. It employs a random search which not only 2 Prepared by Felt Akinaldejo & Janett Williams “August 2006 accepts changes that decrease objective function f, but also some changes that increase it. The latter are accepted with a probability | p=exn(—7), (63) where” is the increase in f and T is a control parameter, which by analogy with the original application is known as the system ‘temperature! irrespective of the objective funetion involved Tutorial 1, Apply Hill Climbing to the following problem Consider the blocks world problem shown below. The operators are pick up one block and put it on table; pick up one block and put it in another block. ‘The heuristic function is Add one point for every block that is resting on the thing it is supposed to be resting on. Subtract one point for every block that is sitting on the wrong thing. A H H G G F F E E D D << c B R A Initial State Goal State 2. Using the following heuristic function apply hill climbing again to the problem above For each block that has the correct support structure (i.e. the complete structure underneath it is exactly as it should be), add one point for every block in the support structure. For each block that has an incorrect support structure, subtract one point for every block in the existing support structure. 3. Prepared by Felix Akinaldejo & Janett Williams August 2006 Using the map of Romania in the text calculate the Best-first searches for the route from Oradea to Bucharest 4 Prepared by Felix Akinaldefo & Janets Williams August 2006 2. Consider the following map (not drawn to scale) and straight line distance table A Straight Line Distance to M A [51 E [4 ED) (To B [50 F [14 i omyi32 G [33 [K [41 D [28 H [43 [L_156 Using the A* algorithm work out a route from town A to town M. Provide the search tree for your solution, showing the order in which the nodes were expanded and cost at each node. You should not re-visit states previously visited. Finally, state the route you would take and the cost of that route. 3. The straight-line distance heuristic used above is known to be an admissible heuristic. What does this mean and why is it important? 3 Prepared by Felix Akinaldejo & Janett Williams August 2006 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE LECTURE 6 Knowledge Representation Terms Knowledge representation : the study of how knowledge about the world can be represented and what kinds of reasoning can be done with that knowledge knowledge representation formalism : See knowledge representation language. knowledge representation language : Formalism used to represent knowledge. Also known as knowledge representation formalism. knowledge base: The collection of knowledge used by the system What is knowledge? How to represent knowledge? Philosophers discuss these questions for thousands of years. Knowledge representation is one of the major research areas of cognitive science and psychology. With the development of computer science, knowledge representation becomes an in separate part of Artificial Intelligence (AT). Most research in AI to date has been based on the assumption that in any AI program there is a separate module, which represents the information that the program has about the world. As a result, a number of so-called knowledge representation formalisms have been developed for representing this kind of information. Knowledge representation is used almost everywhere. E.g. in medical informatics, knowledge representation is related to various topics, such as decision support systems, medical vocabularies, data coding and transfer, guidelines and protocols development, database design and electronic patient records. Knowledge Representation Language ‘A knowledge representation language has two aspects, namely syntactic and inferential aspect. The syntactic or notational aspect concems the way in which one stores information in an explicit format. The inferential aspect concerns the way in which the explicitly stored information can be used to derive information that is implicit in it. A knowledge representation language has four levels, namely implementational, logical, epistemological and conceptual level. The major concern at the implementational level is that it is possible to build a computer program underlying a knowledge representation language. From the notational point of view, emphasis is on data structures for representing the 1 Prepared by Janett Williams August 2006 knowledge, while from the inferential point of view major concerns are discovering and implementing algorithms that draw the desired inferences. At the logical level we focus on logical properties of a knowledge representation language. From the syntactic aspect, the meanings of expressions and the expressive power of the formalism are the major concerns. From the inferential aspect, the logical properties of the formalism, for example, soundness of the inference procedure, is the major concem. At the third level of knowledge representation formalism, the epistemological level, the major concem are the knowledge structuring primitives that are needed for a satisfactory knowledge representation language and the types of inference strategy that should be make available. Whereas the epistemological level is concemed with the types of knowledge structuring primitives that are needed, the conceptual level concerns itself with the actual primitives that should be included in a knowledge representation language. ‘The most popular include production rules, scripts, semantic networks and frames. Production Rules (Expert Systems) Production rules are one of the most popular and widely used knowledge representation languages. Early expert systems used production rules as their main knowledge representation language. For example, MYCIN, which is also considered one of the first research works in medical informatics, has production rules as its knowledge representation language. Production rule system consists of three components, i.e., working memory, rule base and interpreter. The working memory contains the information that the system has gained about the problem thus far. The rule base contains information that applies to all the problems that the system may be asked to solve. The interpreter solves the control problem, i.e., decide which rule to execute on each selection-execute cycle. Production rules as a knowledge representation language has the following advantages: > Naturalness of expression Modularity + Restricted syntax Disadvantages of production rules as a knowledge representation language includes: + Inefficient + Less expressive Scripts A script is a structure that describes a stereotyped sequence of events in a particular context. It consists of slots. Associated with each slot may be some information about what kinds of values it may contain as well as a default value to be used if no other information is available, They are useful because in the real world, there are patterns to the occurrence of events 2) Prepared by Janett Williams August 2006 Advantage + Can predict events that have not explicitly been observed * Provide a way of building a single coherent interpretation from a collection of observations % Focuses attention on unusual events Disadvantage > Less general structure than frames Not suitable for representing all kinds of knowledge Semantic Nets ‘The major idea is that the meaning of a concept comes from its relationship to other concepts, and that the information is stored by interconnecting nodes with labelled ares. Representation in a Semantic Net ‘The physical attributes of a person can be represented as follows. These values can also be represented in logic as: isa(person, mammal), instance(Mike-Hall, person) team(Mike-Hall, Cardiff) We have already seen how conventional predicates such as lecturer(dave) can be written as instance (dave, lecturer) Recall that isa and instance represent inheritance and are popular in many knowledge representation schemes. But we have a problem: How we can have more than 2 place predicates in semantic nets? E.g. score(Cardiff, Llanelli, 23-6) Solution: + Create new nodes to represent new objects either contained or alluded to in the knowledge, game and fixture in the current example, ‘+ Relate information to nodes and fill up slots (see figure below). ae ay am score Darah a ees | ase 7 3 Prepared by Janett Williams August 2006 As a more complex example consider the sentence: John gave Mary the book. Here we have several aspects of an event. ae Si = Frame-based Representation Languages Frames are structures that represent knowledge about a limited aspeet of the world. Like the concepts in many of the semantic network representations, frames are descriptions of objects. The descriptions in a frame are called slots. Since frames are introduced in 1970s, many knowledge representation languages have been developed based on this concept. In medical informatics, some high quality ‘medical vocabularies, for example, MED developed at Department of Medical Informatics of Columbia University, use frames as their knowledge representation languages, Advantages of frame-based representation languages includes: + Domain knowledge model reflected directly + Support default reasoning % Efficient *% Support procedural knowledge Disadvantages of frame-based representation languages includes: Lack of semantics ‘Expressive limitations Logic-based Knowledge Representation (First Order Logics) Logic can be defined as the study of correct inference, of what follows from what. A logic usually consists of a syntax, a semantics and a proof theory. The syntax of a logic defines a formal language of the logic. The semantics of a logic specifies the meanings of the well-formed expressions of the logical language. The proof theory of a logic provides a purely formal specification of the notion of correct inference. As a knowledge representation language, logic has the following advantages: + With a semantics % Expressiveness Disadvantages of logic as a knowledge representation language are + Inefficient Undecidability + Unable to express procedural knowledge + Unable to do default reasoning 4 Propared by Janett Williams August 2006 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE LECTURE7 Logical Agents A statement can be defined as a declarative sentence, or part of a sentence, that is capable of having a truth-value, such as being true or false. So, for example, the following are statements: Portia Simpson-Miller is the first female Prime Minister of Jamiaca, Paris is the capital of France. Everyone born on Monday has purple hair. First-order logic, comprising the propositional and predicate calculi, is the most widely studied and used. Propositional logic Also known as sentential logic, is that branch of logie that studies ways of combining or altering statements or propositions to form more complicated statements or propositions. Joining two simpler propositions with the word "and” is one common way of combining statements. When two statements are joined together with "and’, the complex statement formed by them is true if and only if both the component statements are true. Because of this, an argument of the following form is logically valid: Paris is the capital of France and Paris has a population of over two mill Therefore, Paris has a population of over two million. Propositional logic largely involves studying logical connectives such as the words “and” and "or" and the rules determining the truth-values of the propositions they are used to join, as well as what these rules mean for the validity of arguments, and such logical relationships between statements as being consistent or inconsistent with one another, as well as logical properties of propositions, such as being tautologically true, being contingent, and being self-contradictory. The connectives are typically >, A, V, and > 1 Prepared by Janett Williams August 2006 NOTE — The Truth Table is Propositional Logi ‘Axioms (or their schemata) and rules of inference define a proof theory, and various equivalent proof theories of propositional calculus can be devised. The following list of axiom schemata of propositional calculus is from Kleene,(Mathematical Logic. New York: Dover, 2002). F2Gsh @ F205 sGsMsFs) @) FaG>FaG) @) FaFvG ) FoGvuF 6) PaG oF 6) FG oe a) F205 (30> FvH 30) ® 30> 310) 2>7) ) nae Sr (20) In each schema, F, 6, #can be replaced by any sentential formula. Modus Ponens The following rule called Modus Ponens is the sole rule of inference: EFS a (aay This rule states that if each of Fand ¥ > Gis either an axiom or a theorem formally deduced from axioms by application of inference rules, then Gis also a formal theorem. Other rules are derived from Modus Ponens and then used in formal proof’ to make proofs shorter and more understandable. These rules serve to directly introduce or eliminate connectives. Modus Ponens is basically >-climination, and the deduction theorem is -introduction, Sample introduction rules include FG OF FAG’ GvF™ ey) Sample elimination rules include FAG 35F = ce meres ( Proof theories based on Modus Ponens are called Hilbert-type whereas those based on introduction and elimination rules as postulated rules are called Gentzen-type. All formal theorems in propositional calculus are tautologies and all tautologies are formally provable. Therefore, proofs can be used to discover tautologies in propositional calculus, and truth tables can be used to discover theorems in propositional calculus. One can formulate propositional logic using just the NAND operator. The history of that can be found in Wolfram (2002, p. 1151). The shortest such axiom is the Wolfram axiom. 2 Prepared by Janett Willtams August 2006 Predicate logic A predicate is a feature of language which you can use to make a statement about something, e.g. to attribute a property to that thing. If you say "Peter is tall", then you have applied to Peter the predicate "is tall". We also might say that you have predicated tallness of Peter or attributed tallness to Peter. A predicate may be thought of as a kind of function which applies to individuals (which would not usually themselves be propositions) and yields a proposition. They are therefore sometimes known as propositional functions. Analysing the predicate structure of sentences permits us to make use of the internal structure of atomic sentences, and to understand the structure of arguments which cannot be accounted for by propositional logic alone Predicates (or relations): ‘Are operators which yield atomic sentences. 4 Operate on things other than sentences, 4 Are therefore not cruth functional operators. % Yield atomic sentences whose truth can be determined knowing only the identity of the things to which the predicate is applied (i.e. they are extensional). The term relation is typically used of a predicate which is applied to more than one thing, e.g. "greater than", which is applied to two things to make a comparison, but can also be used for predicates taking one or zero things. The number of "things" involved (as arguments) is called the arity of the predicate or relation. Objects ~ things with individual identities e.g. people, homes, numbers, Peter Relations — description or action on object e.g. brother of, bigger, has colour, owns Functions — a relation in which there is only one “value” for a given “input” e.g. red, prime, fat Properties — used to distinguish objects from each other e.g. father of, best friend Any fact can be thought of in terms of objects and properties or relations ex. One plus two equals three Object — one, two, three, one plus two Relation — equals, Function - plus Forward Chaining Forward chaining starts with the available data and uses inference rules to extract ‘more data (from an end user for example) until an optimal goal is reached. An inference engine using forward chaining searches the inference rules until it finds one where the If clause is known to be true. When found it can conclude, or infer, the ‘Then clause, resulting in the addition of new information to its dataset. Inference engines will often cycle through this process until an optimal goal is reached, 3 Prepared by Janet Willlams August 2006 For example, suppose that the goal is to conclude the color of my pet Fritz, given that he croaks and eats flies, and that the rulebase contains the following two rules: If Fritz croaks and eats flies - Then Fritz is a frog Uf Fritz isa frog - Then Fritz is green ‘The given rule (that Fritz croaks and eats flies) would first be added to the knowledge base, as the rule base is searched for an antecedent that matches its consequent; This is true of the first Rule, so the conclusion (that Fritz is a Frog) is also added to the knowledge base, and the rule base is again searched. This time, the second rules’ antecedent matches our consequent, so we add to our knowledgebase the new conclusion (that Fritz is green). Nothing more can be inferred from this information, but we have now accomplished our goal of determining the color of Fritz. Forward-chaining inference is often called data driven — in contrast to backward- chaining inference, which is referred to as goal driven reasoning. The top-down approach of forward chaining is commonly used in expert systems, such as CLIPS. One of the advantages of forward-chaining over backwards-chaining is that the reception of new data can trigger new inferences, wich makes the engine better suited to dynamic situations in which conditions are likely to change. Backward-chaining Backward chaining starts with a list of goals (or a hypothesis) and works backwards to see if there are data available that will support any of these goals. An inference engine using backward chaining would search the inference rules until it finds one which has a Then clause that matches a desired goal. If the If clause of that inference rule is not known to be true, then it is added to the list of goals (in order for your goal to be confirmed you must also provide data that confirms this new rule). For example, ‘suppose that the goal is to conclude the color of my pet Fritz, given that he croaks and cats flies, and that the rulebase contains the following two rules: Uf Fritz croaks and eats flies - Then Fritz is a frog Uf Fritz is a frog - Then Fritz is green This rule base would be searched and the second rule would be selected, because its conclusion (the Then clause) matches the goal (that Fritz is green). It is not yet known that Fritz is a frog, so the If statement is added to the goal list (in order for Fritz to be green, he must be a frog). The rulebase is again searched and this time the first rule is selected, because its Then clause matches the new goal that was just added to the list (whether Fritz is a frog), The If clause (Fritz croaks and eats flies) is known to be true and therefore the goal that Fritz is a frog can be concluded (Fritz croaks and eats flies, so must be green; Fritz is green, so must be a frog). Because the list of goals determines which rules are selected and used, this method is called goal driven, in contrast to data-driven forward-chaining inference. This bottom- Up appoach is often employed by expert systems. The programming language Prolog supports backward chaining 4 Prepared by Janet Willams August 2006 Tutorial 1. Using the information below answer the following question “Did Marcus hate Caesar?” using predicate logics Marcus was a man Marcus was a Pompeian All Pompeians were Romans Caesar was a ruler All Romans were either loyal to Caesar or hated him Everyone is loyal to someone People only try to assassinate rulers they are not loyal to Marcus tried to assassinate Caesar 2. Consider the following sentences: John likes all kinds of food ‘Apples are food Chicken is food Anything anyone eats and isn’t killed by is food Bill eats peanuts and is still alive Sue eats everything Bill eats peas re Translate these sentences into FOPL. Prove that John likes peanuts using backward chaining 5 Prepared by Janet Willams August 2006 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE LECTURE 8 Expert Systems - Introduction Definition il A computer program that represents and reasons with knowledge of some specialist subject with a view to solving problems or giving advice. Peter Jackson 1999 ‘* An expert system uses domain specific knowledge to provide ‘expert quality’ performance in a problem domain. ‘* The knowledge is from human domain experts and the system emulates their methodology and performance. ‘* It focuses on a narrow set of problems (specialists) Generally different from cognitive modelling programmes in that they currently cannot learn from their own experience, They make decisions based ‘on reasoning heuristically. © General-purpose problem solver versus Domain specific expert systems. (© The general purpose was unsuccessful because it attempts to solve a large class of problems © Special purpose (domain specific) systems were successfull because of narrowness of application area Typical Tasks of Expert Systems + The interpretation of data e.g, sonar signals Diagnosis of malfunctions e.g. equipment fault or human diseases Structural analysis of complex objects e.g. chemical compounds Configuration of complex objects e.g. computer systems Planning sequences of action e.g. performance by a robot Examples of Problem areas Category Problem addressed tes 1. Diagnosis infer malfunctions from observation Medical, Engineering 2. Repair/Debug Recommendations for correcting Machinery, problems Equipment 1 Prepared by Janen Wiliams & FelteAkinladejo August 2006 3. Design Configuring systems under constraints Engineering 4. Prediction Infer likely consequences of situations Financial, Weather forecasting 5. Planning Develop guidelines to achieve goals Scheduling, Financial 6. Interpretation _Infers system properties from data Scientific, Security 7. Control/Monitor Compare observation to standards Production, Operation 8. Instruction Diagnose and correct weakness Education, Training Characteristics of an expert system Simulates human reasoning about a problem domain, rather than simulating the domain itself Performs reasoning over representations of human knowledge, in addition to doing numerical calculations or data retrieval Solves problems by heuristic or approximate methods, which unlike algorithmic solutions, are not guaranteed to succeed, Difference from Other AI programs Deals with subject matter of realistic complexity that normally requires a considerable amount of human expertise Must exhibit high performance in terms of speed and reliability in order to be a useful tool Must be capable to explaining and justifying solutions or recommendations in ‘order to convince the user that its reasoning is in fact comect. Benefits of Expert System Dissemination of scare expertise (CATS — 1) Increased productivity (KCON) Increased quality/consistency (help desks) Integration of several expert options (MYCIN) Ability to work under uncertainty Itis easier to update knowledge base than to change procedural programs . ations of Expert System Restricted to narrow domains Experts may not be able to articulate the problem solving process (so building the ES may become difficult) 2 Prepared by Janew Williams & Felix Akinladejo August 2006 ‘* Experts do not always agree on approaches (this can make ES evaluation difficult) ‘© Expertise is hard to extract from humans Knowledge engineers are rate and expensive * ES donot always reach correct conclusion (validation is difficult) Building an ES To build an acceptable expert system a knowledge engineer has to do the following: 1. acquire the desired knowledge 2. represent this knowledge in a knowledge base 3. have some form of controlled reasoning 4, letthe program explain the solution(s) derived Knowledge Acquisition This is the transfer and transformation of potential problem-solving expertise from some knowledge source to a program” Buchanan 1983 Difficulties in Knowledge Acquisition ‘© Specialists’ own jargon ‘* Principles underlying domains of interest not well understood + Difficulties in eliciting ways to handle domain knowledge ‘+ Need for general knowledge over domain knowledge Knowledge Representation ‘+ how is information stored? what associations can be made? logical Vs biological syntactic standardization (see example) formal expressions (syntaxésemantics - ch.3) sound (accurate) & unambiguous symbolic representation (non-numeric) criteria for assessment: ‘© logical adequacy (sound & complete inferencd) © heuristic power (efficiency Vs optimality) ‘© notational convenience (descriptive/declarative) + coding conventions production rules (if-then) © structured objects (defstruct) © logic programs (Propositional,FOL....) (Example: (Vx)(canine(x)->mammal(x)) Control & Reasoning + metaknowledge: knowledge about what? when? how? in terms of knowledge availability and accessibility 3 Prepared by Janet Williams & Felix Akinladejo August 2006 requires planning and control search techniques & state space + control regime © strategy for applying knowledge in a systematic way Explanation Facility + log of chain of reasoning, ‘+ justifies solution + accountability + for who? © users; knowledge engineers, experts, programmers, managers + evaluation / general assessment of tool Limitations to Building + complex sensory-motor skills commonsense reasoning (ch.3) knowledge bottleneck domain dependencies potential trade-off for efficiency sake (heuristies) © optimality / soundness / completeness 4 Prepared by Janet Williams & Felix Akinladejo August 2006 ERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE LECTURE 9 Uncertainty Uncertain Knowledge and Reasoning © Many real-life situations involve complex problem domains, which involve drawing useful conclusions from incomplete and imprecise data using unsound reasoning. «Predicate calculus only work on correct premises and sound rules. There is need to take care of reasoning under probabilities. Reasons © Many real-life variables may be unknown © Maybe incomplete * Too complicated to cope with * Too many conditions + Probability theory provides the basis for treatment of systems that reason under uncertainty. * One representation language under this system is the Bayesian Belief Network. Bayes Rule The Bayes’ Rule is the foundation for Bayesian Belief Network. This method provides the following equation in order to compute the probability of a hypothesis Hi, given the probabilities with which the evidence follows from the hypotheses. PAVE) = PH) xP (H) Za-.»P GH) xP (Hy) (Luger & Stubblefield, 1998) Where P (HyE) = Probability that Hi true given evidence E P (Hi) = Probability that Hj is true all over the population. P (E/H ,) = Probability of observing evidence E when H , is true n= The number of possible hypotheses 1 Prepared by Janet Williams & Felix Akinladejo August 2003 This equation is known as the Bayes’ rule and as simple as it is, underlies all modem Artificial Intelligence systems for probabilistic inference (Russell & Norvig, 1995). This simple form is hinged on some assumptions that we must know all the probabilities on the relationships of the evidence with the various hypotheses, as well as, the probabilistic relationships among the pieces of evidence. Also, all relationships between evidence and hypotheses i.e. P (E/H ,) must be independent. ‘This method is justifiable, as there are many cases in some real-life situations where we do have probability estimates of some conditions over casual relationships in order to derive our goals. In general however, especially in domains like medicine, the assumption that all relationships between evidence and hypotheses be independent is, more difficult to establish. According to Russell and Norvig (1995), diagnostic knowledge is often more tenous than casual knowledge. For example, a sudden epidemic of meningitis will cause its unconditional (prior) probability to go up. A doctor who has been diagnosing based on the statistical value before the epidemic will have no clue how to update the value, However, if diagnosing is based on conditional (Posterior) probability, it will be obvious to the doctor that his diagnosis based on evidence should now tise due to the increase in the probability of the evidence (number of people having meningitis per population). Note that the casual effect will not be affected by this rise in the prior probability due to the epidemic i.e. Meningitis will cause the symptom the usual way whether there is epidemic or not. In some cases, relative livelihood is sufficient for decision making; but when two probabilities yield radically different utilities for various actions, according to Russell and Norvig (1995), exact values are needed to make rational decisions hence, the normalized version of Bayes’ rule given as: P(AB) = —_PIB/A1P[A] P[B/A}-P[A] + P[B/+A].P[-A] This equation treats 1/P[B] as a normalizing constant that allows the conditional terms to sum to 1. Thus in return for assessing the conditional probability of P[B/—A], Russell and Norvig (1995) argued that we can avoid accessing P[B] and still obtain exact probabilities from Bayes rule. ‘When there is combining evidence each with its own conditional probabilities, we can compute their joint probabilities using Bayes method as PIA/B*C] = PIB*C/A]. PIA] PBC] ‘We need to know the conditions probabilities of the pair (B * C) given A which can be exponential in growth. This can be avoided by Bayesian updating, which incorporates evidence one piece at a time (rather than multiple pieces), modifying the previously held belief in the unknown variable. i.e. beginning with A, we have P[A/B]= P(A]. PIB/ A} P[B] 2 Prepared by Janett Williams & Felix Akinladejo ‘August 2003 When C is observed, we apply Bayes’ rule with B as the constant conditioning context i.e. P[A/B*C] = P[A/B].PIC/B* Al P(C/B] = PA] .PIB/A]. PIC/B4 Al PIB] PIC/B} ‘As new piece of evidence is observed, the belief in the unknown variable is multiplied by a factor that depends on the new evidence. The problem here is finding value for the numerator P[C / B * A] as the multiplication factor depends not just on new evidence, but also on those already obtained. However, given. conditional independence, we can simply the equation for updating as P[A/B*C]=P[A]. PIB/ A] .P[C/A] PIB] PIC/B] The term P[C / B] which can cause problem will eventually go away by elimination. This simplified form of Bayes’ updating only work if the conditional independence holds for A, B and C. Conditional independence information is crucial to making probabilistic systems work effectively and can also be used in multi-valued cases. The demand from Bayes’ rule that probabilities should be up-to-date if conclusions are to be reliable requires extensive data collection and verification that is usually not feasible in many problem domains. Otherwise, it would have provided us with sound mathematical solution to probabilistic reasoning. As a substitute heuristic approaches are generally uscd in most expert system domain. Applying heuristics help to simplify the complexity of the hypotheses/evidence relationships. The domain is decomposed into clusters of evidences and their relationships to hypotheses, Invariancies are discovered and given names that will play an important intermediate point in the search for hypotheses of an observed set of evidences which ultimately help us to arrive at a valuable structure for data collection and hypotheses generation. The simple form of the Bayes’ rule defines the dependencies with the elements of the hypotheses/evidence set, for the use of rule-based inferencing and other abductive methods. ise in the Bayesian Network ‘The complexity encountered in applying the full Bayes’ rule to realistic problems is very prohibitive, Belief network relaxes several constraints of the full Bayes’ rule and uses heuristic approaches that make the management of uncertainty more tractable # (Luger & stubblefield, 1998). This is hinged on the premises that a. The modularity of a problem domain may allow us to relax many of the dependence/independence constraints required by the full Bayes theorem. That is, it is only sufficient to select the local phenomena we know will interact in a reasoning situation and obtain probability measures that reflect only these clusters of events, 3. Prepared by Janet Williams & Felix Akinladejo August 2003 b. The nodes of belief network are linked by conditioned probabilities e.g. The link between node A.and B is represented as A->B(C) meaning that evidence A’s support for belief B is with some casual influence measure C. ¢. Implicitly, coherent patter of reasoning may be reflected as paths through cause/symptom relationships, which will be reflected in the network by the use of different possible arguments. We know that P (A | B) and P (B | A) are not the same. Therefore to create a belief’ network, we must select the path our reasoning will take so that no reasoning path is circular. A directed acyclic graph is used to accomplish this. It is directed in order to reflect how events influence the likelihood of each other and acyclic so that we have no circular reasoning (For example, a likelihood of cooper deposit cannot support a likelihood of cooper deposit in a mineral exploration domain, that will be a circular reasoning). The presence or absence of data supporting nodes in the network both shapes and limits the argument path, thereby constraining or simplifying reasoning, The evidence of a node in a belief network can affect an argument by defining the d- separation of symptom/eause relationships. Consider two nodes A and B in a Bayesian Belief Network. They are d-separated if for all paths between A and B, there is an intermediate node V such that either A and B is connected serially and the state V is known, or the connection is converging and neither node V nor any of its children have evidence. Reasoning paths change in a belief network as different pieces of evidence are discovered. The probabilistic models in real-life application will typically consist of thousands of components, therefore, itis always important to identify the subset of the ‘model that include only those elements of the domain needed for a particular query ot problem at hand. Each piece of evidence instantiation in the network reduces the number of uncertain variables (nodes) and possibly decouples parts of the search graph, thereby reducing the computational complexity of inference. Also, lack of evidence for some nodes could screen off other variables from the problem domain. Once other variables could be made independent of the variables (nodes) of interest, the network could be greatly reduced, leading to search simplification and lower computational complexity. This is in contrast to the belief of the strict Bayes theorem that more evidence demands larger need for statistical relation and a broader network to search, 4 Prepared by Janeu Williams & Felis Akinladejo August 2003 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE LECTURE 10 Bayesian Belief Netw. eee Belief networks (also known as Bayesian networks, Bayes networks and causal probabilistic networks), provide a method to represent relationships between propositions or variables, even if the relationships involve uncertainty, unpredictability or imprecision. They may be learned automatically from data files, created by an expert, or developed by a combination of the two. They capture knowledge in a modular form that can be transported from one situation to another; it isa form people can understand, and which allows a clear visualization of the relationships involved. Bayesian Networks is a model for representing uncertainty in our knowledge. Uncertainty arises in a variety of situations such as: uncertainty in the experts themselves concerning their own knowledge, uncertainty inherent in the domain being modeled, ‘+ uncertainty in the knowledge engineer trying to translate the knowledge, and just plain ‘+ uncertainty as to the accuracy and actual availability of knowledge. Bayesian Networks use probability theory to manage uncertainty by explicitly representing the conditional dependencies between the different knowledge components. This provides an intuitive graphical visualization of the knowledge including the interactions among the various sources of uncertainty In probabilistic reasoning, random variables (abbreviated, r.v.) are used to represent events and/or objects in the world, By making various instantiations to these r-v.s, we can model the current state of the world. Thus, this will involve computing joint probebilities of the given r.v's. Unfortunately, the task is nearly impossible without additional information concerning relationships between the r..s. In the worst case, wwe would need the probabilities of every instantiation combination w combinatorially explosive, (On the other hand, consider the chain rule as follows PYAI, A2, A3, Ad, AS) = P(AL | A2, A3, A4, AS) P(A2| A3, A4, AS) P(A3 | Ad, AS) P(Ad | AS) PAS). 1 Prepared by Janett Williams & Felix Akinladejo August 2006 Bayesian networks take this process further by making the important observation that certain r:v. pairs may become uncorrelated once information concerning some other r-v.(s) is known. More precisely, we may have the following independence condition: P(A| CL, .., Cn, U) = P(A | CL, ay Cn) for some collection of r.».s U. Intuitively, we can interpret this as saying that A is determined by Ci, .., Cn regardless of U. Combined with chain rule, these conditional independencies allow us to replace the terms in the chain rule with the smaller conditionals. Thus, instead of explicitly Keeping the joint probabilities, all we need are smaller conditional probability tables which we can then use to compute the joint probabilities. ‘What we have is a directed acyclic graph of rv. relationships. Directed ares between vss represent conditional dependencies. When all the parents of a given r.y. A are instantiated, that rv. is said to be conditionally independent of the remaining r:3. which are not descendants of 4 given it's parents. For example, let's consider the following story: Mary walks outside and finds that the street and lawn are wet. She concludes that it has just rained recently. Furthermore, she decides that she does not need to water her climbing roses, Assume that Mary used the following set of rules: rain or sprinklers —> stree rain or sprinklers —> lawn = wet moist > roses = okay We can directly transform these into a graph. Now, by considering each variable as a ny. with possible states of {true,false}, we can construct conditional probability tables for rv. which reflects our knowledge of the world (see Figure 1), Let's compute the joint probability of the world where the roses are okay, the soil is dry, the lawn is wet, the street is wet, the sprinklers are off and itis raining, P(sprinklers = F, rain = T, street = wet, lawn = wet, soil = dry, roses ~ okay) P(roses = okay | soil = dry) * P(soil = dry | lawn = wet) * P(lawn = wet | rain = T, sprinklers = F) * & & P(street = wet | rain = T, sprinklers = F) * P(sprinklers =F) * Ptrain = 1) ‘Substituting the appropriate numbers from the tables, we get 0.2 * 0.1 * 1.0 * 1.0* 0.6 * 0.7 = 0.0084 as the probability of this scenario, 2 Prepared by Janett Williams & Felis Akinladejo August 2006 =o evince ox Hence ae eae 18 sescohay | sailmaiet) =07/ Plreescatsplseicayy 2a Pisin) «07 Figure |. Mary's Bayesian network, Conditional tables at each node must contain all possible combinations of assignments. For space purposes, we give a reduced collection, We can compute the missing information by taking I minus the appropriate probabilities given. ‘There are two types of computations performed with Bayesian Networks: belief updating and belief revision. Belief updating concerns the computation of probabilities over random variables, while belief revision concerns finding the ‘maximally probable global assignment. Belief revision can be used for modeling explanatory/diagnostic tasks. Basically, ‘some evidence or observation is given to us, and our task is to come up with a set of hypothesis that together constitute the most satisfactory explanation/interpretation of the evidence at hand. This process has also been considered abductive reasoning in ‘one form or another . More formally, if W is the set of all r-v-s in our given Bayesian network and e is our given evidence, i... ¢ represents a set of instantiations made on a subset of WV, any complete instantiations to all the r.v.s in JV which is consistent with e will be called an explanation or interpretation of e. Our problem is to find an explanation w* such that Pow" | €) = max P(w |e). 3 Prepared by Janett Williams & Felix Akinladejo August 2006 w* is called the " most-probable explanation.” Note that to compute the most-probable explanation for e, itis sufficient to determine the complete assignment consistent with whose joint probability is maximal. In this case, P(e) is simple a constant factor. Intuitively, we can think of the non-evidence r.v.s in IV as possible hypotheses for Belief updating on the other hand is interested only in the marginal probabilities of a subset of rv. given the evidence. Typically, itis to determine the best instantiation of asingle r-v. given the evidence. Example - Belief Revision Let the evidence e be the observation that the roses are okay. Our goal is to now determine the assignment to all the rv.s which maximizes Pewsle). We only need to consider assignments where the rv, roses is set to okay and maximize Pow). ‘The best solution then becomes - P(sprinklers = F. rain = 7, street = wet, lawn = wet, soi 0.2646 wet, roses = okay) = Example - Belief Updating Let the evidence e be the observation that the roses are okay and the condition of our lawn be our focus. Our goal is to now determine the probability that our lawn is either wet or dry given the observation. The solution then becomes - Palawn = dry | roses Pawn = wet | roses Although performing belief revision and updating (even approximating methods) have been shown to be NP-hard there exist special network toplogies for which certain algorithms perform well such as polytrees. Various approaches to reasoning with Bayesian Networks include A* search, stochastic simulation, integer programming, and message passing. ‘One main domain that Bayesain Belief Network has been successful is the medical sciences. Belief Networks have been used extensively to deliver correct medical diagnosis and even recommend treatments for ambiguous symptoms. Other areas of application includes user modeling and user interfaces, natural language 4 Prepared by Janett Williams & Felix Akinladejo August 2006 interpretations, planning, vision robotics, data mining, and other interested domains as ‘addressed in the March 1995 special issue of the Journal Communications of ACM. The following presents analysis of two successful products — the HEPAR project. (Bobrowki, 1992) and the TETRAD II of the Camegie Mellon University — that are based on Bayesian Belief Network. The HEPAR project (Bobrowski, 1992) and (Oniskol, Druzdzel and Wasyluk, 1997) is an application of graphical probabilistic model to the problem of diagnosis of liver disorders conducted by the Institute of Biocybemetics and Biomedical Engineering of the Polish Academy of Sciences. This project was undertaken in collaboration with physicians at the Medical Center of Postgraduate Education, The system includes a database of patient records from the Gastroenterological Clinic of the Institute of Food and Feeding in Warsaw, thoroughly maintained and extended with new cases. ‘The database consisted of 570 patient records each of which is described by 119 features (binary, denoting presence or absence of a feature or continuous, expressing the value of a feature) and one of 16 classes of liver disorders (Oniskol, Druzdzel and Wasyluk, 1997). The features can be divided conceptually into three groups — symptoms and findings volunteered by the patient, objective evidence observed by the Physician, and results of laboratory tests. Initial work of Oniskol, Druzdzel and ‘Wasyluk (1997) reduced the number of the features from 119 encoded in the database to 40 by eliminating those features that had many missing values. They clicited the structure of dependencies among these 40 features relying on human expert’s opinion as to which features have the highest diagnostic value, Subsequently, the parameters of the expert-constructed network i.e. prior and conditional probabilities of the total of 41 variables from the HEPAR were leamed. To evaluate the classification accuracy of the model, a standard test was performed in which a fraction of the database was used to learn the network parameters and the remainder of the records to test the network prediction. Oniskol, Druzdze] and Wasyluk (1997) reported that in over 36% of the cases, the most likely disorder was the correct diagnosis. In over 74% of the cases, the correct, diagnosis was among the first four most likely disorders. One unique feature about this model is the provision to query the system with partial observations, a rare feature in classification systems. This decision support system has been welcomed as a useful interactive diagnostic and training tool. TETRAD I is a casual discovery program developed in Camegie Mellon University's Department of Philosophy. It is an application to a database containing information ‘on 204 US Colleges, collected by US News and World Report Magazine for the Purpose of college ranking. This work focuses on possible causes of low freshmen retention in US colleges. TETRAD II finds that student retention is directly related to the average test scores and high school class standing of the incoming freshman (Druzdzel and Glymour, 1994), Simple linear regression applied to test scores, class standing and retention data ‘showed that the test scores and class standing explain 52.6% of the variance in freshmen retention rate and 62.5% of the variance in graduation rate (test scores alone explain 50.5% and 62.0% respectively). ‘This work then predicts one of the most. effective ways of improving student retention in an individual college, which is by increasing the college’s selectivity. According to Druzdzel and Glymour (1994), high 3 Prepared by Janet Williams & Felix Akinladejo August 2006 selectivity will lead to higher quality of the incoming students and, effectively. to higher retention rate. Other useful results from the literature are ‘The On-line student modeling for coached problem solving using Bayesian networks is one that describe the stuclent modeling component of Andes, an Intelligent Tutoring System for Newtonian Physics which uses a Bayesian network to do long-term knowledge assessment, plan recognition and prediction of students’ action during problem solving. Qualitative Verbal Explanations in Bayesian Belief Networks by Marek J. Druzdzel, which is the application of belief networks in systems that interact directly with human users. ‘These systems require effective user interfaces to bridge the gap bevween probabilistic models and human intuitive approaches to modeling uncertainty. The work consists of several methods for automatic generation of qualitative verbal explanations in systems based on Bayesian belief network model, 6 Prepared by Janett Williams & Felix Akinladejo August 2006 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE, LECTURE 11 Agents that Learn Learning can be described as a relatively permanent change that occurs in behaviour as a result of experience. This ability must be part of any system that would claim to possess general intelligence. An unchanging intellect would be a contradiction in our world of symbols and interpretation. In real life people desire computers that will be able to adapt to problems and improve performance if people provide them with some feedback. These desires led to a relatively new field ~ machine learning. Machine learning Machine learning is an area of artificial intelligence involving developing techniques to allow computers to "learn, More specifically, machine learning is a method for creating computer programs by the analysis of data sets, rather than the intuition of engineers. Machine leaming algorithms are organized into a taxonomy, based on the desired outcome of the algorithm. Common algorithm types include: ‘supervised learning --- where the algorithm generates a function that maps inputs to desired outputs. unsupervised learning inputs. 4 reinforcement learning —- where the algorithm leams a policy of how to act given an observation of the world. + learning to lear -— where the algorithm learns its own inductive bias based on previous experience. where the algorithm generates a model for a set of The analysis of machine learning algorithms is a branch of statistics known as learning theory ‘Areas in Machine Learning 4 Decision-theoretic techniques, which studies general purpose learning systems that start with little inital structures 4 Symbolic concept-oriented learning, which arose from studies in psychology and human learning. It uses symbolic and logic representations in place of numerical representations. 1 Prepared by Janet Wiliams August 2006 + Knowledge-intensive learning, which investigates learning methods based upon knowledge —rich systems Learning Strategies Learning by taking advice Learning in problem solving Learning from examples Explanation based learning v. Discovery Methods Machine leaning depends on how much a set of specific examples is condensed to a few governing properties. The form of these properties and how they are interpreted depends largely on the methods of data analysis used. All methods however have something “human” about them try to mimic biological or psychological properties of humans. They also involve a “search” through a large complicated space of solutions. Biologie: ‘These Methods are based on: i, A network of connectivity originally mimicking the synaptical connection ina brain - connectionist model ii, DNA coded inheritable concepts through generations - genetic algorithms ;chological These methods try to mimic how a human would analyse data, This means that the concepts that are fed the methods, how they manipulate them, and what is given back are more “human”.— Inductive learning Connectionist Model There is one naturally occurring model when building an intelligent machine — the human brain. The idea to simulate the functioning of the brain on the computer was thus bom, This new model was not meant to duplicate the operation of the brain but rather to receive inspiration from known facts about how the brain works. It is characterized by having i, A large number of very simple neuron-like processing elements, called units with a large number of weighted connections between them. The ‘weights on the connections encode the knowledge of a network. ii, Each unit containing a state/activity level that is determined by the input received from other units iii, Highly parallel, distributed control iv. __Anemphasis on learning intemal representations automatically. ‘This model led to the development of Neural Networks. Overview of neural networks A neural net is an artificial representation of the human brain that tries to simulate its Ieaming process. The brain consists of more then a billion of neural cells that processes information. Each cell is like a processor, and all cells interact for parallel processing, Neural cell or a neuron. 2 Prepared by Janet Willams August 2006 Cattdody Structure of a neural cell in the human brain Information is transported between neurons as electrical stimulation along the dendrites. This biological structure and its functions are simplified in order to simulate the neural nets. There are many different types of neural nets, but they all have similar components. Neural nets consist of neurons and connections between them. These connections are called weights (threshold in the brain) which stimulates the electrical information (activate or inhibit in the b 3 Prepared by Janett Willams August 2006 ‘Types of Network Structure ‘Structure of a neuron in a neural net Recurrent Structure ‘Networks have internal states stored in the activation levels of the units. This means that computation is not orderly. The links can form arbitrary topologies and the networks can become unstable, or oscillate or exhibit chaotic. Learning because it takes a long time to compute a stable output, when given some input values. On the other hand, networks can implement complex agent designs and can model system states. These networks call for the use of symmetric weights Feed Forward Structure The links in these networks are unidirectional and there are no cycles. ‘Technically speaking a feed forward network is a directed acyclic graph. They compute a function of the input values that depend on the weight settings - have no internal state other than weights themselves, These networks use the data in-a training set to induce earning. To develop a neural network product, follow the following guidelines = Collect data ~ Separate data into training sets and test sets - Define a network structure + Select a leaming algorithm = Set parameter values = Transform data into network inputs - Train network until desired accuracy is achieved + Test network Usefulness of Neural Network Neural networks are being constructed to solve problems that cannot be solved using conventional algorithms. Different domains of applications are: = pattem association ~ pattern classification = = regularity detection = _ image processing = speech analysis tion problems 4 Prepared by Janett Willams August 2006 ~ robot steering ~ processing of inaccurate or incomplete inputs = quality assurance ~ stock market forecasting = ete, ere are many different neural net types with each having special properties; hence each problem domain has its own net type. 5 Prepared by Janett Williams August 2006 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE, LECTURE 12 Natural Language — Agents that Communicate Introduction Speech acts allow agents to communicate useful information to each other. Agents. can: Inform each other Query each other ‘Answer questions Request or Command each other Promise to do things Acknowledge requests Share experiences and feelings The act of communicating any of the above speech acts can be broken down into a few basic steps. From the speaker's perspective, first the intention to communicate occurs and then the words are generated, wt in tum are finally spoken. From the receiver's position, the receiver perceives, analyzes, disambiguates, and finally incorporates what the speaker has spoken, ‘The generation of sentences is based on the grammar rules that are being used. The ‘grammar first introduced in the first half of chapter 22 for the creation of statements ccan be augmented to take into account the semantics of sentences to generate proper English sentences. The grammar can also be extended to include command and acknowledgement speech acts. ‘The analysis of a sentence by the listener is accomplished by parsing the sentence to get the syntactic interpretation, which leads to the semantic interpretation. Semantic interpretation will convert the linguistic syntax into logical forms useful for knowledge representation within the agent. Intermediate "quasi-logical” forms (QLF) ‘make the transition easier between syntactic and semantic interpretation. Ambiguities can arise from the analysis of a sentence. The disambiguation process attempts to eliminate ambiguities by looking at the accepted models of interpretation and their probabilities. 1 Prepared by Janett Williams August 2005 Parsing Parsing is, in the first instance, a syntactic activity. The words in the sentence are recognized as of a particular type (called parts of speech). Examples of parts of speech are Nouns: cat, dog, program, Pronouns:, she, he, who Verbs: read, write, am, Adjectives: red, green, large ‘Adverbs: slowly, longingly, carefully Prepositions, in at, beside Determiners: a, the Conjunctions, and, or ‘The words, once recognized can be combined into phrases + Noun Phrases: The fat dog; Saint Joseph's University; Bill Clinton ‘+ Verb Phrases: sleeps; loves ice cream; hit the dog with a stick + Prepositional phrases: beside the lake; in the park; that chased the mouse that ate the malt that lay in the house that Jack b ‘The sentence "John loves Mary" is parsed as sentence( noun_phrase("John"), verb_phrase( verb(“loves") , noun_phrase ( "Mar: ) ) Semantics Using a lexicon (which you already needed for the parts of speech of words) the denotations of words and the parse can be translated into statements about the internal model of the world. So that the sentence "John loves Mary" could be represented by loves ("Jobn", “Mary") Definite Clause Grammars (DCGs) Grammars use production rules that describe how to construct complex objects from simpler ones (and conversely how to break up complex things into simpler ones). Rather than give a precise definition of a grammar at this point we give an example of « definite clause grammar which parses many simple English sentences. Sentence --> Noun_Phrase , Verb Phrase Noun_Phrase > Determiner, Noun Verb_Phrase --> verb DI Prepared by Janett Williams August 2006 Verb Phrase Verb, Noun Phrase Noun --> cat Noun --> dream Determiner --> the Determiner --> a Verb --> sleeps Verb --> sleep verb --> chases One can see that this grammar parses sentences such as The mouse sleeps. The cat chases a mouse Unfortunately it also parses ‘The mouse sleep which is ungrammatical and ‘The mouse chases a cat The dream sleeps ‘The first of which is implausible and the second of which while poeti is possibly meaningless. What can we do about this. Well the ungrammatical stuff can be handled by Augmented DCGs Once again we illustrate the idea with an example in which we modify the earlier DCG to cope with the two problem sentences. Sentence(Type, Number) wunber) Noun_Phrase (umber) , verb Phrase (Type, Noun_Phrase (Number) --> Determiner (Number), Noun (Number) Verb_Phrase (intransitive) Verb_Phrase (transitive) Verb (intransitive) > Verb(transitive), Noun Phrase (Any) Moun(sing) --> Noun(plur) --> Noun(sing) --> mouse Noun(plur) --> mice . Noun(sing) --> dream Determiner(_) --> the Determiner(sing) --> 2 Verb(intransitive, sing) --> sleeps Verb(intransitive, plur) --> sleep Verb(transitive, sing) --> chases Prepared by Janet Williams August 2006 We still get "the dream sleeps" and "a mouse chases the cats", but some of the nonsense is gone. (Note that I we introduce pronouns we will have a problem with "I sleeps” but you can guess that one uses person as well as number to deal with that.) Semantics How do we deal with the nonsense sentences such as the famous Chomsky sentence? Curious green dreams sleep furiously. ‘The first step is to assign a semantics to each sentence, With this in hand we might be able to determine on the basis of some knowledge that the system has been given that dreams are not coloured and don't sleep. Once again we can augment the DCG, but this time in a different way. Sentence( intransitive, Number, Senantics) Sen) , Verb(intransitive, Number, Senv) , Noun_Bhrase(iwunber, (Semantics = Senv(SenN)} Sentence(transitive, Number, Semantics, Nutber) --> Noun_Phrase(Number, SeqN1) , Verb(transitive, Number) Noun_Phrase(_, SenN2), (Semantics = semv(SenNl, SenN2)). Noun Phrase (Number, Semantics) --> Determiner (suber, SemD) , Noun (Number, Sen) {Semantics = Semb(senn)} Verb(type, Senv) --> Verb(type, Senv) Noun(sing, cat) --> cat Noun(plur, cats) --> cats Noun(sing, mouse) --> mouse Noun(plur, mice) --> mice Noun(sing, dream) --> dream Determiner (Number, the) the Determiner (sing, a) --> & Verb(intransitive, sing, sleep) --> sleeps Verb(intransitive, plur, sleep) --> sleep Verb(transitive, sing, chase) --> chases Putting it all together Ifyou think this looks a little like Prolog you are on the right track. In fact most Prologs support DCGs If we were to assemble all the pieces we have above or something like them we could translate sentences to theit semantics represented as logical expressions ad then try to prove them. The linked example does this in a very simple case. 4 Prepared by Janett Williams August 2006 Tutorial @ Consider the following sentence: Someone walked slowly to the Safeway And the following set of context-free rewrite rules, which give the grammatical categories of the individual words of the sentence: Pronoun ~ “someone” V > “walked” Ady > “slowly” Prep > “to” Det > “the” Noun > “Safeway” Which of the following three sets of rewrite rules, when added to the above rules, yield context-free grammars, which can generate the above sentence? Justify. At B cs SNP VP SNP VP SNP VP NP Pronoun NP >Pronoun NP Pronoun NP > Det Noun NP Noun NP > Det Noun VP> VP PP NP Det NP VP > V Adv VP>VPAdvAdv VP V Vmod Adv > Adv Ady vP>V Vmod > Adv Vinod Adv > PP PP > Prep NP Vmod > Adv PP > Prep NP NP Noun Ady > PP NP > Noun PP > Prep NP (ii) Write down at least one other English sentence generated by Grammar (B) above. It should be significantly different from the above sentence, and should be at least six words long. Do not use any of the words from the above sentence; instead add grammatical rules of your own using the same form given for the sentence, State this form iii) Construct a parse tree for the sentence developed in (ii) 5 Prepared by Janett Williams August 2006 UNIVERSITY OF TECHNOLOGY, JAMAICA FACULTY OF ENGINEERING & COMPUTING SCHOOL OF COMPUTING AND INFORMATION TECHNOLOGY ARTIFICAL INTELLIGENCE, LECTURE 13 Al=The Present & Future AI -- The Present Definition: Al is the science of making machines imitate human thinking and behavior. Common Applications of Al in Business EXPERT SYSTEMS NEURAL NETWORKS GENETIC ALGORITHMS INTELLIGENT AGENTS 70% of the top 500 companies use Al as part of decision support. Expert System © Arrules-based system that attempts to duplicate human reasoning and logical deduction. © Captures expertise from a human expert and applies it to a problem. Components of an Expert System © KNOWLEDGE BASE - stores the domain expertise. © _ INFERENCE ENGINE - processes the dom: facts to reach a conclusion. ‘Neural Network © A matrix of simple processors that work together to solve certain types of problems. © Aneural network can be ‘trained! to solve certain problems through many repetitions - these often involve some kind of pattern recognition. Some applications of neural networks: 1 Prepared by Janett Willams August 2006 © Distinguishing different chemical compounds © Detecting anomalies in human tissue that may signify disease © Reading handw © Detecting fraud in credit card use Genetic Algorithms © programs that mimic evolutionary, ‘survival-of-the-fittest' processes to generate increasingly better solutions to a problem. © Genetic algorithms produce several generations of solutions, choosing the best of the current set for each new generation, © Genetic algorithims involve a program re-writing parts of its own code in a form of learning. Intelligent Agents © Agents are programs that can do repetitive work for you, © An intelligent agent can learn about you and about the task it is performing, Intelligent Agent Can Perform Tasks Like "Acting as a personal electronic assistant to collect, send, and prioritize electronic information such as e-mail. ing and retrieving information from a database. + Finding and retrieving information across networks. Data Mining + Normally the structure of a database determines the patterns that can be elicited from the data + Data-mining involves joining dissimilar databases (with different structures) and using AT techniques to try to detect new pattems that may have emerged. AI -- The Future? What About Real Machine Intelligence? + Do Al processes model human cognitive processes? + __Is'sentiency' a result of the mechanism? Or is something ‘extra’ needed? + What is the criteria for intelligence? How would we recognize it? ‘We had better find answers to these questions and quick! 3 factors that will lead to real Al + Continued development of CAD/CAM, expert system and genetic engineering programming techniques. + Ever faster computer processors, memory and communications. B Prepared by Janet Wilhams August 2006 = Reverse engineering of human brain, What About BioInfomatics? ae 3 Prepared by Janett Williams August 2006