Вы находитесь на странице: 1из 14

CS7637 KBAI Spring 2017: Project 2 Reflection (jgeorge84) 1

Project 2 Reflection

Joseph George (jgeorge84)

Georgia Tech

Author Note

OMSCS: 7637 KBAI Spring 2017 Project 2 Reflection


CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 2

Project 2 Reflection

Project 1 approach

The approach in Project 1 was only verbal. The verbal approach already had some attributes

and ontology as part of the problem set. This allowed us to find the differences between the

attributes in the verbal description of the problem, build a semantic network and then arrive at the

correct answer. The agent uses the verbal description of the problem to identify the number of

objects in the source and destination for row 1 of a 2x2 matrix.

1. If the shape is the same in both A and B then it sets the transform attribute for

transformed-shape as same.

2. If there is one object in both the source and the destination and there were no objects

missing, it sets the transform attribute Deleted as 0

3. If there is a rotation using the angle attribute, it calculates the rotation.

4. By iterating over the other attributes, it identifies which ones are changed and adds it

to the transform dictionary.

This transformation is used to create the Semantic Network and stored with the original

representation of A and B in RavensSemanticNetwork object.

A similar transform is produced for each of the answer choices and C using the original algorithm.

The transformation between C and each of the answer choices is compared with the original

transformation. The result that matches the original transformation is picked as the correct answer.

The challenges in Project 1 by using the verbal approach was around interpreting the

attribute values, if they were verbal like

Size: medium
CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 3

Should we arbitrarily assign numerical values to the values of size? Using the verbal

approach allowed us to calculate the number of objects and which ones were deleted in subsequent

figures easily.

Changes in problem set and Agent Reasoning and Representation

The problem set itself in Project 1 had fewer computations because it was a 2x2 matrix.

The transformations in Project 1 were calculated along the horizontal and vertical axes but not the

diagonal axes. The number of answer choices was also smaller - 6 rather than 8 in the current

problem set. In contrast, for project 2, the RPM is a 3x3 matrix where the transformations have to

be computed along the horizontal, vertical and diagonal axes, validated by comparing with other

rows and columns and the answer choices before choosing the correct answer.

The agent for this phase uses a visual approach for reasoning. The purely visual approach

was chosen to verify the approach and strategy to solve Ravens Progressive Matrices (RPM) in

subsequent phases. One approach is to extract information from the figures, build a semantic

network and apply the transformations to choose the correct answer. Another approach is the

Gestalt method where the representation of the figures is not extracted but they are visually

compared. Another approach is a hybrid approach which uses the Gestalt method for solving some

and extracts some features to solve the others and as a last resort takes a brute force number of

pixels comparison approach when all else fails. This strategy mirrors our human cognition

approach where we take a first pass at solving problems using a superficial similarity approach

and only choosing to deep dive, and extract and compare particular features when necessary.

This approach worked well for 70% of the Basic and 60% of the Test and Challenge

Problem C tests such as elementary ones shown below:


CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 4

Figure 1 Basic Problem C-01 from assignment

However, the agent fails at slightly more complicated problems that moves along the x or

y axes.
CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 5

Figure 2 Basic problem C-07 from assignment

Two classes - a RavensTransform holds the transformation for an object within a Ravens

figure and RavensSemanticNetwork holds the transformation for each object in a RavensFigure.

A transformation step attribute was also added as part of the RavensTransform class to identify the

transformation.

The identified attributes are:

Original-shape:
Transformed-shape:
Position:
Rotation:
Reflection:
Deleted:
StartCol:
CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 6

StartRow:
Width:
Height:

The last four attributes are used to extract shapes from the RavensFigure. The shape

however is not identified in this phase (and it is probably not necessary to identify the shape) as

long as the correspondence between the shapes in figures can be identified. The current algorithm

extracts shapes that are filled. For shapes, that are not filled, the inverse of the figure is taken to

extract the shape.

Backward compatibility was not a significant consideration for the design of the visual

reasoning because the fundamental approach had changed. The program was run against tests in

Basic, Test and Challenge Problem Set B to ensure that the same agent could solve them. The

biggest challenged faced were extracting shapes and finding correspondence of the shapes among

the figures.

Agent capabilities and relation to KBAI

The agent uses the concept of Semantic Networks to solve the problems. Semantic

networks represent the objects as nodes and the relationship between the objects as links. Labels

on the links provide description of the relationships. In the case of a 3 x 3 Ravens Progressive

Matrix, A, B, C, D, E, F, G, H can be represented using a Semantic Network consisting of nodes,

links and labels. The labels also identify the transformation that occurs for each object within a

Ravens Figure. For each answer choice in #, we will represent H:# and use the knowledge

representation (Semantic Network) from previous transformations (T1 T5) to choose the correct

answer for #. This method of problem solving is called Represent and Reason the problem is

represented using Semantic Networks and the correct answer is identified using reasoning over the

representation.
CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 7

The agent uses the verbal description of the problem to identify the number of objects in

the source and destination for row 1 of a 2x2 matrix. Let us take Fig 2 as an example:

5. It extracts the shapes from A and B and forms a correspondence built on the shape

and the coordinates. The circle is named Shape1. A logicalXor indicates that the

shape has moved to B. The Euclidean distance between the two coordinates is

calculated.

6. It performs Step 1 for B to C. This provides the complete horizontal transformation.

7. It performs Step 1 for A to D and then D to G providing the complete vertical

transformation. The horizontal transformation has a higher weight than the vertical

transformation.
CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 8

8. It performs Step 1 for A to E identifying the diagonal transformation which has the

least weight.

9. It then compares H with the answer choices and scores each one using the

transformations in Steps 1 4.

10. The answer choice with the highest score is set as the answer

Refactoring for phase 3 includes correcting the overfits and moving createTransform and

createSemanticNetwork to separate classes/modules.


CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 9

Figure 3 Agent structure


CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 10

Agent performance and limitations

The current version of the agent has difficulty with positional attributes and has given me some

insights into human cognition. As humans, we are able to identify the different objects and

immediately see the difference. If we take Basic Problem B-12 as an example,

Figure 4 Basic Problem C-12

as humans, we can see that the correct answer is 8. My agent fails in this case and it is because it

is unable to identify cases where H and the answer choice are the same. It appears to identify the

first answer which has four white squares without taking the orientation of the black squares into

consideration.
CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 11

Evaluation of agents results

Currently, the agent handles cases where there is an affine transformation. It is having trouble

identifying the solution for cases where there are overlaps. Part of the reason could be that the

algorithm for extracting shapes is not robust enough. A contour tracing algorithm might work better

in this case.

Figure 5 Challenge Problem C-09

When there are three objects, it is difficult to find the correspondence especially when two of

them are the same type.


CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 12

Human cognition and agent

The behavior of the agent models human cognition to a large degree. The figure below from

Carpenter, Just and Shell 1990 gives a representation of human cognition based on experiments

done by the researchers at CMU.

Figure 6 Carpenter, Just, Shell 1990

The four boxes represent the four stages in the agent. Perceptual analysis is replaced by encoding

the verbal representation and finding correspondences. The createTransform method is analogous

to finding a row-wise rule induction and generalization while the compareSemanticNetwork

compares the original transformation (transform between first row) with that of the generated
CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 13

transformation between H and each of the answer choices. Of course all of the objects and their

representations are held in working memory of the program.

Apart from the structure of the agent being similar to human cognition, there are similarities to

how the agent behaves because we have modeled it on our understanding of human cognition. That

said, it is remarkable how quickly the brain is able to visually identify the similarities and

differences between the figures in RPM. Using different weighted systems for the cognition will

allow agents to identify similarities or differences among images that may not be very evident to

a human observer.
CS7637 KBAI Spring 2017: Project 1 Reflection (jgeorge84) 14

References

Goel, Ashok. (2015). Geometry, Drawings, Visual Thinking and Imagery: Towards a Visual

Turing Test of Machine Intelligence, Proceedings of the 29th Association for the

Advancement of Artificial Intelligence Conference Workshop on Beyond the Turing Test.

Austin, Texas.

Raven's Progressive Matrices, Wikipedia, retrieved on Jan 20, 2017 from

https://en.wikipedia.org/wiki/Raven's_Progressive_Matrices.

Carpenter, P., Just, M., and Shell, P. 1990. What one intelligence test measures: a theoretical

account of the processing in the Raven Progressive Matrix Test. Psychological Review, 97(3),

404-431.

Winston, P.H (1993). Artificial Intelligence. Addison Wesley.

Вам также может понравиться