Академический Документы
Профессиональный Документы
Культура Документы
Xia Lin
Dagobert Soergel
Gary Marchionini
262
weights of the winning node and its neighboring nodes. The
winning node is selected based on the EuclidLm distance of an
nodes urd the grid
input vector and the weight vectors in the N-dimensional
00000000000 0 space. Let
0 0 0 0 000 0 0 0 0 0
00 0 0 0 000 0 0 0 0
/ 00 0 0 000 0 0 0 0 0 / x(t)=( xl(t), Xz(t),. . . . xN(t))
The computational algorithm of the feature map consists of 1 A neighborhood may be defined, for example, as
two basic procedures, selecting a winning node and updating all nodes within a square, a diamond or a cycle around
the winning node.
263
geographic neighborhoods: at time t, cx(t) is a small constant 25 words and 140 titles form a 140*25 matrix of documents
witiln a given neighborhood, and O elsewhere. A more recent vs. indexing words, where each row is a document vector and
version of the feature map adapts the Gaussian function to each column corresponds to a word. A document vector
contains 1 in a given column if the corresponding word occurs
describe the neighborhood and the cx(t). One of the successful
in the document title and O otherwise.
stories of current neural network approaches is to apply
nonlinear, continuous functions such as the sigmoid function
and the Gaussian function to the learning process. The
Gaussian function is supposed to describe a more natural
mapping so as to help the algorithm converge in a more stable y $ f?
.&It i5
2 .wlnput 1010
manner. The application described in this article adopts this 3 data 0506
10004
: YWQ 00000 4
version of the algorithm and uses the gain term as: 6 expw 30102 122
; kwh#J..e 00000 1011
00101 01115
9 IEmgu.ag 01110 100010
10 ieammg :grl:l 00020 4
cxt(k, s) = - Al *exp(-t/A2)*exp( -t*d(k,s)/A3 ) (3) 11 Ikfar 03020 0 14
12 machme 00000 00012 307
13 nahmd 01110 00005 0015
14 netwwk 00000 01001 00004
15 OrJIrm 20000 02001 001007
16 @Mm 00000 00000 0000104
where d(k,s) is the Euclidian distance between the node k and 17 process 01010 00014 0012010s
1S resewoh ~:::~ ~::gy 010000006
the winning node s in the two-dmensional grid. A 1, A2 and 19 remev 10101212023
20 Saewa 00100 12100 02000000206
21 search 10100 14100 000002000008
A3 are three parameters. In the formula, the first Gaussian 22 Sysm 60404 4Z77 2 13111311393551
23 tecimqe 20011 00110 1010101001012 6
24 use 00200 02110 0000010102105 OS
function controls the weight update speed and the second I 012345678910 t112t31415 16171 S19XJ2t 222324
Gaussian function defines the neighborhood shrinkage. Thus,
~t(k, S) unifies gain term and neighborhood definition. We
l% 2. Stetistisal dsta of tie ind emme. Tlw diagonal rrumbem are
emphasize that the gain term depends on time t and the frquenciee of each word T~ diagonal numbsre are co-occurrence da3e.
distance of the node k from the winnhtg node s.
It has been demonstrated that the feature map learning
algorithm can perform relatively well in noise (Lippmarm, The document vectors are used as input to train a feature map
1987) hence its application potential is enormous. According of 25 features and 140 nodes. z Following the Kohonens
to Kohonen (1990), the feature map learning algorithm has algorithm:
been applied to practical problems in vmious areas such as o each feature corresponds to a selected word,
speech recognition, robot arms control, industrial process Qeach document is an input vector,
control, optimization problems and analysis of semantic ~ each node on the map is associated with a vector of weights
information. Early examples and applications of feature maps whtch are assigned small random values at beginning of
mostly demonstrated that the feature map can preserve metric the training;
relationships and the topology of input patterns. The during the training process, a document is randomly
application of the feature map to cognitive information selected, the node closest to it in N-dimensional vector
processing (Ritter & Kohonen, 1989) particularly offers the distance is chosen; the weight of the node and weights of
possibility to create in an unsupervised process its neighboring nodes are adjusted accordingly;
topographical representations of semantic, nonmetric The training process proceeds iteratively for a certain
relationships implicit in linguistic data (p. 243). Following number of training cycles. It stops when the map
this vein, a self-organizing semantic map for literature is converges.
constructed. When the tratilng process is completed, each document is
mapped to a grid node.
3. A SELF-ORGANIZING SEMANTIC MAP FOR AI Fig. 3. is the semantic map of documents obtained after
LITERATURE 2500 training cycles. The map contains very rich
information. The numbers on the map indicate the number of
Kohonens self-organizing map algorithm is applied to a documents mapped to their corresponding nodes. These
collection of documents from the LISA database. The numbers collectively reveal the distribution of the documents
collection includes all 140 titles indexed by the descriptor on the map.
Artificial Intelligence in Silver Platters CD-ROM version The map is divided into concept areas (more precisely, word
of the LISA database (Jan. 1969- March, 1990). An inverted areas). The areas to which a node belongs is determined as
index of the 140 titles was produced. After excluding some follows: compare the node to every unit vector (containing
stop words, the most frequently occurring words (information, only a single word), and assign to the node the unit vector (or
artificial, intelligence) and the least frequently occurring the word it represents). Because of the cooperative feature of
words (those appearing no more than 3 times), 25 words the neighboring nodes in the map, the areas are assured of
(more exactly , 25 roots, as, for example, librarian, library
and libraries are counted as one) were retained; they were used 2140 nodes is chosen because the 10 by 14 rectangle
as the set of indexing words for this collection (Fig. 2). These seems to fit the screen best. It is a coincidence that
the number equals the number of titles used here.
264
continuity. The areas are also labeled as follows: compare The trained semantic map maps the title to a node with unequal
each unit vector to every node and label the winning node with weights:
the word corresponding to the unit. When two words fall into
the same area, the two areas are merged. As we cart see, this retrieval (.99), knowledge (.7), process (.4) and use (.4).
happens when the two words often co-occur, representing a
term (see, for instances, the areas of natural language and Some other words not used in the title are also picked up:
machine learning ont.he map).
The size of the areas corresponds to the frequencies of learning ( .18), online (.12), system (.08), comput (.07) and
occurrence of the words. For example, the largest area in the problem (.04).
map is (expert) system, which, by frequency counts, appears
most often in the collection (Fig. 3). Retriev appears This indexing scheme suggests that retrieval should be the
second most frequently in the collection, so its area is the most important concept for this document (for clustering
second largest. However, as the mapping is a nonlinear one, purposes), followed by the concepts of knowledge, process
the sizes and the frequencies do not have a linear relationship. and use, in their order of importance. Learning can be another
In fact, there is a tendency to make the rich richer, and the major concept in the document, although it does not appear in
poor poorer, that is, the large ones look even larger, and the the title. Learning is picked up because it often co-occurs
small ones sometimes simply disappear. with retrieval and knowledge in this collection. Similar
The relationship between frequencies of word co-occurrence results have been reported by Mozer (1984) and Belew (1986),
and neighbor areas is also apparent. Intelligent is next to although they applied different connectionist models to
system and retriev because it more often co-occurs with information retrieval. Mozer called this property
these two terms than with any others; search is next to compensation for inconsistency and incompleteness in the
librari and system for the same reason, indexing of documents.
The second property is related to descriptors in a thesaurus.
Descriptors are human assigned subject indicators to
documents. The meaning of a descriptor often varies from
database to database and from one field to another. For
example, what does the descriptor Artificial Intelligence
mean in the LIS A database? One may explain it by ones
own knowledge about the descriptor, or one may use the
definition provided by LISA database creators. The self-
organizing semantic map seems to provide another
explanation for the descriptor, that is, from the documents
indexed by the descriptor to define the descriptor. Using
words representing the areas and neighbor areas on the map,
we may construct the following lis~
Artificial Intelligence
(terms formed by combining words mapped onto one area on
the map)
. expert system
Fig. 3. A self-organizing semantic map of Al Iiierature. 140 docuinenta machine learning
from USA database are used as input to produce the map. The areas . natural language processing
on the map we eufometicall y generefed, their relative positione,
neighbors, and sizes are determined by the input data. lhe numbere . . . . . .
on the mep represent the numter of documents mapped to eati (terms formed by combing words in neighbor areas on the
node.
map)
intelligent system
intelligent retrieval
Other properties of the map are particularly related to the . knowledge retrieval
nature of the document space. Firstly, it seems that the final knowledge system
weight patterns could be a good associative indexing scheme 0 computerized search
for the collection. This scheme assigns different weights to . . . . . .
words in a title and adds some other words that are highly (terms formed by three neighbor areas)
associated with the words in the title. For example, for the . online search applications
title On the use of knowledge-based processing in automatic . . . . . .
text retrieval, the initial index (the input data) is an equal The list describes to some extent the content of the
weight indexing: descriptor Artificial Intelligence as used in this database.
Thus, the list can be used as a complement to the given
knowledge (1.0), process (1.0), retrieval (1.0) and use (1 .0). definition of the descriptor so as to provide the user a different
265
~.
view of the descriptor. Alternatively, rather than construct the
list, we may provide the map to the user so that the user
himself can look for possible combinations of words to form
terms relevant to hls request. This allows flexibility and
encourages the user to search for terms that best describe his
r fir ,
request.
.!mp:e 2 ,.
5
Finally, severrd interesting points arise when comparing . -
.
the self-organizing semantic map with Doyles semantic road 1*
atw.1
. 1
mashw L.
292.
++
s.
know-
. .
. . ..O .,,
map. 21 .
1 .
r.t,i%l
. 4..
1) Doyles map was basically a demonstration, rather than a 1. 1*.
Sy :tmm 1 . t
4. A PROTOTYPE SYSTEM
I Ill
Development of the CODER system: a testbed for artificial +
TITLE:
browsing and selection of documents from the document intm!iig.mce method= in in formati.m retrieval
*
space.
A prototype
semantic map
system in HyperCard
as an interface. The
has been designed
system is designed
usrng a
to
SOURCE: lnfOrma~On-processing-and-f.4anawme.t.
341-366. illus.
Q
Ill
Retrieval)
accept downloaded bibliographical data and automatically add
KEWOOROS: ~tr~ial.jnlel~~~, 7~.~1-F=-==.s-a.d-m~8. +
l.lwmal ti-~mage-ad-ratiwal, Idorrnaliomrefriwel,
links among the data for easy browsing. The first card of the
~W=-ind.xing, tiwletisd-inlma~~ .gw~e.ad.rdri.val,
system contains a self-organizing semantic map trained by the CC+nvtwisul.wbiec-indexirw, Subjnc.hdwhg
266
The map becomes a more efficient navigation tool when it is demonstrates its potential use for information retrieval. The
used together with other links provided by the system. There self-organizing semantic map produced by the algorithm has
are four types of links in the system. There are word link several promising features:
such that one can click on arty word in the card to jump to a
card with the same word. There are item links such that by It maps a high dimension document space to a two-
highlighting and selecting an item (an item is a string of dimensional map while maintaining as faithfully as
words; it can be one or several authors, a complete title or a possible document inter-relationships.
partial title, any part of a source, one or several descriptors, It visualizes the underlying structure of a document space
etc.), one can jump to a card with the same item. There are by showing geographic areas of major concepts in the
index links that the user can use to go back and forth between document space and the document distribution over the
full bibliographic records and the alphabetical lists of authors, concept areas. Properties related to geographic concepts
titles, sources, or keywords. There are also associative links such as areas, locations, sizes and neighbors all reflect the
that function on document citations (they are not available for statistical patterns of the document space.
thk set of data since the downloaded materials do not include 0 It results in an indexing scheme (weights) that shows the
citations of each document). All these links are available on potential to compensate for incompleteness of a simple
every full record card. Thus, a typical search in the system document representation (title word indexing).
works like this: a user first clicks a node on the map. The
system displays a set of titles. The user then selects a title Potentials of the self-organizing semantic map are not
from the selection window and browses through several cards limited to these features. Further research along the main
following word links, item links, index links or their themes brought up by the semantic map will shed light on
combinations. If he feels disoriented, he can choose to go design of the interface that will make underlying documents
back to the map. He then can study the markers indicating the visible to the user.
first and the last card viewed, and draw a rectangle somewhere
between the two markers. This time it is likely that titles Dimension r eduction. The main theme of the semantic map is
related to his interest will appear in the selection window. dimension reduction. It is clear that the document space is a
The system is intended as a front-end to an online high dimensional space. Ways to reduce the high
bibliographic retrieval system. The design goal is to dimensionality are not new (Sammon, 1969; Small &
conceptualize art information retrieval approach which uses Griffith, 1974; Siedlecki, 1988). A recent example of
traditional search techniques as information filters and the dimension reduction technique is latent semantic indexing
semantic map as a browsing aid to support ordering, linking, (Furnas et al., 1988). Latent semantic indexing uses the
and browsing information gathered by the filters. One of the estimation of latent structure to construct a semantic space
fundamental issues of information retrieval is searching for that has a much smaller dimensionality than the
compromises between precision and recall. It is generally dimensionality of the original document-term vectors. In the
desirable to have high precision and high recall, although in semantic space, terms and documents that are closely
reality, increasing precision often means decreasing recall and associated are placed near one another. Thus, the semantic
vice versa. Developing retrieval techniques that will yield space helps define relevant neighborhoods for information
high recall and high precision is desirable. A more productive retrieval. An immediate connection of the two dimension
approach, however, seems to enhance post-processing of the reduction techniques, latent semantic indexing and the self-
retrieved set, such as providing links and semantic maps to organizing semantic map described here, is using the semantic
retrieved results of a query. If such value-adding processes map on top of the latent semantic indexing. While latent
allow the user to easily identify relevant documents from a semantic indexing reduces the dimension of a document space
large retrieved set, queries that produce low precisionfilgh to a manageable range around 100, the semantic map extends
recall results will become more acceptable. For example, the the result to a two dimensional map for a visual display.
query artificial intelligence is not considered a useful query Latent semantic indexing applies to a document space before
in the current LISA database without combining with other any retrieval is attempted. The semantic map applies to the
queries. By downloading the retrieved set of the query to this semantic space after a query is received thus provides various
system, however, we have provided the user an environment in views on the semantic space.
which he may feel comfortable to fmd any documents from the A question related to the dimension issue is what are the
set of documents retrieved by the simple, high recall-oriented dimensions for the document space. The semantic map
query artificial intelligence. reported here is based on automated generation of dimensions
from word stem frequencies as appearing in titles.
Alternatively, the document space can be defined assigned
5. FUTURE RESEARCH descriptors as dimensions. Applying the self-organizing
learning algorithm to such a space may produce a more
Development of artificial neural network technology is meaningful map as there is some intellectual work involved in
expected to have a great impact on information retrieval descriptor assignment before the algorithm is applied.
(Doszkocs, Reggia & Lin, 1990). This article introduces a There is another way to define a document space, that is, the
neural networks self-organizing learning algorithm and extended vector model (Salton & McGill, 1983). One of the
267
advantages of neural network algorithms is the ability to algorithms.
aggregate various information in a meaningful way Rather than beginning with maps generated by the system,
(Doszkocs, et al. 1990). When we apply the learning we can consider maps generated by people. We have designed
algorithm to the extended vector model that describes an experiment to let subjects generate similar semantic maps.
document relationships by words, descriptors, and citation Subjects are given the same titles that we used to train the self-
links, we are likely to improve further the quality of the organizing semantic map and a large (table size) grid. Each
semantic map. title is on a small card which can be placed, up to three
together, in a node on the grid. The task given to subjects is
Information visualization. The second main theme of the to put the cards on the grid based on their perceived document
semantic map is to visualize information for information similarities. It is emphasized that titles can be put on any
retrieval. There is increasing interest in information locations of the grid, and that relative distances among
visualization for information retrieval. Crouch (1986) documents are more important than the locations. Subjects are
provided an overview of the graphical techniques used for told that the purpose of such a map is to make browsing and
visual display for information retrieval. The techniques he selection of documents from the map easier. From the results
rev iewed included clustering, multidimensional scaling, iconic of several pilot subjects, we found both similarities and
representation, and his own component scale drawing. As he differences between the maps generated by the algorithm and
pointed out, most of these tectilques were not appropriate in by subjects. There were also some similarities between the
an interactive environment. The self-organizing semantic processes of map generation. The full experiment is currently
map can be considered as another technique for visual display underway and the analysis of the results will be reported in Lin
of information for information retrieval. It has several (in progress).
advantages over those techniques mentioned about. It has the A further comparison of the maps generated by the
same objective as those described by Crouch in terms of algorithm and by people is to use the maps in a retrieval
representing the document space two dimensionally. setting. We have planned another experiment where subjects
Moreover, it seems to provide richer information on the will be randomly assigned to one map or another to perform
display. It creates a visual effect through geographic features the same retrieval task. The task will require quick browsing
of the areas on the map and preserves certain clustering and and selection of documents from the map. This experiment
hierarchical structures of the document space. It keeps its will examine how subjects will take advantage of the semantic
simplicity even when the number of documents are increased. map to perform the retrieval task. It will test the usefulness
Most importantly, it has the potential to use interactively as and comparability of the human-generated maps and the self-
we can expect that the map will be generated instantly in a organizing semantic map. The results of thk experiment will
parallel computing environment. also be reported in Lln (in progress). We believe that through
One of the central issues concerning visual display for a series of investigations, we should better understand the
information retrieval involves creating the ability to spark construction and the properties of the self-organizing
understanding, insight, imagination, and creativity through semantic map, by which we can produce an interface to make
the use of graphic representations and arrangements (Veith, underlying information visible to the user.
1988, p. 213). The prototype system demonstrates that the
semantic map may help create such an ability, although this
needs to be tested experimentally. One test will be using the
prototype system in a teaching environment. For example, Acknowledgments. James Reggia and Tarnas Doszkocs
we can ask students in a class that requires a term paper to made useful comments and suggestions on an early version of
search and download 200-300 document citations related to this paper (by the first author). Their suggestions and
their term papers into the system. Each student can have one insights are gratefully appreciated.
system that contains his/her own search results. The students
then are asked to consult the system frequently when they are
writing the papers throughout the semester. In the end, we can
examine whether the semantic map has any influences on how
the paper is organized what documents are cited, and where the
cited documents appear in the semantic map.
268
REFERENCES information retrieval. New York, NY. McGraw-Hill.
Sarnmon, J. W. (1969). A nonlinear mapping for data structure
Belew, R. K. (1986). Adaptive information retrieval : analysis. IEEE Transactions on Computers. 18(5), 401-
machine learning in associative networks (Doctoral 409.
dissertation, University of Michigan,1986). Shepherd, G. (1979). Synaptic Organization of the Brain.
Carpenter, G. A., & Grossberg, S. (1986). Neural dynamics of England Oxford University Press.
category learning and recognition: attention, memory Siedlecki, W.; Siedlecki, K.; & Sklartsky, J. (1988). An
consolidation, and amnesia. In: S. Grossberg (cd.), overview of mapping techniques for exploratory pattern
Adaptive Brain (pp. 238-286). New York: Elsevier. analysis. Pattern Recognition. 21(5), 411-429.
Crouch, D. B. (1986). The visual display of information in an Small, H.; & Griffith, B. C. (1974). The structure of scientific
information retrieval environment. SIGIR86, 58-67. literatures I: identifying and graphing specialities. Social
Doszkocs, T., Reggia, J., & Lin, X. (1990). Connectionist Studies, 4: 17-40.
models and information retrieval. Annual Review of White, Howard D. & Griffith, Belver, C. (1980). Author
Information Science and Technology, v25, pp. 208-260. cogitation: A literature measure of intellectual structure.
Doyle, L. B. (1961). Semantic road maps for Literature Journal of the American Society for Information Science,
searchers. Journal of ACM, 8, 553-578. 31(3):163-171, 1980.
Fumas, G. W., Deerwester, S., Dumais, S. T., Landauer, T. K., Veith, R. H. (1988). Visual information systems: the power of
Harshmrm, R. A, Streeter, L. A., & Lochbaum(1988). graphics and video. G. K. Hall & Co. Boston, MA.
Information retrieval using a singular value decomposition
model of latent semantic structure. In: 11th International
Conference on R&D in Information Retrieval (pp. 465-
480). Grenoble, France, the ACM Press.
Kohonen, T. (1982). Self-organized formation of
topologically correct feature maps. Biological
Cybernetics, 43, 59-69.
Kohonen, T. (1989). Self-organization and associative
memory. 3rd ed. Berlin: Springer-Verlag.
Kohonen, T. (1990). Some practical aspects of self-
organizing maps. In IJCNN-90-WASH-DC, January 15-
19, 1990. Washington, D.C. Volume II, Application
Track. Hillsdale, NJ: Lawrence Erbaum Associates; 1990.
253-256.
Kwok, K. L. (1990). Application of neural network to
information retrieval. In IJCNN-90-WASH DC,
International Joint Conference on Neural networks,
January 15-19, 1990. Washington, D.C. Volume II,
Application Track. Hillsdale, NJ: Lawrence Erbaum
Associates; 1990.623-626.
Lin, X. (in progress). Self-organizing semantic maps as
graphical interfaces for information retrieval. Doctoral
dissertation, University of Maryland at College Park.
Lippmrmn, R. P. (1987). An introduction to computing with
neural nets. IEEE ASSP Magazine, 4-22, April, 1987.
Mann, R. & Haykin, S. (1990). A parallel implementation of
Kohonen feature maps on the Warp systolic computer. In
IJCNN-90-WASH DC, January 15-19, 1990. Washington,
D.C. Volume II, Application Track. Hillsdale, NJ:
Lawrence Erbaum Associates. 84-87.
Persormaz, L.; Guyon, I; Dreyfus, G. (1986.). Neural network
design for efficient information retrieval. In: E.
Bienenstock et al. (eds.) Disordered Systems and
Biological Organization. Berlin: Springer-Verlag.
Ritter, H. & Kohonen, T. (1989). Self-organizing semantic
maps. Biological Cybernetics. 61, 241-254.
Rumelhar~ D., Hinton, G., & Williams, R. (1986). Learning
Representations by Back-Propagating Errors. Nature, 323,
533-536.
Salton, G.; & McGill, M. J. (1983). Introduction to modem
269