Вы находитесь на странице: 1из 11

Neural Networks 22 (2009) 247–257

Contents lists available at ScienceDirect

Neural Networks
journal homepage: www.elsevier.com/locate/neunet

2009 Special Issue

Language and cognition


Leonid Perlovsky ∗
Harvard University, SEAS, Cambridge, USA
The AF Research Laboratory, Sensors Directorate, Hanscom AFB, USA

article info a b s t r a c t

Article history: What is the role of language in cognition? Do we think with words, or do we use words to communicate
Received 5 January 2009 made-up decisions? The paper briefly reviews ideas in this area since 1950s. Then we discuss mechanisms
Received in revised form 9 March 2009 of cognition, recent neuroscience experiments, and corresponding mathematical models. These models
Accepted 16 March 2009
are interpreted in terms of a biological drive for cognition. Based on the Grossberg–Levine theory of drives
and emotions, we identify specific emotions associated with the need for cognition. We demonstrate
Keywords:
an engineering application of the developed technique, which significantly improves detection of
Language
Cognition
patterns in noise over the previous state-of-the-art. The developed mathematical models are extended
Emotions toward language. Then we consider possible brain–mind mechanisms of interaction between language
Knowledge instinct and cognition. A mathematical analysis imposes restrictions on possible mechanisms. The proposed
Dynamic logic model resolves some long-standing language–cognition issues: how the mind learns correct associations
Mind between words and objects among an astronomical number of possible associations; why kids can talk
Heterarchy about almost everything, but cannot act like adults, what exactly are the brain–mind differences; why
Hierarchy animals do not talk and think like people. Recent brain imaging experiments indicate support for the
proposed model. We discuss future theoretical and experimental research.
Published by Elsevier Ltd

1. Nativism, cognitivism, evolutionism learn whichever language is spoken around, even if he or she
is raised on the other side of the globe. Chomsky called this
Complex innate mechanisms of the mind were not appreciated inborn learning mechanism Universal Grammar and set out to
in the first half of the last century. Thinking of mathematicians discover its mechanisms. He emphasized the importance of syntax
and intuitions of psychologists and linguists were dominated by and thought that language learning is independent of cognition.
logic. Considered mechanisms of logic were not much different This approach to language based on innate mechanisms, is called
for language or cognition; both were based on logical statements nativism.
and rules. Even fundamental Gödelian theory (Gödel, 1931/1994) Chomsky and his school initially used available mathematics
establishing the deficiency of logic did not move thinking about the of logical rules, similar to rule systems of artificial intelligence. In
mind away from logic. 1981, Chomsky (Chomsky, 1981) proposed a new mathematical
Contemporary linguistic interests in the mind mechanisms paradigm in linguistics, rules and parameters. This was similar
of language were initiated in the 1950s by Chomsky (1965). to model-based systems emerging in mathematical studies of
He identified the first mysteries about language that science cognition. Universal properties of language grammars were
had to resolve. ‘‘Poverty of stimulus’’ addressed the fact that supposed to be modeled by parametric rules or models, and
the tremendous amount of knowledge needed to speak and specific characteristics of grammar of a particular language were
understand language is learned by every child around the world fixed by parameters, which every kid could learn from a limited
even in the absence of formal training. It has seemed obvious exposure to the surrounding language. Another fundamental
to Chomsky that surrounding language cultures do not carry change of Chomsky’s ideas (Chomsky, 1995) was called the
enough information for a child to learn language, unless specific minimalist program. It aimed at simplifying the rule structure of
language learning mechanisms are inborn in the mind of every the mind mechanism of language. Language was modeled in closer
human being. This inborn mechanism should be specific enough interactions to other mind mechanisms, closer to the meaning, but
for learning complex language grammars and still flexible enough stopped at an interface between language and meaning. Chomsky’s
so that a child of any ethnicity from any part of the world would linguistics still assumes that meanings appear independently from
language. Logic is the main mathematical modeling mechanism.
Many linguists disagreed with separation between language
∗ Tel.: +1 781 377 1728; fax: +1 781 377 8984. and cognition in Chomsky’s theories. Cognitive linguistics emerged
E-mail address: leonid@deas.harvard.edu. in the 1970s to unify language and cognition, and explain
0893-6080/$ – see front matter. Published by Elsevier Ltd
doi:10.1016/j.neunet.2009.03.007
248 L. Perlovsky / Neural Networks 22 (2009) 247–257

creation of meanings. Cognitive linguistics rejected Chomsky’s idea Västfjäll, 2008); in this paper we always refer to the mechanism
about a special module in the mind devoted to language. The connecting conceptual and instinctual brain regions. Perception
knowledge of language is no different from the rest of cognition, and understanding of concept-models corresponding to objects or
and is based on conceptual mechanisms. It is embodied and situations that can potentially satisfy an instinctual need receive
situated in the environment. Related research on construction preferential attention and processing resources in the mind.
grammar argues that language is not compositional, not all phrases Projection of top-down signals from a model to the visual
are constructed from words using the same syntax rules and cortex primes or makes visual neurons to be more receptive to
maintaining the same meanings; metaphors are good examples matching bottom-up signals. This projection produces imagination
(Croft & Cruse, 2004; Evans & Green, 2006; Ungerer & Schmid, that we perceive with closed eyes, as in the closed–open eye
2006). Cognitive linguistics so far has not led to computational experiment. Conscious perception occurs, as mentioned, after top-
linguistic theory explaining how meanings are created. Formal down and bottom-up signals match. The process of matching
apparatus of cognitive linguistics is dominated by logic. for a while presented difficulties to mathematical modeling, as
Evolutionary linguistics emphasized that language evolved discussed below.
together with meanings. A fundamental property of language is
that it is transferred from generation to generation, and language 2.2. Combinatorial complexity, logic, and dynamic logic
mechanisms are shaped by this process. (Christiansen & Kirby,
2003; Hurford, 2008). Evolutionary linguistics by simulation of Perception and cognition abilities of computers still cannot
societies of communicating agents (Brighton, Smith, & Kirby, 2005) compete with those of kids and animals. Most algorithms and
demonstrated the emergence of a compositional language. neural networks suggested since 1950s for modeling perception
and cognition, as discussed in Perlovsky (2006a), faced difficulty
of combinatorial complexity (CC). Rule systems of artificial
2. Cognition, dynamic logic, and the knowledge instinct
intelligence in the presence of variability has grown in complexity:
rules have become contingent on other rules, and rule systems
Consider a seemingly simple experiment. Close your eyes and
faced CC. Algorithms and neural networks designed for learning
imagine an object in front of you. The imagined image is vague,
have to be trained to understand not only individual objects,
not as crisp and clear as with opened eyes. As we open eyes; the but also combinations of objects, and thus faced CC of training.
object becomes crisp and clear. It seems to occur momentarily, but Fuzzy systems required a fuzziness level to be set appropriately
actually it takes 1/5th of a second. This is a very long time for neural in different parts of systems, also degrees of fuzziness vary in time,
brain mechanisms – hundreds of thousands of neural interactions. an attempt to select efficient levels of fuzziness would lead to CC.
Let us also note: with opened eyes we are not conscious about These CC difficulties were related to Gödelian limitations of
initially vague imagination, we are not conscious about the entire logic, they were manifestations of logic inconsistency in finite
1/5th of a second, we are conscious only about the end of this systems (Perlovsky, 2000). Even approaches designed specifically
process: crisp, clear object in front of our eyes. The explanation of to overcome logic limitations, such as fuzzy logic and neural
this experiment has become simple after many years of research networks, encountered logical steps in their operations: neural
that have found out what goes on in the brain during these 1/5th networks are trained using logical procedures (e.g. ‘‘this is a chair’’),
of a second. and fuzzy systems required logical selection of the degree of
fuzziness.
2.1. Instincts, emotions, concepts To overcome limitations of logic, dynamic logic was proposed
(Perlovsky, 2000, 2006a; Perlovsky & McManus, 1991). In the next
Explaining this experiment requires us to consider mechanisms section we summarize the mathematical description of dynamic
of concepts, instincts, and emotions. We perceive and understand logic, here we describe it conceptually. Whereas logic works with
the world around due to the mechanism of concepts. Concepts statements (e.g. ‘‘this is a chair’’), dynamic logic is a process from
are like internal models of objects and situations; this analogy is vague to crisp, from vague statement, decision, plan, to crisp ones.
quite literal, e.g., during visual perception of an object, a concept- It could be viewed as fuzzy logic that automatically sets a degree
model of the object stored in memory projects an image (top- of fuzziness corresponding to the accuracy of learning models.
down signals) onto the visual cortex, which is matched there to an Dynamic logic corresponds to the open–close eye experiment:
image projected from the retina (bottom-up signal; this simplified initial states of models are vague. This experiment was recently
description will be refined later; see Grossberg (1988)). performed with much more details using brain imaging. Bar et al.
The mechanism of concepts evolved for instinct satisfaction. (2006) used functional Magnetic Resonance Imaging (fMRI) to
The word instinct is not used currently in the psychological obtain high spatial resolution of processes in the brain, which they
literature; the reason is that the notion of instinct was mixed combined with magneto-encephalography (MEG), measurements
up with instinctual behavior and other not very useful ideas. We of the magnetic field next to the head, which provided high
use the word instinct to denote a simple inborn, non-adaptive temporal resolution of the brain activity. Combining these two
mechanism described in Grossberg and Levine (1987). Instinct is techniques the experimenters were able to receive high resolution
a mechanism of the internal ‘‘sensor’’, which measures vital body of cognitive processes in space and time. Bar et al. concentrated
parameters, such as blood pressure, and indicate to the brain when on three brain areas: early visual cortex, object recognition area
these parameters are out of safe range. This simplified description (fusiform gyrus), and object information semantic processing area
will be sufficient for our purposes, more details could be found in (OFC). They demonstrated that OFC is activated 130 ms after
Gnadt and Grossberg (2008) and Grossberg and Seidman (2006) the visual cortex, but 50 ms before object recognition area. This
and the references therein. We have dozens of such sensors, suggests that OFC represents the cortical source of top-down
measuring sugar level in blood, body temperature, pressure at facilitation in visual object recognition. This top-down facilitation
various parts, etc. was unconscious. In addition they demonstrated that the imagined
According to instinctual–emotional theory (Grossberg & Levine, image generated by top-down signals facilitated from OFC to
1987), communicating satisfaction or dissatisfaction of instinctual cortex is vague, similar to the closed–open eye experiment.
needs from instinctual parts of the brain to decision making parts Conscious perception of an object occurs when vague projections
of the brain is performed by emotional neural signals. The word become crisp and match the crisp and clear image from the retina,
emotion refers to several neural mechanisms in the brain (Juslin & and an object recognition area is activated.
L. Perlovsky / Neural Networks 22 (2009) 247–257 249

2.3. The knowledge instinct and neural modeling field theory a probability of them actually being present. Similarity L accounts
for all combinations of signals n coming from any model h, hence
The process of matching concept-models in memory to the huge number of items H N in Eq. (1); this is a basic reason
bottom-up signals coming from sensory organs is necessary for for combinatorial complexity of most algorithms. From time to
perception; otherwise an organism will not be able to perceive time a system forms a new concept-model, while retaining an old
the surroundings and will not be able to survive. Therefore one as well; alternatively, old concepts are sometimes merged or
humans and higher animals have an inborn drive to fit top-down eliminated. This requires a modification of the similarity measure
and bottom-up signals. We call this mechanism the instinct for (1); the reason is that more models always result in a better fit
knowledge (1991 (Perlovsky, 2006a)). This mechanism is similar between the models and data. Therefore similarity (1) has to be
to other instincts in that our mind has a sensor-like mechanism reduced using a ‘‘skeptic penalty function’’, p(N , M ) that grows
that measures a similarity between top-down and bottom-up with the number of models M, and this growth is steeper for a
signals, between concept-models and sensory percepts. Brain smaller amount of data N.
areas participating in the knowledge instinct were discussed in The learning instinct demands maximizing the similarity L
Levin and Perlovsky (2008). As discussed in that publication, over model parameters S. Dynamic logic maximizes similarity L
biologists considered similar mechanisms since 1950s; without while matching vagueness or fuzziness of similarity measures to
a mathematical formulation, however, its fundamental role in the uncertainty of models. It starts with any unknown values of
cognition was difficult to discern. All learning algorithms have parameters S and defines association variables f (h|n),
some models of this instinct, maximizing correspondence between X
sensory input and an algorithm internal structure (knowledge f (h|n) = r (h)l(n|h) r (h0 )l(n|h0 ). (2)
in a wide sense). According to Grossberg and Levine (1987)
h0 ∈H
instinct–emotion theory, satisfaction or dissatisfaction of every
instinct is communicated to other brain areas by emotional neural Dynamic logic determining the Modeling Field (MF) dynamics
signals. We feel these emotional signals as harmony or disharmony is given by
between our knowledge-models and the world. At lower layers
X
df (h|n)/dt = f (h|n) [δhh0 − f (h0 |n)] ·
of everyday object recognition these emotions are usually below
h0 ∈H
the level of consciousness; at higher layers of abstract and general
concepts this feeling of harmony or disharmony could be strong, [∂ lnl(n|h0 )/∂ Mh0 ]∂ Mh0 /∂ Sh0 · dSh0 /dt , (3)
as discussed in Perlovsky (2006b) it is a foundation of our X
dSh /dt = f (h|n)[∂ lnl(n|h)/∂ Mh ]∂ Mh /∂ Sh , (4)
higher mental abilities. We summarize now a mathematical theory
n∈N
combining the discussed mechanisms of cognition as interaction
between top-down and bottom-up signals at a single layer in here
multi-layer heterarchical system following Perlovsky (2006a).
δhh0 is 1 if h = h0 , 0 otherwise. (5)
Neurons are enumerate by index n = 1, . . . , N. These neurons
receive bottom-up input signals, X(n), from lower layers in the Initially, parameter values are not known, and uncertainty of
processing heterarchy. The word heterarchy is used by many partial similarities is high (e.g., if l(n|h) is modeled by Gaussian
neural and cognitive scientists to designate that the mind is functions, variances are high). So the fuzziness of the association
organized in an approximate hierarchy; this hierarchy is not exact, variables is high. In the process of learning, models become more
cross-layer interactions are abundant (Grossberg, 1988); we would accurate, and association variables more crisp, as the value of
use hierarchy for simplicity. X(n) is a field of bottom-up neuronal the similarity increases. The number of models is determined
synapse activations, coming from neurons at a lower layer. Top- in the learning process. The system always keeps a store of
down, or priming signals to these neurons are sent by concept- dormant models, which are vague, have low r (h), and do not
models, Mh (Sh , n); we enumerate models by index h = 1, . . . , H. participate in the parameter fitting; only their parameters r (h)
Each model is characterized by its parameters, Sh . Models represent are updated. When r (h) exceeds a threshold, a model is activated;
signals in the following sense. Say, signal X(n), is coming from correspondingly, an active model is deactivated when its r (h) falls
sensory neurons activated by object h, characterized by parameters below the threshold. MF organization is similar to ART (Carpenter
Sh . These parameters may include position, orientation, or lighting & Grossberg, 1987) in that it models interaction between bottom-
of an object h. Model Mh (Sh , n) predicts a value X(n) of a signal at up and top-down signals. It is different in that it fits all models in
neuron n. For example, during visual perception, a neuron n in the parallel.
visual cortex receives a signal X(n) from retina and a priming signal Dynamic logic process always converges (Perlovsky, 2000); it is
Mh (Sh , n) from an object concept-model h. A neuron n is activated proven by demonstrating that at each time step in Eqs. (3) and (4)
if both a bottom-up signal from lower layer input and a top-down (as long as the bottom-up signals remain constant), the knowledge
priming signal are strong. Various models compete for evidence instinct (1) increases; thus dynamic logic and the knowledge
in the bottom-up signals, while adapting their parameters for instinct are mathematically equivalent.
better match as described below. This is a simplified description of
perception. Models Mh specify a field of primed neurons {n}, hence 2.4. Perception example
the name for this modeling architecture, modeling fields.
The knowledge instinct maximizes a similarity measure Here we illustrate the developed technique with an example
between top-down and bottom-up signals, described in Perlovsky (2006a), which demonstrates that the
YX described theory can find patterns below noise at about 100
L({X}, {M}) = r (h)l(n|h). (1) times better in terms of signal-to-noise ratio, than previous state-
n∈N h∈H
of-the-art algorithms. The reason for choosing such an example
Here l(n|h) is a partial similarity of a bottom-up signal in pixel is to demonstrate, in a relatively simple way, that engineering
n given that it originated from concept-model h; functional shape algorithms based on the mind cognitive mechanisms significantly
of l(n|h) often can be taken as a Gaussian function of X(n) with the exceed capabilities of ad hoc algorithms (Fig. 1).
mean Mh (Sh , n). Partial similarities are normalized on objects (or As exact pattern shapes are not known and depend on unknown
concepts) h being definitely present, and coefficient r (h) estimate parameters, these parameters should be found by fitting the
250 L. Perlovsky / Neural Networks 22 (2009) 247–257

Fig. 1. Finding ‘smile’ and ‘frown’ patterns in noise, an example of dynamic logic operation: (a) true ‘smile’ and ‘frown’ patterns are shown without noise; (b) actual image
available for recognition (signal is below noise, signal-to-noise ratio is between 1/2 and 1/4); (c) an initial fuzzy blob-model, the fuzziness corresponds to uncertainty of
knowledge; (d) through (h) show improved models at various iteration stages (total of 22 iterations). Between stages (d) and (e) the algorithm tried to fit the data with more
than one model and decided, that it needs three blob-models to ‘understand’ the content of the data. There are several types of models: one uniform model describing noise
(it is not shown) and a variable number of blob-models and parabolic models, which number, location, and curvature are estimated from the data. Until about stage (g) the
algorithm ‘thought’ in terms of simple blob-models, at (g) and beyond, the algorithm decided that it needs more complex parabolic models to describe the data. Iterations
stopped at (h), when similarity (2) stopped increasing. This example is discussed in more detail in (Linnehan et al., 2003).

pattern model to the data. At the same time it is not clear which 3. Extension to language
subset of the data points should be selected for fitting. A previous
state-of-the-art algorithm, multiple hypothesis testing (Singer, All linguistic theories, as reviewed at the beginning of the
Sea, & Housewright, 1974) tries various subsets. In difficult cases, paper, are formulated as logical systems, and face combinatorial
all combinations of subsets and models are exhaustively searched, complexity. This is possibly why computers do not understand
leading to combinatorial complexity. In the current example the human language, and in particular, Google, Yahoo, and other search
searched patterns are shown in Fig. 2(a) without noise, and in engines, while being immensely useful, cause so much frustrations
Fig. 2(b) with noise, as actually measured. Direct search through to their users. Extension of dynamic logic to language promises to
all combinations of models and data leads to complexity of M N = remedy the situation. Here we briefly summarize this extension
105000 . A search in parameter space yields less complexity. Each following Perlovsky (2006c). The challenge in extending dynamic
pattern is characterized by a 3-parameter parabolic shape plus the logic to language has been in substituting derivatives in Eqs. (3) and
4th parameter for the average signal strength. The image size is (4) with equivalent procedures suitable for linguistic constructs
100 × 100 points, and the true number of patterns is 3, which is that are essentially discrete, non-differentiable structures. For
not known. Therefore, at least 4 patterns should be fit to the data, example, consider a phrase ‘‘Leonid sits in a chair’’. A language
to decide that 3 patterns fit best. Fitting 4 × 4 = 16 parameters to learning procedure should be able to figure out that the gist of
100 × 100 grid by a brute-force testing would take about 1040 to this phrase, its most essential part is {sit, chair}; ‘‘Leonid’’ can be
1042 operations, still a prohibitive computational complexity. substituted by many other nouns, and ‘‘in’’, ‘‘a’’ are even more
The models and conditional similarities for this case are dispensable. The main idea of dynamic logic is learning sentence
described in detail in Linnehan et al. (2003): a uniform model for structures not by trying all possible combinations of words, but by
noise, Gaussian blobs for highly fuzzy, poorly resolved patterns, taking a ‘‘derivative’’ of a phrase with respect to constituent words.
and parabolic models for the patterns of interest. The number But of course standard mathematical definition of a derivative is
of computer operations in this example was about 109 . Thus, a not applicable to this situation in principle. Language constructs
problem that was not solvable due to CC becomes solvable using are essentially discrete and non-differentiable.
dynamic logic. A suitable derivative-like procedure was described in Perlovsky
In this example dynamic logic performs better than the human
(2006c). Here we summarize it for a word–phrase layer; where
visual system. This is understood due to the fact that the human
bottom-up signals are comprised of words, top-down models
visual system is optimized for different type of images, not for
are phrases, and these phrase-models are learned without
parabolic shapes in noise.
combinatorial complexity. The bottom-up input data, X(n), in this
An ability of dynamic logic to extract signals from strong
‘‘phrase-layer’’ MF system, are word strings, for simplicity, of a
noise and clutter was used in many applications; we would
fixed length, S, X(n) = {wn+1 , wn+2 . . . wn+S }. Here wn are words
mention here an application to EEG signals (Kozma, Deming,
from a given dictionary of size K , W = {w1 , w2 . . . wK }, and n is the
Perlovsky, Levine, & Perlovsky, 2007). Potentially, EEG signals
word position in a body of texts. A simple phrase-model often used
contain information about brain cognitive events; detecting these
in computational linguistics is ‘‘a bag of words’’, that is, a model is
signals and estimating their parameters could be utilized to allow
a subset of words from a dictionary, without any order or rules of
quadriplegics to move a computer cursor or steer their wheelchairs
grammar,
with their thoughts; or those playing computer games could
control actions on the screen with their thoughts. The difficulty MLh (Sh , n) = {wh,1 , wh,2 . . . wh,S }. (6)
is that EEG signals are notoriously noisy. The referenced article
describes a dynamic logic algorithm for extracting cognitively A superscript L here denotes a language model, the param-
related events from EEG. eters of this model are its words, MLh (Sh , n) = Sh = {wh,1 ,
L. Perlovsky / Neural Networks 22 (2009) 247–257 251

Fig. 2. Hierarchical integrated language–cognition MF system. ‘‘Heterarchy’’ refers to cross-layer connections, not shown, and to the consequence that the hierarchical
structure is not logically strict as may appear from the figure. At each layer in a hierarchy there are integrated language and cognition models (thick arrow). Similarities are
integrated as products of language and cognition similarities. Initial models are fuzzy placeholders, so integration of language and cognition is sub-conscious. Association
variables depend on both language and cognitive models and signals. Therefore language model learning helps cognitive model learning and v.v. High-layer abstract cognitive
concepts are grounded in abstract language concept similarities.

wh,2 . . . wh,S }. The language learning (traditionally called language p(h|s) = (2π σh2 )−1/2 exp{−0.5δ(h, s)/σh2 };
acquisition in Chomskyan linguistics) consists in defining mod- X
els–concepts–phrases best characterizing the given body of texts φ(s|h) = p(h|s) p(h|s0 ). (10)
in terms of a similarity measure. s0 ∈h
Conditional partial similarities between a string of text, X(n),
and a model MLh could be defined by a proportion of the matches Here, δ , is an average distance of the word wh,s from its phrase-
model, h. A non-normalized measure of belonging of word s to
between the two sets, X(n) and MLh , l(n|h) = |X(n) ∩ MLh |/S.
phrase h is given by p(h|s). It is used to define probabilistic-like
Thus, similarity (1) is defined and it could be maximized over
vague phrase contents, φ (s|h).
the unknown parameters of the system, {Sh }, that is, over the
This is now used to define the dynamics of the word contents
word contents of phrases. Maximization of this language similarity
of the phrase-models in the dynamic logic process as follows.
gives a mathematical formulation of the language instinct (Pinker,
Consider a simple example, let us limit the problem to learning
1994). The language instinct mechanism is mathematically similar
phrases of a fixed length, say, we would like to learn 5-word
to the knowledge instinct; the main difference is that the
phrases. Start with a large value of S >> 5 (e.g. S = 50), and with
language instinct maximizes similarity between language models
arbitrary word contents of phrases (6). In each iteration, compute
and language data. Without dynamic logic, the computational
Eq. (7) through (10). Reduce S by 1: in each phrase-model eliminate
complexity of similarity maximization becomes combinatorial
∼K(H N S) , in other words, unsolvable.
∗ ∗ 1 word with the minimal φ(s|h),
The main idea of a dynamic-logic-like procedure is to start for each h, find s0 = argmins φ(s|h), (11)
with many long ‘‘phrases’’, just sets of words from a given text
corpus. Then through a series of iterations, reduce them to a wh,s0 is the least ‘‘probable’’ word in model h and it is eliminated
limited number of short phrases that represent the text corpus on this iteration. S is changed to S-1. Continue iterations until S
best in terms of the similarity measure. At each iteration, ‘‘phrases’’ reaches the desired value 5.
are made shorter by deleting words that least belong to them. The dynamic logic procedure defined in this way results in
This is an equivalent of ‘‘linguistic derivatives’’. To implement learning phrase-models and accomplishes the goal of language
this procedure mathematically, a conditional similarity measure (a acquisition without combinatorial complexity. The computational
similarity between one word sequence, X(n), and one model, MLh ) complexity is moderate, ∼N∗ H∗ K∗ S. This overcoming of CC is the
is defined as follows: major accomplishment of this procedure.
( ) The procedure outlined in this section is general in that it
2 −S /2 is applicable to all higher layers in the mind hierarchy and to
X
l(n|h) = (2π σ ) h exp −0.5 e(n, h, s) /σ
2 2
h . (7)
s
cognitive as well as language models. Similarly to learning here
phrase-models composed of words, at higher layers, composition
Here e(n, h, s) is a distance (measured in the number of words) of larger chunks of text from smaller chunks of texts can be
between the middle of the word sequence X(n), and the closest learned. Similarly, cognitive models of higher layers, say images
occurrence of the word wh,s ; the sum here is over words belonging composed of objects can be learned. This procedure provides a
to the phrase-model h. The search for the nearest word is limited mathematical foundation for perceptual symbol systems described
to X(n) (S words), and e(n, h, s) falling outside this range can in Barsalou (1999). In particular the mathematical procedure
be replaced by a (S /2 + 1). Variance, determining vagueness of for his ‘‘simulators’’, which, as he suggested, support abilities
this similarity measure, is given similarly to standard variance for abstract concepts, propositions, and productivity that is an
estimation in dynamic logic (Perlovsky, 2006a), and association ability to produce limitless structures from finite amount of basic
variables f (h|n) are defined as in (2), components.
X X Lower layer models, may require continuous parametric
σh2 = (h|n) e(n, h, s)2 /Nh . (8)
models, like laryngeal models of phonemes (Lieberman, 2002).
n s
These can be learned from language sounds using procedures
A measure of ‘‘belonging’’ of the word s to phrase h, φ (s|h), is similar to Section 2.
defined as follows: An important property of the procedure described in this
section, let me repeat, is productivity. This word is used to describe
X
δ(h, s) = f (h|n)e(n, h, s)2 /Nh ; (9)
n
the ability of the human mind to use finite means (words, concepts)
252 L. Perlovsky / Neural Networks 22 (2009) 247–257

to produce virtually infinite (combinatorially large) number of ‘‘state’’, ‘‘rationality’’. This explains how it is possible that kids
more complex structures. In linguistics, traditionally, productivity by the age of five can talk about most of the contents of the
has been associated with recursion (Barsalou, 1999; Chomsky, surrounding culture but cannot function like adults: language
1995; Hauser, Chomsky, & Fitch, 2002; Pinker, 1994), while non- models are acquired ready-made from the surrounding language,
combinatorial mathematical procedures for acquisition-learning but cognitive models remain vague and gradually acquire concrete
of this ability have not been demonstrated. Here we demonstrated contents throughout life. This is the neural mechanism of what is
a mathematical procedure for non-combinatorial learning, which colloquially called ‘‘acquiring experience’’.
results in combinatorially powerful productivity. Recursion is Human learning of cognitive models continues through the
implemented through the hierarchy. This section considered in lifetime and is guided by language models. The knowledge instinct
detail two-layer hierarchy, words–phrases; it could be extended drives the human mind to develop more specific and concrete
in a straightforward way down to morphology in linguistics or cognitive models by accumulating experience throughout life in
feature-object representation in perception, as well as up to higher correspondence with language models.
levels of language and cognition.
4.2. Experimental evidence, answers and questions
4. Cognition and language
As mentioned, experimental evidence for the dual model
is almost nil. The first experimental indication has appeared
Do we use phrases to label situations that we already have
in Franklin et al. (2008). Those researchers demonstrated that
understood, or the other way around, do we just talk without
categorical perception of color in prelinguistic infants is based
understanding any cognitive meanings? It is obvious that different
in the right brain hemisphere. As language is acquired and
people have different cognitive and linguistic abilities and may
access to lexical color codes becomes more automatic, categorical
tend to different poles in cognitive-language continuum, while
perception of color moves to the left hemisphere (between two and
most people are somewhere in the middle in using cognition
five years) and adult’s categorical perception of color is based in the
to help with language, and vice versa. What are the neural
left hemisphere (where language mechanisms are located).
mechanisms that enable this flexibility? How do we learn which These experiments have provided evidence for neural connec-
words and objects come together? If there is no specific language tions between perception and language, a foundation of the dual
module, as assumed by cognitivist linguists, why kids learn model. Possibly it confirms another aspect of the dual model: the
language by 5 or 7, but do not think like adults? crisp and conscious language part of the model hides from our
Virtually nothing is known about neural mechanisms integrat- consciousness vaguer cognitive part of the model. This is sim-
ing language and cognition. Here we propose a computational ilar to what we observed in the closed–open eye experiment:
model that potentially can answer the above questions, and that is with opened eyes we are not conscious about vague imagination-
computationally tractable, it does not lead to combinatorial com- priming signals.
plexity. It is experimentally testable. Also it implies relatively sim- So, we can answer some of the questions posed at the beginning
ple neural mechanisms, and explains why human language and of the section. Language and cognition are separate and closely
human cognition are inextricably linked. It suggests that human related mechanisms of the mind. They evolve jointly in ontological
language and cognition have evolved jointly. development and learning, and possibly these abilities evolved
jointly in evolution—this we address in more detail in the next
4.1. Dual model section. This joint evolution of dual models from vague to more
crisp content resolves the puzzle of associationism: there is no
We propose that integration of language and cognition is need to learn correct associations among combinatorially large
accomplished by a dual model. Every model in the human mind number of possible associations, words and objects are associated
is not separately cognitive or linguistic, and still cognitive and all the time while their concrete contents emerge in the mind.
linguistic contents are separate to a significant extent. Every Perception of objects that can be directly perceived by sensing
concept-model Mh has two parts, linguistic MLh and cognitive MCh : might be to some extent independent from language, nevertheless,
as above experimental data testify, even in these cases language
Mh = {MCh , MLh }. (12) affects what we perceive. In more complex cognition of abstract
ideas, which cannot be directly perceived by the senses, we
A sensor data stream constantly comes into the mind from all conclude that language parts of models are more crisp and
sensory perceptions; every part of this data stream is constantly conscious; language models guide the development of the content
evaluated and associated with models (12) according to the of cognitive models. Language models also tend to hide cognitive
mechanisms of dynamic logic described above. In a newborn mind contents from consciousness. It follows that in everyday life most
both types of models are vague mostly empty placeholders for thinking is accomplished by using language models, possibly with
future cognitive and language contents. The neural connections little engagement of cognitive contents.
between the two types of models are inborn; the mind never We know that thinking by using cognitive contents is possible,
has to learn which word goes with which object. As models for example when playing chess. Mathematics is another example,
acquire specific contents in the process of growing up and learning, but not necessarily a good one, because mathematics uses its own
linguistic and cognitive contents are always staying properly ‘‘language’’ of mathematical notations. Do mathematical notations
connected. play a similar role to language in shadowing cognitive contents
During the first year, infants learn some objects and situations and thinking? My guess would be that this is the case for a C
in the surrounding world, this means that cognitive parts of some student of mathematics, but creative thinking in mathematics
models at the layer of objects and situations become less vague and in any other endeavor engages cognitive models. Needless to
and acquire a degree of specificity. Language models at the layer say, this requires special abilities and significant effort. Possibly
of objects and above remain vague. After one year of age, language fusiform gyrus plays a role in cognition shadowed by language.
model adaptation speeds up; language models become less vague More detailed discussion of possible brain regions involved in the
and more specific much faster than the corresponding cognitive knowledge instinct are discussed in Levin and Perlovsky (2008).
models. This is especially true about contents of abstract models, This is a vast field for experimental psychological and neuro-
which cannot be directly perceived by the senses, such as ‘‘law’’, imaging investigations.
L. Perlovsky / Neural Networks 22 (2009) 247–257 253

4.3. Dual hierarchy would encounter a situation such as follows: entering a room one
sees a desk, a chair, books, shelves. . . and a barely seen scratch on
Previous sections focused mostly on processes at a single layer a wall. Is this scratch on a wall as important as other objects? Is
in the mind hierarchy. Contents of lower layers in the hierarchy are the fact that a red-color book is on the left from a blue-color one
perceptual elements, objects; higher up are relationships among important and should be learned as a separate situation-model?
objects, situations, more and more abstract and general model- In fact there are much more insignificant objects and infinity of
concepts etc., and near the top are the most general concepts of the their combinations, such as scratches, dust, relative positions, etc.,
purpose and meaning of life (Levin & Perlovsky, 2008; Perlovsky, than important objects and their combinations of significance for
2006a, 2006b). At every layer there is a similarity measure defining any situation. No one would have enough experience in a lifetime
the knowledge instinct, models, emotions, and actions, including to learn which objects and in which combinations are important
adaptation of models. An input to each layer is a set of signals X(n). and which are not for each situation. Only from language do we
The result of signal processing at a given layer are activated models, learn what is typical and important for understanding various
or concepts h recognized in the input signals n; these models situations. Even more complicated is learning of abstract concepts,
along with the corresponding instinctual signals and emotions may which cannot be perceived by the senses directly. Many linguists
activate behavioral models and generate behavior at this or lower beginning from Chomsky have under-appreciated this fact and
layers. The activated models also send their activation signals to developed linguistic theories assuming that theories of cognitive
the next higher layer. meaning should be handed down by professors from cognitive
The dual model implies two parallel heterarchies of language departments.
and cognition, as illustrated in Fig. 2. This architecture along with Language hierarchy is acquired by the human mind from the
dynamic logic equations in previous sections solve an amazing surrounding ready-made language. Learning language hierarchy at
mystery of the human mind, which we are so used to that it almost all layers is grounded in communication with other people around;
never has been even formulated as requiring an explanation. people talk to and understand each other. This provides grounding
Models at higher layers are composed of lower layer models for language learning. Try to teach a dog to understand the word
(say scenes are composed of objects). In parallel, language is ‘‘rational’’, or any abstract concept, whose meaning is based on
used to describe scenes linguistically with sentences composed several hierarchical layers; this is not possible. It is known that
of words. Words–object relations at lower layers are preserved the smartest chimps after long training can barely understand few
at higher layers of phrase–scene relations. This holds true across concepts at the second layer (Savage-Rumbaugh & Lewine, 1994).
tremendous number of the phrase–situation layer models, using
various combinations of the same words from the lower layer. This 4.4. Cognitive linguistics and dynamic logic
amazing property of our mind seems so obvious, that nontrivial
complexity of the required mechanism has only been noticed once The proposed theory shares many assumptions with cognitive
(Deacon, 1997). linguistics, including the embodied nature of cognition and
Deacon also suggested that the hierarchy sets the human mind language. As emphasized by Barsalou (1999), this idea belongs
apart from the animal world. Every human culture possesses to a tradition that goes back more than two millennia. Section 3
both abilities; and there is no species that has either language describes just a first step toward mathematics of cognitive
or cognition at the human level. Here we discuss mathematical linguistics, which should be extended toward evolution of
reasons why hierarchy can only exist as a joint dual hierarchy of semantics and grammar in terms of experiential constructs.
language and cognition. This dual hierarchy architecture gives a This development would utilize accounts developed by cognitive
mathematical reason for this fact. Only at the lower layers in the linguistics, and would provide mathematical foundations for these
hierarchy cognitive models can be learned by direct perception developments.
of the world. Learning is grounded in ‘‘real’’ objects. At higher The interaction between cognition and language proposed here,
levels, however, learning of cognitive models has no ground. In while being just the first step, resolves long-standing problems of
artificial intelligence it was long recognized that learning without cognitive linguistics. Jackendoff (1983) suggested that
grounding could easily go wrong, learned or invented models may
‘‘the meaning of a word can be exhaustively decomposed into
correspond to nothing real or useful (Meystel & Albus, 2001).
finite set of conditions. . . necessary and sufficient. . . ’’.
The mechanism of the dual model sets the human mind apart
from the rest of the animal world. Consider an example of a dog The very language used in this quote exposes the logical way
learning to bring shoes to a human master on a verbal command. of thinking, which leads to wrong conclusions. Meanings of words
A dog, it seems, can jointly learn language and cognition (a word do not reside in other words, but in words relations to real world
‘‘shoes’’ and an object shoes). Does that mean that a dog possesses situations. According to the mechanism suggested in this paper,
a dual model? No. The dog can do it, because it perceives an meanings reside in cognitive parts of the dual models.
object (shoes) in the world. Learning a word ‘‘shoes’’ is grounded in Gradually, cognitive linguistics moved away from a strictly
direct perception of object-sound ‘‘shoes’’. Such a direct grounding compositional view of language. Lakoff (1988) emphasized that
in sensory signals exists only at the very bottom of the mind abstract concepts used by the mind for understanding the world
hierarchy. At higher layers, cognitive concepts are grounded in have metaphorical structure. Metaphors are not just poetic tools,
language concepts due to the dual models, and language models, but a mind mechanism for creating new abstract meanings.
as discussed, are grounded in talking with other people and in Lakoff’s analysis brought this cultural knowledge of the role of
mutual understanding. Using the dual models, the knowledge metaphorical thinking within the mainstream of science. There
instinct drives the mind to acquire cognitive models corresponding was still a big gap between Lakoff’s analysis of metaphors on the
to language models (colloquially, ‘‘experience’’). one hand and neural and mathematical mechanisms on the other.
The fact that the cognitive hierarchy cannot be learned without The ‘‘Metaphors we live by’’ is a metaphorical book (the pun is
language hierarchy is so fundamental and under-appreciated that intended) in that it begs the question: Who is that homunculus
I would give another explanation for this reason in different in the mind, interpreting the metaphorical theater of the mind?
words. Consider learning situations on top of already learned What are the mechanisms of metaphorical thinking? According to
object perception. When deciding which set of objects constitutes the current paper, a metaphor extends an old understanding to
a concept-model that should be learned and remembered, one the new meaning by extending a vague cognitive model to a crisp
254 L. Perlovsky / Neural Networks 22 (2009) 247–257

language model and making it vaguer; this is followed by dynamic exist in a meaning free grammar; their construction and under-
logic creation of several more specific new language and cognitive standing require a combination of semantic and linguistic knowl-
models. edge. Still, existing proposals for construction grammar formalism
In works of Jackendoff (1983), Langacker (1988), Talmy (1988) are logical and combinatorially complex. The dual model, instead,
and other cognitive linguists it was recognized that dichotomies of provides a necessary mechanism—cognitive and linguistic models
meanings (semantic–pragmatic) and dichotomies of hierarchical act jointly.
structures (superordinate–subordinate) were limiting scientific Another proposal to combine cognition and language is given
discourse and have to be overcome. Consider the following by Fauconnier and Turner (2008). They proposed what they called
opinions on meaning creation: a double-scope blending, without however a specific mechanism
‘‘in a hierarchical structure of meaning determination the su- to accomplish this. Its function is similar to the dual hierarchy
perordinate concept is a necessary condition for the subordi- proposed in this paper.
nate one. . . COLOR is a necessary condition for determining the Dual hierarchy supports the cognitive linguistic idea that syntax
meaning of RED’’ (Jackendoff, 1983). is not a separate inborn ‘‘box’’ in the mind, but is a conceptual
mechanism. Specifics of syntax according to this paper are
‘‘The base of predication is nothing more than. . . domains which encoded in concept-model contents at higher layers of phrases and
the prediction actually invokes and requires’’ (Langacker, 1988). sentences. It is possible that given a mechanism of the dual model,
These judgments illustrate difficulties encountered when the hierarchy would evolve and syntax would be learned from
attempting to overcome old dichotomies. Logical intuitions guide surrounding language. To which extent syntax reflects structures
these judgments and limit their usefulness. Attempts to implement in the world that could be directly learned along with language and
mathematically mechanisms assumed by these examples would encoded in cognitive and language models? In addition to the dual
lead to combinatorial complexity. Problems of meaning and model, what other linguistic knowledge must be inborn? What is
hierarchy still reminded the old question about the chicken and the role of dynamic logic and the dual model in morphology? It is
the egg, what came first? If superordinate concepts come before possible that the dual model is the only mechanism that is required
subordinate ones, where do they come from? Are we born with to enable language and to set us aside from animals. Demonstrating
the concept COLOR in our minds? If predications invoke domains, this in simulations is a challenge for future research. If this could
where do domains come from? These complex questions with be demonstrated, Occam’s Razor would favor the dual model.
millennial pedigrees are answered mathematically in this paper.
Hierarchy and meaning are emerging jointly with cognition and 4.5. Evolutionary linguistics and dynamic logic
language. In evolution and individual learning, superordinate
concepts (COLOR) are vaguer, less specific, and less conscious than Evolutionary linguistics emphasizes that language properties
subordinate ones (RED). RED can be vividly perceived, but COLOR evolved in the process of cultural evolution of languages (Chris-
cannot be perceived. RED can be perceived by animals. But, the tiansen & Kirby, 2003). Only those properties of languages sur-
concept COLOR can only emerge in the human mind, due to joint vive that can be passed from generation to generation. Christiansen
operation of language and cognition. and Chater (2008) discussed how various linguistic phenomena
Jackendoff in his recent research (2002) concentrated on are explained within the evolutionary framework. Brighton et al.
unifying language and cognition. He developed detailed models for (2005) demonstrated in mathematical simulation that evolution-
such unification; however, his logical structures face combinatorial ary approach can explain the emergence of language composition-
complexity. ality. Compositionality, the language ability to construct words of
Lakoff and Johnson (1999) brought within the realm of sounds, phrases of words etc., is a fundamental language univer-
linguistics an emphasis on embodiment of the mind. The sal, unique to human languages. Brighton et al. ’s research is espe-
implication that the philosophical tradition will have to be cially elegant in that most simple assumptions were required, first
reassessed, however, seems exaggerated. Recent synthesis of an ability to learn statistically from limited examples which sounds
computational, cognitive, neural, and philosophical theories of the go with which cognitive meanings, and second, the fact that train-
mind demonstrated the opposite (Perlovsky, 2000). Plato, Aristotle, ing samples are insufficient and agents have to guess sounds for
and Kant, even in specific details about the mind mechanisms, new meanings; so meanings that are similar in a certain way to
were closer to contemporary computational theories, than the the old ones are designated by sounds similar in some ways to the
20th century philosophers and mathematicians developing logical old sounds. This has led to compositionality.
formalism and positivism. Unfortunately, most work in evolutionary linguistics so far as-
Talmy (2000) introduced a notion of open and closed classes of sumed that cognitive meanings already exist. Another detrimental
linguistic forms. Open class includes most words, which could be aspect of existing work is logical computational basis, leading to
added to language as needed, say, by borrowing from other lan- combinatorial complexity. Dynamic logic, if applied to Brighton at
guages. Closed class includes most grammatical structures, which al’s formulation, leads to non-combinatorial complexity of learn-
are fixed for generations and cannot be easily borrowed from other ing and production. If combined with the dual model, it does not
languages. This pointed to an important aspect of interaction be- require an assumption that the meanings have already existed; a
tween language and cognition. Forms of the closed class interact result is the joint learning of combinatorial language and cognitive
with cognitive concepts, which emerged over thousands of years meanings (Fontanari, Tikhanoff, Cangelosi, & Perlovsky, 2009). The
of cultural and language evolution. Thus, for each individual mind mathematical formulation in this paper leads to a joint evolution
and for entire generations, which operate within constrains of ex- of language and cognitive meanings.
isting grammar, many cognitive concepts are predetermined by
language. Talmy identified cognitive concepts affected by closed 4.6. Contents of language faculty
forms. These forms are more basic for cognition than words and
unconsciously influence entire cultures. Hauser et al. (2002) emphasized that ‘‘language is, fundamen-
Kay (2002) proposed construction grammar (a direction closely tally, a system of sound–meaning connections’’. This connection is
associated with cognitive linguistics) to accommodate metaphoric accomplished by a language faculty, which generates internal rep-
and idiomatic linguistic constructions. Within generative- resentations and maps them into the sensory–motor interface, and
Chomskyan-type grammar these could only be understood by into the conceptual–intentional interface. In this way sound and
rejecting word–phrase distinction and adopting a word–phrase meaning are connected. They emphasized that the most impor-
continuum, which is of course impossible as it would lead to com- tant property of this mechanism is recursion. However, they did
binatorial complexity. These constructions, he suggested, cannot not propose specific mechanisms how recursion creates represen-
L. Perlovsky / Neural Networks 22 (2009) 247–257 255

tations, nor how it maps representations into the sensory–motor are much less differentiated than humans’; conceptual, emotional,
or conceptual–intentional interfaces. and behavioral contents are not separated, they are unified.
A conclusion of the current paper is that it might not be Vocalizations are not as voluntary (if at all) as humans’. Evolution
necessary to postulate recursion as a fundamental property of of language required separation of conceptual, semantic contents
a language faculty. In terms of the mathematical model of from emotional ones, and from involuntary vocalizations.
modeling fields and dynamic logic proposed here, recursion is In this evolution language loses its original emotionality and
accomplished by the hierarchy: a higher layer generates lower acquires an ability to approximate an amodal symbol system.
layer models, which accomplish recursive functions. We have This ability is important for the evolution of high culture, science,
demonstrated that the dual model is a necessary condition for the technology. However, an amodal symbol system is not an ‘‘ideal’’
hierarchy of cognitive representations. It also might be a sufficient evolutionary goal of language. To have meaning, to support
one. It is expected that the hierarchy is not a separate inborn cognition, language must be unified with cognition into the dual
mechanism; the hierarchy might emerge in operations of the dual hierarchy. As discussed, this involves the knowledge instinct
model and dynamic logic in a society of interacting agents with and aesthetic emotions. Language without emotionality loses
intergenerational communications (along the lines of Brighton this ability to unify with cognition. Amodal symbols are logical
et al. (2005)). What exactly are the inborn precursors necessary constructs, and as we discussed throughout the paper, logic is not a
for the hierarchy’s ontological emergence, if any, is a challenge fundamental mind mechanism, but is only a final result of dynamic
for the ongoing research. Anyway, reformulating the property of logic, and likely involves only a tiny part of the mind operations.
recursion in terms of a hierarchy, along with demonstrating that As language was evolving to less emotional, more semantic
a hierarchy requires the dual model, the paper has suggested a ability that differentiated psyche, another part of primordial
new explanation that a single neurally simple mechanism is unique vocalization was evolving toward more emotional, less semantic
for human language and cognitive abilities. Initial experimental ability that maintained a unity of psyche. This ability evolved
evidence indicates a support for the dual model; still further into music (Perlovsky, 2008). Cultures with semantically rich
experiments elucidating properties of the dual model are needed. languages also evolved emotionally rich music (Perlovsky, 2006d,
Another conclusion of this paper is that the mechanism 2007). Songs affect us by unifying semantic contents of lyrics
mapping between linguistic and cognitive representations is with emotional contents of music, which are perceived by ancient
accomplished by the dual models. In previous sections we emotional brain centers. The same mechanism still exists to a
considered mathematical modeling of the ‘‘conceptual–intentional lesser degree in languages; all existing languages retain some
interface’’ for intentionality given by the knowledge and language degree of emotionality in their sounds. Therefore languages, while
instincts; in other words we considered only intentionalities evolving amodal symbol abilities, still retain their vocal modalities,
related to language and knowledge. It would not be principally and otherwise they will not be able to support the process of
difficult to add other types of intentional drives following cognitive and cultural evolution. Symbolic structures of the mind
Grossberg and Levine (1987). The current paper has not considered are mathematically described by dynamic logic; the mind symbols
the ‘‘sensory–motor interface’’, which of course is essential for are not static categories, but the dynamic logic processes. Due
language production and hearing. This can be accomplished by the to the dual model they combine cognition and language, and
same mechanism of the dual model, with addition of behavioral can approximate amodal properties. The dual model suggests
and sensorial models. This task is not trivial; still it does not present experimentally testable neural mechanisms combining two modal
principal mathematical difficulties. (perceptual, voice) symbol systems with the amodal ability.
We would also like to challenge an established view that
specific vocalization is ‘‘arbitrary in terms of its association 6. Future research
with a particular context’’. In animals, voice directly affects an-
cient emotional centers. In humans these affects are obvious in The proposed mechanism of the dual model implies a relatively
songs, still persist in language to a certain extent. It follows that minimal neural change from the animal to the human mind.
sound–intentional interface is different from language–cognition It possibly has emerged through combined cultural and genetic
interface modeled by the dual model. The dual model frees lan- evolution and this cultural evolution most likely continues today.
guage from emotional encumbrances and enables abstract cog- Dynamic logic resolves a long-standing mystery of how human
nitive development to some extent independent from primitive language, thinking, and culture could have evolved in a seemingly
ancient emotions. Arbitrariness of vocalization (even to some ex- single big step, too large for an evolutionary mutation, too fast and
tent) could only be a result of long evolution of vocalizations from involving too many advances in language, thinking, and culture,
primordial sounds (Perlovsky, 2007). Understanding evolutionary happening almost momentarily around 50,000 years ago (Deacon,
separation of cognition from direct emotional–motivational con- 1997; Mithen, 1998). Dynamic logic along with the dual model
trol from immediate behavioral connections are other challenges explains how changes, which seem to involve improbable steps
for future research. according to logical intuition, actually occur through continuous
dynamics. The proposed theory provides a mathematical basis
5. Symbols: Perceptual, emotional, amodal for concurrent emergence of hierarchical human language and
cognition. The dual model is directly testable experimentally.
The proposed dual model for language–cognition interaction The next steps would further develop this theory in multi-agent
bears on several aspects of a long-standing debate about the simulations, leading to more specific and experimentally testable
nature of representations of symbols in human symbolic thought predictions.
(see Barsalou (1999); and the discussions therein). Specific Classic ideas of frames and schema were demonstrated to face
discussion poles centered on perceptual vs. amodal symbols combinatorial complexity, and this is the fundamental reason for
with distinguished scientific schools on each side. The dual their failures despite decades of development. The proposed theory
model suggests that structure of cognitive models may implement produces conceptual relations, binding, and recursion through
the Barsalou outline for perceptual symbols; and their learning the mechanisms of the hierarchy; in addition it explains how
mechanism follows the procedure of Section 3. The other side language acquires meanings in connecting with cognition due
of human thinking, usually associated with amodal symbols, is to the mechanism of the dual model. These predictions are
represented to significant extent in language models. experimentally testable and several groups of psychologists and
Is language an amodal symbol system? To answer this neuroscientists are working in these directions.
question let us look into the evolution of language from animal Evolutionary linguistics and cognitive science have to face a
vocalizations (Perlovsky, 2004, 2006c, 2008). Animal vocalizations challenge of studying and documenting how primordial fused
256 L. Perlovsky / Neural Networks 22 (2009) 247–257

model differentiated into several significantly independent mech- References


anisms. In animal minds emotion–motivation, conceptual under-
standing, and behavior-voicing have been undifferentiated unity, Bar, M., Kassam, K. S., Ghuman, A. S., Boshyan, J., Schmid, A. M., Dale, A. M., et al.
(2006). Topdown facilitation of visual recognition. Proceedings of the National
their differentiation is a hallmark of human evolution. Was this a Academy of Sciences USA, 103, 449–454.
single step, or could evolutionary anthropology document several Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22,
steps, when different parts of the model differentiated from the 577–660.
primordial whole? Brighton, H., Smith, K., & Kirby, S. (2005). Language as an evolutionary system.
Physics of Life Reviews, 2(3), 177–226.
The paper solved several principled mathematical problems, Carpenter, G. A., & Grossberg, S. (1987). A massively parallel architecture for a self-
which involved combinatorial complexity, when using previously organizing neural pattern recognition machine. Computer Vision, Graphics, and
considered mechanisms inspired by logical intuition. Initial neuro- Image Processing, 37, 54–115.
imaging evidence supports dynamic logic mechanism discussed in Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge: MIT Press.
Chomsky, N. (1981). Principles and parameters in syntactic theory. In N. Hornstein,
this paper, still much remains unknown, a wide field is ready for & D. Lightfoot (Eds.), Explanation in linguistics. the logical problem of language
theoretical and experimental research. acquisition. London: Longman.
Dynamic logic was experimentally demonstrated for percep- Chomsky, N. (1995). The minimalist program. Cambridge: MIT Press.
Christiansen, M. H., & Chater, N. (2008). Language as shaped by the brain. Behavioral
tion of a single object; these experiments should be extended to
and Brain Sciences, 31(5), 489–509.
perception of multiple objects in complex context, as well as for Christiansen, M. H., & Kirby, S. (2003). Language evolution. New York: Oxford Univ.
higher level cognition. Possibly experiments could probe more de- Press.
tails of top-down and bottom-up signal interactions. Croft, W., & Cruse, D. A. (2004). Cognitive linguistics. Cambridge: Cambridge
University Press.
The paper challenges the idea that recursion is the main fun- Deacon, T. W. (1997). The symbolic species: The co-evolution of language and the brain.
damental mechanism setting human language apart from animal New York: Norton.
abilities. It is proposed instead that recursion is accomplished by Evans, V., & Green, M. (2006). Cognitive linguistics: An introduction. Edinburgh:
the hierarchy. The fundamental mechanism enabling the hierar- Edinburgh University Press.
Fauconnier, G., & Turner, M. (2008). The origin of language as a product of the
chy, recursion, and connection of language and cognition is the evolution of modern cognition. In B. Laks, et al., (Eds.), Origin and evolution of
dual model. languages: approaches, models, paradigms. London: Equinox.
Detailed experimental evidence is needed for the dual model, Fontanari, J. F., Tikhanoff, V., Cangelosi, A., & Perlovsky, L. I. (2009). A cross-
situational algorithm for learning a lexicon using neural modeling fields.
especially the proposed development of cognitive models at higher
International Joint Conference on Neural Networks,.
layers throughout life under the guidance of language models. Franklin, A., Drivonikou, G. V., Bevis, L., Davie, I. R. L., Kay, P., & Regier, T. (2008).
The paper also challenges the idea of arbitrariness of vocaliza- Categorical perception of color is lateralized to the right hemisphere in infants,
tion. It is suggested that a significant degree of arbitrariness in cur- but to the left hemisphere in adults. PNAS, 105(9), 3221–3225.
Gnadt, W., & Grossberg, S. (2008). SOVEREIGN: An autonomous neural system
rent languages is a distal result of millennia of language evolution for incrementally learning planned action sequences to navigate towards a
in presence of the dual model. Instead of assuming arbitrariness as rewarded goal. Neural Networks, 21, 699–758.
fundamental, future research should concentrate on its emergence Gödel, K., Jr. (1931/1994). In S. Feferman, J. W. Dawson, & S. C. Kleene. (Eds.),
from primordial fusion of sound and emotion. This paper suggests Collected works, Vol. I, Publications 1929–1936, New York: Oxford Univ. Press.
Grossberg, S. (1988). Neural networks and natural intelligence. Cambridge: MIT Press.
that origins of language are contemporaneous with origins of mu- Grossberg, S., & Levine, D. S. (1987). Neural dynamics of attentionally modulated
sic; primordial undifferentiated vocalizations split into two types: Pavlovian conditioning: Blocking, inter-stimulus interval, and secondary
first, more semantic and less emotional mechanisms of language, reinforcement. Psychobiology, 15(3), 195–240.
and second, less semantic and more emotional mechanisms of mu- Grossberg, S., & Seidman, D. (2006). Neural dynamics of autistic behaviors:
Cognitive, emotional, and timing substrates. Psychological Review, 113, 483–525.
sic. Whereas language differentiates consciousness, music unifies Guenther, F. H. (2006). Cortical interactions underlying the production of speech
conscious and unconscious into a whole self. It is my hope that a sounds. Journal of Communication Disorders, 39, 350–365.
wide field of experimental research directions opened by this dis- Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: What is
it, who has it, and how did it evolve?. Science, 298, 1569–1579.
cussion will be explored (certain experimental programs are al- Hurford, J. (2008). The evolution of human communication and language. In
ready being planned). P. D’Ettorre, & D. Hughes (Eds.), Sociobiology of communication: an interdisci-
Ontological emergence of the hierarchy should be modeled plinary perspective (pp. 249–264). New York: Oxford University Press.
mathematically and demonstrated experimentally; this could be Jackendoff, R. (1983). Semantics and cognition. Cambridge, MA: MIT Press.
Juslin, P. N., & Västfjäll, D. (2008). Emotional responses to music: The Need to
done in conventional psychological experiments and in neuro- consider underlying mechanisms. Behavioral and Brain Sciences, 31, 559–575.
imaging. Kay, P. (2002). An informal sketch of a formal architecture for construction
Lower hierarchical layers, below words and objects, should be grammar. Grammars, 5, 1–19.
developed theoretically, or better to say, dynamic logic should be Kozma, R., Deming, R., & Perlovsky, L. 2007. Estimation of propagating phase
transients in EEG data, application of dynamic logic neural modeling approach.
integrated with ongoing development in this area (see Guenther Proc. Int. Joint Conference on Neural Networks, IJCNN’07, 1602–1606.
(2006)). Lakoff, G. (1988). Cognitive semantics. In U. Eco, M. Santambrogio, & P. Violi (Eds.),
Mathematical simulations of the proposed mechanisms should Meaning and mental representations. Indiana University Press.
Lakoff, G., & Johnson, M. (1999). Philosophy in the flesh: The embodied mind and
be extended to engineering developments of Internet search
Its challenge to western thought. Basic Books.
engines with elements of language understanding. The next step Langacker, R. W. (1988). Foundations of cognitive grammar. Stanford: Stanford
would be developing interactive environments, where computers University Press.
will interact among themselves and with people, gradually Levine, D. S., & Perlovsky, L. I. (2008). Neuroscientific insights on Biblical myths:
Simplifying heuristics versus careful thinking: Scientific analysis of millennial
evolving human language and cognitive abilities. Developing spiritual issues. Zygon, Journal of Science and Religion, 43(4), 797–821.
intelligent computers and understanding the mind would continue Lieberman, (2002). Human language and our reptilian brain. Cambridge, MA: Harvard
to enrich each other. University Press.
To conclude, I would emphasize that the proposed approach Linnehan, R., Mutz, C., Perlovsky, L. I., Weijers, B., Schindler, J., & Brockett, R.
2003. Detection of patterns below clutter in images. Int. Conf. on Integration
of neural modeling fields, dynamic logic, and the dual model is of Knowledge Intensive Multi-Agent Systems, Cambridge, MA.
a step toward unification of basic approaches to knowledge, its Meystel, A. M., & Albus, J. S. (2001). Intelligent systems: Architecture, design, and
evolution, and learning in modern cognitive and neural sciences. control. New York: Wiley.
Mithen, S. (1998). A creative explosion? Theory of mind, language, and the
These include classic representational approaches based on logic,
disembodied mind of the Upper Paleolithic. In Steven Mithen (Ed.), Creativity
rule systems, and amodal symbols; statistical pattern recognition in Human Evolution and Prehistory (pp. 165–191). London and New York:
and learning; embodied approaches of classic empiricism, situated Routledge.
actions, and cognitive linguistics; evolutionary linguistics; connec- Perlovsky, L. I. (2000). Neural Networksand intellect: Using model based concepts. New
York: Oxford University Press.
tionism and neural networks. This step toward unification outlines Perlovsky, L. I. (2004). Integrating language and cognition. IEEE Connections, 2(2),
a wide field for future theoretical and experimental research. 8–12.
L. Perlovsky / Neural Networks 22 (2009) 247–257 257

Perlovsky, L. I. (2006a). Toward physics of the mind: Concepts, emotions, Perlovsky, L. I., & McManus, M. M. (1991). Maximum likelihood neural networks for
consciousness, and symbols. Physics of Life Reviews, 3, 23–55. sensor fusion and adaptive classification. Neural Networks, 4(1), 89–102.
Perlovsky, L. I. (2006b). Modeling field theory of higher cognitive functions. In Pinker, S. (1994). The language instinct: How the mind creates language. New York:
A. Loula, R. Gudwin, & J. Queiroz. (Eds.), Artificial Cognition Systems (pp. 64–105). William Morrow.
Hershey, PA: Idea Group. Savage-Rumbaugh, E. S., & Lewine, R. (1994). Kanzi: The ape at the brink of the human
Perlovsky, L. I. (2006c). Symbols: Integrated cognition and language. In R. Gudwin, mind. New York: J. Wiley.
& J. Queiroz (Eds.), Semiotics and Intelligent Systems Development (pp. 121–151). Singer, R. A., Sea, R. G., & Housewright, R. B. (1974). Derivation and evaluation
Hershey, PA: Idea Group. of improved tracking filters for use in dense multitarget environments. IEEE
Perlovsky, L. I. 2006d. Music–the first principles. Musical Theater; http://www.ceo. Transactions on Information Theory, 20, 423–432.
spb.ru/libretto/kon_lan/ogl.shtml. Talmy, L. (1988). The relation of grammar to cognition. In Rudzka-Ostyn (Ed.), Topics
Perlovsky, L. I. (2007). Evolution of languages, consciousness, and cultures. IEEE in cognitive linguistics (pp. 165–205). Amsterdam: Benjamins.
Computational Intelligence Magazine, 2(3), 25–39. Talmy, L. (2000). Toward a cognitive semantics. Cambridge, MA: MIT Press.
Perlovsky, L. I. (2008). Music and consciousness. Leonardo, Journal of Arts, Sciences Ungerer, F., & Schmid, H.-J. (2006). An introduction to cognitive linguistics. New York:
and Technology, 41(4), 420–421. Pearson.

Вам также может понравиться