Вы находитесь на странице: 1из 55

Modelling the acquisition of English phonology in infancy

MSc dissertation by

Christopher Pidcock
School of Cognitive Science
Division of Informatics
2 Buccleuch Place
Edinburgh, EH8 9LW
UK
chrisp@cogsci.ed.ac.uk

September 20, 2000


Contents

1 Introduction............................................................................................................................. 4
1.1 The problem of phonological development ........................................................................ 4
1.2 Current perspectives on phonological development........................................................... 5
1.2.1 From babbling to speech............................................................................................. 5
2 Models of phonological acquisition........................................................................................ 9
2.1 Early models ....................................................................................................................... 9
2.2 Cognitive and biological theories ....................................................................................... 9
2.3 Modelling using contextual information .......................................................................... 11
3 Method.................................................................................................................................. 14
3.1 Developing the Shillcock & Westermann model ............................................................. 14
3.1.1 Introduction to Latent Semantic Analysis ................................................................ 14
3.1.2 Using the LSA model ............................................................................................... 15
3.2 Model dimensions............................................................................................................. 18
3.2.1 Dimensions manipulated during modelling.............................................................. 18
3.2.2 Constant factors. ....................................................................................................... 19
3.3 Modelling using regression............................................................................................... 20
4 Results................................................................................................................................... 21
4.1 Modelling using the unbounded CHILDES corpus.......................................................... 21
4.1.1 Results....................................................................................................................... 21
4.1.2 Discussion................................................................................................................. 23
4.2 Modelling using the bounded CHILDES corpus.............................................................. 24
4.2.1 Results....................................................................................................................... 24
4.2.2 Discussion................................................................................................................. 25
4.3 Backward and forward models in the CHILDES corpus.................................................. 25
4.3.1 Forward models ........................................................................................................ 25
4.3.2 Backward models...................................................................................................... 26
4.3.3 Discussion................................................................................................................. 26
4.4 Comparison with the London-Lund corpus ...................................................................... 26
4.4.1 Unbounded speech.................................................................................................... 26
4.4.2 Bounded speech ........................................................................................................ 27
4.4.3 Discussion................................................................................................................. 27
5 Discussion............................................................................................................................. 28
5.1 The relevance of phonological features – physical factors in acquisition. ....................... 28
5.1.1.1 The degree feature ................................................................................................ 28
5.1.1.2 The labial feature .................................................................................................. 28
5.1.1.3 The sonorant feature ............................................................................................. 29
5.1.2 Physical factors and models of phonological development...................................... 29
5.2 Distributional features in acquisition................................................................................ 30
6 Conclusions........................................................................................................................... 33
6.1 Biological factors.............................................................................................................. 33
6.2 Statistical measures........................................................................................................... 34
7 Extensions............................................................................................................................. 36

Appendix A Phonological feature representations for the consonants


Appendix B Results of stepwise regressions for all cases

Modelling the acquisition of English phonology in infancy 2


Abstract
Despite the large amount of individual variation, attempts have been made to design models that
account for the acquisition of phonology in English. Comparing the relationship between
different factors in acquisition has usually consisted of qualitative conclusions backed up by
studies on small groups of children. Shillcock & Westermann (1996), however, used a clinically
relevant acquisition order from Grunwell (1985) to allow modelling of the stages of acquisition
using a range of factors. Representative statistics on phonological usage were drawn from the
London-Lund and CHILDES corpora. This paper updates the earlier work of Shillcock &
Westermann, applying a vector based context model to the corpus data in an attempt to derived
statistics that are useful for modelling acquisition order. These statistics are combined with a
physically motivated phonological feature analysis in order to find the best combination of
biological and environmental factors to model the order of phonological acquisition.

Modelling the acquisition of English phonology in infancy 3


1 Introduction
1.1 The problem of phonological development

The central problem of phonological development runs parallel to the problem of the
development of language itself. Infants somehow progress from the sounds of physical necessity
such as crying and cooing to children capable of producing complex, phonologically and
syntactically well formed utterances. The normal infant entering the world is assumed to draw
on four classes of input that are useful for learning about their local environment:

♦ Innate endowments and biases.


♦ Events in the surrounding environment.
♦ Self-generated perceptuomotor actions.
♦ Feedback from self generated perceptuomotor actions.

The exact relationship between these factors in development has always been controversial. In
learning phonology, the child may well have certain innate biases present at birth. At the very
least the infant is endowed with a perceptuomotor system and a vocal tract that will eventually
be able to make a large array of sounds. The infant is surrounded by a rich linguistic
environment filled with people generating complex acoustic information. The infant makes
sounds, and the production of early utterances starts a feedback loop by which the child can
modify the sound of his own voice.

Infants learning English phonology show a great deal of individual variation. However,
superimposed on this is a predictable order of acquisition. The exact nature of the interaction
between biological and environmental factors in this process is not understood, with roles
posited for explanations invoking innate language skills, a maturing vocal system and phoneme
frequency. The order in which phonemes are acquired is difficult to ascertain. This is due to
many factors:

“… acquisition varies between children learning the same language; related allophones are acquired at
different times; acquiring a phoneme may not mean that the relevant phonemic contrasts have been

Modelling the acquisition of English phonology in infancy 4


acquired; position of the target phoneme in the syllable, together with other contents of the syllable,
affects pronunciation.” Shillcock & Westermann (1996) p1.

Expected orders of acquisition have been published despite the problems involved. Speculations
have been made about the influences that combine to produce a phonological system with a
predictable acquisition order.

A successful theory of phonological acquisition will have to explain several observed properties
of development:
♦ The nature of the interaction between babbling and early speech.
♦ Predictable orders of development.
♦ Individual variation.
♦ Cross-linguistic variation.
♦ The role of different levels of sound structure: phonemes, syllables, lexical items.

1.2 Current perspectives on phonological development

1.2.1 From babbling to speech

The status of babbling as a ‘linguistic’ stage is under debate. At one extreme is the claim that
babbling is a motoric subsystem entirely insulated from language, at the other the theory that
babbling grades continuously into speech. Jakobson (1968) attempted to model the order in
which the phonological oppositions of language are acquired. Acquisition was treated as the
building of a phonological system using sets of rules. Jakobson denied that the infant babbling
stage had any impact on the subsequent development of communicative vocal gestures. The
constrained approach of Jakobson has famously failed to account for the observed variation in
language acquisition. However, some researchers continue to see babbling as an independent
subsystem insulated from language learning. Locke (1983) supported the ‘independence
hypothesis’ against the ‘interactional hypothesis’, although Locke has also stressed the wide
range of factors likely to be implicated in phonological development:

Modelling the acquisition of English phonology in infancy 5


“We must figure out how children develop the emotional, social, perceptual, motoric, neural,
cognitive and linguistic capabilities required for the efficient use of language.” (Locke 1995 p278).

Locke and Studdert-Kennedy believe that the question of language development is not to explain
how the external environment trains the child:

Language is not an object, or even a skill, that lies outside the child and has somehow to be
internalized. Rather it is a mode of action which the child grows because the mode is implicit in the
human developmental system (Studdert-Kennedy 1991 p10).

Locke does not believe that linguistic capabilities are implicit in the babbling stage of
production. Consistent with this theory is the observation that babbling is strikingly similar
across very different language environments, even to the extent of surviving congenital deafness
and neonatal brain damage. Locke suggests that a lack of variation implies that phonetic forms
in the infant are constrained by purely anatomical and physical factors. They are hence
insensitive to the language environment and cannot be practice for speech.

As well as denying that babbling can be characterised as ‘speech practice’, Locke (1995) draws
attention to the formulaic nature of early speech, which implies that the first units are not
morphemes; “Formulaic utterances are thought to be holistically perceived and stored, and
irreducible to their syllabic or segmental parts.” p297. Infants do appear to select some words to
add to their lexicon on the basis of holistic similarities. Matasaka (1992) showed that infants
mimic mother’s speech in terms of intonation patterns – rising, falling, bell shaped, flat or
complex. This vocal accommodation is taken to be a form of mimicry and is influenced by
holistic rather than segmental properties of the mother’s voice. Kuhl & Meltzoff (1982) also
provided evidence of mimicry. Children of 18-20 weeks watched a film of a woman alternately
vocalising [i] and [a]. The control group saw the same film and articulatory movement, but
heard no formants, instead hearing sounds with similar durations and amplitude contours.
During the interstimulus interval, the infants hearing the intact vowels spontaneously repeated
them at virtually the same rate as they had been presented. The control group did not vocalise.
Speech sounds, in this experiment, triggered echoic responses in very young infants.

Locke believes that the seemingly segmental babbling stage is not syllabic at all – merely
genetically pre-programmed motor play. If babbling is a separate subsystem, as Locke and

Modelling the acquisition of English phonology in infancy 6


Jakobson have implied, then it must not be incorporated into models of phonological acquisition.
If babbling does not grade into early speech, then we must attempt to explain the patterns found
in both babbling and early speech. However, in the phonological acquisition order published by
Grunwell (1985), the phonemes reported to occur first in English appear when the babbling stage
has not yet come to an end. If babbling and speech are separate systems, then they will be
confounded together in Grunwell’s acquisition order. This would be a problematic result for any
attempt to model acquisition order, including the one presented here. Holistic interpretations of
first word acquisition are also problematic for theories of development that concentrate on the
phoneme unit. Holistic theory denies the existence of phonemes in early speech – instead
proposing that the child uses motor gestures for each word. Eventually, phonological capacity
‘crystallises’ out of these motor plans.

If babbling were to grade into speech, it would be difficult to maintain the view that infants
learning phonology begin with highly regular rule-based oppositions (Ingram 1988, Jakobson
1968). Vihman (1996) has pointed out that there is currently little evidence of the proposed
transition from whole word to segmental representations. However, grouping the results of
several studies together, Vihman concludes that there appears to be a

“gradual qualitative shift from a predominance of processes affecting the structure of whole words
(consonant harmony, reduplication, final consonant deletion) to those affecting specific segments or
classes of segments (stopping of fricatives, gliding of liquids).”

However, Vihman et al. (1985) showed that there is continuity between babbling and speech.
The selection of sounds infants use in their first words is in part dependent on the sounds used in
babbling. Vihman et al. found that there was a relationship between the consonants used in
babbling and the speech sounds produced in the child’s first words. The pattern of sounds that
constitute the input language appear to affect the vocal productions of the infant even at this pre-
linguistic stage. The frequency of phonemes in the input language is usually assumed to have
some effect on the acquisition of sounds. Predictions of percentage of stops in infant production
from the ambient language are borne out. Evidence has come from cross-linguistic studies, such
as Boysson-Bardies et al (1992). The proportion of stops in phonological production in four
languages is Swedish > English > Japanese > French. The ranking of infant stop percentage in is
also Swedish > English > Japanese > French. This tendency is not restricted to the production of
stops:

Modelling the acquisition of English phonology in infancy 7


“In spite of some common or “universal” tendencies … a selection of articulatory gestures in late
babbling and first words can be seen to arise from phonetic patterns traceable to the linguistic
environment of the child.” p381.

Boysson-Bardies et al. used phonetic segments in their research, although they warn of the
dangers of over-interpreting this technique:

“The infants’ capacity to extract relevant acoustic information does not imply that infants segment the
stimuli into constituent units or categorize them in the same way as adults. However, from the kind of
word representation they have at this age, the infants are able to derive a selection of the main
articulatory gestures that will be necessary to produce (or reproduce) a small relevant basic lexicon.”
p388.

Ingram (1988) showed that the acquisition of word initial /v/ also varies cross linguistically. At
same time as language is developing, the infant’s perceptual capacities are changing. The ability
to discriminate certain types of contrast that are not in their ambient language diminishes over
the first year of life.

Ingram (1992) has supported the view that distinctive features are used in children’s early
phonological representations as well as holistic representations. He points out that perceptual
capacities are well developed and infants can identify acoustic characteristics with ease.
Perception leads production, and by 1;6 the child can usually produce 50 words but can
recognise approximately 250. This suggests that some form of phonological organisation must
have begun. Infants show awareness of the segmental nature of babbling. Variegated babbling
consists of a repeated consonant combined with different vowels, implying that the segment is a
unit and various combinations are possible within segments. There is more evidence for the
importance of environmental influences: Menn & Stoel-Gammon (1995) point out that hearing
impaired infants are delayed in babbling and in the frequency of supraglottal consonants in their
vocalisations.

There appears to be little motivation for assuming that babbling is a separate subsystem. It is
more difficult to decide whether holistic or segmental representations are preferred by the child –
perhaps because both kinds of representation are present.

Modelling the acquisition of English phonology in infancy 8


2 Models of phonological acquisition
2.1 Early models

Behavioural models, such as Olmstead (1966) are based on perceptual factors – explicitly
disregarding ease of articulation. Contrasting this was Jakobson’s structuralist approach, which
saw the child as displaying an unfolding system of oppositions. The development of more
advanced recording techniques and the consequent increase in the amount of data available for
analysis has allowed more rigorous statistical techniques to be used. Stampe’s (1979) ‘natural
phonology’ applied Chomskyan nativist principles to phonology. Automatic phonological rules,
termed phonological processes by Stampe, are universal and innately available. Oppositions are
formulated in terms of rules. The same criticisms can be made of both Jakobson and Stampe.
Rule based formulations can only be designed to account for the observed variation in
acquisition order when a vast collection of phonological rules are deployed. Using rule based
approaches in this way utterly failed to capture phonological generalisations that could be
formulated in terms of constraints. This led to the development of optimality theory (OT), a
nonlinear approach that centres on the ranking of phonological constraints to explain
phonological usage. Ingram (1992) follows a rule based approach but adds sensitivity to
frequency via a ‘functional load’ parameter.

2.2 Cognitive and biological theories

Cognitive approaches view the child as an active participant in the learning of phonology.
Avoidance, selection and overgeneralisation are seen as evidence that the child is actively
attempting to solve the problem of phonology acquisition.

Cognitive theories explicitly reject any innate knowledge of phonological categories and
processes. The only innate propensities result from the construction and control of the human
vocal tract and the neural learning capacity of the brain. The child is seen as an active
participant in the learning of phonology, hypotheses are tested, experiments performed and

Modelling the acquisition of English phonology in infancy 9


overgeneralisation is observed. Hence variability in production is explained as experimentation
with the sounds of language.

“… the child’s progress typically involves an early presystematic period of piecemeal learning (the
mastering of special cases, including “phonological idioms”), followed by the discovery (and
overgeneralisation) of patterns.” Vihman (1996) p29.

However, a similar learning pattern has been observe in other areas of language acquisition, such
as the learning of the past tense construction in English. Past tense learning has also been
modelled using a neural network modelling approach (Plunkett and Marchman 1993).
Connectionist models can also show generalisation behaviours and U-shaped developmental
trajectories without being viewed as taking an ‘active’ part in learning.

The cognitive approach has been criticised. Locke (1983) denies that the child has to be seen as
a dynamic participant in learning. Locke’s biological model can still explain the observed
variation – genetic diversity, expressed through biological differences, can be invoked to explain
the variation in acquisition. MacNeilage and Davis (1990) have also offered a biologically
orientated perspective with their frame/content theory of acquisition. They see the rhythmic
opening and closing of the jaw seen in babbling as the basis for gaining motor control for speech.
Silent jaw-wagging, observed in most infants, would be describe as a ‘pure’ or ‘empty’ frame.
The initial ‘content’ of the syllable is also defined in primarily motoric terms. With motor
practice, the infant acquires finer grained control over the articulators. Vihman (1992), in a
cross-linguistic study, found evidence to support the frame/content theory. Vowels and
consonants that frequently occurred together in babbling did tend to have common tongue
positions. This implies that the mandibular oscillation was primarily responsible for the CV
syllables produced. However, Vihman claims that individual differences between children
learning the same language was more salient than any biological constraint. Vihman (1992)
recognises that a combination of factors must come into play:

“Those syllables that are both salient in a given language environment, because of their common
occurrence in adult models, and motorically accessible, because of facilitating “articulatory
neighborhood” effect of consonant-vowel similarity, can serve as entry points into more sophisticated
or motorically complex vocal production.” p414.

Modelling the acquisition of English phonology in infancy 10


Another interesting class of model relies on self-organising principles to guide phonological
development. Lindblom (1992) sees phoneme units as emergents from a holistic word-based
representation. It is assumed that a word is initially stored as the set of motor actions required to
utter it. Over time, the physiology of the brain is designed so that regularities are picked out, and
segments slowly emerge:

“Phonetic gestures are emergents, not primitives of the theory. This is in line with the crystallization
process envisioned by Studdert-Kennedy, but is clearly different from the premises of standard
distinctive feature theory.” p158. Original emphasis.

The words acquired in this model are assigned a motor score which over time, as more patterns
are learned, differentiates into a more segmental representation. “There is no transition from
“holistic” to “analytic” coding because the present account is a “have-your-cake-and-eat-it”
scenario.” Lindblom p159. Segments appear through the “interaction of subsystems”, word
forms are acquired in a holistic manner, but these require different configurations of the
articulators. Out of this interaction comes a self-structured set of representations.

We DO NOT posit separate babbling/phonemic subsystems – gradual change (vihman vowels


cross linguistically). Babbling primarily motoric? see p284 handbook child lang. ‘Motor’
features in the model can capture this – this is why such a feature based phonology is useful.

The model presented here is an attempt to produce a quantitative explanation of what Vihman
(1992) means by salient features of the environment and motorically accessible articulations.

2.3 Modelling using contextual information

Shillcock & Westermann (1996) produced a model of phonological acquisition that included
both distributional statistics and phonological features. The subsegmental structure for the
phonological feature analysis was taken from Government Phonology. Each of the 22
consonants modelled was represented by a vector in this 9 dimensional space. A phonotactic
range variable was introduced to add distributional information:

Modelling the acquisition of English phonology in infancy 11


“Phonotactic range was defined as the number of different segment bigrams in which a particular
segment participates, above a criteria probability (0.001); for /t/, the bigrams included /t /, /tr/, /t /, /lt/,


/st/, /nt/ and so on.” Shillcock & Westermann (1996) p2.

Additionally, the phonotactic range was partitioned into “forward” (e.g., /t /, /t /) and  

“backward” (e.g., /st/, /nt/) components. A phonotactic range over trigrams was also calculated.
The phonotactic range is another possible definition of ‘function in the phonological system’
which Ingram had linked to frequency. Part of the motivation for the model was to test which
definition of functional load is the most useful for predicting acquisition order. The range
statistics were calculated from an idealised phonetic transcription of the London-Lund and
CHILDES corpora. (See Shillcock, Hicks, Cairns, Levy & Chater (in press) for a description of
the orthography to phonology conversion). The corpora were transcribed into the machine-
readable phonetic alphabet (MRPA) which includes signs for 24 consonants and 21 vowels.

Shillcock & Westermann used the order of phonemic acquisition defined by Grunwell (1985) in
the Phonological Assessment of Child Speech (PACS). Grunwell’s acquisition order
concentrates on individual phonemes, giving ages at which segments are used reliably by
learners of English. In general there is a lack of research reporting phonological acquisition
orders. This is in part due to the difficulties involved in obtaining a large enough sample of
longitudinal child speech. Much research on phonology acquisition has focussed on small scale
diary studies rather than general processes. There is also a lot of variation in acquisition order.
In fact, we would be sceptical of a model that appeared to model the Grunwell order too well. A
model explaining 100% of the variance in Grunwell’s acquisition order would be inflexible in
the face of the observed variation between children. However, there are even problems with the
definition of ‘acquisition of a phoneme’. Menn & Stoel-Gammon (1995) point out that
phonemes differ in the time lags between 50% and 90% accuracy in production. Hence it can be
difficult even to define a stage. However, the effectiveness of Grunwell’s PACS account has
been noted (see Vihman, 1996, p218). PACS has also been used in clinical situations, and at the
moment is the most useful target sequence for a model of acquisition.

Shillcock & Westermann conducted multiple linear regression analyses were conducted to
discover which variables are most accurate at predicting the order of acquisition. The best 3
variables accounted for 77% of the variance in acquisition order (R² =0.765, F(3,18)=19.48,
p<0.0001). The phonotactic range variable was the second most important of the three, with

Modelling the acquisition of English phonology in infancy 12


larger ranges being acquired earlier (ß=-0.522, p<0.002). Of the two subsegmental structure
variables included in the regression, the most important was “R” – “apicality, coronality, coronal
formant locus”, correlated with later acquisition (ß=0.579, p<0.0001). The third most important
variable was “?” – “occlusion, abruptness, alone, glottal stop”, correlated with early acquisition
(ß=-0.362, p<0.02). This optimal model was found using the phonological data from the
transcription taken from the CHILDES corpus of speech directed to children up to the age of 28
months. Using the forward phonotactic range and the CHILDES transcription the best model
accounted for 79% of the variance in acquisition order (R²=0.794, F(3,18)=23.09, p<0.0001).
The independent variables follow the same order of importance as before.

Modelling the acquisition of English phonology in infancy 13


3 Method

3.1 Developing the Shillcock & Westermann model

The CHILDES and London-Lund corpora were again used to derive statistical corpus measures,
and Grunwell’s (1985) order of acquisition was the dependent variable. However, important
methodological changes were made to the 1996 model. The phonological feature representations
were updated. The representations, based along the lines of conventional phonological feature
theory, have been used previously in modelling phonology in a neural network (Harm &
Seidenberg, 1999). Corpus statistics were calculated using a vector matrix approach.

3.1.1 Introduction to Latent Semantic Analysis

Sophisticated statistical models incorporating contextual information are now commonplace.


The LSA (Latent Semantic Analysis) approach described by Landauer & Dumais (1997) was
initially developed to explain how children rapidly learn the semantics of a large vocabulary of
words. These words appear to be acquired from text with virtually no direct instruction.
Landauer & Dumais summarise their approach:

“...we suggest a very different hypothesis [to that of innate concepts] to explain the mystery of
excessive learning. It rests on the simple notion that some domains of knowledge contain vast
numbers of weak interrelations that, if properly exploited, can greatly amplify learning by a process of
inference.” p211.

The LSA model is merely a learning mechanism, without any inbuilt knowledge of the domain
to which it is applied. LSA has mainly been used to model contextual similarities between
words. Landauer & Dumais showed that the model performed similarly to moderately proficient
readers of English in a test requiring judgements of semantic similarity. The LSA approach has
been usefully applied to the modelling of priming and reaction times in semantic tasks. A model
trained on sufficient examples can even ‘grade’ essays in good agreement with human markers.
However, the model has not been applied to developmental modelling.

Modelling the acquisition of English phonology in infancy 14


There is nothing inherent in the unit ‘word’ that makes it ideal for treatment with LSA. A ‘word’
in LSA is simply a coordinate in a high dimensional vector space. Within this vector space
similarities between words can be calculated from the distances between the vectors. The LSA
model used here was given 45 distinct phonemes as input rather than words. The corpora used
contained around 1.3 million phoneme tokens. The method of input to the model was exactly the
same as for a corpus of words. However, the number of types is limited relative to a corpus of
words. The LSA model was trained as if the language input contains only 45 different words.
The applications for performing the LSA analysis were taken from MacDonald (2000).

3.1.2 Using the LSA model

Initially, the entire corpus of phonemes is passed into the LSA program. A matrix is formed of
cooccurrences of phonemes within a certain predetermined window size. The window size
determines the amount of phonological context from which information can be gained. With a
one phoneme window cooccurrence statistics are only computed from the two phonemes directly
adjacent to the central phoneme.

For example, if the entire corpus were to consist of just one word, ‘picnic’, then the cooccurrence
matrix for the phonetic transcription with a one word window would be simple to calculate:

  

p  

 

e.g. If the central phoneme is / / (bold surround), counts are incremented by one for / / and / /

(dashed surround).

Table 1 – Cooccurrence counts in a one phoneme window

P
 

P 0 1 0 0


1 0 2 1

0 2 0 1


0 1 1 0

Modelling the acquisition of English phonology in infancy 15


Two phoneme window:

e.g. If the central phoneme is again / / (bold surround), counts are incremented by two for / /

and / / (dashed surround).


Table 2 - Cooccurrence counts in a two phoneme window

P 0 1 1 0

1 0 3 2


1 3 0 2


0 2 2 0

The vectors formed by passing a window over the phonemic corpus are then analysed. A variety
of statistical measures are calculated from the cooccurrence data. The measure most akin to the
phonotactic range variable in Shillcock & Westermann is the sparsity of the phoneme. Sparsity
is calculated by taking the number of zero dimensions for a single phoneme, and dividing by the
total number of dimensions. Hence the fewer contexts the phoneme appears in, the larger the
sparsity score. For the very limited ‘picnic’ corpus above, there are only 4 dimensions for the
four distinct phonemes, whereas the full model has 45 phoneme dimensions. The sparsity values
of the phonemes can be tabulated:

Modelling the acquisition of English phonology in infancy 16


Table 3 – Sparsity and frequency measures over different window sizes,
1 phoneme window 2 phoneme window
Sparsity Frequency Sparsity Frequency
P ¾ 1 ½ 1

¼ 2 ¼ 2

½ 2 ¼ 2


½ 1 ½ 1

It is clear that as the window size increases, the sparsity scores will become more similar.
Sparsity converges towards zero as the number of zero dimensions falls and the number of
dimensions remains constant. As window size increases it becomes less likely that there will be
any zero dimensions at all. Over a 5 word window only very infrequent phonemes do not occur
in all contexts. Hence as window size increases from one or two phonemes, highly ‘local’
information about the side-by-side cooccurrences of phonemes is lost. The same is true for the
richness score for a phoneme. This score is found by dividing the number of nonzero
dimensions by the sum of all dimension values. The sum of dimension values is usually very
large compared to the count of nonzero dimensions. In these experiments, the maximum number
of nonzero dimensions is 45, whereas the sum of dimension values is the sum total of all the
counts made for a phoneme as the window is passed over the corpus (hence this sum can even be
larger than the actual phoneme frequency). The richness score can also be termed the type/token
ratio. These two measures, which include information gained from the number of different
contexts in which a phoneme appears, contrast with measures defined over the vectors. Such
measures include the euclidean distance and information theory based statistics such as mutual
information, relative entropy and conditional entropy (see table 5). All vector statistics were
calculated relative to the mean corpus vector.

Modelling the acquisition of English phonology in infancy 17


3.2 Model dimensions

3.2.1 Dimensions manipulated during modelling

The goodness of fit of models to the acquisition data was tested over several varying dimensions:

a) Speech directed at children (CHILDES) versus adult speech (London-Lund)

Primarily a replication of Shillcock & Westermann, the model was tested to compare speech
directed at children under 28 months with adult speech. Shillcock & Westermann found that the
use of CHILDES data increased the R² of their model.

b) Bounded versus unbounded speech

As the window is passed over the corpus it can be configured either to ignore or stop at the
boundaries between words. Adding word boundaries to the data prevents ‘inappropriate’
cooccurrences of phonemes from being counted. For example, a child might produce the two
word utterance “cat gone”. A model ignoring word boundaries would score the phonemes /t/ and
/g/ as being cooccurrent. However, the phonemes /t/ and /g/ do not occur next to each other
within words of English speech. Leaving out word boundaries profoundly effects ‘local’
measures such as sparsity or richness measures, certainly the number of zero (or nonzero)
dimensions becomes similar for most phonemes. As previously discussed, the role of the whole
word unit is an area of controversy. Introducing word boundaries assumes that some form of
lexicalisation has taken place.

c) Window size

The size of the window passed over the corpus was varied between one and five phonemes. The
LSA model calculates various distributional statistics that vary dependent on the window size.
‘Local’ information (1 phoneme window) versus more ‘general’ statistics are calculated. As the
window size increases, the potential for the majority of phonemes to cooccur both with
themselves and each other increases. In a large window, calculating more ‘general’ statistics

Modelling the acquisition of English phonology in infancy 18


means that potentially useful information encoded between close phonological neighbours may
be lost.

d) Window direction

The input window can be made asymmetric. In this instance counts are only taken from the
window in front or behind the central phoneme.

3.2.2 Constant factors.

a) Frequency

For each corpus the frequency of occurrence of phonemes is invariant.

b) Distinctive features of phonemes

The distinctive features used to represent phonology are in the tradition of the Chomsky & Halle
(1968) Sound Pattern of English (SPE) analysis. The English consonants are defined using 11
features (see table 5).

Table 4 – Phonemic acquisition order (after Grunwell 1985).

Acquisition Stage Phoneme


1 m, p, b, w, n, t, d
2  

k, g, h
3 f, s, l, j
4    

5 v, z, r,  

6     

Modelling the acquisition of English phonology in infancy 19


Table 5 – Predictors used in statistical modelling

Distributional statistics Phonological features


Frequency and ln(frequency) Sonorant
Richness and ln(richness) Consonantal
Sparsity Voice
Euclidean distance and ln(euclidean) Nasal
Conditional entropy (CE) Degree
Relative entropy (RE) Labial
Average mutual information (AMU) Palatal
Pharyngeal
Round
Tongue
Radical

3.3 Modelling using regression

Distributional scores from the LSA model were calculated for each of the 24 phonemes. This
data was then combined with the constant factors (frequency and phonological features1). A
clear distinction should be made between the LSA model, which is used to derive scores based
on the phoneme cooccurrence matrix, and the regression model, used to discover which of these
factors are the most useful for modelling acquisition order. A stepwise multiple linear regression
was then run to discover which features or statistics were implicated most significantly in the
modelling of acquisition order. Stepwise regression is a heuristic technique that adds and
removes variables (such as frequency and sonorant) dependent on their ability to model the
variance in dependent variable (acquisition stage). Variables outside the model are added
sequentially, and on subsequent cycles these variables can be removed if their significance
within the model reduces (for example when a variable is added that combines better with the
variable already present in the model). The result of the regression is a linear model that can be
used to predict the acquisition stage given values for the variables included in the model.

1
See Appendix A for the table of phonological features used.
Modelling the acquisition of English phonology in infancy 20
4 Results

After Shillcock & Westermann (1996), the corpora used in the forming of the regression model
were phonologically transcribed versions of London-Lund and CHILDES. The CHILDES
corpus gave the best fit to the data, hence this corpus was chosen for the initial work on
constructing the model. Shillcock & Westermann modelled 22 of the English consonants. The
model presented here includes all 24 consonants of English2. Because a great deal of multiple
regression analyses have been conducted, reporting every ß score and t-statistic for each
predictor is unwieldy. Where this information has not been provided, all statistics can be found
in Appendix B.

4.1 Modelling using the unbounded CHILDES corpus

4.1.1 Results

Table 6 – Predictor scores and model significance for the CHILDES corpus, 1 phoneme window,
no word boundaries.

Predictor ß score t(20) Significance


1) ln (richness) .530565 4.287 .0004
2) Degree -.417719 -3.294 .0036
3) Labial -.317222 -2.474 .0224
Adjusted R² F (3,20) Significance
.65610 15.626 0.0000

Table 6 shows the best linear model of acquisition order, found using stepwise regression, with
the corresponding ß scores and significance levels of the predictors in the model. The overall
significance of the model is excellent, with p being reported at 0.0000. The natural log of the

2
See table 4 for the order of acquisition of phonemes used in the model.
Modelling the acquisition of English phonology in infancy 21
richness score accounts for 33% of the variance (R² = 0.33147, F(1,22) = 12.414, p = 0.0019).
The richness measure is the first to be added during the stepwise regression process, followed by
degree and labial. To compare the usefulness of the richness measure to frequency, a second
stepwise regression was run on the same data. This time only frequency and ln (frequency) were
distributional options. The stepwise regression added ln (frequency), followed by degree and
labial. The model explained a similar amount of variance to the richness-based model, with R² =
0.64938 (F(3,20)=15.199, p=0.0000).

Distributional measures are a useful addition to the physical constraints of the phonological
features. Removing the statistical vector based scores, a stepwise regression over the
phonological features can model just 39% of the variance. Only the consonantal and degree
features have significant t-values when incorporated within the regression model. (R² = 0.39132,
F(3,20)=8.393, p=0.0021). Motivated by the later discovery of its importance3, an attempt was
made to incorporate the sparsity measure into the model as the only distributional measure.
However, the measure was not selected and the stepwise regression incorporated only the
consonantal and degree phonological features.

Shillcock & Westermann (1996) did not include the phonemes   and   in their model, finding
that 77% of the variance in order of acquisition can be explained. Statistics were derived from
the CHILDES corpus, with a 1 phoneme window and no word boundaries. In the model
presented here for 22 phonemes the best stepwise model accounted for nearly 78% of the
variance in order of acquisition, a boost of 12% over the 24 phoneme model. The three
predictors included were degree, CE and sonorant. R² =0.77927, F(3,18)=25.712, p=0.0000.
However, this is only a partial attempt at replication as no criterion probability was incorporated
into the model. The 22 phoneme model also fitted the acquisition data much better than the
frequency approach. With frequency and ln (frequency) as the distributional options, 65% of the
variance could be explained by the model. The order of inclusion was found to be: degree, ln
(frequency), labial. R² =0.65020, F(3,18)=14.011, p=0.0001.

When the window size is increased to 2 phonemes (all other factors constant), ln (richness), then
degree, then labial are again picked out by the stepwise regression process. The amount of
variance explained is very slightly worse: R² =0.65193, F(3,20)=15.360, p=0.0000. With  and

3
See ‘Modelling using bounded speech’ below.
Modelling the acquisition of English phonology in infancy 22
removed, the R² is also reduced compared to the 1 phoneme window, to 0.75753
! "

(F(3,18)=22.870, p=0.0000). Degree, CE and sonorant are again selected.

The same pattern of incorporation of ln (richness), degree, labial, is again observed for the 3
phoneme window. Again the change of window size appears to make little difference to the
amount of variance that is explained. R² =0.65019, F(3,20)=15.250, p=0.0000. Using Shillcock
& Westermann’s 22 consonants, the R² is 0.69483 (F(3,18)=16.938, p=0.0000), and the same
predictors are incorporated in the same order as before. For window sizes of 4 and 5 phonemes,
the best predictors are ln (richness), degree and labial for all the analyses, including those in
which 22 consonants are modelled. The ß coefficients are highly consistent across the window
sizes (see Appendix B).

4.1.2 Discussion

The ß scores for ln (richness), degree and labial show a consistent relationship with acquisition
order. The natural log of the richness score is associated with later acquisition. This is to be
expected, as richness is calculated by dividing the number of nonzero dimensions by the total
cooccurrence count for the dimension. Because an unbounded corpus incorporates unexpected
cooccurrences across word boundaries, the number of nonzero dimensions tends towards 45.
Hence there will be little variation due to the number of contexts in which a phoneme is found.
The bulk of the variation in the richness score will therefore be due to the total cooccurrence
counts for the dimension, which in turn is linked to the frequency of the phoneme. This implies
that richness score for a phoneme in the unbounded corpus will be highly correlated with the
inverse of the frequency of the phoneme. More frequent phonemes tend to be acquired earlier,
therefore ‘richer’ phonemes should be acquired later. This is in borne out by the results, as
richness has a ß of 0.53 implicating it in later acquisition4. Frequency and richness are also
similar predictors of the variance in order of acquisition – 65% for ln (frequency), degree and
labial versus 66% for ln (richness), degree, labial. Despite this similarity, the stepwise model
picks richness above frequency for all the window sizes, implying that there is some residual
contextually useful information present, albeit at a low level. The other two predictors, degree
and labial, are both associated with earlier acquisition (ß scores are negative).

4
The ß score for richness is 0.53 for all window sizes when the input is the unbounded CHILDES corpus.
Modelling the acquisition of English phonology in infancy 23
With # $ and % & removed, the sonorant feature becomes one of the predictors selected by the
stepwise regression. Like degree and labial, the phonemes with the highest sonority scores were
acquired earlier. CE is only selected within the 22 phoneme models, high CE scores are also
correlated with earlier acquisition.

4.2 Modelling using the bounded CHILDES corpus

4.2.1 Results

In the models above, the sparsity variable was a poor predictor because of its uniformity across
the phonemes. When word boundaries were ignored, all but the most infrequent phonemes are
cooccurrent with every other phoneme. The introduction of word boundaries should remove
some of these inappropriate cooccurrence counts.

Table 7 - Predictor scores and model significance for the CHILDES corpus, 1 phoneme window,
word boundaries included.

Predictor ß score t(20) Significance


1) Sparsity 0.592764 -5.100 0.0001
2) Degree -0.671729 -4.676 0.0001
3) Sonorant -0.314080 -2.186 0.0409
Adjusted R² F (3,20) Significance
0.69766 18.69115 0.0000

The sparsity variable alone can predict 42% of the variance in order of acquisition (R² = 0.41638,
ß = 0.665, F(1,22)=17.409, p=0.0004). With 3 predictors, 70% of the variance in order of
acquisition is being modelled. The model is a highly significant predictor of the order of
acquisition. To test this model against frequency, a further stepwise regression included only
frequency, ln (frequency) and the phonological features as predictors. The order of inclusion
was ln (frequency), degree, labial, R² = 0.64938, F(3,20) = 15.19934, p = 0.0000. Again, the
frequency variable is a worse predictor of acquisition order than the context sensitive measure.

Modelling the acquisition of English phonology in infancy 24


Again the model can be compared to Shillcock & Westermann’s findings by removing the ' (

and ) * phonemes. The performance of the model was improved, but by a smaller margin than for
the unbounded corpus. The 22 phoneme model accounted for 74% of the variance, an increase
of 4% over the 24 phoneme model. (R² = 0.74132, F(3,20) = 21.06018, p = 0.0000). Included
predictors were degree, sparsity and sonorant.

When the window size is increased to 2 phonemes, ln (richness) replaces sparsity as the first
measure to be included, followed by degree and labial. The R² reduces to 0.64738, F(3,20) =
15.07540, p=0.0000. At 3 phonemes, the top three predictors become sparsity, labial and
euclidean distance. R² = 0.68576, F(3,20) = 17.73096, p=0.0000. The same three factors are
incorporated in the same order for the 4 phoneme window model. R² = 0.66675, F(3,20) =
16.33914, p = 0.0000. At 5 phonemes, ln (richness), degree, and labial are the best predictors,
with R² = 0.65247, F(3,20) = 15.39407, p = 0.0000.

4.2.2 Discussion

Although accounting for less of the variance than the best model of Shillcock & Westermann,
this model incorporated all 24 English consonants. Reduction of the input cases to 22 phonemes
increased the model performance by 4%. The model incorporating sparsity was able to explain
5% more of the variance than the ln (frequency) model. The sparsity predictor, with a negative
ß, is associated with later acquisition. Phonemes that occur in few contexts are acquired later
than those found in cooccurrence with many other phonemes. This replicates the finding of
Shillcock & Westermann that phonemes with a large phonotactic range were acquired earlier.
Degree is again implicated in earlier acquisition in the 1 phoneme window model. Labial
appears in models with a larger window size, again with negative ß scores. Phonemes with
higher sonority were acquired earlier.

4.3 Backward and forward models in the CHILDES corpus

4.3.1 Forward models

Modelling the acquisition of English phonology in infancy 25


Experiments were also undertaken using asymmetric windows as input to the LSA model. In the
first of these tests, the model only counted cooccurrences in the window in front of the central
phoneme. With a 1 phoneme window, the stepwise regression selected sparsity, degree and
sonorant (as had been the case with the symmetric window). However, the R² was reduced (R² =
0. 68949 F(3,20) = 18.02408, p=0.0000). With a 2 phoneme window only sparsity and degree
were significant when within the model, and R² dropped to 0.62534, F(3,20) = 20.19437,
p=0.0000. For the windows of 3-5 phonemes, sparsity, degree and labial were the top 3
predictors, with the R² scores reducing from 0.66744 to 0.64712 to 0.62119.

4.3.2 Backward models

In the backward model, counts were only taken from the segment of the window behind the
central phoneme. With a 1 phoneme window, three predictors were included in the order ln
(frequency), degree, labial. R² = 0.64938, F(3,20)=15.19934, p=0.0000. With a 2 phoneme
window, ln (euclidean), degree and labial were the top predictors, with R² = 0.65788,
F(3,20)=15.743, p=0.0000. For a window of 3 phonemes, the top predictors are sparsity, degree
and ln (euclidean), R² = 0.62667, F(3,20) = 13.86901, p = 0.0000. For the 4 and 5 phoneme
windows, the best predictors are ln (euclidean), degree and labial. R² = 0.66068 for the 4
phoneme window, R² = 0.65619 for the 5 phoneme window (see appendix B for full statistical
data).

4.3.3 Discussion

Shillcock & Westermann found that the forward model was better than the full phonotactic range
variable for predicting acquisition order. That result has not been replicated here. However, the
finding that the forward model is a better predictor of the variance than the backward model does
replicate Shillcock & Westermann.

4.4 Comparison with the London-Lund corpus

4.4.1 Unbounded speech

Modelling the acquisition of English phonology in infancy 26


Within a 1 phoneme window, the top 3 predictors for the unbounded Lund corpus were found to
be degree, richness and ln (euclidean). R² = 0.59004, F(3,20) = 12.03451, p = 0.0000. The same
predictors are selected for 2 phoneme and 3 phoneme windows, the R² scores being 0.61700
and 0.56772 respectively. For the 4 phoneme window, degree, richness and euclidean distance
are the best predictors, with R² = 0.56716. With a 5 phoneme window, degree, richness and ln
(euclidean) are incorporated into the model, with an R² of 0.56526 (see appendix B for full
statistical data).

4.4.2 Bounded speech

The London-Lund corpus represents a selection of adult English speech. For a 1 phoneme
window, the 3 best predictors were sparsity, degree and AMU. R² = 0.65583, F(3,20) =
15.60897, p = 0.0000. The same sequence of predictors is seen for 2 and 3 phoneme windows.
R² = 0.65386 for the 2 phoneme and 0.65269 for the 3 phoneme windows. With a 4 phoneme
window, degree, ln (frequency) and labial are incorporated into the model, with an R² of only
0.58388. Stepwise regression on the 5 word window data selects degree, ln (frequency) and
sparsity, with R² = 0.58909.

4.4.3 Discussion

The best model derived from the adult speech data was found to have an R² of 0.65583, in
contrast to the best model from CHILDES of 0.696766. This replicates the finding in Shillcock
& Westermann that speech directed at children under 28 months was a better predictor of
acquisition order than the adult linguistic environment.

Modelling the acquisition of English phonology in infancy 27


5 Discussion

The linear regression models presented here show that a large proportion of the variance in
acquisition order, as defined by the PACS clinical measure, can be explained using a
combination of distributional and feature-based phonological variables. A combination of the
two types of measure seems essential to form a highly significant model. None of the stepwise
regressions presented above and in appendix B selected solely distributional or feature-based
measures.

5.1 The relevance of phonological features – physical factors in


acquisition.

Only 3 of the 11 phonological features were included in the regression models. By far the most
common were degree and labial, occurring either alone or together in most of the calculated
models. Sonority is also implicated, most notably in the bounded CHILDES corpus with a 1
phoneme wide input window. In this case sonorant was the third variable to be added to the
model, with 70% of the variance in order of acquisition attributable to the 3 variables.

5.1.1.1 The degree feature

The degree feature is often termed continuant in the phonology literature. Continuant is a
manner feature. Phonemes that are +continuant are defined as those in which air flows
continually through the oral cavity during production. Stops and affricates that block air flow
are –continuant. However, in the degree feature description, a positive score indicates closure
(as in /p/, /b/, /g/) and a zero or negative score indicates that air continues to flow (e.g. /f/, /j/, /l/).
The ß scores found in the modelling trials show that degree is negatively correlated with
acquisition, and hence closure of the oral cavity is likely to be present in those phonemes that are
acquired early.

5.1.1.2 The labial feature

Modelling the acquisition of English phonology in infancy 28


Phonemes that are +labial involve movement of one or both of the lips, such as /p/, /m/, /w/.
Labial movements are also correlated with early acquisition. Phonemes that are –labial include
/t/, /d/ and /l/.

5.1.1.3 The sonorant feature

The sonorant feature is represented in the feature matrix as a hierarchy. In a binary system,
+sonorant phonemes are those in which air has free passage through the vocal tract. Sonority
differs from the continuant feature because free air flow in the vocal tract can be due to a lack of
restriction in the oral or nasal tracts. The sonority hierarchy can also be considered as
representing a gross measure of ‘closeness to vowel’. In the feature representation used here, /w/
and /j/ are the most sonorous, and are often termed semi-vowels. The least sonorous sounds are
also the least similar to vowels – stops such as /p/, /d/ and /g/.

5.1.2 Physical factors and models of phonological development

The discovery that high values of degree, labial, and sonorant are all implicated in early
acquisition is consistent with current theories of phonological development that involve a
biological component. The degree measure, reflecting closure of the oral tract, would be
expected to be an ‘early’ feature due to the nature of infant control over sound production.
Initially infants have little control over their articulators, and hence producing an oral stop would
be a characteristic of the easiest phonemes to produce. With little control over the tongue, an
oral stop can be produced simply by moving the jaw. The importance of this feature is
consistent with an approach such as that of MacNeilage and Davis (1990), who contend that the
rhythmic jaw movements in babbling are part of the basis for acquisition of a more complicated
phonology.

The sonority feature has a similar physical explanation, however in this case free air flow in the
vocal tract is found in phonemes that occur early. It is easiest to take the view that sonority is a
measure of the closeness of the phoneme to being classed as a vowel. A range of vowel sounds
are amongst the first sounds the infant produces. Even in babbling, where the consonants
interspersing the vowels are highly constrained, a wide variety of vowel sounds are produced. In
a biological theory where practice improved control over sounds the vowels would be amongst

Modelling the acquisition of English phonology in infancy 29


the most practised by the infant. In this case it would be expected that consonantal phonemes
closest to vowels would be acquired earlier.

The prominence of the labial feature can also be linked to babbling, although the motion of the
lips in babbling may well be part of the general closure caused by the rhythmic movement of the
jaw. Importantly, however, the labial feature is a visually salient cue for the infant learning
speech sounds. Babies from an early age imitate facial expressions, hence imitating the lip
movement of care givers would also help the infant master phonemes requiring use of the lips.
Lip movement would therefore be characteristic of early phonemic acquisition, an observation
borne out by the data.

Unlike Shillcock & Westermann, movement of the tip and blade of the tongue was not found to
be a significant predictor of later acquisition. However, the other Government Phonology
feature implicated in the Shillcock & Westermann model was “?” – “occlusion, abruptness,
alone, glottal stop” (p2). This feature is similar to the closure implied in the degree feature
above. Phonemes with this feature were also found to be acquired earlier.

5.2 Distributional features in acquisition.

A role for frequency in phonological acquisition has often been suggested. Frequent phonemes
are in general acquired earlier5. Ingram (1992) concedes that some form of linguistic (as
opposed to biological) factor is required to describe acquisition. This is termed ‘functional load’:

“That is, the more words a child acquires with a particular sound, the more likely it is that the sound
will be produced”. p428.

The functional load factor is obviously linked to frequency, but in the modelling trials above,
frequency appears less as a predictor than contextually sensitive measures such as sparsity and
richness. In the model constructed using the bounded CHILDES corpus, with a 1 phoneme
window, the stepwise regression selected sparsity, degree and sonorant. With these predictors
70% of the variance in the order of acquisition could be modelled. Forcing frequency into the
equation, the best model accounted for 65% of the variance. For the model constructed in the

Modelling the acquisition of English phonology in infancy 30


same manner with an unbounded corpus, the ln (richness) variable was only very slightly better
than ln (frequency). However, for all the window sizes the ln (richness) measure was selected
for the model. With an unbounded corpus and a high window size, it would be expected that
virtually all phonemes would cooccur within the same window at some point. However, even
with an unbounded corpus and a window size of 5 phonemes, ln (richness) is selected above ln
(frequency). Even at this level, some information about cooccurrences of very infrequent
phonemes may be encoded in the statistic, and be helpful enough to ensure that richness is
selected over frequency.

Sparsity was the distributional factor incorporated into the most stepwise multiple regression
models, especially, although not exclusively, where the window size was small. Sparsity is also
the dominant predictor in the best model of the acquisition of all 24 phonemes (in combination
with degree and sonorant). This replicates the finding of Shillcock & Westermann that the
phonotactic range variable was the best distributional measure.

The window size parameter was varied for all the different classes of model. The best model,
accounting for 70% of the variance in order of acquisition, was found at window size of 1
phoneme. In general, the proportion of variance explained by the regression models reduced as
the window size increased. The introduction of an asymmetric window did not improve the
models, although the forward model was found to account for more of the variance than the
backward model.

The addition of word boundaries over which the model window could not pass improved the
proportion of variance explained. The use of bounded speech also allowed features such as
sparsity to become relevant due to the absence of inappropriate cooccurrences across word
boundaries. Although Shillcock & Westermann did not give special status to word boundaries,
they did include a criterion probability for the inclusion of a phoneme bigram into the model. In
general this would be expected to have a similar effect to the addition of word boundaries - one
would expect that inappropriate cooccurrences would be of low frequency and hence removed
from consideration. An approach based on word boundaries is to be preferred, as the child is still
allowed to be sensitive to low frequency cooccurrences within words. Even at higher window
sizes, with bounded speech, context based predictors such as sparsity can be selected for the

5
For the CHILDES corpus, in fact, ln (frequency) and acquisition order were found to be correlated with r = -0.597,
p = 0.001.
Modelling the acquisition of English phonology in infancy 31
regression model. It is likely, however, that the effective width of a 5 phoneme window within
the bounded CHILDES corpus is less than 5 phonemes. If the average word length is less than
11 phonemes (which it is certain to be) then counts cannot be collected at the extremes of the
window. To study the effects of window size on the sparsity predictor, a simple one way
ANOVA was calculated. Data was taken from the bounded CHILDES corpus. Sparsity score
were input, window size was the effect to be studied. For the sparsity measure the window size
effect was significant (F(4,115) = 12.63, p = 0.0000). However, the window size did not appear
to affect the ln (richness) measure (F(4,115) = 1.09, p = 0.365).

Statistics derived from the CHILDES corpus are a better predictor than those derived from
London-Lund. This is another piece of evidence for the interaction between distributional
factors in the environmental input and the stage at which phonemes are acquired. This result,
combined with the incorporation of at least one distributional measure in each of the stepwise
regressions, supports the view that distributional sensitivity is an important part of the learning of
phonology.

Modelling the acquisition of English phonology in infancy 32


6 Conclusions
6.1 Biological factors
The corpus analysis presented here offers two distinct perspectives on phonological acquisition.
From the biological perspective, the presence of the phonological features degree, labial and
sonorant is evidence of the influence of movement and control on acquisition. All of these
features, associated with early acquisition, are in some way ‘easy’ for the infant to master.
Closure of the oral tract, characteristic of the degree feature, does not require precise movements
of the articulators (at least initially). The sonority hierarchy is linked to vowel proximity. The
consonants most similar to vowels are in general acquired earlier. This could be due to similar
sounds already being present in the babbling. Boysson-Bardies et al. (1992) have shown that
there are cross-linguistic differences even at the babbling stage. Hence there is evidence for
some kind of perceptual and motoric matching process even at the babbling stage. The gaining
of finer control over the babbling process would allow the most vowel-like phonemes to be more
easily incorporated into the child’s phonological repertoire at a later time. The hypothesised
perceptual-motoric matching process as discussed with respect to vowels implies that the
auditory perceptual stream is important. However, with the prominence of the labial feature in
the acquisition models, we can posit a role for a visual component in this matching process. The
labial feature is the only feature discriminable by sight alone, and appears to be a useful cue to
children learning to produce phonemes. In fact, the labial consonants are the only ones to show
cross-linguistic variation in the first year of life (Vihman 1996).

We do not wish to posit separate systems for babble and speech. Such a distinction would lead
to problems defining the first phonemes to be acquired. An analogy could be drawn between
woodwork and speech. Just as motor repetitions are not practice for woodwork, babbling does
not have to be seen as practice for speech. However without early motor practice, both carpentry
and language would be out of the question. The effects if the environment are clear in both
situations. The ability of an infant to pick up and manipulate certain objects will depend in some
way on previous objects in the environment that have been played with. The linguistic
environment of the child also provides perceptual and motor feedback. The child uses whatever
cues it can – such as motion of the lip, but also can draw on practised vocalisations (vowels) and
relatively easy movements to produce closure. The distinction here blurs between the biological
and the cognitive, as the child is neither seen as a maturing biological machine nor a little

Modelling the acquisition of English phonology in infancy 33


scientist testing out phonological hypotheses. The argument for holistic representations over
segmental representations in early speech is often bolstered by the observation that ‘whole word’
processes tend to be replaced by segmental processes as the child learns to speak. Holistic
processes such as reduplication and final consonant deletion give way to segmental processes
like the gliding of liquids and the stopping of fricatives. However, this developmental trajectory
may simply be a consequence of the infant’s greater and greater command over vocalisations.

6.2 Statistical measures


The distributional statistics presented here do play an important role in modelling the order of
acquisition. The superiority of the sparsity and richness measures over frequency replicates
Shillcock & Westermann’s finding that phonotactic range is a better predictor of acquisition
order than frequency. In these analyses, the natural log of the frequency proved to be a better
predictor than frequency itself. It appears that gross measures such as phoneme frequency are
less important than measures that define the phoneme in terms of its context and pattern of
occurrence with other phonemes. When input to the stepwise regression, it is the measures that
incorporate the contextual sparsity (or richness) of the phoneme that are better predictors of the
order of acquisition than frequency.

Frequency confounds abound, however. Although the sparsity measure accounts for more of the
variance in acquisition order than does frequency, it is in no way insulated from it. Descriptive
statistics calculated in advance of the multiple regression reveal that sparsity is correlated at
around –0.7 with frequency (p = 0.000). High frequency phonemes also tend to occur in the
most different contexts. The ln (frequency) and ln (richness) score are even more highly
correlated. Despite this confound, the finding that the frequency of the phoneme unit is not the
best distributional predictor of acquisition order does stand.

Criticisms have been made of the nature of representation in early language acquisition. The
phoneme unit may be less important than more holistic or syllabic forms. Although the model
presented here takes phonemes as a base unit, the nature of the statistical information derived
from these base units is defined over groups of phonemes within a certain window. The
distributional statistics calculated from cooccurrences within a moving window do not place the
same amount of importance on the phoneme unit as a measure such as frequency.

Modelling the acquisition of English phonology in infancy 34


The finding that acquisition order is best modelled by incorporating a context-sensitive
distributional measure should not be surprising. The LSA model is one possible solution to the
problem of semantic knowledge acquisition and relies on context to derive many different
predictions of human behaviour – such as reaction times and semantic similarity judgements.
Contexts are calculated over words. In the experiments presented here, however, the context
based relationship is on a much narrower level, between phonemes within a 1 phoneme window.
One potential reason for this discrepancy could be the nature of the learning processes in the
child. Elman (1996) showed that to train a neural network on grammatical agreement, the initial
‘window size’ for the input to the model had to be low. As the model learned the window size
could be increased. If the initial input window was too large the model failed to learn the task.
It is possible that context based learning mechanisms are important in learning in the human
brain. A small input window in infancy could allow the developing child to order and arrange
their perceptual inputs effectively. The approach taken indicates that context sensitive learning
is useful for the acquisition of both semantics and phonology. In semantics the context can help
a reader or listener understand a new word. In phonology sensitivity to context might help the
infant pick out those phonemes that are most communicatively useful. Intriguingly, recent work
by Shillcock and others (personal communication) indicates that there is a relationship between
semantics and phonology in the English lexicon. The words with the highest correlations are
mainly those with the most communicative value. A (very hypothetical) topographic account is
compatible with models of semantics and phonology using high-dimensional vector spaces.

Around 30% of the variance in order of acquisition remains to be explained. However, as


discussed earlier, a model predicting 100% of Grunwell’s PACS acquisition order would too
constrained, over-fitting the much more noisy relationship observed in individual children. One
source of variance would undoubtedly be physical and motor differences. This is especially
noticeable when comparing different ways of producing the same phonetic utterance. The
phonemes /d/ or /g/ could be produced using just the tongue tip or the whole tongue pressed
against the palate. The infant often does not have any physical cues to help decide on the correct
position of the tongue in various phonemes. Only auditory cues can help the child match the /d/
in the environment with the /d/ or /d/-like sound which is produced. This lack if a combination
of cues could be an important source of individual variation. Parental input language also affects
acquisition. It would be interesting to study exactly how fine grained distinctions are between
linguistic input differences, different accents may even influence order of acquisition.
Modelling the acquisition of English phonology in infancy 35
7 Extensions

The LSA model also allows for calculations of vector similarity (used for estimating semantic
similarity) to be performed on pairs of phonemes. If a vector based account is useful for
predicting acquisition, than it may also be useful for predicting phonological similarities. An
initial test was performed to see whether the contextual similarity of a pair of phonemes affect
the age at which they were acquired. It might be expected that highly similar phonemes would
be difficult to discriminate, and hence acquired later. All correlations found in this initial test
were less than 0.12 (in the expected direction). Further work would be required to test the
significance and usefulness of this result. Presumably biological factors, modelled in terms of
similarity of feature based accounts, would have to be incorporated. However, combining
distributional and physical data might lead to a model with testable predictions about the
similarity of phonemes in context. Human subjects could be tested by asking them to perform
judgements of phonological similarity, for example is /pub/ closer to /cub/ or /rub/ in the way the
word sounds.

The role of the information theoretic measures might also be considered in more depth
elsewhere. In the models presented here, however, measures like the euclidean distance and
relative entropy were in general less useful for modelling acquisition order than simple measures
based on context and frequency.

Model modifications

Other measures of articulatory features could also be incorporated into the model. Lindblom
(1992) uses an ‘articulatory cost’ measure assign a level of difficulty to the production of a
certain phoneme.

In the corpora used there are no markers for the end of an utterance. Hence we cannot take
account of sentence boundaries, only word boundaries. Utterance boundaries might be a better
unit over which to allow the window to pass.

Modelling the acquisition of English phonology in infancy 36


This modelling technique should also be tested cross-linguistically. At the moment, however,
there is a lack of sufficient transcribed phonological corpora that can be read into the model.
Orders of acquisition are difficult to find for English consonants, and similar problems would
will be found in other languages.

Clinical implications

The finding of the model that context is more important than frequency may have clinical
implications. Children with phonological disorders may be more usefully trained to master the
language by presenting problem phonemes in as many different contexts as possible (rather than
simply emphasising the learning of words in which the phoneme is present).

Modelling the acquisition of English phonology in infancy 37


References

Berhardt, B.H., & Stemberger, J.P. (1998) The handbook of phonological development from the
perspective of constraint-based nonlinear phonology. San Diego: Academic Press.

Bosson-Bardies, B. de., Vihman, M.M., Roug-Hellichius, L., Durand, C., Landberg, I. & Arao,
F. (1992) Material evidence of infant selection from target language: a cross linguistic phonetic
study. In C.A Ferguson, L. Menn & C. Stoel-Gammon (eds.) , Phonological Development:
models, research, implications. Maryland: York Press

Chomsky, N. & Halle, M. (1968) The sound pattern of English. New York: Harper & Row

Davenport, M. & Hannahs, S.J. (1998) Introducing phonetics and phonology. London: Arnold.

Elman, J.L. (1993). Learning and development in neural networks: the importance of starting
small. Cognition 48, 71-99.

Elman, J.L. (1996). Rethinking Innateness.

Grunwell, P. (1985). Phonological Assessment of Child Speech. Windsor: NFER – Nelson.

Harm, M.W. & Seidenberg M.S. (1999) Phonology, reading acquisition, and dyslexia: insights
from connectionist models. Psychological Review 106 491-528.

Hayes, B. (1999) Phonological acquisition in Optimality Theory: the early stages, in Matthew
Gordon, ed., Papers in Phonology 2, UCLA Working Papers in Linguistics 1, 167-206.
[Submitted for formal publication in René Kager and Wim Zonneveld, eds., Fixing Priorities:
Constraints in Phonological Acquisition, to be published by Cambridge University Press.]

Ingram, D. (1988). The acquisition of word initial /v/. Language and Speech, 31, 77-85.

Ingram, D. (1992) Early phonological acquisition: a cross linguistic perspective. In C.A


Ferguson, L. Menn & C. Stoel-Gammon (eds.) , Phonological Development: models, research,
implications. Maryland: York Press

Jakobson, R. (1968). Child language, Aphasia and phonological universals. The Hague:
Mouton.

Kuhl, P.K. & Meltzoff, A.N. (1982) The bimodal perception of speech in infancy. Science 218,
1138-41.

Landauer, T.K. & Dumais, S.T. (1997) A solution to Plato’s problem: The Latent Semantic
Analysis theory of acquisition, induction, and representation of knowledge. Psychological
Review, 104, 211-240.

Lindblom, B. (1992) Phonological units as adaptive emergents of lexical development. In C.A


Ferguson, L. Menn & C. Stoel-Gammon (eds.) , Phonological Development: models, research,
implications. Maryland: York Press

Modelling the acquisition of English phonology in infancy 38


Locke, J.L. (1983) Phonological acquisition and change. New York: Academic Press.

Locke, J.L. (1995) Development of the capacity for spoken language. In Fletcher, P, &
MacWhinney, B. (eds.) The handbook of child language. Oxford: Blackwell

Locke, J.L & Pearson, D.M. (1992) Vocal learning and the emergence of phonological capacity.
In C.A Ferguson, L. Menn & C. Stoel-Gammon (eds.) , Phonological Development: models,
research, implications. Maryland: York Press

MacDonald, S. (2000) Unpublished Phd thesis at the University of Edinburgh.

Matasaka, N. (1992). Pitch characteristics of Japanese maternal speech to infants. Journal of


Child Language 19, 213-23.

Menn, L. & Stoel-Gammon C. (1995) Phonological development. In Fletcher, P, &


MacWhinney, B. (eds.) The handbook of child language. Oxford: Blackwell

Olmstead, D. (1966) A theory of the child’s learning of phonology. Language 42, 531-5.

Plunkett, K. & Marchman, V. (1993) From rote learning to system building: acquiring verb
morphology in children and connectionist nets. Journal of Child Language 20, 43-60.

Stampe, D. (1979) A dissertation on natural phonology. NY: Garland.

Shillcock, R.C. & Westermann, G. (1996) The role of phonotactic range in the order of
acquisition of English consonants. In Proceedings of the Fifth Symposium of the International
Clinical Phonetics and Linguistics Association.

Shillcock, R.C., Hicks, J., Cairns, P., Levy, J., & Chater, N. (under revision). A statistical
analysis of an idealised phonological transcription of the London-Lund corpus. Journal of
Computer Speech and Language

Studdert-Kennedy, M. (1991) Language development from an evolutionary perspective. In N.


Krasnegor, D. Rumbaugh, R. Schiefelbusch & M. Studdert-Kennedy (eds.) Language
acquisition: biological and behavioural determinants. Hillsdale NJ: Erlbaum

Vihman, M.M. (1992) Early syllables and the construction of phonology. In C.A Ferguson, L.
Menn & C. Stoel-Gammon (eds.) , Phonological Development: models, research, implications.
Maryland: York Press

Vihman, M.M. (1996) Phonological development: the origins of language in the child.
Cambridge, Mass: Blackwell

Modelling the acquisition of English phonology in infancy 39


Appendix A - Phonological feature representations for the consonants.

Symbol Sonorant Consonantal Voice Nasal Degree Labial Palatal Pharyngeal Round Tongue Radical

+
-1 1 0 -1 1 1 0 -1 1 0 0

,
-1 1 0 -1 1 -1 1 -1 -1 1 0

-
-1 1 0 -1 1 -1 -1 -1 -1 -1 0

.
-1 1 -1 -1 1 -1 -1 -1 -1 -1 0

/
-1 1 -1 -1 1 1 0 -1 1 0 0

0
-1 1 -1 -1 1 -1 1 -1 -1 1 0

1
-0.8 1 -1 -1 1 -1 0 -1 -1 0 0

3
4
-0.8 1 0 -1 1 -1 0 -1 -1 0 0

5
-0.5 1 0 -1 0 -1 1 -1 -1 0 0

6
-0.5 1 -1 -1 0 -1 1 -1 1 0 0

7
-0.5 1 0 -1 0 -1 -1 1 -1 -1 -1

8
-0.5 1 -1 -1 0 -1 1 -1 -1 1 0

9
-0.5 1 -1 -1 0 -1 0 -1 -1 0 0

:
-0.5 1 -1 -1 0 -1 1 -1 -1 0 0

;
-0.5 1 0 -1 0 -1 1 -1 1 0 0

<
-0.5 1 0 -1 0 -1 1 -1 -1 1 0

=
-0.5 1 0 -1 0 -1 0 -1 -1 0 0

>
0 0 1 1 1 1 0 -1 1 0 0

?
0 0 1 1 1 -1 1 -1 -1 1 0

@
0 0 1 1 1 -1 -1 -1 -1 -1 0

A
0.5 0 1 0 -1 -1 1 -1 -1 1 0

B
0.5 0 1 0 -1 -1 -1 1 1 -1 -1

C
0.8 0 1 0 0 1 -1 -1 1 -1 0

D
0.8 0 1 0 0 -1 0 -1 -1 0 1

Modelling the acquisition of English phonology in infancy 40


Appendix B - Results of stepwise regressions for all cases
CHILDES corpus - no word boundaries

2 phoneme window

Multiple R .83506
R Square .69733
Adjusted R Square .65193
Standard Error 1.07918

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.66560 17.88853
Residual 20 23.29274 1.16464

F = 15.35975 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.171275 .360777 -.414220 -3.247 .0040


LABIAL -.761976 .309898 -.317163 -2.459 .0232
XLNRICH .792045 .187126 .527338 4.233 .0004
(Constant) 8.975619 1.488574 6.030 .0000

3 phoneme window

Multiple R .83416
R Square .69582
Adjusted R Square .65019
Standard Error 1.08188

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.54904 17.84968
Residual 20 23.40929 1.17046

F = 15.25008 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.174204 .361668 -.415256 -3.247 .0040


LABIAL -.761642 .310703 -.317024 -2.451 .0235
XLNRICH .788873 .187367 .525863 4.210 .0004
(Constant) 9.267534 1.563720 5.927 .0000

4 phoneme window

Multiple R .83447
R Square .69635
Adjusted R Square .65080
Standard Error 1.08094

Analysis of Variance
Modelling the acquisition of English phonology in infancy 41
DF Sum of Squares Mean Square
Regression 3 53.58980 17.86327
Residual 20 23.36853 1.16843

F = 15.28831 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.174091 .361353 -.415216 -3.249 .0040


LABIAL -.763303 .310368 -.317716 -2.459 .0232
XLNRICH .789418 .187149 .526261 4.218 .0004
(Constant) 9.496357 1.613803 5.884 .0000

5 phoneme window

Multiple R .83426
R Square .69599
Adjusted R Square .65039
Standard Error 1.08157

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.56251 17.85417
Residual 20 23.39582 1.16979

F = 15.26270 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.174926 .361562 -.415511 -3.250 .0040


LABIAL -.762754 .310573 -.317487 -2.456 .0233
XLNRICH .787288 .186876 .525943 4.213 .0004
(Constant) 9.653619 1.652209 5.843 .0000

CHILDES corpus – word boundaries included

2 phoneme window

ln(richness), degree, labial

Multiple R .83269
R Square .69337
Adjusted R Square .64738
Standard Error 1.08622

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.36096 17.78699
Residual 20 23.59737 1.17987

F = 15.07540 Signif F = .0000

Modelling the acquisition of English phonology in infancy 42


------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.167201 .363146 -.412780 -3.214 .0044


LABIAL -.754908 .312225 -.314221 -2.418 .0253
XLNRICH .892292 .213751 .524132 4.174 .0005
(Constant) 9.226914 1.567310 5.887 .0000

3 phoneme window

sparsity labial euclidean

Multiple R .85250
R Square .72675
Adjusted R Square .68576
Standard Error 1.02540

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 55.92942 18.64314
Residual 20 21.02891 1.05145

F = 17.73096 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

LABIAL -1.012171 .286359 -.421304 -3.535 .0021


XEUCLID -3.11393E-05 1.0981E-05 -.355948 -2.836 .0102
XSPARSE 12.200683 3.274957 .475751 3.725 .0013
(Constant) 2.436262 .504747 4.827 .0001

4 phoneme window

sparsity, labial, euclidean

Multiple R .84274
R Square .71022
Adjusted R Square .66675
Standard Error 1.05596

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 54.65720 18.21907
Residual 20 22.30114 1.11506

F = 16.33914 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

LABIAL -1.062296 .292999 -.442168 -3.626 .0017


XEUCLID -2.32703E-05 7.5009E-06 -.391420 -3.102 .0056
Modelling the acquisition of English phonology in infancy 43
XSPARSE 11.802944 3.383086 .445436 3.489 .0023
(Constant) 2.560260 .493920 5.184 .0000

5 phoneme window

ln (richness), degree, labial

Multiple R .83535
R Square .69780
Adjusted R Square .65247
Standard Error 1.07834

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.70184 17.90061
Residual 20 23.25650 1.16282

F = 15.39407 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.152242 .360595 -.407489 -3.195 .0045


LABIAL -.742150 .310377 -.308911 -2.391 .0267
XLNRICH .893242 .210688 .529597 4.240 .0004
(Constant) 9.417236 1.587609 5.932 .0000

Adjusted R Square .65247


F(3,20)=15.39407 Signif F = .0000

FORWARD AND BACKWARD MODELS

CHILDES corpus (bounded) - Forward 1 phoneme window

sparsity, degree, sonorant

Multiple R .85440
R Square .72999
Adjusted R Square .68949
Standard Error 1.01930

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 56.17907 18.72636
Residual 20 20.77927 1.03896

F = 18.02408 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.903867 .411576 -.673301 -4.626 .0002


SONORANT -1.483222 .460647 -.464558 -3.220 .0043
Modelling the acquisition of English phonology in infancy 44
XSPARSE 5.610286 1.126509 .593921 4.980 .0001
(Constant) 1.017898 .531878 1.914 .0701

2 phoneme window

Sparsity, degree

Multiple R .81112
R Square .65792
Adjusted R Square .62534
Standard Error 1.11965

Analysis of Variance
DF Sum of Squares Mean Square
Regression 2 50.63225 25.31613
Residual 21 26.32608 1.25362

F = 20.19437 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.236429 .364812 -.437262 -3.389 .0028


XSPARSE 7.183658 1.489403 .622264 4.823 .0001
(Constant) 2.361263 .369725 6.387 .0000

3 phoneme window

Sparsity, degree, labial

Multiple R .84310
R Square .71082
Adjusted R Square .66744
Standard Error 1.05487

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 54.70346 18.23449
Residual 20 22.25487 1.11274

F = 16.38696 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -.997167 .355426 -.352647 -2.806 .0109


LABIAL -.675026 .306204 -.280972 -2.204 .0394
XSPARSE 6.977379 1.572673 .554574 4.437 .0003
(Constant) 2.074033 .385387 5.382 .0000

4 phoneme window

Sparsity, degree, labial

Modelling the acquisition of English phonology in infancy 45


Multiple R .83256
R Square .69315
Adjusted R Square .64712
Standard Error 1.08662

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.34362 17.78121
Residual 20 23.61471 1.18074

F = 15.05943 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -.996473 .366338 -.352402 -2.720 .0132


LABIAL -.709192 .314222 -.295193 -2.257 .0354
XSPARSE 7.190177 1.723784 .535050 4.171 .0005
(Constant) 2.097038 .397157 5.280 .0000

5 phoneme window

Sparsity, degree, labial

Multiple R .81890
R Square .67060
Adjusted R Square .62119
Standard Error 1.12584

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 51.60809 17.20270
Residual 20 25.35024 1.26751

F = 13.57202 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -.972869 .380740 -.344054 -2.555 .0189


LABIAL -.740652 .324590 -.308287 -2.282 .0336
XSPARSE 6.924854 1.797713 .512212 3.852 .0010
(Constant) 2.124061 .412157 5.154 .0000

CHILDES corpus (bounded) - Backward 1 phoneme window

Ln (frequency), degree, labial

Multiple R .83373
R Square .69511
Adjusted R Square .64938
Standard Error 1.08313

Analysis of Variance
DF Sum of Squares Mean Square
Modelling the acquisition of English phonology in infancy 46
Regression 3 53.49472 17.83157
Residual 20 23.46361 1.17318

F = 15.19934 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.177426 .362078 -.416395 -3.252 .0040


LABIAL -.760997 .311096 -.316756 -2.446 .0238
XLNFREQ -.781561 .186089 -.525173 -4.200 .0004
(Constant) 10.778276 1.919689 5.615 .0000

2 phoneme window

Ln(euclidean), degree, labial

Multiple R .83816
R Square .70251
Adjusted R Square .65788
Standard Error 1.06992

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 54.06362 18.02121
Residual 20 22.89471 1.14474

F = 15.74268 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.177240 .357661 -.416330 -3.291 .0036


LABIAL -.757500 .307330 -.315300 -2.465 .0229
XLOGEUCL -.928640 .215470 -.532396 -4.310 .0003
(Constant) 11.275883 1.985161 5.680 .0000

3 phoneme window

Sparsity, degree, ln (euclidean)

Multiple R .82180
R Square .67536
Adjusted R Square .62667
Standard Error 1.11767

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 51.97471 17.32490
Residual 20 24.98363 1.24918

F = 13.86901 Signif F = .0000

------------------ Variables in the Equation ------------------


Modelling the acquisition of English phonology in infancy 47
Variable B SE B Beta T Sig T

DEGREE -1.265161 .367213 -.447423 -3.445 .0026


XLOGEUCL -.689685 .273792 -.411109 -2.519 .0204
XSPARSE 4.440682 2.526856 .291134 1.757 .0941
(Constant) 9.333985 2.797266 3.337 .0033

4 phoneme window
Ln(euclidean), degree, labial

Multiple R .83961
R Square .70494
Adjusted R Square .66068
Standard Error 1.06554

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 54.25096 18.08365
Residual 20 22.70737 1.13537

F = 15.92756 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.167378 .356229 -.412842 -3.277 .0038


LABIAL -.725704 .307217 -.302065 -2.362 .0284
XLOGEUCL -.864879 .198978 -.537142 -4.347 .0003
(Constant) 11.306093 1.975479 5.723 .0000

5 phoneme window
Ln(euclidean), degree, labial

Multiple R .83728
R Square .70103
Adjusted R Square .65619
Standard Error 1.07257

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.95017 17.98339
Residual 20 23.00816 1.15041

F = 15.63218 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.173081 .358559 -.414859 -3.272 .0038


LABIAL -.728523 .309197 -.303239 -2.356 .0288
XLOGEUCL -.833177 .194317 -.533148 -4.288 .0004
(Constant) 11.205960 1.979186 5.662 .0000

Modelling the acquisition of English phonology in infancy 48


Lund Corpus with boundaries – 1 phoneme window

sparse, degree, amu

R Square .65583
F(3,20)= 15.60897 Signif F = .0000

Multiple R .83709
R Square .70072
Adjusted R Square .65583
Standard Error 1.07313

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.92619 17.97540
Residual 20 23.03214 1.15161

F = 15.60897 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.282067 .348190 -.453402 -3.682 .0015


XAMU .445552 .196089 .279290 2.272 .0343
XSPARSE 10.481737 2.105680 .614427 4.978 .0001
(Constant) 2.940157 .563616 5.217 .0000

2 phoneme window

sparse, degree, amu

R Square .65386
F(3,20)= 15.48259 Signif F = .0000

Multiple R .83607
R Square .69901
Adjusted R Square .65386
Standard Error 1.07619

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.79478 17.93159
Residual 20 23.16356 1.15818

F = 15.48259 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.286288 .349097 -.454895 -3.685 .0015


XAMU .438747 .195820 .275268 2.241 .0366
XSPARSE 10.201737 2.104530 .598013 4.848 .0001
(Constant) 3.364417 .712280 4.723 .0001

3 phoneme window

Modelling the acquisition of English phonology in infancy 49


sparse, degree, amu

Multiple R .83546
R Square .69799
Adjusted R Square .65269
Standard Error 1.07801

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.71607 17.90536
Residual 20 23.24227 1.16211

F = 15.40758 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.288721 .349644 -.455755 -3.686 .0015


XAMU .434582 .195619 .273294 2.222 .0380
XSPARSE 10.041892 2.107237 .588644 4.765 .0001
(Constant) 3.605646 .808047 4.462 .0002

4 phoneme window

degree, ln(frequency), labial

Multiple R .79885
R Square .63816
Adjusted R Square .58388
Standard Error 1.17997

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 49.11169 16.37056
Residual 20 27.84665 1.39233

F = 11.75765 Signif F = .0001

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.257946 .394842 -.444871 -3.186 .0046


LABIAL -.813157 .337702 -.338467 -2.408 .0258
XLNFREQ -.813361 .237636 -.463978 -3.423 .0027
(Constant) 11.311697 2.501064 4.523 .0002

5 phoneme window

degree, ln(frequency), sparse

Multiple R .80168
R Square .64268
Adjusted R Square .58909
Standard Error 1.17257

Analysis of Variance
Modelling the acquisition of English phonology in infancy 50
DF Sum of Squares Mean Square
Regression 3 49.45992 16.48664
Residual 20 27.49842 1.374920

F = 11.99097 Signif F = .0001

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.384870 .381515 -.489758 -3.630 .0017


XLNFREQ -.806747 .236409 -.460205 -3.413 .0028
XSPARSE 25.187131 10.177339 .336700 2.475 .0224
(Constant) 11.189587 2.494152 4.486 .0002

Lund corpus with no boundaries – 1 phoneme window

Degree, richness, logeuclid

Multiple R .80219
R Square .64352
Adjusted R Square .59004
Standard Error 1.17120

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 49.52394 16.50798
Residual 20 27.43439 1.37172

F = 12.03451 Signif F = .0001

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.755277 .385369 -.620752 -4.555 .0002


XLOGEUCL 1.938990 .674246 .785542 2.876 .0093
XRICHNES 2807.548170 663.772450 1.160393 4.230 .0004
(Constant) -16.900128 6.824113 -2.477 .0223

2 phoneme window

richness, ln(euclidean), consonantal

Multiple R .81668
R Square .66696
Adjusted R Square .61700
Standard Error 1.13204

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 51.32803 17.10934
Residual 20 25.63031 1.28152

F = 13.35087 Signif F = .0001

------------------ Variables in the Equation ------------------


Modelling the acquisition of English phonology in infancy 51
Variable B SE B Beta T Sig T

DEGREE -1.723172 .372577 -.609398 -4.625 .0002


XLOGEUCL 1.997979 .632035 .776370 3.161 .0049
XRICHNES 5411.296872 1170.730854 1.137882 4.622 .0002
(Constant) -17.627767 6.434650 -2.740 .0126

3 phoneme window
degree, richness, ln(euclidean)

Multiple R .79000
R Square .62411
Adjusted R Square .56772
Standard Error 1.20267

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 48.03026 16.01009
Residual 20 28.92807 1.44640

F = 11.06889 Signif F = .0002

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.768870 .395609 -.625559 -4.471 .0002


XEUCLID 6.44989E-05 2.5139E-05 .434386 2.566 .0184
XRICHNES 5241.679865 1223.534225 .735392 4.284 .0004
(Constant) 1.055696 .744348 1.418 .1715

4 phoneme window
degree, richness, euclidean

Multiple R .78970
R Square .62362
Adjusted R Square .56716
Standard Error 1.20344

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 47.99280 15.99760
Residual 20 28.96554 1.44828

F = 11.04595 Signif F = .0002

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.757589 .395727 -.621570 -4.441 .0003


XEUCLID 5.05871E-05 1.9854E-05 .416557 2.548 .0192
XRICHNES 6730.647975 1576.899226 .707421 4.268 .0004
(Constant) 1.234063 .688210 1.793 .0881

5 phoneme window
degree richness ln(euclidean)
Modelling the acquisition of English phonology in infancy 52
Multiple R .78865
R Square .62197
Adjusted R Square .56526
Standard Error 1.20608

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 47.86555 15.95518
Residual 20 29.09278 1.45464

F = 10.96848 Signif F = .0002

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.721077 .396933 -.608657 -4.336 .0003


XLOGEUCL 1.534088 .607522 .619285 2.525 .0201
XRICHNES 11784.366527 2923.203630 .990891 4.031 .0007
(Constant) -13.242646 6.315447 -2.097 .0489

JH, CH removed for bounded CHILDES corpus

1 phoneme window

degree, sparsity, sonorant

Multiple R .88220
R Square .77827
Adjusted R Square .74132
Standard Error .94057

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 55.89408 18.63136
Residual 18 15.92410 .88467

F = 21.06018 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -2.130642 .394164 -.744665 -5.405 .0000


SONORANT -.995255 .429180 -.314976 -2.319 .0324
XSPARSE 5.263173 1.181392 .506577 4.455 .0003
(Constant) 2.038776 .363132 5.614 .0000

2 phoneme window

degree, sparsity

Multiple R .81667
R Square .66695
Adjusted R Square .63189
Standard Error 1.12201

Modelling the acquisition of English phonology in infancy 53


Analysis of Variance
DF Sum of Squares Mean Square
Regression 2 47.89903 23.94951
Residual 19 23.91915 1.25890

F = 19.02412 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.605008 .385240 -.560954 -4.166 .0005


XSPARSE 8.595066 2.313492 .500222 3.715 .0015
(Constant) 2.629724 .356301 7.381 .0000

3 phoneme window

sparsity, degree, labial

Multiple R .88455
R Square .78242
Adjusted R Square .74616
Standard Error .93172

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 56.19220 18.73073
Residual 18 15.62599 .86811

F = 21.57644 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.206329 .341218 -.421615 -3.535 .0024


LABIAL -.612875 .273671 -.261661 -2.239 .0380
XSPARSE 13.383774 2.858992 .537949 4.681 .0002
(Constant) 1.997141 .363404 5.496 .0000

4w phoneme window

sparsity, degree, labial

Multiple R .86330
R Square .74529
Adjusted R Square .70284
Standard Error 1.00810

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 53.52535 17.84178
Residual 18 18.29283 1.01627

F = 17.55617 Signif F = .0000

------------------ Variables in the Equation ------------------


Modelling the acquisition of English phonology in infancy 54
Variable B SE B Beta T Sig T

DEGREE -1.227939 .369460 -.429168 -3.324 .0038


LABIAL -.657910 .295283 -.280889 -2.228 .0389
XSPARSE 12.733712 3.173970 .496694 4.012 .0008
(Constant) 2.083664 .390656 5.334 .0000

5 phoneme window

degree, sparsity, labial

Multiple R .85436
R Square .72993
Adjusted R Square .68492
Standard Error 1.03805

Analysis of Variance
DF Sum of Squares Mean Square
Regression 3 52.42218 17.47406
Residual 18 19.39601 1.07756

F = 16.21638 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

DEGREE -1.239430 .380450 -.433184 -3.258 .0044


LABIAL -.668254 .303911 -.285305 -2.199 .0412
XSPARSE 12.910339 3.431336 .479201 3.762 .0014
(Constant) 2.095758 .404707 5.178 .0001

Modelling the acquisition of English phonology in infancy 55

Вам также может понравиться