Perceiving Categorical Emotion in Sound PDF

Psychomusicology: Music, Mind, and Brain 2015 American Psychological Association
2016, Vol. 26, No. 1, 1525 0275-3987/16/$12.00 http://dx.doi.org/10.1037/pmu0000105
Perceiving Categorical Emotion in Sound: The Role of Timbre

Casady Bowman and Takashi Yamauchi
Texas A&M University
This study investigated the role of timbre for the perception of emotion in instrumental sounds. In 2
experiments, 180 stimuli were created by mixing sounds of 10 instruments (flute, clarinet, trumpet, tuba,
piano, French horn, violin, guitar, saxophone, and bell). In Experiment 1a, participants received stimuli
1 at a time and rated the degree to which each stimulus sounded like each of the 10 instruments (i.e.,
timbre judgment). In Experiment 1b, participants received the same sound stimuli and rated whether
these stimuli sounded happy, sad, angry, fearful, and disgusting (i.e., emotion judgment). The authors
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
extracted acoustic features from these instrumental sounds and examined the extent to which these
This document is copyrighted by the American Psychological Association or one of its allied publishers.
features could predict both emotion and timbre ratings made for the same sound stimuli. Our regression
analysis showed that regularity, envelope centroid, sub band 2, and sub band 9 explained timbre and
emotion ratings. The relationship between acoustic features and emotion judgments of basic emotions,
however, was not uniform. Sub band 7, related to perceived activity in a sound, predicted anger, fear and
disgust, but not sadness. Because sub band 7 could predict all emotions except for sadness, this indicates
that some timbre-related features play a substantial role in the perception of emotion and that timbre
could be a more useful indicator for specific emotions rather than emotion in general.
Keywords: timbre, emotion, acoustic features, sound perception
Emotion in sounds is perceived by a number of attributes such Timbre and Emotion

as pitch, loudness, duration, and timbre (Caclin et al., 2006;
Hailstone et al., 2009). Timbre is the acoustic property of sound Timbre is an attribute of sound used by a listener to judge that
that is essential for the identification of auditory stimuli with two sounds similar in loudness and pitch are dissimilar (American
identical pitches (Bregman, Liao, & Levitan, 1990; Hailstone et National Standards Institute, 1994). Helmholtz defined different
al., 2009; McAdams & Cunible, 1992; McAdams, Winsberg, Don- timbres as resulting from different amplitudes (of harmonic com-
nadieu, De Soete, & Krimphoff, 1995). For example, to identify ponents) of a complex tone in a steady state (Helmholtz, 1885).
two musical instruments playing the same note for the same These definitions, however, do not adequately describe acoustic
duration, one uses timbre (Grey & Moorer, 1977; Risset & Wessel, features that predict both timbre and emotion in sound. A wide
1982). This study examines the role of timbre for the perception of range of features from loudness and roughness (e.g., Leman,
emotion. Vermeulen, De Voogdt, Moelants, & Lesaffre, 2005) to mode and
There is considerable research about how acoustic and structural harmony (e.g., Gabrielsson & Lindstrm, 2010) can account for
features contribute to emotional expression in music, but few perceived emotions, but can these features also explain timbre
studies have explored the connection between timbre and emotion. (Patel, 2009)?
As a notable exception, Eerola, Ferrer, and Alluri (2012) showed Psychoacoustical experiments show that timbre is multidimen-
that the acoustic features envelope centroid, ratio high-frequency sional (Caclin et al., 2005); it arises from a distribution of acoustic
to low-frequency energy, and skewness could predict dimensions features rather than one single physical dimension (Padova, Bi-
of emotion (i.e., valence and activation), but how does this extend anchini, Lupone, & Balardinelli, 2003). Acoustic features such as
to particular categories of emotion (e.g., happy, sad, anger, fear, or amplitude, phase, attack time, decay, spectrum fine structure,
disgust; Ekman, 1992)? By looking at particular categories of spectral fluctuation, the presence of low-amplitude to high-
emotions we can describe the relationship between timbre and frequency energy and spectral centroid work simultaneously to
emotion more thoroughly. In this study we investigate the role of influence the perception of timbre and are central for instrument
timbre in emotion perception by examining the following ques- recognition (Caclin et al., 2005; Caclin, Giard, & McAdams, 2009;
tions; do particular acoustic features of sound that predict timbre Chartrand & Belin, 2006; Grey & Moorer, 1977; Hailstone et al.,
also predict different categories of emotions in instrumental 2009; Hajda, Kendall, Carterrette, & Harshberger, 1997). The
sounds? If so, how are they related? present research adopts these acoustic features and investigates the
extent to which these features predict the perception of five cate-
gorical emotions happiness, sadness, anger, fear, and disgust.
This article was published Online First December 21, 2015.

Casady Bowman and Takashi Yamauchi, Department of Psychology, Related Work: Acoustic Features of Sound
Texas A&M University.
Correspondence concerning this article should be addressed to Casady Acoustic features of emotional sounds have been investigated
Bowman, Department of Psychology, Texas A&M University, College since the 1970s (see Scherer & Oshinsky, 1977). Only recently
Station, TX 77854. E-mail: casadyb@tamu.edu have researchers studied the relationship between emotion and
15
16 BOWMAN AND YAMAUCHI
timbre. There are features that can explain emotion in sound; which could have previously formed emotional associations for
however, there is not yet evidence for a conclusive set of acoustic listeners. In this regard, it is unclear whether the acoustic features
features that explain both emotion and timbre (Coutinho & Dib- or the association of specific sound sources with emotional expe-
ben, 2013; Eerola & Vuoskoski, 2013). rience created the perceived emotion.
Eerola et al. (2012) showed that a dominant portion of valence With these issues in mind, the present study has several aims.
and arousal could be predicted by a few acoustic features such as We extracted acoustic features from pseudo instrumental sounds
ratio of high-frequency to low-frequency energy, attack slope, and and examined the extent to which these features could explain both
envelope centroid. Participants rated the perceived affect of 110 timbre and emotion ratings made for the same sound stimuli.
instrumental sounds that were equal in duration, pitch, and dynam- Stimuli for the present experiments were created to reduce partic-
ics. Results showed that acoustic features related to timbre played ipants familiarity with sound stimuli, which could bias the per-
a role in affect perception. ception of a listener to a certain emotion (e.g., associating flute
Scherer and Oshinsky (1977) used synthetic tone sequences of sounds with happiness). Using categorical emotions for affect
expressive speech with varied timbres and demonstrated that ma- perception, we aim to compliment the work of Eerola et al. (2012)
nipulating amplitude, pitch variation, level, contour, tempo, and that used affect dimensions (valence and arousal).
envelope could explain variance in emotion ratings. Participants

listened to one of three types of tone sequences created from
Overview of Experiments
sawtooth wave bursts and rated each sound on scales accounting
for pleasantness-unpleasantness, activity-passivity, and potency- In conducting this study novel stimuli were created for Exper-
weakness and indicated if each sound was an expression of anger, iments 1a and 1b by mixing frequencies from 10 instrumental
fear, boredom, surprise, happiness, or disgust. While this showed sounds (flute, clarinet, trumpet, tuba, piano, French horn, violin,
strong effects of manipulating acoustic features of sound on emo- guitar, saxophone, and bell). To identify the acoustic properties
tion perception, this study did not address whether these features that predict timbre and emotion judgments, 29 acoustic features
were related to timbre. Likewise, Juslin (1997) showed that listen- that are known to relate to timbre (see Table 1 for descriptions)
ers use similar acoustic cues (e.g., tempo, attack time, and sound were initially extracted from these sound stimuli.
level) to decode emotion in synthesized and live music perfor-
mances. While this research helps understand how some acoustic Experiment 1a: Timbre Judgment; Experiment 1b:
features are related to specific emotions, no direct comparison of
Emotion Judgment
features for timbre and emotion were made. Without this informa-
tion, it is difficult to indicate how well timbre contributes to In Experiment 1a, participants listened to the sound stimuli one
perceived emotion. at a time and rated the extent to which each sound was perceptually
In summary, there is an important link between timbre and close to the sound produced by a flute, clarinet, trumpet, tuba,
emotion, though previous studies are limited in terms of scope and piano, French horn, violin, guitar, saxophone, or bell (i.e., timbre
stimuli. First, only a few studies have directly investigated the role judgment). In Experiment 1b, participants received the same sound
of timbre in emotion perception (e.g., Eerola et al., 2012). Second, stimuli and judged the extent to which the stimulus sounded happy,
music excerpts are typically the stimuli used to study emotion, sad, angry, fearful, and disgusting (i.e., emotion judgment). Our
Table 1
Initial Acoustic Features (29 Total)
Feature Description
Attack time Temporal duration of the onset of a sound

Attack slope Slope of the attack time
Brightness The amount of energy above a specified frequency (usually 1,500 Hz) (Caclin et al., 2005).
Mel frequency cepstral This represents the power spectrum of a sound based on a linear transformation from actual frequency to the Mel-scale of
coefficients frequency
Roughness Sensory dissonance (Sethares, 1999)
Zero cross Number of times a sound crosses the x-axis (Tzanetakis & Cook, 2002)
Roll off Amount of high frequencies in a sound signal (Tzanetakis & Cook, 2002)
Fluctuation Change between consecutive spectral frames(McAdams et al., 1995)
Spectral centroid Geometric center of a sound spectrum (McAdams, Winsberg, Donnadieu, De Soete, & Krimpoff, 1995).
Entropy Measure of disorder of a sound spectrum
Spectral flatness Ratio between geometric and arithemetic mean of a sound spectrum (Eerola, Ferrer, & Alluri, 2012)
Regularity Degree of regularity of peaks within a sound spectrum (McAdams, Beauchamp, & Meneguzzi, 1999; Lartillot & Toiviainen,
2008).
Spread Standard deviation of a sound spectrum
Kurtosis Kurtosis of a sound spectrum
Skew Skew within a sound spectrum
Envelope centroid Centroid of the temporal envelope (Eerola et al., 2012)
Inharmonicity Deviation of partials from harmonic frequencies within a sound (Jensen, 1999)
Hf-lf ratio High-energy to low-energy ration (Juslin, 2000)
Sub band 110 Spectral flux within particular frequency bands (Alluri & Toiviainen, 2010; Eerola et al., 2012)
PERCEIVING CATEGORICAL EMOTION IN SOUND 17
goal was to identify the extent to which acoustic features that could synthesis program (SPEAR) applies fast Fourier transform analy-
explain behavioral responses obtained in the timbre judgment task sis and decomposes each sound into amplitude and frequency
(Experiment 1a) also explain behavioral responses obtained in the components. Laboratory assistants created combination sounds
emotion judgment task (Experiment 1b). from each pair of instrumental sounds by manually picking up
Behavioral rating scores were averaged over participants for frequencies from one sound (e.g., clarinet) and manually picking
individual sound stimuli. Because data from the timbre judgment up frequencies from the other sound (e.g., French Horn), and
task in Experiment 1a consisted of 10 response dimensions for mixing these frequencies to create a new sound (Figures 1a and
each sound and the emotion judgment task in Experiment 1b 1b). When creating combinations, laboratory assistants were in-
produced five response dimensions for each sound. To make structed to make sure that the combination sound still sounded like
dimensions of timbre and emotion responses comparable, Principal a mix between the two instruments in the given pair (e.g., the
Component Analysis (PCA) was applied to equate the dimension- combination sound still sounded like a mix between the clarinet
ality of the two types of behavioral rating data (because the rating and the French horn as in Figure 1a).
data was not normally distributed, a logarithmic transformation Laboratory assistants then modified the combined sound by
was applied before PCA). For our analysis, we selected the first manually shifting or deleting individual frequencies so that the
two principal components from timbre judgment and emotion sounds would convey happiness, anger, sadness, or fear based on
judgment tasks, based upon eigenvalues larger than one, and their own subjective judgments. Figure 2 illustrates how frequen-
investigated the extent to which the aforementioned acoustic fea-
cies were shifted to create the emotional sounds. The same sound
tures accounted for the principal component scores obtained from
stimuli were used for Experiments 1a and 1b.
the behavioral rating data.
Before mixing, the sounds amplitudes were normalized using
Eerola et al. (2012) showed that affect ratings of short instru-
the program Audacity (Version 1.3.4-beta) by first utilizing the DC
mental sounds were explained by spectral centroid and ratio high-
offset function where the mean amplitude of the sound sample is
frequency low-frequency energy. This is in accordance with pre-
set to 0 to decrease any distortions or extra sounds not related to
vious studies that relate the expressive content of speech and music
the stimuli. The sound stimuli were then normalized by setting the
(Juslin & Laukka, 2003; Scherer & Oshinsky, 1977). Thus, we
predict that features related to the fluctuation of sound should be peak amplitude to 1.0 dB (decibel).
significant predictors for the instrument as well as emotion ratings.
Furthermore, it is also possible that features such as sub band Procedure
110, where lower bands are more indicative of perceived fullness
and higher bands are representative of perceived activity (Alluri & In Experiment 1a, participants were randomly presented 180
Toiviainen, 2010), are likely to characterize particular categories sounds using customized Microsoft Visual Basic software through
of emotion. JVC Flats stereo headphones. No participants reported having
difficulty hearing the sounds. Stimuli were presented in a random
Method order for each participant. After listening to the stimuli participants
rated each sound on 10 different rating scales for the 10 instrument
types. For example, after listening to a stimulus sound, the partic-
Participants
ipant rated how much the stimulus sounded like the flute, then the
In total, 219 participants (73 male, mean age 18.6 years, 146 clarinet, and so forth, separately. For each rating, a scale ranging
female, mean age 18.5 years) participated in Experiment 1a and from 1 to 71 strongly disagree (the degree to which the
a total of 376 participants (202 male, mean age 19.2 years; 174 stimuli sounded like one of the 10 given instruments) and 7
female, mean age 19.2 years) participated in Experiment 1b. strongly agreewas used.
Participants were recruited from the university subject pool and The procedure of Experiment 1b was identical to that in Exper-
attending introductory psychology classes. Participants received iment 1a except that participants rated each sound on five cate-
course credit for their participation. No participants who partici- gorical emotions happiness, anger, sadness, fear, and disgust
pated in Experiment 1a participated in Experiment 1b. (Ekman, 1992). For example, after listening to a stimulus sound,
the participant rated how much the stimulus was perceived to
Materials sound like happy, sad, anger, fear, and disgust via rating scales
presented on the computer screen. Participants rated each sound on
Stimuli consisted of 180 manually produced pseudo instru-
all five emotions with each emotional scale ranging from 1 to
mental sounds (45 instrumental pairs 4 emotions 180 total
71 strongly disagree and 7 strongly agree. This rating
sounds). We first audio-recorded sounds of 10 instruments at 440
method was adopted from the emotion rating procedure used by
Hz: flute, clarinet, alto saxophone, trumpet, French horn, tuba,
Stevenson and James (2008).
guitar, violin, piano, and bells (six professional musicians from the
395th Army band, a United States Army reserve band, played each
instrument at 440 Hz and a digital musical tuner was used for 1
The emotion disgust was not included because we found that creating
verification of pitch). We then paired these audio-recorded sounds sounds conveying disgust by rearranging partials or frequencies was dif-
(a total of 45 pairs), and five undergraduate laboratory assistants ficult. While this emotion has not shown to be prominent in the music and
emotion literature, we included this in analyses and rating options for
were instructed to generate four different emotional sounds participants to make use of the five basic emotions (happy, sad, anger, fear,
(happy, sad, angry, and fearful)1 for each pair using an audio and disgust) as well as to give participants another dimension on which to
editing and synthesis program (SPEAR; Klingbeil, 2005). The rate sound stimuli.
Figure 1. (a) Illustrates Step 1 of stimuli creation, where laboratory assistants arbitrarily selected frequencies
from each instrumental sound using the program SPEAR (Klingbeil et al., 2005). (b) Shows Step 2, where lab
assistants created a new sound by mixing frequencies taken from each instrumental sound within a pair. Lab
assistants were instructed to maintain the sound identity from each instrumental sound in a pair so that the new
sound sounded like a combination of the two instrumental sounds.
Design and Analysis Procedure ment ratings. Independent variables were predictors, or acoustic
features (envelope centroid, sub bands 110, etc.), extracted from
Principal component analyses. Participants in Experiment
the 180 sound stimuli. Robust regression (Hampel, Ronchetti,
1a made timbre ratings associated with 10 different instruments
Rousseeuw, & Stahel, 1986) was used to determine statistically
(10 dimensions) and participants in Experiment 1b made emotion
significant predictor variables.
ratings associated with five categories of emotions (five dimen-
Acoustic features: Feature selection. In total, 29 features
sions). Because of different response dimensions in the two sets of
were chosen (see Table 1 for a complete list and description). All
rating data, it was necessary to make these response dimensions
analogous. We used PCA for this purpose. Before PCA, judgment features were computed using 25 ms frames with 50% overlap, or
scores given to each stimulus were averaged across participants the mean of each acoustic feature across all frames. The feature
and a logarithmic transformation was applied to the rating data to extraction analysis was carried out in the MATLAB environment
reduce skewness. We selected the first two principal components using the MIRToolbox 1.3.1 (Lartillot & Toiviainen, 2008).
because their corresponding eigenvalues were larger than one. The To extract acoustic features from our sound stimuli we followed
two components accounted for a majority of the variance for the the feature selection procedure specified by Al-Kandari and Joliffe
judgment data in Experiments 1a and 1b (approximately 75 and (2001) and Abdi and Williams (2010). Specifically, we first ap-
90%, respectively). Table 2 lists the amount of variance explained plied PCA to the 29 acoustic features to determine the number of
and eigenvalues for each principal component of the timbre judg- PCs to retain (see Al-Kandari & Joliffe, 2001). Based on eigen-
ment and emotion judgment tasks. values larger than one, we retained eight PCs. Next, we applied
The dependent variables in Experiment 1a were PCA scores varimax rotation to the eight PCs. Varimax rotation is a type of
obtained from timbre judgment ratings and the dependent variables orthogonal transformation that maximizes the variance of correla-
in Experiment 1b were PCA scores obtained from emotion judg- tion coefficients (acoustic features with PCs) so that each PC will
Figure 2. This figure illustrates how instrumental combination sounds were made to sound emotional.
Laboratory assistants manually made each instrument pair sound happy, sad, angry, and fearful by selecting and
shifting or deleting varying frequencies in the combination sound. Stimuli were made to sound emotional based
on subjective judgments of the laboratory assistants creating the sounds.
be strongly correlated with some of the acoustic features, allowing band 9 for PC1, see Table 3). This procedure was applied for the
for an easier interpretation of representative features in each PC remaining principal components and acoustic features.
(Tuffry, 2011). Following the varimax rotation, we selected the We used this feature selection procedure because relying solely
feature that best represented each PC by their highest loadings. For on principal components as predictors would make the interpreta-
example, to find the acoustic feature that represented PC1, we first tion of data more difficult as individual principal components
obtained principal component scores for PC1 over 180 stimuli, correspond to a linear combination of the acoustic features (Eerola
measured the loadings between PC scores and individual acoustic et al., 2012). To avoid this complication, we selected the acoustic
features, and selected the feature that had the highest loading (sub features that had the highest loading onto each principal compo-
nent to represent that particular dimension (Al-Kandari & Jolliffe,
2001; Eerola et al., 2012). This procedure resulted in the selection
Table 2 of the eight acoustic features, envelope centroid and kurtosis,
Variance Explained for Timbre and Emotion for Each which refer to the shape of a sound; sub bands 2, 5, 7, and 9, which
Principal Component relate to the fluctuation within a particular band of the sound
Timbre Emotion spectrum, where lower bands (approximately 50 500 Hz) are
representative of perceived fullness, and higher bands (approxi-
Principal component Inst PC1 Inst PC2 Emo PC1 Emo PC2
mately 1,600 6,400 Hz) are representative of perceived activity
Cumulative variance .41 .75 .64 .90 (Alluri & Toiviainen, 2010); regularity, which is the degree of
Variance explained .41 .34 .64 .26 variation between peaks of a sound spectrum and zero cross, which
Eigen values 4.08 3.41 3.19 1.31
is related to the perceived noisiness of a sound. Selected acoustic
Table 3
PC Selection Before Varimax Rotation of Acoustic Features
Principal components afPC1 afPC2 afPC3 afPC4 afPC5 afPC6 afPC7 afPC8
Variance explained .20 .15 .10 .07 .06 .06 .05 .04
Cumulative Variance .20 .35 .45 .53 .59 .64 .69 .73
Eigenvalues 5.82 4.35 2.93 2.15 1.75 1.61 1.40 1.12
Note. af of afPC1 8 stands for acoustic feature.
features are listed in Table 4, where the selected features are Note that the purpose of gathering emotion rating data for this
bolded for each principal component. study was not to create sounds that were necessarily happy or sad,
for example, but to create a variety of sounds. Figure 5 shows
emotion ratings made by participants for each emotion category.

Results
Overall, emotion ratings made by participants were congruent with
This section begins with an overview of the behavioral data the intended emotion of the stimuli (e.g., the intended happy sound
from Experiments 1a (timbre) and 1b (emotion), followed by was generally rated as happy). In general, emotion ratings of fear
results indicating the extent of overlap between acoustic features in received higher scores than other ratings.
the timbre judgment task (Experiment 1a) and emotion judgment
task (Experiment 1b). A robust regression analysis (Hampel et al.,
1986) was used to examine the features that can predict timbre, Regression Analysis: Identifying Features for Timbre
emotion, and particular categories of emotion (e.g., happy, sad, and Emotion Judgments
etc.) because of its resiliency against outliers and distributional
problems (Eerola et al., 2012; Street, Carroll, & Ruppert, 1988). Table 5 summarizes results from our regression analysis. A
significant portion of timbre judgments were predicted by the
acoustic features chosen for the study. For principal component 1,
Experiment 1a: Behavioral Data Analysis the adjusted R2 value indicates that timbre ratings were well
of Timbre Judgments predicted at 44% by zero cross, kurtosis, sub band 7, sub band 9,
A logarithmic transformation was performed on both the timbre regularity, envelope centroid, and sub band 3 (see Table 5). Zero
and emotion rating data before analyses. Figure 3 shows a sum- cross relates to the perceptual noisiness of a sound, kurtosis gives
mary of timbre judgments. Each box indicates transformed rating a measure of the variability of a sound, and sub band 7 and 9 relate
scores given to one of the 10 instruments (e.g., flute). to higher frequencies in a sound that represent perceptual activity.
These features had a positive beta value, indicating that higher
Experiment 1b: Behavioral Data Analysis values of these features were positively associated with principal
component 1 scores (i.e., Inst PC1 in Table 5). The features
of Emotion Judgments
regularity, or variation between peaks in a sound, envelope cen-
Figure 4 shows a summary of emotion judgment data for Ex- troid, which indicates the sharpness and intensity contour of a
periment 1b, where each box indicates how happy, sad, angry, sound, and sub band 3 that relates to perceived fullness in a sound
fearful, or disgusting the 180 stimuli sounded. From the whiskers for lower frequency bands (between 50 and 500 Hz), had a
of the box plot for the emotion data it is evident that there is negative beta value indicating that values of these features are
variation within the rated emotions. negatively associated with principal component scores. The sounds
Table 4
Acoustic Feature Selection: Correlation of Acoustic Features with Principal Component Scores
Acoustic features afPC1 afPC2 afPC3 afPC4 afPC5 afPC6 afPC7 afPC8
Variance explained .13 .12 .09 .08 .08 .08 .07 .07
Cumulative variance .13 .25 .34 .42 .50 .58 .65 .72
Zero cross .07 .21 .88 .05 .04 .01 .17 .12
Regularity .26 .24 .06 .08 .09 .06 .71 .21
Kurtosis .14 .14 .12 .88 .19 .03 .1 .09
Envelope centroid .06 .91 .02 .10 .07 .15 .11 .17
Sub band 3 .00 .00 .03 .11 .87 .04 .04 .03
Sub band 2 .05 .22 .03 .02 .1 .80 .05 .01
Sub band 7 .39 .05 .15 .01 .04 .09 .11 .74
Sub band 9 .85 .04 .07 .04 .00 .01 .06 .03
Note. Numbers in bold indicate the highest loadings between PC scores for the acoustic features and values of
the acoustic features. Sub bands represent spectral fluctuation for different frequency bands. The af of
afPC1 8 stands for acoustic feature.
Figure 3. Box plot of observations for timbre ratings for Experiment 1a

after a log-transformation. Each box indicates a timbre rating made by Figure 5. This figure shows participants normalized emotion rating
participants for all of the 180 stimuli. The median is indicated by the red scores of sounds (y-axis) given in each emotion category (x-axis). Ob-
line in the center of each box and the edges indicate the 25th and 75th served emotions (x-axis) and emotion ratings (y-axis) are shown separately
percentiles. The whiskers of each plot indicate the extreme data points, and for each intended emotion made by laboratory assistants (happy, sad,
outliers are plotted outside of the whiskers. anger, and fear). Overall, participants ratings were congruent with the
stimulis intended emotion over other emotions in three of the four cate-
gories (happy, sad, and fear). For example, the sounds that were produced
encompassed by Inst PC1 had a high rate of fluctuation and were to sound happy were rated high in happy as compared with other
not perceived as full sounds (e.g., a string bass compared with a emotions, and the sound that were produced to sound sad were rated high
in sad. In general, emotion ratings of fear received higher scores than
muted trumpet). Inst PC2 was predicted by regularity, envelope
other ratings.
centroid and sub band 2, where predictors accounted for 35% of
the data. The sounds encompassed by Inst PC2 were generally
moderate in fluctuation. sions), this suggests that these features may play a role in differ-
For emotion ratings, principal component 1 (i.e., Emo PC1 in entiating emotions. Overall, the timbre and emotion sounds are
Table 5) was predicted at an adjusted R2 of 28% with four features, best characterized by regularity, envelope centroid and sub band 3.
zero cross, regularity, sub band 3, and sub band 7. Emo PC2 was The overlapping features between timbre and emotion ratings
predicted at an adjusted R2 value of 42% by a majority of the eight with a significance value of .001 are listed in Table 6. These
significant features, zero cross, regularity, kurtosis, envelope cen- features, regularity, envelope centroid, sub band 3, and sub band 9,
troid, sub band 3, 2, and sub band 9. Because similar significant indicate that both timbre and emotion are perceived by the shape
acoustic features are shared between emotion and timbre (zero and fluctuation within a sound signal. However, some features are
cross, regularity, sub band 3, and sub band 7 from both dimen- better at describing timbre and others better at describing emotion.
This idea is explored more in the following section, providing an
extension to the work of Eerola et al. (2012).
Table 5
Standardized Beta Coefficients from Robust Regression Analysis
in Experiment 1a (Timbre) and 1b (Emotion)
Timbre Emotion
Acoustic features Inst PC1 Inst PC2 Emo PC1 Emo PC2
Adjusted R2 .44 .35 .28 .42

Zero cross .17 .29 .37
Regularity .28 .28 .32 .25
Kurtosis .30 .27
Envelope centroid .23 .47 .26
Sub band 3 .65 .19 .45
Figure 4. Box plot of observations for individual emotion ratings after a Sub band 2 .27 .13
log-transformation. This figure illustrates emotion ratings for Experiment Sub band 7 .14 .29
1b. Each box indicates one emotion rating for all 180 stimuli made by Sub band 9 .29 .38
participants. The median is indicated by the red line, and the edges show Note. Sub bands represent spectral fluctuation for different frequency
the 25th and 75th percentiles. Whiskers of each plot indicate the extreme bands.

data points, and outliers are plotted outside of the whiskers. p .05. p .01. p .001.
Table 6 angry sound is likely to be comprised of dissonant tones. The

A Comparison of Main Features for Experiment 1a (Timbre) emotion fear was predicted by zero cross, regularity, sub band 3,
and Experiment 1b (Emotion) and sub band 7, which predicted 25% of the data. Disgust was
predicted by zero cross, sub band 7, and sub band 9, which
Acoustic features Timbre Emotion accounted for 21% of the data. These results indicate that timbre-
Zero cross X related features play a substantial role in the perception of emotion
Regularity X X and that timbre could be a more useful indicator for specific
Kurtosis X emotions rather than emotion in general.
Envelope centroid X X
Sub band 3 X X
Sub band 2 Discussion
Sub band 7 X
Sub band 9 X X Our results illustrate that acoustic features that characterize
differences in timbre may also predict emotional judgments of the
Note. Sub bands represent spectral fluctuation for different frequency
musical sounds differing in timbre. In particular, specific acoustic

bands. Features with p .001 are shown.
features are associated with particular emotions (i.e., a high zero

cross, or fluctuation, is associated with the emotion disgust).
Consistent with results from Eerola et al. (2012) and Goydke et al.
In summary, four acoustic features were able to predict both
(2004), emotion ratings of short instrument sounds were predicted
timbre and emotion ratings, suggesting a close relationship be-
by a small set of acoustic features. A set of eight acoustic features
tween emotion with timbre perception in instrumental sounds.
were used to predict listeners ratings across timbre and emotion,
While this is an area to be expanded in future work involving other
which have previously been identified as relating to the expressive
types of emotion (dimensional and discrete) to create a more
content of speech as well as music (Eerola et al., 2012; Juslin &
reliable model (Eerola & Vuoskoski, 2013), there is nonetheless an
Laukka, 2003; Scherer & Oshinsky, 1977).
indication of shared processes used for the perception of timbre
Overall, a dominant part of the variance for timbre and emotion
and emotion.
was predicted with two to six acoustic features. Regularity, enve-
lope centroid, sub band 3, which represents spectral fluctuation at
Regression Analyses for Individual Emotions a lower frequency band, and sub band 9, which represents spectral
To examine more closely the role of timbre in emotion, we fluctuation at a higher frequency band, were significantly associ-
investigated the extent to which the selected acoustic features ated with both timbre and emotion judgments. While some features
predict individual emotions happiness, sadness, anger, fear, and such as regularity could predict the timbre, emotion, and categor-
disgust by applying robust regression separately for each emo- ical emotion data, other features predicted specific emotions. For
tion not using PCA (see Table 7). This analysis showed that the example, sub band 7 predicted the emotions anger, fear, and
selected acoustic features could predict all individual emotions disgust, such that less of this feature was associated with these
reasonably well (adjusted R2 values ranging between 25 and 50%). emotions.
Consistent with previous studies (Eerola, Friberg, & Bresin, 2013;
Juslin, 1997; Juslin & Lindstrm, 2010; Scherer & Oshinsky, General Discussion
1977), the contributions of acoustic features varied for each emo-
The two experiments explored whether acoustic features related
tion. For example, sub band 3 (a lower band of frequencies related
to timbre can predict particular categories of emotion. In Experi-
to fullness of a sound) could predict the emotions sadness and fear,
ment 1a, participants took part in a timbre judgment task where
but had a negative association with happy. Similarly, sub band 3
they rated the instrument identity of a sound stimulus. In Experi-
did not predict anger and disgust. In contrast, sub band 7, which is
ment 1b, participants listened to the same stimuli as Experiment 1a
related to perceived activity in a sound, could predict all emotions
and judged the extent to which these stimuli were perceived to
except for sadness.
About 35% of happiness ratings were predicted by a positive
standardized beta value of sub band 7, and negative values for zero Table 7
cross, regularity, and sub band 3, suggesting that as a sound Robust Regression for Individual Emotions
increases in happiness it decreases in those acoustic features.
Around 50% of the emotion sadness was best predicted by positive Acoustic features Happy Sad Anger Fear Disgust
values of regularity, envelope centroid, and sub band 3 (fullness),
Adjusted R2 .35 .50 .24 .25 .21
and by negative values of zero cross, kurtosis, and sub band 9. Zero cross .27 .25 .25 .26 .40
Similar to happiness, the emotion sadness is predicted by less Regularity .36 .33 .33 .27
fluctuation but more fullness (sub band 3), where these features Kurtosis .29
indicate sustained notes with a longer decay relative to their attack Envelope centroid .17 .19
Sub band 3 .25 .46 .23
(Elliott, Hamilton, & Theunissen, 2013). In addition, the negative Sub band 2 .15
association between sub band 9 (perceived activity) and sadness Sub band 7 .30 .24 .26 .21
indicates that sad sounds are less active, perhaps dull. Anger was Sub band 9 .37 .16
significantly predicted, at 24%, by zero cross and regularity, indi- Note. Sub bands represent spectral fluctuation for different frequency
cating that angry sounds fluctuate highly, and by the features bands.

envelope centroid, sub band 2 and sub band 7, indicating that an p .05. p .01. p .001.
sound like each of the five categorical emotions happiness, sad- to instrumental music (Eerola et al., 2013; Juslin & Laukka, 2003;
ness, anger, fear, and disgust. Our results indicate that envelope Scherer, 2003). This is an interesting point for future research that
centroid can explain the perception of timbre and emotion, con- can explore the extent of overlap for timbre and emotion in vocal
sistent with Eerola et al. (2012). In addition, we found that zero sounds. Furthermore, there is potential for this to generalize to the
cross, regularity, and to a lesser extent, sub bands 3, 2, 7, and 9 and expression of emotion through other structural cues in music, the
kurtosis can also predict the perception of timbre and particular expression of vocal emotion, or expression of emotion in other
categories of emotion. modalities such as gesture (Hailstone et al., 2009; Sloboda &
As an extension of the work of Eerola et al. (2012), we found Juslin, 2001).
that sub band 3 and zero cross are better at predicting timbre
compared to emotion. Furthermore, kurtosis can predict categori-
Model of Emotion
cal emotions better than timbre. Overall, envelope centroid, regu-
larity and sub bands 3 and 9 explained timbre and emotion, Previous studies, such as Eerola et al. (2012) and Vstfjll
indicating that these features are likely general features used to (2013) suggest that there are unique components of emotion con-
perceive emotion. Additionally, some timbre-related features can- veyed by dimensional models of affect. For example, affective
not describe particular emotions; for example, sub bands 2 and 7 reactions induced by auditory stimuli can be explained by a com-
could not predict the emotion sadness. Below, we interpret these bination of valence and activation (Russell, 1980; Russell & Bar-
results with respect to two issues: (a) the specificity of emotion rett, 1999; Vstfjll, 2013). Eerola et al. (2012) show that valence
prediction by the selected acoustic features, and (b) the question of is associated with ratio high frequency-low frequency energy using
the model of emotion used. a dimensional model of emotion; however, this acoustic feature
does not explain the relationship between timbre and emotion
using a categorical model of emotion. Results from our study
Acoustic Features of Timbre and Emotion
indicate that envelope centroid is a good predictor of timbre and
Unlike Eerola et al. (2012), our results indicate that envelope emotion when listeners rate sounds using a categorical model of
centroid does not encompass the entirety of emotion categories emotion, similar to results of Eerola et al. (2012) that show that
used in our experiments. In the Eerola et al. (2012) study, envelope envelope centroid can predict emotion using a dimensional model.
centroid was shown to predict valence and energy; however, our While some acoustic features may not work to predict emotion in
results show that envelope centroid was a good predictor for the a dimensional model, they work well using a categorical model of
emotions sad and anger. In contrast, envelope centroid was not emotion; however, not much is known about the overlap between
related to other emotions such as happiness, fear or disgust, pos- dimensional and categorical models of emotion.
sibly because of the way emotions are processed. The emotions sad Despite these similarities it remains unclear whether emotions
and anger have a negative valence, where happy has a positive are separate entities, as explained by categorical models of emo-
valence. Past research shows that negative and positive emotions tion, or if they are governed by underlying factors, as explained by
are processed differently. dimensional models of emotion (Kragel & Labar, 2015). Kragel
In a brain imaging study, Koelsch, Fritz, Cramon, Muller, and and Labar (2015) compared dimensional and categorical models to
Friederici (2006) investigated emotion perception of pleasant and assess the underlying neural substrates of emotion perception.
unpleasant stimuli. Listeners heard dissonant music contrasted While recorded on fMRI, participants listened to and watched
with pleasant music and functional magnetic resonance imaging instrumental music and cinematic film clips to induce emotion, and
(fMRI) results showed that emotions were processed in different rated their emotion on categorical emotions as well as valence and
areas of the brain. Negative emotions were processed in the hip- arousal. By utilizing computational modeling, Kragel and Labar
pocampus, parahippocampal gyrus, and temporal poles were acti- (2015) demonstrated that combining dimensional and categorical
vated whereas positive emotions were processed in the inferior models of emotion could better explain neural activation patterns
frontal gyrus used for music-syntactic analyses and working mem- of emotion perception. They also found that categorical and di-
ory, and the anterior superior insula and the ventral striatum were mensional models of emotion could explain unique sources of
activated. According to this fMRI research, emotions are pro- neural information, indicating that emotion perception is diverse
cessed in different areas of the brain depending on valence and not well explained by one model of emotion. In addition,
(whether they are positive or negative). Results of our study found results showed that activity in different neural systems could be
that certain acoustic features are associated with specific emotions; mapped to unique emotion categories. Future research utilizing a
for example, kurtosis is predictive of the emotion sadness which combination of the categorical and dimensional approach in addi-
has a negative valence. It is plausible that these acoustic features tion to acoustic features could work to better describe the relation-
related to timbre are processed in different areas of the brain. ship between timbre and emotion as well as the effect of timbre on
Combined, these findings demonstrate that using acoustic features specific emotions.
can potentially help develop a more functional model of emotion. With respect to the effects of timbre on emotion judgments
Brain mechanisms for processing timbre in music are said to be (Balkwill & Thompson, 1999; Eerola et al., 2012; Hailstone et al.,
evolved for the representation and evaluation of vocal sounds 2009) our results are consistent with previous evidence and sug-
(Juslin & Laukka, 2003). As noted in Gabrielsson and Lindstrm gest that because there are similarities in acoustic features used to
(2010), there are many expressive properties of music that are predict timbre and emotion, that there is an overlap in the mech-
likely to overlap with the cues in vocal expression (Juslin & anisms used to perceive timbre and emotion. We extend previous
Laukka, 2003). Theoretically, findings from our experiments could paradigms by showing that timbre is potentially related to specific
be interpreted in terms of vocal expression of emotion as compared emotions, such as sadness.
Conclusions and Future Work dimensions in auditory sensory memory. Journal of Cognitive Neuro-
science, 18, 1959 1972. http://dx.doi.org/10.1162/jocn.2006.18.12.1959
Our research reveals that a number of acoustic features, in Caclin, A., McAdams, S., Smith, B. K., & Winsberg, S. (2005). Acoustic
particular, regularity, envelope centroid and sub bands 3 and 9, can correlates of timbre space dimensions: A confirmatory study using
predict ratings of both timbre and emotion in sound stimuli. While synthetic tones. The Journal of the Acoustical Society of America, 118,
the results of this study provide an extension to the work of Eerola 471 482. http://dx.doi.org/10.1121/1.1929229
et al. (2012) showing that some acoustic features are better for Caclin, G., Giard, M.-H., & McAdams, S. (2009). Perception of timbre
predicting specific emotions, they also raise a number of questions dimensions: Psychophysics and electrophysiology in humans. The Jour-
related to the stimuli used to rate both timbre and emotion. Lim- nal of the Acoustical Society of America, 126, 2236 2240. http://dx.doi
itations of these studies include that although the stimuli were .org/10.1121/1.3249185
carefully constructed, they lacked variety in terms of dynamics or Chartrand, J. P., & Belin, P. (2006). Superior voice timbre processing in
pitch, which could contribute to the reason that some features musicians. Neuroscience Letters, 405, 164 167. http://dx.doi.org/10
prevalent for timbre and emotion in past work were not significant .1016/j.neulet.2006.06.053
in the feature selection process, such as ratio high-frequency to Coutinho, E., & Dibben, N. (2013). Psychoacoustic cues to emotion in
low-frequency energy. The lack of similar acoustic features, how- speech prosody and music. Cognition and Emotion, 27, 658 684. http://
dx.doi.org/10.1080/02699931.2012.732559
ever, could also be because of the differences in using dimensional
Eerola, T., Ferrer, R., & Alluri, V. (2012). Timbre and affect dimensions:
versus categorical emotions to rate stimuli, though participants
Evidence from affect and simCilarity ratings and acoustic correlates of
were able to reliably rate the emotion of the individual stimuli. In
isolated instrument sounds. Music Perception, 30, 49 70. http://dx.doi
addition, the method used to create affective sounds was unique .org/10.1525/mp.2012.30.1.49
compared with previous studies, such as Eerola et al. (2012), in Eerola, T., Friberg, A., & Bresin, R. (2013). Emotional expression in
that acoustic cues of these sounds were not manipulated but music: Contribution, linearity, and additivity of primary musical cues.
instead frequencies of sounds were manipulated to create happy, Frontiers in Psychology, 4, 487. http://dx.doi.org/10.3389/fpsyg.2013
sad, angry, or fearful stimuli. This method of sound production .00487
could limit emotional responses of participants in that variability Eerola, T., & Vuoskoski, J. K. (2013). A review of music and emotion
may not be diverse enough to accurately produce the intended studies: Approaches, emotion models, and stimuli. Music Perception,
emotional response. 30, 307340. http://dx.doi.org/10.1525/mp.2012.30.3.307
Despite these limitations, this work contributes an extended Ekman, P. (1992). Are there basic emotions? Psychological Review, 99,
approach to the study of the relationship between music, speech 550 553. http://dx.doi.org/10.1037/0033-295X.99.3.550
and emotion by introducing a new pathway in the field of cogni- Elliott, T. M., Hamilton, L. S., & Theunissen, F. E. (2013). Acoustic
tion, music, and emotion to relate timbre and emotion in terms of structure of the five perceptual dimensions of timbre in orchestral
acoustic features. Specifically, we suggest that features such as sub instrument tones. The Journal of the Acoustical Society of America, 133,
band 7 (perceived activity) are related to the emotions anger, fear, 389 404. http://dx.doi.org/10.1121/1.4770244
and disgust such that these sounds are likely more active, while Gabrielsson, A., & Juslin, P. N. (1996). Emotional expression in music
zero cross may be related to general emotion perception and is performance: Between the performers intention and the listeners ex-
perience. Psychology of Music, 24, 68 91. http://dx.doi.org/10.1177/
used to differentiate specific emotions (happiness compared to
0305735696241007
anger). We can further research the role of emotion in instrumental
Goydke, K. N., Altenmller, E., Mller, J., & Mnte, T. F. (2004). Changes
sounds and the possibility that they are perceived in the same way
in emotional tone and instrumental timbre are reflected by the mismatch
as speech sounds.
negativity. Cognitive Brain Research, 21, 351359. http://dx.doi.org/10
.1016/j.cogbrainres.2004.06.009
References Grey, J. M., & Moorer, J. A. (1977). Perceptual evaluations of synthesized
musical instrument tones. The Journal of the Acoustical Society of
Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley America, 62, 454 462. http://dx.doi.org/10.1121/1.381508
Interdisciplinary Reviews: Computational Statistics, 2, 433 459. http:// Hailstone, J. C., Omar, R., Henley, S. M. D., Frost, C., Kenward, M. G., &
dx.doi.org/10.1002/wics.101 Warren, J. D. (2009). Its not what you play, its how you play it: Timbre
Al-Kandari, N., & Jolliffe, I. (2001). Variable selection and interpretation affects perception of emotion in music. The Quarterly Journal of Ex-
of covariance principal components. Communications in Statistics Sim- perimental Psychology, 62, 21412155. http://dx.doi.org/10.1080/
ulation and Computation, 30, 339 354. http://dx.doi.org/10.1081/SAC- 17470210902765957
100002371
Hajda, J. M., Kendall, R., Carterette, E., & Harshberger, M. (1997).
Alluri, V., & Toiviainen, P. (2010). Exploring perceptual and acoustical
Methodological issues in timbre research. Perception and Cognition of
correlates of polyphonic timbre. Music Perception, 27, 223242. http://
Music, 12, 253306.
dx.doi.org/10.1525/mp.2010.27.3.223
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., & Stahel, W. A. (1986).
American National Standards Institute. (1973). American national psy-
choacoustical terminology. S3.20. New York, NY: Author. Robust Statistics: The Approach Based on Influence Functions. New
Balkwill, L.-L., & Thompson, W. F. (1999). A crosscultural investigation York, NY: Wiley-Interscience.
of the perception of emotion in music: Psychophysical, and cultural Helmholtz, H. (1885). On the sensations of tone as a physiological basis
cues. Music Perception, 17, 43 64. http://dx.doi.org/10.2307/40285811 for the theory of music (2nd ed.). London: Longman.
Bregman, A. S., Liao, C., & Levitan, R. (1990). Auditory grouping based Huber, P. J., & Ronchetti, E. M. (1975). Robustness of design (2nd ed., pp.
on fundamental frequency and formant peak frequency. Canadian Jour- 239 248). Robust Statistics.
nal of Psychology, 44, 400 413. http://dx.doi.org/10.1037/h0084255 Jensen, K. (1999). Timbre models of musical sounds (Doctoral disserta-
Caclin, A., Brattico, E., Tervaniemi, M., Ntnen, R., Morlet, D., Giard, tion). Department of Computer Science, University of Copenhagen,
M. H., & McAdams, S. (2006). Separate neural processing of timbre Denmark.
Juslin, P. (1997). Emotional communication in music performance: A Padova, A., Bianchini, L., Lupone, M., & Balardinelli, M. (2003). Influ-
functionalist perspective and some data. Music Perception, 14, 383 418. ence of specific spectral variations of musical timbre on emotions in the
http://dx.doi.org/10.2307/40285731 listeners. In Proceedings of the 5th ESCOM Conference (pp. 227230).
Juslin, P. N. (2000). Cue utilization in communication of emotion in music Germany.
performance: relating performance to perception. Journal of Experimen- Patel, A. (2009). Music, language, and the brain. New York, NY: Oxford
tal Psychology. Human Perception and Performance, 26, 17971813. University Press.
http://dx.doi.org/10.1037/0096-1523.26.6.1797 Risset, J. C., & Wessel, D. L. (1982). Exploration of timbre by analysis and
Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal synthesis. In D. Deutsch (Ed.), The psychology of music (pp. 2558).
expression and music performance: Different channels, same code? New York, NY: Academic Press. http://dx.doi.org/10.1016/B978-0-12-
Psychological Bulletin, 129, 770 814. 213562-0.50006-1
Juslin, P., & Lindstrm, E. (2010). Musical expression of emotions: Mod- Russell, J. A. (1980). A circumplex model of affect. Journal of Personality
elling listeners judgments of composed and performed features. Music and Social Psychology, 39, 11611178. http://dx.doi.org/10.1037/
Analysis, 29, 334 364. http://dx.doi.org/10.1111/j.1468-2249.2011 h0077714
.00323.x Russell, J. A., & Barrett, L. F. (1999). Core affect, prototypical emotional
Klingbeil, M. (2005). Software for spectral analysis, editing, and synthesis. episodes, and other things called emotion: Dissecting the elephant.
Journal of Personality and Social Psychology, 76, 805 819. http://dx
In Proceedings of the International Computer Music Conference (pp.

107110). Barcelona, Spain. .doi.org/10.1037/0022-3514.76.5.805
Koelsch, S., Fritz, T. V. Cramon, D. Y., Mller, K., & Friederici, A. D. Scherer, K. R. (2003). Vocal communication of emotion: A review of
(2006). Investigating emotion with music: An fMRI study. Human Brain research paradigms. Speech Communication, 40, 227256. http://dx.doi
Mapping, 27, 239 250. http://dx.doi.org/10.1002/hbm.20180 .org/10.1016/S0167-6393(02)00084-5
Kragel, P. A., & LaBar, K. S. (2015). Multivariate neural biomarkers of Scherer, K. R., & Oshinsky, J. S. (1977). Cue utilization in emotion
emotional states are categorically distinct. Social Cognitive and Affective attribution from auditory stimuli. Motivation and Emotion, 1, 331346.
Neuroscience. Advance online publication. http://dx.doi.org/10.1093/ http://dx.doi.org/10.1007/BF00992539
scan/nsv032 Sethares, W. A. (1999). Automatic detection of meter and periodicity in
Lartillot, O., Toiviainen, P., & Eerola, T. (2008). A matlab toolbox for musical performance. Proceedings of the Research Society for the Foun-
music information retrieval. In C. Preisach, H. Burkhardt, L. Schmidt- dations of Music. Chicago, IL.
Thieme, & R. Decker (Eds.), Data analysis, machine learning and Sloboda, J. A., & Juslin, P. N. (2001). Psychological perspectives on music
applications (pp. 261268). New York, NY: Springer. http://dx.doi.org/ and emotion. In P. N. Juslin & J. A. Sloboda (Eds.), Music and emotion:
10.1007/978-3-540-78246-9_31 Theory and research (pp. 71104). Oxford, United Kingdom: Oxford
Laukka, P., Elfenbein, H. A., Sder, N., Nordstrm, H., Althoff, J., Chui, University Press.
W., . . . Thingujam, N. S. (2013). Cross-cultural decoding of positive and Stevenson, R. A., & James, T. W. (2008). Affective auditory stimuli:
negative non-linguistic emotion vocalizations. Frontiers in Psychology, Characterization of the International Affective Digitized Sounds (IADS)
4, 353. by discrete emotional categories. Behavior Research Methods, 40, 315
Leman, M., Vermeulen, V., De Voogdt, L., Moelants, D., & Lesaffre, M. 321. http://dx.doi.org/10.3758/BRM.40.1.315
(2005). Prediction of musical affect using a combination of acoustic Street, J. O., Carroll, R. J., & Ruppert, R. J. (1988). A note on computing
structural cues. Journal of New Music Research, 34, 39 67. http://dx robust regression estimates via iteratively reweighted least squares. The
.doi.org/10.1080/09298210500123978 American Statistician, 42, 152154.
McAdams, S., Beauchamp, J. W., & Meneguzzi, S. (1999). Discrimination Tuffry, S. (2011). Data mining and statistics for decision making. West
of musical instrument sounds resynthesized with simplified spectrotem- Sussex, United Kingdom: Wiley, Ltd. http://dx.doi.org/10.1002/
poral parameters. The Journal of the Acoustical Society of America, 105, 9780470979174
882 897. http://dx.doi.org/10.1121/1.426277 Tzanetakis, G., & Cook, P. (2002). Musical genre classification of audio
McAdams, S., & Cunible, J. C. (1992). Perception of timbral analogies. signals. IEEE Transactions on Speech and Audio Processing, 10, 293
Philosophical Transactions of the Royal Society of London Series B, 302. http://dx.doi.org/10.1109/TSA.2002.800560
Biological Sciences, 336, 383389. http://dx.doi.org/10.1098/rstb.1992 Vstfjll, D. (2013). Emotional Reactions to Tonal and Noise Components
.0072 of Environmental Sounds. Psychology, 4, 10511058.
McAdams, S., Winsberg, S., Donnadieu, S., De Soete, G., & Krimphoff, J.
(1995). Perceptual scaling of synthesized musical timbres: Common Received April 16, 2013
dimensions, specificities, and latent subject classes. Psychological Re- Revision received June 24, 2015
search, 58, 177192. http://dx.doi.org/10.1007/BF00419633 Accepted August 28, 2015
Copyright of Psychomusicology: Music, Mind & Brain is the property of Psychomusicology
and its content may not be copied or emailed to multiple sites or posted to a listserv without
the copyright holder's express written permission. However, users may print, download, or
email articles for individual use.

Perceiving Categorical Emotion in Sound PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Perceiving Categorical Emotion in Sound PDF

Загружено:

Авторское право:

Доступные форматы

Psychomusicology: Music, Mind, and Brain 2015 American Psychological Association

2016, Vol. 26, No. 1, 1525 0275-3987/16/$12.00 http://dx.doi.org/10.1037/pmu0000105

Perceiving Categorical Emotion in Sound: The Role of Timbre

Keywords: timbre, emotion, acoustic features, sound perception

Emotion in sounds is perceived by a number of attributes such Timbre and Emotion

This article was published Online First December 21, 2015.

envelope could explain variance in emotion ratings. Participants

Attack time Temporal duration of the onset of a sound

emotion ratings made by participants for each emotion category.

Figure 3. Box plot of observations for timbre ratings for Experiment 1a

Adjusted R2 .44 .35 .28 .42

Table 6 angry sound is likely to be comprised of dissonant tones. The

musical sounds differing in timbre. In particular, specific acoustic

features are associated with particular emotions (i.e., a high zero

In Proceedings of the International Computer Music Conference (pp.

Вам также может понравиться