Академический Документы
Профессиональный Документы
Культура Документы
Harmony Explained:
Progress Towards A Scientific Theory of Music
The Major Scale, The Standard Chord Dictionary, and The Difference of Feeling Between The Major
and Minor Triads Explained from the First Principles of Physics and Computation; The Theory of
Helmholtz Shown To Be Incomplete and The Theory of Terhardt and Some Others Considered
Daniel Shawcross Wilkerson
Begun 23 September 2006; this version 19 February 2012.
Table of Contents
1 The Problem of Music
1.1 Modern "Music Theory" Reads Like a Medieval Medical Textbook
1.2 What is a Satisfactory, Scientific Theory?
1.3 Music "Theory" is Not a Scientific Theory of Anything
1.4 Can we Make a Satisfactory Theory of Music?
1.5 Physical Science: Harmonics Everywhere
1.5.1 Timbre: Systematic Distortions from the Ideal Harmonic Series
1.6 Computational Science: as Fundamental as Physical Science
1.6.1 Algorithms are Universal
2 Living in a Computational Cartoon
2.1 Searching for Harmonics
2.1.1 Virtual Pitch: Hearing the Harmonic Series Even When it is Not There
2.1.2 Using Greatest Common Divisor as the Missing Fundamental
2.1.3 Even Animals Seem to Compute the Ideal Harmonic Series
2.2 Artifacts of Optimization
2.2.1 Relative Pitch: Differences Between Sounds
2.2.2 Octaves: Sounds Normalized to a Factor of Two
2.3 Harmony: Sweetness is the Ideal
2.3.1 Recreating an Ideal Harmonic Series using Instruments having SystematicallyDistorted Timbre
2.3.2 Harmony Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical
Across the Notes
2.3.3 Vertical Intervals Have Pure Ratios
2.3.4 Vertical Intervals Have Balanced Amplitudes
2.3.5 Vertical Intervals Are All The Same Ratio
2.3.6 Harmony is Sweeter Than Sweet
2.4 Interestingness: Just Enough Complexity
2.4.1 The Simplicity of Theme
2.4.2 The Complexity of Ambiguity
2.5 Recognition: Feature Vectors
2.5.1 Soft Computing
2.5.2 False Recognition
2.5.3 Cubism: Partial Recognition Due to Redundant, Over-Determined Feature Vectors
3 Harmonic Music Explained
3.1 The Major Triad
3.2 The Major Scale
3.2.1 Interlocking Triads
3.2.2 Using Logarithms to Visualize Distances Between Tones/Notes
3.2.3 The Keyboard Revealed
3.3 Scales and Keys
3.3.1 Changing Key: Playing Other Groups of Triads
3.3.2 Key Changes Break Harmony
3.3.3 Just versus Equal Tuning
3.4 The Minor
3.4.1 The Minor Triad
3.4.2 The Minor as Auditory Cubism
3.4.3 Minor Scales
3.5 Chords
3.5.1 The Standard Chord Dictionary
8
9
Hmm, the white and black keys mostly just alternate, yet these alternating regions last for 5 and then 7 keys
and then that 5/7 region-pair repeats, and where these regions meet there are two adjacent white keys. There
seems to be a pattern, but it is quite an odd one.
The piano keyboard seems really weird and ad-hoc.
Doesn't it seem that something as simple as sound should have a simple device for producing it?
Further, this weirdness is not specific just to the piano: the key layout reflects the Major Scale [maj] which is
the basis of all Western music. Is that black-white pattern somehow fundamental to sound and music itself? Or
are they really just a cultural coincidence, combinations of sounds that we have heard over and over since
infancy and been trained to associate with different emotions? Is something fundamental to the ear and to
sound itself that is going on here or not?
the California school board as told in '"Surely You're Joking, Mr. Feynman!": Adventures of a Curious
Character', [Feynman1985, p. 270-271], (emphasis in the original):
For example, there was a book that started out with four pictures: first there was a wind-up toy;
then there was an automobile then there was a boy riding a bicycle; then there was something
else. And underneath each picture it said, "What makes it go?"
I thought, "I know what it is: They're going to talk about mechanics, how the springs work inside
the toy; about chemistry, how the engine of the automobile works; and biology, about how the
muscles work."
It was the kind of thing my father would have talked about: "What makes it go? Everything goes
because the sun is shining." And then we would have fun discussing it:
"No, the toy goes because the spring is wound up," I would say.
"How did the spring get wound up?" he would ask.
"I wound it up."
"And how did you get moving?"
"From eating."
"And food grows only because the sun is shining. So it's because the sun is shining that all these
things are moving." That would get the concept across that motion is simply the transformation of
the sun's power.
I turned the page. The answer was, for the wind-up toy, "Energy makes it go." And for the boy on
the bicycle, "Energy makes it go." For everything, "Energy makes it go."
Now that doesn't mean anything. Suppose it's "Wakalixes." That's the general principle:
"Wakalixes makes it go." There's no knowledge coming in. The child doesn't learn anything; it's
just a word!
is a better theory?
For one thing, the theory is mechanical: we have some mechanism, consistent with our understanding of
inanimate matter today (physics and chemistry) such that the operation of the mechanism corresponds
with what we observe (Scientific Method) [sci].
Further, this mechanism is deterministic and precise: there isn't much arbitrariness in the mechanism: we
can compute rather well how sick someone will get and how much toxin we have to give them to kill the
bacteria and not the person.
This mechanism is universal: there is no appeal to beliefs or cultural norms: people throughout the world
get sick in the same way and the medicines work on them, with but small differences that can be further
explained by another mechanism called genetics.
This mechanical explanation is simple and minimal (Occam's razor) [occ]. We can see the parts working.
Lastly, the mechanism is factored -- made up of independent parts -- and the complexity of the observed
phenomena is emergent -- arising naturally from the operation of the parts. That is, these parts of the
explanation of disease all operate independently: (1) how the body works such that the bacterial
excretions disrupt it, (2) how bacteria works such that the toxin kills it, (3) how the toxicity to the
human depends on the size of the human, etc.
Physicist Richard Feynman gave a series of lectures where he attempted to encapsulate the basic nature of
how science is done and the kind of results it produces; these were published as "The Character of Physical
Law" [Feynman1965]. Here is a brilliant paragraph on how to know when you have finally found the truth.
[Feynman1965, p. 171] (underlining added, not in the original):
One of the most important things in this 'guess -- compute consequences -- compare with
experiment' business is to know when you are right. It is possible to know when you are right way
ahead of checking all the consequences. You can recognize truth by its beauty and simplicity. It is
always easy when you have made a guess, and done two or three little calculations to make sure
that it is not obviously wrong, to know that it is right. When you get it right, it is obvious that it is
right -- at least if you have any experience -- because usually what happens is that more comes
out than goes in. Your guess is, in fact, that something is very simple. If you cannot see
immediately that it is wrong, and it is simpler than it was before, then it is right. The
inexperienced, and crackpots, and people like that, make guesses that are simple, but you can
immediately see that they are wrong, so that does not count. Others, the inexperienced students,
make guesses that are very complicated and it sort of looks as if it is all right, but I know it is not
true because the truth always turns out to be simpler than you thought.
Using Computer Science terminology, I summarize Feynman's point as follows.
The more factored a theory and the more emergent the observed phenomena from the theory, the
more satisfying the theory.
The Ptolemaic [ptol] model of the solar system puts the earth at the center. This explanation really does
explain the movements, especially when epicycles [epi] are added, but it is rather complex and ad hoc: how
does it emerge that we need epicycles? The Copernican [cop] system is also another explanation of the solar
system that puts the sun at the center. This second explanation only requires Newton's laws of motion plus
gravity. The consequences of Newton's laws are complex and even hard to simulate, even on a modern
computer, but the laws themselves are quite simple and independent and mechanical and factored and
observable etc. Even further, the notation used in this theory easily reflects the underlying understanding in
the theory: it allows for easy calculations when making predictions of the theory. All in all, the Copernican
system is quite a quite satisfying explanation, or theory, of the motions of planets in the solar system because,
not only does it explain the observed phenomena, it is factored into simple parts and the observed phenomena
are emergent from the interactions of those parts. Consequently, we use the Copernican system today
Given that
1. sound and instruments exist in reality and
2. music only sounds like something because a human brain is computing the listening to it,
it seems therefore that
1. physics and
2. computation
respectively seem the appropriate place to start with a real theory of music.
The brain is central to our theory. Not knowing how the brain really works, we therefore have a hole to fill in
our explanation. We proceed by telling a story to explain the known properties of music; along the way we
assume certain conjectures about the structure of the brain where we need them. We make these conjectures
as reasonable as possible, given the assumption that
The brain is a machine optimized by evolution to compute human survival.
That is, being a machine, the brain is likely to be subject to properties that computer scientists and engineers
have observed across many computational systems and that these properties will be driven by evolutionary
optimization. In the end, the test of our theory will depend on (1) how well it explains the observed
phenomenon called music, and (2) how well the conjectures hold up under testing. In this essay we do (1) and
we leave (2) for future work by cognitive/brain scientists.
(length) for a certain wavelength, so that waves bouncing back or being produced at each end
reinforce each other, instead of interfering with each other and cancelling each other out. And it
really helps to keep the container very narrow, so that you don't have to worry about waves
bouncing off the sides and complicating things. So you have a bunch of regularly-spaced waves
that are trapped, bouncing back and forth in a container that fits their wavelength perfectly. If you
could watch these waves, it would not even look as if they are traveling back and forth. Instead,
waves would seem to be appearing and disappearing regularly at exactly the same spots, so these
trapped waves are called standing waves.
We will call each single sine-wave at a single frequency a "tone", whereas the collection of frequencies that
occur together due to a single physical process (such as a vocal utterance or the striking of a piano key) we
will call a "note". (A tone can be expressed simply as (1) a wave "frequency" in Hertz (Hz), the number of
cycles per second, (2) a wave "amplitude", the wave peak height, and (3) a wave "phase", where the wave is
in its cycle compared to other waves; we won't discuss amplitude and phase much.)
This sequence of tones forming a note is called the "Harmonic Series" [har] or "Overtone Series" of the
fundamental. Herein we speak of "the (ideal) Harmonic Series" when we mean an abstract computational ideal
and speak of "an overtone series" when we mean what is actually produced in reality by a particular actual
instrument (which may be quite different from the ideal); note that others quoted here may not follow this
same convention. (Further, throughout we pluralize "series" as "series-es" because in a technical discussion it
is very important to avoid the ambiguity between a single series of multiple tones and multiple series-es of
multiple tones.)
There are two conventions for numbering overtones/harmonics; we use the convention where the fundamental
or "Root" tone is called "harmonic 1", the tone vibrating twice as fast is called "harmonic 2", the tone vibrating
three times as fast is called "harmonic 3", etc.
having relatively high amounts of energy in the odd harmonics -- three times, five times, and
seven times the multiples of the fundamental frequency, etc. (This is a consequence of their being
a tube that is closed at one end and open at the other.) Trumpets are characterized by having
relatively even amounts of energy in both the odd and the even harmonics (like the clarinet, the
trumpet is also close at one end and open at the other, but the mouthpiece and bell are designed to
smooth out the harmonic series). A violin that is bowed in the center will yield mostly odd
harmonics and accordingly can sound similar to a clarinet. But bowing one third of the way down
the instrument emphasizes the third harmonic and its multiples: the sixth, the ninth, the twelfth,
etc.
Besides introducing us to timbre, Levitin points out:
Most real instruments systematically produce tones having amplitudes distinct from that of the
ideal Harmonic Series.
Michael O'Donnell points out that the effects of timbre on the overtone series goes even further [O'Donnell, 14
January 2009]:
I suggest that you check into the importance of approximate harmonic series. E.g., the overtones
on a piano string are measurably and audibly higher in frequency than the harmonics that they
approximate. Both the nearness to harmonics, and the perceptible difference, appear to be
important....
You mentioned the way that the harmonic series of frequencies occurs naturally in air columns, as
in strings. But, on soft strings (such as guitar, violin---little resistance to bending) the natural series
of resonant frequencies is very accurately harmonic. In wind instruments, the natural resonances
of the air column approximate the harmonic series rather poorly. In the brass, the approximation is
so poor that the numbers of the harmonics don't even match between the natural resonances and
the notes as played. While the conical shape of many reeds is designed to improve the
harmonicity of the resonances, the bell on the brass is actually designed to increase the
inharmonicity of the natural resonances, which produces a better match in the misaligned
overtones. It is phase locking between vibrational modes, caused by the highly nonlinear feedback
in the excitation mechanisms (reeds, lips, bow scraping) that makes the overtone series so
accurately harmonic, not the natural resonances.
That is, O'Donnell points out:
Most real instruments systematically produce tones having frequencies distinct from that of the
ideal Harmonic Series.
Therefore whatever our theory of harmony it should work for sounds where the overtone series differs from
the ideal Harmonic Series by (1) altered amplitudes and (2) altered frequencies. However, notice that both of
these distortions of the ideal Harmonic Series have one important property:
The distortions made by the overtone series of a given instrument to the ideal Harmonic Series
are a predictable, systematic function of the instrument kind.
That is, two notes (series-es of overtones) made by the same (kind of) instrument will be distorted from the
ideal Harmonic Series in the same (or similar) way. This must be the case in order for an instrument or
instrument kind to have a uniform, recognizable timbre. We will use this below.
I think part of the reason the theory we develop here might not have been described before is that there aren't
many people who think about both the physical and the computational understanding needed to derive it.
The properties, or laws, of computation are just as fundamental as the physical laws.
Computation is everywhere -- you live in a sea of it.
You may see a cup, but computational engineers see an idiom for managing liquids by getting them stuck
in a local optimum.
You may think of ownership as a basic human right, but engineers think of it as an distributed decisionmaking algorithm.
You may enjoy a field full of bumblebees pollinating flowers, but engineers enjoy it as information
distribution network.
You may think it is polite to not talk on top of other people at dinner, but engineers think it is optimal to
use a back-off algorithm to resolve a network packet collision.
I wrote that list off of the top of my head as fast as I can type and edit text: the examples are myriad.
Consider for a moment that perhaps you are computation: that you are the computational activity of your
brain. Some people say that this reduces the wonder of life to simple mechanism; I say it simply elevates
mechanism to the wonder of life. While you need not adopt this All-Is-Computation point of view as your
personal understanding of life or of yourself, a computational understanding of the brain has amazing
explanatory power, so please consider it at least for the rest of this essay.
"I'm not bad, I'm just drawn that way." -- Jessica Rabbit [Jessica-bad]
Jessica Rabbit [Jessica-pout] is one of the sexiest characters in Hollywood, elected 88th of The 100 Greatest
Movie Characters of All Time by Empire Magazine [Jessica-great]. Sadly, she is just a drawing and a voice.
Despite the powerful illusion to the contrary, we do not see or hear the world; we see and hear the world that
our brains compute. Like the characters in "Who Framed Roger Rabbit?" [WFRR-1988], we live in a cartoon.
Music is not what the world does; it is what we do with the world.
A friend of mine Joel Auslander used to intern at Pixar; his job was to make physics simulator tools for the
animators. He wanted to make simulators that were accurate to the real physics, but he said that the animators
told him that people don't want to watch real physics, people want to watch cartoon physics: even though not
accurate as real physics, cartoon physics is somehow more satisfying [Auslander, c. 1996].
Conjecture Two: The brain uses cartoon physics, that is, physics that is easy to compute, but not
2.1.1 Virtual Pitch: Hearing the Harmonic Series Even When it is Not There
There is reliable acoustic phenomenon called "Virtual Pitch": if the Harmonic Series is processed to remove
the Root or Fundamental tone and then played to a person, that person will hear the note, including the Root
tone, even thought it is not played [miss-fund]. The "Auditory Demonstrations" CD again [acoustical-demo,
Demo 20], "Virtual pitch":
A complex tone consisting of 10 harmonics of 200 Hz having equal amplitude is presented, first
with all harmonics, then without the fundamental, then without the two lowest harmonics, etc.
Low-frequency noise (300-Hz lowpass, -10dB) is included to mask a 200-Hz difference tone that
might be generated due to distortion in playback equipment.
As they say, in the demo overtones are subtracted one at a time, from the fundamental on up. Amazingly, the
note being played seems to stay the same; however it does get more buzzy or annoying to the point where a
fellow listener Simon Goldsmith thought that he would no longer call the last example the same note
[Goldsmith, c. 2010].
Virtual pitch is what allows engineers to fake bass notes on small speakers: they don't play the low tones, as
often the speaker is too physically small to make the fundamental frequency anyway; instead they play the
overtones and rely on your brain to reconstruct the whole Harmonic Series. However, as we noted above, you
will hear that small, cheap speakers sound, well, cheap or "tinny"; the bass just doesn't sound as good as it
does when played on sub-woofers. That said, don't forget how remarkable it is that you can still "hear" the
non-existent fundamental tone at all (which helpfully prevents the need for people to jog with sub-woofers
attached to their ears). From [miss-fund]:
For example, when a note (that is not a pure tone) has a pitch of 100 Hz, it will consist of
frequency components that are integer multiples of that value (e.g. 100, 200, 300, 400, 500.... Hz).
However, smaller loudspeakers may not produce low frequencies, and so in our example, the 100
Hz component may be missing. Nevertheless, a pitch corresponding to the fundamental may still
be heard.
(Note that virtual pitch is a special case of (1) the feature vector understanding that we give in Section 2.5
"Recognition: Feature Vectors" and (2) the concomitant effect of false recognition that we speak of in Section
2.5.2 "False Recognition", where here virtual pitch is the false recognition of the Harmonic Series.)
(See Section 6.2 "Terhardt Does Not Explain Sustained and Minor Chords" for an illustration by Coren
[Coren1972] (as quoted by Terhardt [Terhardt1974-PCH]) which shows standard visual illusions as a metaphor
with virtual pitch.)
(In "How to Play From a Fake Book" [Neely1999] says that when playing a chord, you can drop not only the
Root of the chord, but also the Fifth and the listener will still hear the chord; see Section 3.5.4 "Chords
Inducing Ambiguity". We should point out that here we speak of omitting one note from a chord, a collection
of multiple notes, or multiple series-es of tones, whereas virtual pitch is a phenomenon of omitting one tone
from a single Harmonic Series of tones of a single note. However we argue later in Section 2.3.2 "Harmony
Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical Across the Notes" that these two
situations are closely related and therefore the fact that it works to omit the Root or Fifth of a chord is actually
the phenomenon of virtual pitch again and is thus more evidence for our theory that the brain is listening for
the Harmonic Series.)
Recognizing ratios of tones (and notes) more strongly than the absolute tones themselves is a phenomenon
called "Relative Pitch" [rel]. A ratio of a pair of tones (or notes) is called an "interval".
and their emotional state. But being perfect makes this recognition hard; from "What Caricatures Can Teach
Us About Facial Recognition" [Austen-caricature] (see Section 2.5.2 "False Recognition" for more):
[W]hen you talk to these artists about their process, you realize that the psychologists have gotten
the basics down pretty well. When Court Jones, the 2005 Golden Nosey winner, describes how he
teaches the craft to younger artists, he lays out exactly the algorithm that vision scientists believe
humans use to identify faces. Students, he says, should imagine a generic face and then notice
how the subject deviates from it: "That's what you can judge all other faces off of."
Also, just as a vision scientist would predict, symmetrical faces -- those close to our internal
average -- are especially difficult to caricature. People at the convention mention struggles with
Katy Perry and Brad Pitt; the animator Bill Plympton, a guest speaker at the convention, tells me
that Michael Caine has long been a bte noire. The same principle explains why the person at the
convention with maybe the least symmetrical of faces appears by week's end in no fewer than 33
works of art on the ballroom walls.
I don't think I need a citation to claim that Katy Perry and Brad Pitt are considered to be very beautiful
people. This suggests another conjecture.
Conjecture Six: Absence of distortion (or personality or timbre) is sweetness.
2.3.1 Recreating an Ideal Harmonic Series using Instruments having SystematicallyDistorted Timbre
In Section 1.5.1 "Timbre: Systematic Distortions from the Ideal Harmonic Series" above we saw that the
overtone series of a single instrument is easily distorted by myriad physical effects. However, recall that for
the same (kind of) instrument, those distortions were systematic and reliable. Therefore by playing
multiple notes,
on instruments having the same (or similar) timbre,
and relying on Relative Pitch to subtract the differences for us,
from distorted overtone series-es we can magically recreate parts of the ideal Harmonic Series!
2.3.2 Harmony Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical
Across the Notes
Suppose we play two notes on the piano that are a Fifth (a factor of 3/2) apart. Per O'Donnell's comment in
Section 1.5.1 "Timbre: Systematic Distortions from the Ideal Harmonic Series" above, since piano strings are
not the strings of ideal physics, they don't make an ideal Harmonic Series. Instead, each tone in the series is
moved by being multiplied by some fudge factor. However notice that strings on the piano are made of the
same stuff, at least nearby strings, and this fudge factor should therefore be somewhat consistent across
strings. That is, two corresponding tones at the same point in the overtone series of two different notes should
get multiplied by the same fudge.
Tones of 1st note:
1 ---> (1
* 2 * fudge2) ---> (1
* 3 * fudge3) ...
------------------------------|
|
|
|
|
|
v
v
v
--------------------------------Tones of 2nd note: 3/2 ---> (3/2 * 2 * fudge2) ---> (3/2 * 3 * fudge3) ...
Now notice that there are two kinds of intervals of tone pairs:
"horizontal": intervals made by pairs of tones within the one series of tones generated by one note, and
"vertical": intervals made by pairs of tones across the two series-es of tones generated by the two
different notes, especially those of corresponding overtones.
= 3 * fudge3,
Vertical intervals are pure: the ratio of overtone 3 of the 2nd note to overtone 3 of the 1st note is pure:
(3/2 * 3 * fudge3) / (3 * fudge3) = 3/2 (pure!).
However, I would be remiss if I did not point out here [acoustical-demo, Demo 31], "Tones and Tuning with
Stretched Partials" from "Auditory Demonstrations" CD, quoted in Section 5.1 "Helmholtz's Theory Relies
Only On Interfering Overtones, But Harmony Is Something More". In Demo 31, a piece by Bach is played on
computer-generated piano (part 1) having normal overtones and (part 4) having overtones where an Octave is
stretched from a factor of 2 to a factor of 2.1. Taken naively, our theory that the purity of vertical intervals
matters to the brain suggests that these should both harmonize; however the normal one (part 1) certainly
sounds better. We suggest therefore that if the horizontal intervals are distorted grossly enough, then the fact
that the vertical intervals are pure cannot save the harmony from being destroyed by the dissonance of the
horizontal intervals.
one voice, especially that of a trained singer, as in the horizontal intervals of that voice there is one
instance of each interval of the Harmonic Series (albeit with the fudge we mentioned above of
horizontal intervals).
Vertical intervals are all of the same kind, an entire box of chocolate almond cherry: on the other
hand when two voices are sung, say, a Fifth apart, there is an entire wall of the same kind of sweetness,
a wall of many Fifths coming at you, namely the vertical intervals above, each of which is a Fifth.
(Again, for an introduction to musical intervals such as the Fifth), see Section 3.1 "The Major Triad".)
be greatly reduced. Since it is work to process information, we suggest that the brain likes to have reliable
expectations in order to minimize the amount of surprise it is dealing with all day.
Model Inference: Life is full of situations where we may observe the consequences of a situation but are not
told explicitly what is the state of the situation. There is nothing left to do but to infer a model of the state of
affairs from observation of many details, and therefore inference is likely a constant activity of the brain. For
example, people often infer the rules of a game from observation and without reading the rules.
Have you ever seen someone color-coordinate their clothes or even their room? Have you ever been to a
"theme party" where everyone was to dress and act from a given era or situation? How about a "theme
restaurant" or "theme park"? Having a theme for all of the elements of a given situation
(surprise reduction) reduces the amount of new information or "surprise" that each one introduces, and
(ease of inference) allows the brain to construct a whole from the parts.
Differences and changes are interesting to the brain, but too much difference fails to feel "unified" -- it does
not all occur as parts of a whole. In support of both surprise-reduction and ease of inference, we consider it
likely that
Conjecture Seven: The brain wants input to have a theme. That is, the brain both infers themes
from input and uses themes as context when processing input.
horse
raced
past
the
barn
fell.
Sentences like these are called garden-path sentences because, in slow reading, we often notice
that we have followed an analysis path that turned out to be wrong....
But why are people surprised in garden-path situations? The brain is a massively parallel
information processor and is able to retain multiple active possibilities for interpreting sentence,
scene, and so on. Well, there must be a cutoff after which some possible interpretations are
deemed so unlikely as to be not worth keeping active. The final piece of their [referring to a model
given by other researchers] model was an assumption that a hypothesis was abandoned if its belief
net score was less than 20% of that of its rival. We experience surprise when the analysis needed
for a full sentence is one that was deactivated earlier as unlikely. This is a complex computational
model, but nothing simpler can capture all the necessary interactions.
The input the brain gets as we live life is inherently and often wildly ambiguous. Alternatives multiply and so
the number possible ambiguities in a situation can easily grow exponentially. No machine can keep up with the
demands of a problem the size of which grows that fast. Therefore:
Much of the brain is a massive disambiguation engine that is running all the time and is
functioning at its computational limit.
Jokes are often of the form of an ambiguity of contexts/themes resolved by a punchline which evaluates one
way in one context and another way in the other context (say true in one and false in the other); the story that
precedes the punchline serves to amplify the weaker context, the weaker side of the ambiguity, so as to
maximize the punch of the line by making it break symmetry between two almost equal contexts/themes. Story
plots are often of this form as well, in particular mysteries. The language of Shakespeare is full of double
meanings and even perhaps a triple meaning here and there. These are all to the same purpose:
Conjecture Eight: The brain enjoys having its disambiguation engine teased.
timbres; thus the Harmonic Series recognizer must be able to robustly find the fundamental even when some of
the tones are missing. See Section 2.1.1 "Virtual Pitch: Hearing the Harmonic Series Even When it is Not
There", Section 2.1.2 "Using Greatest Common Divisor as the Missing Fundamental", and Section 1.5.1
"Timbre: Systematic Distortions from the Ideal Harmonic Series".
Major Triad
Root:
(harmonic 1):
1 = 1.0.
Major Third
(harmonic 5): 5/4 = 1.25.
Perfect Fifth (harmonic 3): 3/2 = 1.5.
Note that according to our measure of interestingness in Section 2.4 "Interestingness: Just Enough
Complexity", the intervals of the Fifth (factor of 3) and the Third (factor of 5) are, in that order, the most
interesting intervals: (1) they are in the theme of the Harmonic Series, while also (2) they have some
complexity resulting from not being a simple power of two times the Root (which if they were would make
them subject to the octave effect tending to make two notes sound like one; that is, a factor of 2 is too boring).
If we pick C as the Root (as we did above) then the resulting Major Triad is called the "chord" of C Major. The
starting node of "C" was arbitrary; however the resulting triad was not. Is it so surprising that this Major Triad
is everywhere in music? It sounds rather nice to play notes in the C-Major Triad; try it. However, after a while
it is a little boring, so we would like to add some variety. How little complexity can we add and yet still change
something?
Ok, that was so much fun let's go in the other direction as well. That is, let's make yet another Major Triad
where that the Perfect Fifth of that Triad is the Root of our first Triad. That means multiplying by 1/(3/2) =
2/3; therefore let's build a triad using 2/3 times C4 = F3 as the fundamental. Let's be sure to multiply by 2
when necessary to keep everything within the same Octave. (Note that throughout we use "~" (tilde) to mean
"almost equals".)
Major Triad Down by a Perfect Fifth
Root:
2/3 *
1 = 2/3
which is smaller than 1,
so mult by 2: 4/3
~ 1.333.
Major Third:
2/3 * 5/4 = 5/6
which is smaller than 1,
so mult by 2: 5/3
~ 1.666.
= 1.0.
Note that the selection of three interlocking triads is suggested by our measure of interestingness from Section
2.4 "Interestingness: Just Enough Complexity". That is, using three overlapping Major Triads (1) maximizes
the theme of the Harmonic Series while not requiring any harmonics beyond harmonic 5 (the interval called
the Third), while also (2) having some complexity by not all being of one Harmonic Series.
Now we have three "interlocking" Triads: the Perfect Fifth of one is the Root of the next. How many notes is
that? Three notes per triad times three triads is nine notes; however two of the notes where the triads interlock
are counted twice, so there are 3 * 3 - 2 = 7 unique notes. Let's plot them on a line to see how far they are
from one another.
log_b(b^y) = y
You can use exponentials to think about logarithms. When computing the ratios of numbers, imagine each
number represented as an exponential of a base, such as 2. Now think about what multiplication and division
of that number do to the exponent as the numbers are divided or multiplied. That is, if we think of taking the
ratio of two numbers represented as exponents we see that we we are just subtracting their exponents (the
logarithms of the original numbers); that is,
2^p / 2^q = 2^(p-q).
Perfect
* (1/1)
* (5/4)
* (3/2)
a
*
*
*
1/1;
5/4;
3/2;
Fifth
* (1/1) = 3/2;
* (1/1) = 15/8;
* (1/2) = 9/8;
Perfect
(1/1) *
(5/4) *
(3/2) *
Fifth
(2/1) =
(2/1) =
(1/1) =
4/3;
5/3;
1/1;
log_2
log_2
log_2
1/1 ~ 0.000.
5/4 ~ 0.322.
3/2 ~ 0.585.
4/3 ~ 0.415.
5/3 ~ 0.737.
1/1 ~ 0.000.
Note that this may look more complicated than it really is. All that is going on numerically is playing with
factors of 2, 3, and 5, in a rather systematic way, as follows:
The three Triads are each a factor of 3 (a "Fifth") apart.
Within each Triad, we have a factor of 3 (a "Fifth") and a factor of 5 (a "Third") from the Root.
We multiply or divide by 2 (an "Octave") enough times to keep everything in one Octave.
Notice that the importance of the numbers 2, 3, and 5 is not uniform: 3 is used most prominently, 5 is more
secondary, and, going the other direction, 2 is so boring we just throw it in wherever we like. This reflects our
observation from Section 3.1 "The Major Triad" of the different harmonics, a factor of 3 seems to have the
right amount of complexity to be most interesting, so it gets top billing (see Section 2.4 "Interestingness: Just
Enough Complexity" for more on interestingness in general).
log_2 1/1
log_2 9/8
log_2 5/4
log_2 4/3
log_2 3/2
log_2 5/3
log_2 15/8
~
~
~
~
~
~
~
0.000.
0.170.
0.322.
0.415.
0.585.
0.737.
0.907.
Now let's plot them on the unit interval to within 0.02 units.
C
D
E
F
G
A
B
C
+----+----+----+----+----+----+----+----+----+----+
0
1
2
3
4
5
6
7
8
9
0
Hmm, now that's interesting, if we were to fill in a few gaps they would look almost evenly spaced. Following
music theory, we'll call the big gaps "tones" (yes, this is a different meaning of the word "tone") and the small
gaps "semi-tones". (This meaning of "tone" and "semi-tone" will not occur very often, and below I try to use
only semi-tone.) I'll fill in the big gaps with a hash sign (I'll omit computing any exact values for them) so all
the gaps are now semi-tones.
C
#
D
#
E
F
#
G
#
A
#
B
C
+----+----+----+----+----+----+----+----+----+----+
0
1
2
3
4
5
6
7
8
9
0
Does that look familiar? If not, color the letters white and the hashes black and look again at the picture of the
keyboard at the top of the article.
Notice that there was no resorting to the following arguments:
"Because the Ancient Greeks did it this way."
"Because if the notes were equally spaced your ear would lose its place."
"Because your culture has trained these notes into your ear since you were a baby."
The result emerged naturally just from some physics and some computer science.
After I made this derivation of the Major Triad from first principles, a friend of mine Peter McCorquodale
pointed me to "Aesthetic Measure" by George D. Birkhoff [Birkhoff1933]. On page 92 in the section "The
Natural Diatonic Scale", Birkhoff independently makes the same derivation of the Major Scale as we do
above, albeit providing less detail and with no motivation from computer or brain science. Given the Major
and Minor Triads, Helmholtz also seems to give the same theory of a key as interlocking chords
[Helmholtz1863, p. 300] as we do above, though we argue below that he fails to explain how it is that we find
the Major and Minor Triads compelling to listen to in the first place.
While my derivation of the Major Scale above is therefore not a completely original contribution, it is also
certainly not well known. It was quite an effort for me to invent it, given my starting point of nothing but
curiosity about the problem and disgust with all the books to which I had access. How is it that even music
majors in college not know this derivation of the Major Scale?
The notes above are known as the 12-(Semi-)Tone Western (Chromatic) Scale (you will hear people call it the
"12-tone Western Scale", including in quotations below). The subset of lettered (or white) keys, omitting the
hashes (or black keys), is called the Major Scale.
We can now explain some standard musical terminology. In the particular case of the C-triad, the note E is
called the Major Third, as it is the third white key in the Major Scale. Similarly, the note G is called the Perfect
Fifth, as it is the fifth white key. Deep huh? We defer discussion on how it is that one is called "Major" and the
other "Perfect" until the section on Equal vs. Just Tuning below.
Explaining all musical conventions is beyond the scope of this article, but I mention a few basic ones we will
need. Going up a step is called "sharp", denoted "#", and down "flat", denoted "b", so we now have two names
for each black key; for example the black key between C and D is both "C#" and "Db".
The "key" of the scale is the Root note of what we called the base triad (the one in the middle of the three
interlocking triads); that is, in the example above the key was C Major. It will turn out that there is more than
one way to build a scale than to lock together those three Major Triads (F, C, G).
1. We could use a note other than C as the base of the center triad.
2. We could use another kind of triad other than the Major Triad; we haven't talked about that yet.
The name of the key indicates which choices were made for the two variables above: the one we built above
was built (1) starting at C and (2) using three Major Triads (a Major Scale), so it is called the C Major Scale.
Adding F# to our keys can be done; however there is a worse problem. Let's compute the ratios we get if we
build a Major Triad starting at D, that is, using the notes D, F#, and A. In particular, if A is to be the Perfect
Fifth above D, then their ratio should be 3/2. (I kept so many decimal places below as the fractional part is just
too cool to omit.)
We got D as the Fifth above G, and G as the Fifth above C.
(We divide by 2 to keep it in the same Octave:)
D = (3/2) * (3/2) / 2 = (9/4) / 2
= 9/8.
We got A as the Third above F, and F as the Fifth below C:
A = (1/(3/2)) * (5/4) = (2/3) * (5/4) = 5/3.
A over D is therefore
A / D = (5/3) / (9/8) = (5*8)/(3*9) = 40/27 ~ 1.481.
Whereas a Perfect Fifth should be
3/2 = 1.5.
The error is therefore
(PerfectFifth - (A/D)) / PerfectFifth
= ((3/2) - (40/27)) / (3/2)
~ 0.0123456790123457
~ 1.2%.
So if we measure carefully, we notice that, even with the big gaps filled in, the intervals are not all exactly right
for playing another key, such as D Major. That is:
If we want to do a key change, we can try (a) just using the same piano we derived for the key of C
Major, but (b) playing whatever piano keys we find when we just move "up" a triad; that is, using the
same notes as for C Major but making the triad rooted at D.
However, if we compute the note ratios carefully, we see that the ratios for the "triad" rooted at D will
not be quite right. They also will not sound right. In fact, if we do more key changes, moving, say,
repeatedly "up" by a Perfect Fifth again (beyond D), some of the other triads will be even less right and
will start to sound really bad.
Uh, oh. Should we buy a new piano every time we change key?
notes equally-spaced, each semi-tone is the number such that when you multiply it by itself 12 times you get
an Octave, or factor of 2. This numbers is also called twelfth-root of 2 which is 1.0594630943593 or one plus
about six percent. That is, every time you go up a semi-tone, you are adding six percent to the frequency of
the note. Below I'll call this ratio of the interval of a semi-tone "TwR2" when I want to emphasize that it is the
twelfth-root of 2.
Note that the Equal Tempered Chromatic Scale is commensurate with our measure of interestingness from
Section 2.4 "Interestingness: Just Enough Complexity". That is, these twelve notes (1) maximize the theme of
being closed under arbitrary key changes and preserve the theme of the Major Scale (as a subset of the twelve
notes), while also (2) having probably the right amount of complexity, namely twelve notes, which Levitin
points out later in Section 4.2 "But Other Cultures Have Different Musical Scales!" seems to be a human
universal for scales (my guess is that's a limit on how much complexity the human brain can handle).
Equal Tuning is not considered to be a completely good thing. First, now none of the keys are tuned really
right -- now they all sound a little "off" -- some of the sweetness is permanently gone. Second, while all the
keys sound the same, on the other hand... all the keys sound the same! (That is, "sound the same" with respect
to relative pitch considerations; Mark Hoemmen points out that other factors of absolute pitch may still make
keys sound different [Hoemmen, October 2011].) When the notes were all tuned to make one key sound
perfect, other keys sounded "off" in different ways and this could be used for dramatic effect by the
composer; as a musician once pointed out to me, when we play that old music today on a modern keyboard,
we no longer hear it as it was intended. By adopting the Equal Tempered Scale, as with all engineering
tradeoffs, something has been gained and something lost.
In our insistence on symmetry we have lost both some sweetness and some richness -- a common
theme of Modernism.
Recall that in, say, the C Major Scale, the Fifth is called "Perfect" and the Third "Major". This is because the
approximations introduced by Equal Tuning caused more damage to the Third than the Fifth. Specifically the
Fifth on an Equally Tempered piano is very close to 3/2:
(TwR2^7) / (3/2) ~
~ 1 - 0.1%.
0.998871 = 1 - 0.001129
1.007937
We now consider what different kinds of easily-computable features the Harmonic Series might have that the
brain might use to look for it. Recall further from Section 2.2 "Artifacts of Optimization" that we assumed that
the brain uses two different tricks in order to recognize the many different possible instances of the Harmonic
Series (those based on different fundamentals) using the same hardware and using that hardware as efficiently
as possible.
Relative pitch: the brain divides pairs of co-occurring tones and recognizes the resulting pairwise
intervals.
Octaves: the brain normalizes tones into a single factor of two range by dividing or multiplying up front
by powers of two.
Octaves: when normalized into the Octave of the fundamental, the Major Triad looks like this.
Harmonics of the Major Triad
Root:
1 = 1.0.
Major Third:
5/4 = 1.25.
Perfect Fifth: 3/2 = 1.5.
Relative pitch: when expressed as a set of ratios of pairwise intervals, the Major Triad looks like this.
Intervals of the Major Triad
Major Third
/ Root:
(5/4) / (1)
= 5/4.
Perfect Fifth / Root:
(3/2) / (1)
= 3/2.
Perfect Fifth / Major Third: (3/2) / (5/4) = 6/5.
When expressed as a set of ratios of pairwise intervals, the Minor Triad looks like this.
Intervals of the Minor Triad
Perfect Fifth / Minor Third: (3/2) / (6/5) = 5/4.
Perfect Fifth / Root:
(3/2) / (1)
= 3/2.
Minor Third
/ Root:
(6/5) / (1)
= 6/5.
Harmonic Series, and myriad ways for things to sound off in some way, resulting in the multiple Minor Scales
and other even stranger things (see Section 3.5.6 "Chords Preserving Intervals but not Harmonics").
3.5 Chords
Musicians sometimes play multiple notes at the same time. Likely by trial and error, people have discovered
that certain combinations of notes convey a particular "emotion" or "sense" to the listener. Notice I didn't say
"sound good" -- in language we often use words that don't sound good but still convey a certain sense, even if
it is bad. That is, a musical sound should sound "like something", even if not a good something. Similarly, in
written expression we may say things good or bad, but we don't (often) say complete nonsense and even if we
do it is in a context that gives the nonsense some sort of meta-sense. Let's call these groups of notes that sound
like something "chords".
Instead of resorting to trial and error to discover these chords, let's see if we can derive them from our first
principles and conjectures, given above. To make this a proper scientific experiment we need a "ground truth"
list of chords as a goal against which to test our progress. Many piano books come with a "chord dictionary"
which is a reasonable measure of the chords that people use most of the time and that have been discovered by
trial and error over the centuries to sound like something; we'll use that. Motivating observed phenomena from
the first principles of a theory is considered to be an ultimate test of the quality of a theory (together with
being well-factored; see Section 1.2 "What is a Satisfactory, Scientific Theory?"): if it succeeds, the theory has
about all of the explanatory power we could ever want.
Let's see if we can motivate from first principles all of the chords in a Standard Chord
Dictionary.
Throughout this derivation of different chords, you will note that a gradual progression or degradation from
high-theme/low-complexity (Major Triad, Harmonic Series chords) to low-theme/high-complexity (Minor and
Ambiguous chords). This progression is suggested by our measure of interestingness from Section 2.4
"Interestingness: Just Enough Complexity". That is, as we progress, more and more of the theme of the
Harmonic Series is lost and more and more complexity is introduced. Notice that this progression seems to
mirror that of musical sophistication as well: musically untrained listeners like Major chords while more
musically trained listeners are more tolerant to loss of theme and more interested in complexity. (I met a signal
processing engineer who had played piano for something like 18 years and who simply did not like Major
chords at all.) Other fields seem to progress similarly: white wine is preferred by new wine drinkers, whereas
more "complex" red wines are an acquired taste.
Name
C "Major Triad"
Cm "Minor Triad"
C-Eb-G
0-3-7
Cdim "Diminished"
C-Eb-Gb
0-3-6
C+ "Augmented"
Csus "Sustained"
C-E-G#
C-F-G
0-4-8
0-5-7
C6 "Sixth"
C-E-G-A
0-4-7-9
C-Eb-G-A
0-3-7-9
C7 "Dominant Seventh"
Cmaj7 "Major Seventh"
C-E-G-Bb
C-E-G-B
0-4-7-10
0-4-7-11
0-3-7-10
C-Eb-Gb-A
0-3-6-9
C-E-G-D
0-4-7-14
C9 "Ninth"
Cmaj9 "Major Ninth"
C-E-G-Bb-D
C-E-G-B-D
0-4-7-10-14
0-4-7-11-14
C-Eb-G-Bb-D
0-3-7-10-14
C11 "Eleventh"
C-E-G-Bb-D-F
0-4-7-10-14-17
C13 "Thirteenth"
C-E-G-Bb-D-A
0-4-7-10-14-21
If you really want to make it ring fill in these other notes from the Harmonic Series of C in their proper
Octaves -- you can just do it with two hands if they can span an Octave:
Harmonic 2: play the Octave interval, C4, one Octave up from the Root: a factor of 2.
Harmonic 4: play the "Double-Octave", C5, two Octaves up from the Root: a factor of 2 * 2 = 4.
Harmonic 6: play the "Double-Octave-Fifth", G5, two Octaves up from the Root: a factor of (3/2) * 2 *
2 = 6.
Since playing the notes of a chord in Octaves that put them closer to the ideal Harmonic Series really sounds
better, this experiment supports the theory that notes sound good together because they are all from one
Harmonic Series, rather than because their intervals are just somehow special.
However, I would be remiss if I did not point out the following phenomenon (I recall Michael O'Donnell
making this point to me, but he does not recall making it) [O'Donnell, c. January 2009]:
even if you invert the chord -- select an alternate voicing, say with the Root high (keyboard right) and
other harmonics low (keyboard left),
but are sure to play the notes across several Octaves anyway,
then the chord stills sounds better than if played all in one Octave, though I think perhaps it doesn't quite
"ring" as nicely; try it for yourself. This observation may dampen the ringing endorsement of the Harmonic
Series of the above paragraph (puns intended): it seems that even in the absence of the ideal Harmonic Series
(due to the chord inversion), the brain would still rather hear notes played across several Octaves rather than
all bunched into one.
Having said that, we now begin our plan to derive the Standard Chord Dictionary from our theory.
that is, the Major Seventh interval may make no sense outside of the context of an effect that arises when it is
used as part of the Major Seventh chord [O'Donnell, c. April 2009]; please see the discussion of the Major
Seventh chord in Section 3.5.6 "Chords Preserving Intervals but not Harmonics" where we discuss this
situation further.
You can read a discussion on several kinds of seventh intervals here: [min7] and [harmonic7]. From that
discussion it seems that the exact ratio 7/4 (not the fudged-downward Dominant Seventh or the fudgedupward Major Seventh) is called the "Harmonic Seventh" (at least when it's not being called the "Septimal
Minor Seventh", or the "Subminor Seventh"!).
A chord containing harmonic 7 is called a seventh chord because B is the seventh white key counting from C
(and Bb is the flatted version of B), and not because it is harmonic number 7 in the Harmonic Series. Recall
that the Perfect Fifth is actually harmonic 3 and the Major Third is actually the harmonic 5! This convenient
naming of the seventh white key being harmonic 7 is however just a coincidence! That is, in the case of seven
we are just lucky. (Get it? Lucky seven! Nevermind.) It turns out that this lucky correspondence of the two
numbers of (a) the white key count from the Root, and (b) the overtone harmonic number in the Harmonic
Series continues on for all of the rest of the subsequent harmonics above seven as well (approximately well
enough anyway and up until we stop at harmonic 13; more on this below).
Let's add some more notes to our chord just by going up the Harmonic Series. Harmonic 8 is just the Root
again if we divide by two enough times; for this reason in general we will skip even numbers from now on (to
find the interval for an even harmonic just divide by two until you get an odd number). For a harmonic N times
the fundamental the number of semi-tones to the right of the Root key on a Equal Tempered 12-semi-tone
keyboard is log N / log TwR2 (recall that "TwR2" denotes the twelfth-root of 2). Of course we subtract off any
extra multiples of 12 to keep it in the same Octave; another way to do that is divide up-front by a high-enough
power of 2. (This computation differs from the ones done earlier taking logs base 2 in Section 3.2.3 "The
Keyboard Revealed" because in that section we wanted the answer as a percentage of the Octave (a factor of
2), whereas here we now know that we have divided the Octave into 12 semi-tones and instead want the
answer as the number of semi-tones; we could just as easily instead take the log_2 x as before and then
multiply by 12.)
Ninth
Eleventh
Thirteenth
9/8 = 1.125;
11/8 = 1.375;
13/8 = 1.625;
Hmm, well those eleventh and thirteenth harmonics are pretty badly approximated by the piano keyboard!
Hey, wait a minute! I thought earlier we were saying that the names of the intervals were the number of white
keys we had to count over from the Root, not the name of the harmonic number. Remember, harmonic 3 is the
fifth white key and harmonic 5 is the third white key.
As noted above, just by coincidence harmonic 9 is really the ninth white key: 12 + 2 = D above C. It basically
works out, if we fudge just a little, that harmonic 11 is the eleventh white key: 12 + 5 (round down instead of
up!) = 17 = F (also known as the 4th). Harmonic 13 is the thirteenth white key: 12 + 9 (round up instead of
down!) = 21 = A (also known as the 6th). Very conveniently, with these fudgings, these harmonics just fit in
between the Major Triad white keys. How nice, especially if we don't let the pesky mathematical precision
bother us. Be sure to play the higher harmonics in higher Octaves, as detailed in Section 3.5.2 "How to Turn
Sweetness into Mud: Over-Using Octaves" below.
Chords from the Harmonic Series
Name
C "Major Triad"
0-4-7-10
C-E-G-D
0-4-7-14
C9 "Ninth"
C11 "Eleventh"
C-E-G-Bb-D
C-E-G-Bb-D-F
0-4-7-10-14
0-4-7-10-14-17
C13 "Thirteenth"
C-E-G-Bb-D-A
0-4-7-10-14-21
And what about the harmonic 15 and higher? I don't know if people can even hear harmonic 15, as it is also
the Major Seventh -- the brain may simply start to hear lower harmonics instead. Also, remember that my goal
is to explain the chords in my chord dictionary and it doesn't go higher than a Major Thirteenth. I don't know if
people can hear harmonics 11 or 13 either, but note that by stopping at harmonic 13 each odd interval neatly
corresponds one-to-one with a white key on the keyboard; that is, the fact that the chord dictionary stops at
harmonic 13 may be another consequence of people's desire for symmetry (also, harmonic 15 is not the
"fifteenth" white key, which would instead be the Double-Octave).
Harmonic
15/8 = 1.875;
standard chords are of this kind as we will see as we enumerate a taxonomy of the ways we can tease the brain
with ambiguity.
Sustained: One ambiguity is to have two instances of the interval from Root to harmonic 3, the Perfect Fifth.
For example, if we play notes C, F and G, we create the possibility of either F or C being the Root.
Unsurprisingly, this chord is called the "Sustained" chord and musicians say that it "wants" to "resolve" to a
Major Triad at F or C. This effect is very easy to hear. Try it: play C-F-G (sustained) and then play C-E-G (C
Major).
Augmented: Another ambiguity is to have two instances of the interval from Root to harmonic 5, the Major
Third. For example, if we play notes C, E and G# we have this situation. Counting carefully, note that these
three notes actually make three Major Thirds! Unsurprisingly, it doesn't sound very satisfying, but the brain
does "recognize it as something", as opposed to sounding like noise.
Diminished: Another ambiguity we can create is to have two instances of the Minor Third. For example, if we
play notes C, Eb and Gb we have this situation and if we add the note A, we have not only three but four (they
wrap around) copies of this interval all at once! Unsurprisingly, it also doesn't sound very satisfying, but again
it still sounds "like something".
Chords Inducing Ambiguity
Name
Notes in C Major Scale Semi-tones from fundamental
C+ "Augmented"
C-E-G#
0-4-8
Csus "Sustained"
C-F-G
0-5-7
Cdim "Diminished"
C-Eb-Gb
0-3-6
0-3-6-9
Notes in C Major
Scale
C-Eb-G
Semi-tones from
fundamental
0-3-7
C-E-G-A
C-Eb-G-Bb
0-3-7-10
C-Eb-G-Bb-D
0-3-7-10-14
Name
Cm "Minor Triad"
Note that there are two sub Triads, one Major and one Minor:
The C Major Triad.
The E Minor Triad.
The Major Seventh has always had a "cool Jazz" sound to me -- I can hear something, it is a bit of a thin,
distant flavor having just a hint of sweetness -- like one of those drinks in an upscale organic juice bar that
you've never heard of -- I'm thinking celery and watermelon rind with a touch of pomegranate.
But then what about C Minor (Dominant) 7? It doesn't seem to fit into any Harmonic Series no matter how I
look at it. There are also other many other weird chords that occur in Jazz pieces that are too rare to make it
into my chord dictionary but show up in actual music, such as C7#5: C Dominant 7 with a sharped 5th. There
are two things all of these chords have in common
they are relatively rare, and
they sound rather weird and off, even more so than a Minor Triad (probably related to the fact that they
are rare!).
Explanation? They are probably all just fragments of various sets of intervals that occur in the Harmonic
Series, with no attempt to actually represent the series, or they combine that with some Harmonic Series
ambiguity as does the Minor Triad above. We discuss a possible further effect in Section 7.2 "The Role of
Narrative Generally" where we consider the importance of content providing context for further content,
thereby creating narrative. In sum, it is likely that the brain has one disambiguation engine and that the
processing that occurs in verbal narrative would process similarly in other contexts, such as music. So, while
these chords may sound strange in isolation, the theme created by the preceding music before the chord may
bring a certain sense to them. Think of one standard structure for a joke: a story (creating a theme) and then a
punchline; the punchline would not be funny in isolation without the context provided by the story, and yet we
attribute the funniness of the joke to the punchline and not the story which did the work. Theme and ambiguity
may alternate throughout a narrative, repeating this effect. Investigating this process is left as future work.
Chords Preserving Intervals but not Harmonics
Notes in C Semi-tones from
Name
Analysis of composition
Major Scale
fundamental
Cm6
A Minor Triad plus another Minor Third interval
"Minor
C-Eb-G-A
0-3-7-9
(from the ninth to the fundamental one Octave
Sixth"
up).
Cmaj7
"Major
Seventh"
Cmaj9
"Major
Ninth"
C-E-G-B
C-E-G-B-D
0-4-7-11
0-4-7-11-14
Other stuff in Jazz too weird to make it into the chord dictionary also likely exploits distant
harmonics, subtle ambiguities, or effects of context created by preceding content which cannot
be appreciated in isolation....
4 Miscellaneous Objections
Here are my thoughts on a few objections that I anticipate.
C
C#
G
G#
D
D#
A
A#
E
B
(...and then repeats: E# = F).
Cool!
As we saw above, of all the intervals, the Fifth is the most
sweet (near the bottom of the Harmonic Series), and
interesting (not so harmonic as to be boring; that is, not the Octave interval).
As we discuss in the next section, by using the Fifth (and, well, of course the Octave) repeatedly we get all the
notes on our piano! This Circle of Fifths thing must be fundamentally important to the nature of how sounds
sound musical or something!
count carefully we went down by an Octave, a factor of 1/2, an additional seven times, and we got back to the
"same" note, or a ratio to that same note that is very close to one (actually within 1.4%). In other words, we
have discovered that:
3^12 = 531441
almost equals
19
2^
= 524288;
that is,
3^12 / 2^19
= 1.0136432647705078125 (exact!)
Wow is that close to 1. When that small amount of error is spread out evenly over twelve Fifth intervals, you
can see how the Equally Tempered Scale is rather appealing: it gets Fifths almost exactly right, to within
almost a tenth of a percent:
((3^12 / 2^19) - 1) / 12
= 0.0136432647705078125 / 12
~ 0.0011369387308756511
~ 0.1%.
By the way, if you write a computer program to look for other amazingly-close collisions of powers of two and
three with reasonably low exponents (I have), you won't find any.
Amazingly, (1) picking twelve notes makes the most important harmonic, number 3, be the fifth white key, and
(2) humans have five fingers on one hand, making the most important harmonic also the one that is easiest to
play.
essence of a piece pared down to the minimum: it consists of a chord progression, a melody line, some rhythm,
and maybe some lyrics. But it is clear that the chord progression and rhythm are more fundamental than the
melody: I have twice now asked two different Jazz pianists, after they had just finished a piece, "When you are
improvising while playing, what are you really doing?" Both times I got the same answer "I know the melody,
and I'm not playing it." The only way to be playing something that goes with the melody is to play a different
set of notes that are harmonically related to the melody, and, of course, in a related rhythm.
4.1.4 The Symmetries of the Circle of Fifths are a Terrible Red Herring
Michael O'Donnell points out the chord progression possibilities here (addressed to me) [O'Donnell, 14
January 2009]:
Many years ago, there was an article in the Computer Music Journal about the value of the
diatonic scale as a subset of the 12-tone chromatic scale. The article dealt entirely with grouptheoretic structural properties of the notes under the half-step and perfect-fifth generators. It
deliberately left out acoustical/perceptual issues. I think that there is something to learn from this
structural study, and from connecting it to acoustical/perceptual issues, even though the authors
did not even claim to have solved some defined problem. Roughly speaking, while you go into
reasons why the major and minor thirds sound nice, this article discussed why they yield a lot of
structurally interesting harmonic progressions.
Using symmetries to create chord progressions may be interesting, however no amount of symmetries is going
to explain how it works that the Fifth sounds good in and of itself; that is, no one wants interesting chord
progressions made out of chords that don't sound like anything. As people love symmetry so much, the Circle
of Fifths makes for a powerful red herring. Don't fall for it. Mark [Hoemmen, 22 October 2011] again:
The issue there is that circle of fifths transitions have become perhaps a cultural expectation[.] So
there's an interaction between what people expect culturally, vs. what their brains expect
biologically.
4.2.1 A Culture May Simply not be Fully Exploiting All of the Universal Harmonic
Features
Suppose the music of another culture is missing some important feature emerging from our canonical
derivation of chords and harmony. Does that mean the music of that culture is a counter-example to our
argument? Maybe not: just because the brain is capable of experiencing something does not mean that the art
of that culture has taken advantage of that fact.
That is, the situation could be as follows:
The brain is listening for certain canonical patterns induced by the Harmonic Series.
The brain induces processing artifacts such as relative pitch (listening for the ratios between tones/notes)
and octaves (listening for tones/notes normalized by factors of 2).
The brain is listening for parts of a feature vector firing.
However, a culture may not fully exploit all of these features in its tradition. An engineer might say that
features above that we claim to be universal to human hearing define the shape of the parameter space, but a
particular design need not make use of the whole space.
Many cultures use only five notes in their scale; any such scale is called "Pentatonic" [pen]. For example, one
of these Pentatonic scales is the Major Pentatonic: the set of notes we get by starting at one note and going
around the Circle of Fifths four more times, resulting in five notes (starting with C, that would be C, G, D, A,
E). From [pen]:
Pentatonic scales are very common and are found all over the world....
Much African and Chinese music makes use of Pentatonic scales and yet African and Chinese people seem to
have no difficulty enjoying Western music. Their brains were always capable of hearing Western music, but
their culture has simply never made use of the rest of the available parameter space.
4.2.2 But The Nasca People Of Peru Use A Linear, Not A Logarithmic, Scale!
From [acoustical-demo, Demo 18], "Logarithmic and Linear Frequency Scales":
A musical scale is a succession of notes arranged in ascending or descending order. Most musical
composition is based on scales, the most common ones being those with five notes (pentatonic),
twelve notes (chromatic), or seven notes (major and minor diatonic, Dorian and Lydian modes,
etc.). Western music divides the Octave into 12 steps called semitones. All the semitones in an
Octave constitute a chromatic scale or 12-tone scale. However, most music makes use of a scale
of seven selected notes, designated as either a major scale or a minor scale and carrying the note
name of the lowest note. For example, the C-major scale is played on the piano by beginning with
any C and playing white keys until another C is reached.
Other musical cultures use different scales. The pentatonic or five-tone scale, for example, is basic
to Chinese music but also appears in Celtic and Native American music. A few cultures, such as
the Nasca Indians of Peru, have based their music on linear scales (Haeberli, 1979), but these are
rare. Most music is based on logarithmic (steps of equal frequency ratio f / f) rather than linear
(steps of equal frequency f) scales.
In this demonstration we compare both 7-step diatonic and 12-step chromatic scales with linear
and logarithmic steps.
Who knows what is going on with the linear scales of the Nasca. I suspect the following.
Linear-scale flutes are easy to make as the lengths of the tubes of a linear-scale pan-flute are increasing
linearly, rather than exponentially; in contrast, to make, say, a pan flute you must make tube of
exponentially-related lengths, which is probably quite unintuitive the first time someone has to work that
out.
This culture was small and isolated and so no one ever noticed harmony and no one nearby showed it to
them; that is, innovation seems likely to be proportional to the number of people available to have new
ideas, but notice that the interesting number here is the number of people who are all in communication
with one another.
Let's consider the cultural evolution of visual instead of auditory processing for a moment. Cultures introduce
color names in an almost deterministic fashion. From [Feldman2006, p. 102-103]:
Berlin and Kay (Berlin et al. 1969) showed that, in languages around the world, basic color terms
had essentially the same focal colors, even though boundaries around color categories varied. The
neurophysiology of color vision was seen as directly providing the best explanation. There are
now a number of competing explanations for the commonality of focal colors, but the are all
based on embodiment (Kay et al. 2005).
... In the 1950s, color names were believed to be arbitrary in different languages. The assumption
was that you couldn't predict the ranges of these different color terms. Paul Kay and Brent Berlin
did a study in which they asked whether the boundaries of color terms were and also what colors
from a color chart where the best examples of each term. Between their own experiments and the
literature, they surveyed about 100 languages. They found that the boundaries for different
languages were somewhat different, but the best examples where quite similar. This study has
since been greatly expanded and the basic result confirmed (Kay et al. 2005).
... There is also considerable evidence on how the color word system evolves over time -- usually
when its community encounters other languages. Figure 8.1 outlines the development as speakers
of a language (like Dani) that has only two color words come to express further distinctions.
Systematically, when a third word is added, it distinguishes white from warm; a fourth term will
separate black from cool, and so on. Since this progression appears to hold very widely, it is rather
further evidence that human color terms are anything but arbitrary.
Where the figure 8.1 referred to above looks like this:
light-warm ---> white ---> white ---> white ---> white
\
\-> warm ---> warm ---> red
---> red
\
\--> yellow ---> yellow
dark-cool
(2 Terms)
(3 Terms)
(4 Terms)
---> black
---> green
\
\--> blue
(5 Terms)
(6 Terms)
(I should add that in class Feldman said that this progression of colors was not completely deterministic: there
are two paths through the space of color sophistication [Feldman, c. 2006]. Canonicality of algorithm has
limits; the main point still holds.)
4.3 But You Can Make a Piece of Music Based Entirely on That
Utterly Un-Harmonic Interval, the Augmented Fourth!
Recall the diagram of the three interlocking Major Triads laid out all in one Octave with C as the fundamental.
There is a huge gap between F and G, which in the Major Scale we fill in with a black note F#/Gb. That results
in an interval between C and F# called the "Augmented Fourth" (or "Diminished Fifth"), as it is the Fourth, F,
plus a semi-tone (or the Fifth, G, minus a semi-tone).
Play C and F# on a piano; it sounds awful. This interval is also called the Tritone as the distance between C
and F# is three whole tones (where here "tone" means a distance of two semi-tones, so a distance of six
semi-tones). We can see how it emerges that it sounds so bad: the ratio between F# and C it isn't near that of
any of the harmonics in the Harmonic Series. This interval deserves its nickname as the Devil's Interval.
I have heard the following argument from a music student: all music must be culturally-relative (as opposed to
the universal, physics-and-computation explanation we give) because someone has even written a piece of
music based entirely in the most un-harmonic of intervals, the Augmented Fourth, and gotten away with it.
There are people who can abuse themselves to the point of re-calibrating their expectations to all kinds of
strange inputs, including thinking that getting beaten with whips is fun or that McDonald's tastes good. That
doesn't mean that those inputs are natural or good or beautiful or true. Ben Franklin pointed this out in a letter
to Lord Kames, June 2, 1765 [Franklin]:
[T]he Pleasure Artists feel in hearing much of that compos'd in the modern Taste, is not the
natural Pleasure arising from Melody or Harmony of Sounds, but of the same kind with the
Pleasure we feel on seeing the surprizing Feats of Tumblers and Rope Dancers, who execute
difficult Things. For my part, I take this to be really the Case and suppose it the Reason why those
who being unpractis'd in Music, and therefore unacquainted with those Difficulties, have little or
no Pleasure in hearing this Music. Many Pieces of it are mere Compositions of Tricks. I have
sometimes at a Concert attended by a common Audience plac'd myself so as to see all their Faces,
and observ'd no Signs of Pleasure in them during the Performance of much that was admir'd by
the Performers themselves; while a plain old Scottish Tune, which they disdain'd and could
scarcely be prevail'd on to play, gave manifest and general Delight.
4.4 But I've Been a Musician All My Life / Studied Music In College
and I've Never Heard Any of This Before!
Michael O'Donnell again [O'Donnell, 14 January 2009]:
I think that most "Music Theory" as taught in music departments is intended more as a
development of descriptive terminology and notation than as an explanatory theory. I think that a
lot of humanists don't understand the difference between description and explanation.
Amen to that. But what makes me angry is that music teachers do not even MENTION that there is a whole
science of the perception of sound and if you would like to know how it really works, then you should talk to
them. They basically lie by omission, which is how it ended up taking me several decades to figure all of this
out.
Recent music theory textbooks continue to be utterly exemplary in the physical science and yet fall on their
face completely when it comes to the computational nature of the brain. Catherine Schmidt-Jones [Schmidtmusic-theory] again:
Why are some note combinations consonant and some dissonant? Preferences for certain sounds
is partly cultural; that's one of the reasons why the traditional musics of various cultures can
sound so different from each other. Even within the tradition of Western music, opinions about
what is unpleasantly dissonant have changed a great deal over the centuries. But consonance and
dissonance do also have a strong physical basis in nature.
In simplest terms, the sound waves of consonant notes "fit" together much better than the sound
waves of dissonant notes. For example, if two notes are an octave apart, there will be exactly two
waves of one note for every one wave of the other note. If there are two and a tenth waves or
eleven twelfths of a wave of one note for every wave of another note, they don't fit together as
well. For much more about the physical basis of consonance and dissonance, see Acoustics for
Music Theory, Harmonic Series, and Tuning Systems.
Nope. There are way too many asymmetric phenomena left unexplained by this overly-simplified theory of
just whole numbers dividing each other, two major ones being (1) the feeling that the Sustained Chord should
"resolve" to the Major Triad, and (2) the difference in the feeling of Major and Minor Triads. See Section 5.1
"Helmholtz's Theory Relies Only On Interfering Overtones, But Harmony Is Something More" for more
details.
know this; many questions that for millennia were the subject of superstition and wild speculation were finally
clarified, and in a way that could be verified on the anvil of experiment. This lead not just to new
understanding, but new engineering, and from this we have built a wholly new world. It is a triumph that can
hardly be overstated.
However, such success can induce arrogance and therefore blindness. Computation science is the next
breakthrough, its wave washing over us even now. The science of complexity and emergent behavior,
"Computer Science" if you will, is Science 2.0 and its insights again bring new light to heretofore mysterious
questions. Physical scientists say that Computer Science is not a science, but they are wrong:
computation/algorithm is a phenomenon of nature that surrounds us and has stable, though often mysterious,
properties that can be reliably verified by experiment. It is a tradition in Music Theory to resort only to physics
for the explanations; however this is a mistake, as using only physical science ignores
the computational nature of the brain,
how much we really do know about the canonicality of the space of algorithm itself,
and how much the brain must be using the canonical algorithms to solve the very difficult problem of
surviving and thriving in everyday life using very constrained resources.
It is only because of this new computational understanding that this article before you can even be written.
seems to be what Helmholtz would suggest. Below, Terhardt [Terhardt1974-PCH] also agrees with this point
that the absence of beats/roughness does not cause consonance, however the presence of beats/roughness does
cause dissonance; see Section 6.1 "Terhardt Recognizes that the Brain is Listening For Something". Therefore,
again, the theory of Helmholtz isn't really wrong; it is however incomplete, and rather strikingly so. See
Section 2.3.3 "Vertical Intervals Have Pure Ratios" for more discussion on this point.
exception, and either no beats at all are formed, or at least only such as have so little intensity that
they produce no unpleasant disturbance of the united sound. These exceptional cases are called
Consonances.
1. The most perfect consonances are those that have been here called absolute, in which the
prime tone of one of the combined notes coincides with some partial tone of the other. To this
group belong the Octave, Twelfth, and double Octave.
2. Next follow the Fifth and the Fourth, which may be called perfect consonances, because they
may be used in all parts of the scale without any important disturbance of harmoniousness. The
Fourth is the less perfect consonance and approaches those of the next group. It owes its
superiority in musical practice simply to its being the defect of a Fifth from an Octave, a
circumstance to which we shall return in a later chapter.
3. The next group consists of the major Sixth and the major Third, which may be called medial
consonances. The old writers on harmony considered them as imperfect consonances. In lower
parts of the scale the disturbance of the harmoniousness is very sensible, but in the higher
positions it disappears, because the beats are too rapid to be sensible. But each, in good musical
qualities of tone, is independently characterized, by the fact that any little defect in its intonation
produces sensible beats of the upper partials, and consequently each interval is sharply separated
form all adjacent intervals.
4. The imperfect consonances, consisting of the minor Third and the minor Sixth, are not in
general independently characterized, because in good musical qualities of tone the partial on
which their definition depend are often not found for the minor Third, and are generally absent of
the minor Sixth, so that small imperfections in the intonation of these intervals do not necessarily
produce beats.
deliberation [Feltman, Ganz, c. mid 1990s] . Harmony is a uniquely powerful force for people. The mere lack
of a few interfering upper harmonics can't possibly produce from absence of annoyance such a powerful
presence of visceral rapture.
taken as the Root (or "fundamental bass") of the chord. But as we saw above, according to Helmholtz the
(Major) Sixth is considered to be a rather dissonant interval. So for what purpose use it at all?
Further, I suspect that he really cannot be right about Eb being the real Root of a Cm (C Minor) chord,
because otherwise it would make no sense to build Minor chords with higher harmonics unless all of those
higher harmonics were also named wrong and also by coincidence made another chord. For example, the Cm9
(C Minor ninth) chord would not make any sense as a ninth chord if C was not really the Root of the chord. If
indeed Eb is the Root of Cm9, then the chord is really EbMaj7(add6) or Eb6(add Maj7); I suppose that could
be the case but it sounds awfully ad-hoc to me. (Similarly Cm7 becomes Eb6.) Try voicing them that way
(putting the higher harmonics in higher Octaves) and see how it sounds. From [Helmholtz1863, p. 300]:
Minor chords do not represent the compound tone of their root as well as the major chords : their
Third, indeed, does not form any part of this compound tone. The dominant chord alone is major,
and it contains the two supplementary tones of the scale. Hence when these appear as constituents
of the dominant triad, and therefore of the compound tone of the dominant, they are connected
with the tonic by the close relationship of Fifths. On the other hand, the tonic and subdominant
triads do not simply represent the compound tones of the tonic and subdominant notes, but are
accompanied by Thirds which cannot be reduced to the close relationship of Fifths. The tones of
the Minor Scale can therefore not be harmonized in such a way as to link them with the tonic note
by so close a relationship as in the major mode.
So Helmholtz does seem to notice that (1) all notes in the Major Triad have a ratio to the Root that is in the
Harmonic Series whereas (2) this is not the case for the Minor Triad. Further, on p. 212 Helmholtz tries all
combinations of triads consisting of two intervals already "known" to be consonant with the Root, and notices
that the only ones which induce a third interval that is on his list of intervals known to be consonant with each
other are the Major and Minor Triads.
He almost gets to the truth: (1) he notices that the Minor Triad also has the same three intervals as the Major
Triad and (2) that the Major Triad "represents the compound tone of their root" better than the Minor. Yet he
never seems to put these two facts together into a coherent theory that the brain is listening for the presence of
something, namely an abstracted version of the overtone series of common sound-making devices (tubes and
strings, and in particular, voice) called the ideal Harmonic Series, rather than just an absence of colliding
"upper partials". That is, at best his theory explains how it is that Major sounds better than Minor, but it fails to
explain (a) how Major sounds so compelling and "right", even when inverted, (b) how Minor sounds like
something other than noise, like something I recognize, and yet simultaneously like something that is off or has
something missing about it, and (c) how it is that it might make sense that Major and Minor are afforded a kind
of almost equal dual status in music when his explanation for Minor is so convoluted and odd.
"Major" isn't just a combinatorial accident minimizing pain as Helmholtz would have us believe, instead
it is positively recognized as something.
"Minor" isn't just a more dissonant other combinatorial accident, instead it is both positively recognized
and yet not-right at the same time.
presence of something. Helmholtz does not explain virtual pitch or the related phenomenon that chords
can be played with the Root or even the Fifth omitted. We explain it simply in Section 3.5.4 "Chords
Inducing Ambiguity": the brain wants to hear the Harmonic Series.
2. As far as I know Helmholtz has no explanation for why the Sustained chord "wants" to resolve to the
Major Triad. Again, we explain it simply in Section 3.5.4 "Chords Inducing Ambiguity": the brain wants
to hear one Harmonic Series.
3. Helmholtz almost seems to hit upon the idea that the Minor Triad has the ratios of the Harmonic Series
that occur in the Major Triad, but he really just notices that both triads induce three intervals with low
whole number ratios. Further he remarks that the Minor Triad can be thought of with different notes as
the Root, making the Minor Third into the Major Sixth, but he doesn't say for what purpose we should
care about this interval at all, having also said the Major Sixth is rather dissonant anyway. His theory
has no explanation for how it is that this very odd thing called the Minor Triad seems to have about
equal status with the Major Triad in idiomatic musical usage. We explain it simply: the compellingness of
both Major and Minor comes from the Harmonic Series with the Minor functioning as a kind of
Auditory Cubism; see Section 3.4.2 "The Minor as Auditory Cubism".
Helmholtz never seems to be wrong, he just never finishes the job: his theory just stops before explaining the
observed phenomena. We can measure the ratios of the height and width of two cars but it won't explain how
they got into a crash. Helmholtz just fails to provide a simple and compelling explanation of harmony. There's
nothing wrong with that -- future generations will find errors and omissions in our work as well -- thus is the
nature of progress. That said however:
People should not be lulled into thinking that they have an answer when they do not -- that they
have a full understanding of harmony when they do not.
the brain that it is pleasurable when the periods of neural firings line up? There is no motivation.
However the "Fusion or pattern matching" theory seems much more interesting, so we will investigate it
further.
Wow. Point (1) recapitulates both of our points about the theory of Helmholtz: (a) it is incomplete, however
(b) it is correct as far as it goes -- that is, absence of roughness does not provide consonance, but roughness
does provide dissonance. Point (2) recapitulates our point that the brain is listening for something and that this
likely derives from a computational mechanism for processing speech (he even calls the brain a "'central
processor'"). Point (3) posits that chords therefore result from two activities: (a) listening for something ("tonal
meaning") in addition to the usual (b) avoiding roughness.
I have not attempted to follow the computational model of the core of his article, but the result seems to be
that (a) using human speech input to train (b) his learning model results in a machine that exhibits the artifact
of hearing the classical intervals in the usual chords. This theory reminds me generally of our derivation of the
same intervals by (a) assuming, as he does, that the brain is optimized for processing speech and in particular
finding the harmonic series, and then (b) assuming a subtraction signal processing optimization in the brain
resulting in relative pitch, as we do in Section 2.2.1 "Relative Pitch: Differences Between Sounds". However,
how close his learning model and our subtraction optimization (the two part (b)-s in the preceding) are to one
another I cannot say as I did not study his model.
However I do not see how he can get away without also assuming an additional mechanism for getting rid of
powers of two, as we do in Section 2.2.2 "Octaves: Sounds Normalized to a Factor of Two". Terhardt
[Terhardt1974-PCH, section E.2] claims that a simple learning model of the brain together with inputs from
speech also induce the octave effect. I did not follow his argument, but I also did not try very hard. If he is
right, then the number of mechanisms we require in the brain for our theory could possibly be reduced by one,
as we instead assumed Octaves result from an optimization built into the brain.
concept of virtual pitch readily provides subjective cues which correspond to subharmonics of
given tones. Figure 7(h) [not shown] depicts the distribution of virtual-pitch cues which is
produced when the model is stimulated by a major triad consisting of three complex tones, i.e., a
typical sound of music. The distribution has pronounced maxima one and two octaves below the
lowest one of the primary fundamentals. This means that the system attributes to the chord I-III-V
the "tonal meaning" I. One can easily prove that the model reflects also for other musical chords
the well-known relations between fundamental frequencies and "tonal meaning." Hence, by
means of the concept of virtual pitch, the theory of consonance and harmony possibly can be
provided with what was lacking yet: a psychoacoustic basis.
In a sense, Terhardt really gets quite close to our theory, his notion of "Gestalt" explaining the Major Triad in a
manner similar to the way we do, though without real computational sophistication: he notices that a
recognizer trained to hear the Harmonic Series will also be fired by some chords. However though Terhardt
goes on to claim that 'One can easily prove that the model reflects also for other musical chords the
well-known relations between fundamental frequencies and "tonal meaning."', he never actually does it.
Terhardt never goes further to use the common computational understanding that we have today of a feature
vector detector and computational disambiguation engine, so he does not come to our ultimate explanations of
ambiguous chords, such as the desire of the sustained chord to resolve to the Major Triad (see Section 3.5.4
"Chords Inducing Ambiguity") or how the the Minor Triad can be understood as a form of auditory cubism
(see Section 3.4.2 "The Minor as Auditory Cubism").
Regarding the other work of the above "Fusion or pattern matching" theory, "Evidence for a general template
in central optimal processing for pitch of complex tones" Gerson & Goldstein 1978 seems completely
irrelevant, but I found it quite hard to follow so I cannot be certain. There seems to be an article "Fusion or
pattern matching: fundamentals may be perceived through pattern matching of the separately analyzed partials
to a best-fit exact-harmonic template" by Gerson & Goldstein 1978 that I could not locate easily and so I did
not investigate. I also did not read any of the later references mentioned. The paragraph above from Wikipedia
implies that articles in this thread will be similar to Terhardt:
They support the conjecture that the brain is listening for something, likely the Harmonic series.
This observation therefore supports the conclusion that the brain finds the classic intervals inherently
pleasurable/interesting.
However this observation does not take this line of explanation further into the computational models of
feature vectors and disambiguation engines; we suggest that it is these models may be an original
contribution of this work.
Perhaps I should investigate the literature further, but I am must truncate this investigation somewhere; even
UC Berkeley didn't have entire journal(s) that I needed for some articles (I invite the reader to subtract the
start and stop dates at the top of the article). If anyone had come up with my theory of the Minor presented in
Section 3.4.2 "The Minor as Auditory Cubism" then Temperley [Temperley2007] would likely not have still
been wondering how it is that the Minor and Major sound so different in January 2007.
However, should any of my readers find anything relevant in the undoubtedly rich literature of prior research
into acoustic models which have computational explanatory power relevant to our question of "how does
harmony work?", please do let me know and if I publish an updated version of this article, I will include it.
Now the Construction of the old Scotch Tunes is this, that almost every succeeding emphatical
Note, is a Third, a Fifth, an Octave, or in short some Note that is in Concord with the preceding
Note. Thirds are chiefly used, which are very pleasing Concords. I use the Word emphatical, to
distinguish those Notes which have a Stress laid on them in Singing the Tune, from the lighter
connecting Notes, that serve merely, like Grammar Articles, to tack the others together. That we
have a most perfect Idea of a Sound just past, I might appeal to all acquainted with Music, who
know how easy it is to repeat a Sound in the same Pitch with one just heard.
In Tuning an Instrument, a good Ear can as easily determine that two Strings are in Unison, by
sounding them separately, as by sounding them together; their Disagreement is also as easily, I
believe I may say more easily and better distinguish'd, when sounded separately; for when
sounded together, tho' you know by the Beating that one is higher than the other, you cannot tell
which it is. Farther, when we consider by whom these ancient Tunes were composed, and how
they were first performed, we shall see that such harmonical Succession of Sounds was natural
and even necessary in their Construction.
They were compos'd by the Minstrels of those days, to be plaid on the Harp accompany'd by the
Voice. The Harp was strung with Wire, and had no Contrivance like that in the modern
Harpsichord, by which the Sound of a preceding Note could be stopt the Moment a succeding
Note began. To avoid actual Discord it was therefore necessary that the succeeding emphatic Note
should be a Chord with the preceding, as their Sounds must exist at the same time. Hence arose
that Beauty in those Tunes that has so long pleas'd, and will please for ever, tho' Men scarce know
why. That they were originally compos'd for the Harp, and of the most simple kind, I mean a Harp
without any Half Notes but those in the natural Scale, and with no more than two Octaves of
Strings from C. to C.
I conjecture from another Circumstance, which is, that not one of those Tunes really ancient has a
single artificial Half Note in it; and that in Tunes where it was most convenient for the Voice, to
use the middle Notes of the Harp, and place the Key in F. there the B. which if used should be a B
flat, is always omitted by passing over it with a Third.
The Connoisseurs in modern Music will say I have no Taste, but I cannot help adding, that I
believe our Ancestors in hearing a good Song, distinctly articulated, sung to one of those Tunes
and accompanied by the Harp, felt more real Pleasure than is communicated by the generality of
modern Operas, exclusive of that arising from the Scenery and Dancing. Most Tunes of late
Composition, not having the natural Harmony united with their Melody, have recourse to the
artificial Harmony of a Bass and other accompanying Parts. This Support, in my Opinion, the old
Tunes do not need, and are rather confus'd than aided by it. Whoever has heard James Oswald
play them on his Violoncello, will be less inclin'd to dispute this with me. I have more than once
seen Tears of Pleasure in the Eyes of his Auditors; and yet I think even his Playing those Tunes
would please more, if he gave them less modern Ornament.
melody exhibits
a tendency to go "up and down" by consecutive or almost consecutive notes,
often traveling within the harmonic expectation of the current chord or scale, but sometimes not.
What is going on?
Perhaps melody is arpeggio of an entire scale. Recall that a scale is generally not one chord but three triads
(particularly powerful chords) occurring together. Recall from our discussion of theme in Section 2.4.1 "The
Simplicity of Theme" that any kind of expectation is useful to the brain, not just those that are computable
from direct harmonic relationship. Therefore it is possible that a scale creates a weaker kind of association
than (or containing) the stronger harmonic association of the current chord. Specifically, as a scale is multiple
chords, notes within a scale may be harmonically far from one another and may instead be related simply by
the fact that that they are known to occur together, due to their frequent use together to make harmoniclyrelated chords.
Given the theme of a scale, we suggest that a general desire for the melody line to go vaguely "up" or "down"
will naturally create an expectation for the "next" or "previous" note in the scale, regardless of any direct
harmonic relationship between the two notes. Let's call this kind of association a "melodic association". Recall
the first page of the music theory book referred to above, "Jazz Improvisation 1: Tonal and Rhythmic
Principles" by John Mehegan [Mehegan1959], where Mehegan referred to harmonic association as "vertical"
and melodic association as "horizontal" (a different notion of vertical and horizontal than we give in Section
2.3.2 "Harmony Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical Across the Notes").
The trouble with such a sequence is that there's no place where it can stop, or rather, that it can
stop anywhere; you are unconsciously waiting for another activity to start, not free association,
but reincorporation.
... What matters to me is the ease with which I free-associate and the skill with which I
reincorporate.
Here's a 'good night' story made up by me and Dorcas (age six).
'What do you want a story about?' I asked.
'A little bird,' she said.
'That's right. And where did this little bird live?'
'With Mummy and Daddy bird.'
'Mummy and Daddy looked out of the nest one day and saw a man coming through the trees.
What did he have in his hand?'
'An axe.'
'And he took the axe and started chopping down all the trees with a white mark on. So Daddy bird
flew out of the nest, and do you know what he saw on the bark of his tree?'
'A white mark.
'Which meant?'
'The man was going to cut down their tree.'
'So the birds all flew down to the river. Who did they meet?'
'Mr Elephant.'
'Yes. And Mr Elephant filled his trunk with water and washed the white mark away from the tree.
And what did he do with the water left in his trunk?'
'He squirted it over the man.'
'That's right. And he chased the man right out of the forest and the man never came back.'
'And is that the end of the story?'
'It is.'
At the age of six she has a better understanding of storytelling than many university students. She
links the man to the birds by giving him an axe. She links up the water left in the trunk with the
wood-cutter, who she remembers we'd shelved. She isn't concerned with content but any narrative
will have some (about insecurity, I suppose).
are thought and language products of our bodies? How, exactly, does our embodied nature shape
the way we think and communicate? Here are some of the findings discussed in the course of this
book:
Concrete words and concepts directly label our embodied experience. Think of of such
short words in English as knee, kick, ask, read, want, sad.
Spatial relations, for example, concepts directly expressed by words such as in, through,
above, and around, can be seen as derived form specialized circuitry in the visual system:
topographic maps of the visual field, orientation-sensitive cells.
What is technically called "aspect" in linguistics -- the way we conceptualize the structure
of events, reason about events, and express events in language -- appears to stem from the
neural structure of our system of motor control.
Abstract thought grows out of concrete embodied experiences, typically sensory-motor
experiences. Much of abstract thought makes use of reasoning based on the underlying
embodied experience....
Grammar consists of neural circuitry pairing embodied concepts with sound (or sign).
Grammar is not a separate faculty, but depends on embodied conceptual and phonological
systems....
Thought and language are thus very strongly shaped by the nature of our bodies, our brains, and
our experience functioning in the everyday world....
Thoughts and language are not disembodied symbol systems that happen to be realized in the
human brain though its computational properties. Instead, thought and language are inherently
embodied. They reflect the structure of human bodies and have the inherent properties of neural
systems as well as the external physical and social environment.
And further [Feldman2006, p. 213]:
Understanding as Simulation
... The important point for us is that much of language can be seen as setting up the conditions for
imagining the scene being portrayed.
Perhaps we need an "embodied construction grammar" for music:
Harmony as an abstraction of human voice.
Rhythm as an abstraction of physical movement, such as walking or dance (which we discuss more in
the rhythm section below).
Since much human relationship is expressed through voice and movement, is music therefore recalling to us
experiences of vocally and physically relating to others and therefore also recalling the associated emotions?
As Zen Buddhists say, while on the one hand, "mind and body are two", on the other hand "mind and body are
one". (Maddeningly) though I have heard the non-dualism of body and mind expressed often in Zen (it is quite
standard), I have given up on trying to find a citation for that exact pair of sentences. Here is a pretty good
statement of the sentiment by Shunryu Suzuki in "Zen Mind, Beginner's Mind", [Suzuki1970, Epilogue: Zen
Mind]:
We Buddhists do not have any idea of material only, or mind only, or the products of our mind, or
mind as an attribute of being. What we are always talking about is that mind and body, mind and
material are always one.
8 Acknowledgements
I gratefully acknowledge the proofreading of Simon Goldsmith, Mark Hoemmen, Emma Dzelzkalns, Peter
McCorquodale, Ryan Barrett, Karl Chen, Russell Sears, and Michael O'Donnell. Thanks also to Michelle and
others at the Art and Music Department of the Central Berkeley Public Library for their assistance finding
Helmholtz and checking a quote over the phone. Thanks also to Joanne of The 24/7 Reference Cooperative
and John Kupersmith of the University of California, Berkeley Library for help in finding the obscure Terhardt
and Gerson & Goldstein articles.
I have to particularly acknowledge Michael O'Donnell for his extensive and in-depth discussions with me on
the topic of this paper. He also recommended to me the "Auditory Demonstrations" CD [acoustical-demo],
which certainly extended my understanding of the details of what is known about how the brain processes
sound. Mike has genuine enthusiasm for and knowledge of the subject of how the brain listens to music. He
was quite generous in sharing that knowledge with me by providing much thoughtful and thorough feedback.
Thanks Mike.
Any remaining errors are mine alone.
9 References
Personal Communication
[Auslander] Joel Auslander
[Feldman] Jerome A. Feldman: http://www.eecs.berkeley.edu/Faculty/Homepages/feldman.html
[Feltman] Charles Feltman: http://www.sfcablecarchorus.org/director.html
[Fultz] Andrea Fultz: http://www.andreafultz.com/
[Ganz] William Ganz: http://ucce.berkeley.edu/about
[Goldsmith] Simon Goldsmith: http://sfg.users.sonic.net/
[Hoemmen] Mark Hoemmen: http://www.cs.berkeley.edu/~mhoemmen/
[Levitin] Daniel J. Levitin: http://ego.psych.mcgill.ca/levitin.html/
[O'Donnell] Michael J. O'Donnell: http://www.cs.uchicago.edu/people/odonnell
[Stolorow] Ben Stolorow: http://www.benstolorow.com/
[Turner] Tim Turner
Print
[Alexander1979] Alexander, Christopher. 1979. "The Timeless Way of Building". Oxford University
Press.
[Birkhoff1933] George D. Birkhoff. 1933. "Aesthetic Measure"; in particular see "Chapter V: The
Diatonic Chords". Harvard University Press, Cambridge, Massachusetts.
[Carroll1865] Lewis Carroll, John Tenniel. 1865. "Alice in Wonderland".
[Coren1972] S. Coren. 1972. Psychol. Rev. 79, 359-367.
[Feldman2006] Jerome A. Feldman. 2006. "From Molecule to Metaphor". The M.I.T. Press, Cambridge,
Massachusetts.
[Feynman1965] Richard Feynman. 1983, 1965. "The Character of Physical Law". The M.I.T. Press,
Cambridge, Massachusetts.
[Feynman1985] Richard P. Feynman. 1985. '"Surely You're Joking, Mr. Feynman!": Adventures of a
Curious Character'. Bantam Books, New York.
[Franklin] Benjamin Franklin. 1959--. "The Papers of Benjamin Franklin". Yale University Press.
Online and searchable at: http://franklinpapers.org/
[Helmholtz1863] Hermann L. F. Helmholtz. 1877, 1863. "On the Sensations of Tone as a Physiological
Basis for the Theory of Music". A translation of "Die Lehre von den Tonempfindungen als
physiologische Grundlage fr die Theorie der Musik", the first edition of which was published in 1863.
Title page note: "The Second English Edition, Translated, thoroughly Revised and Corrected, rendered
conformal to the Fourth (and last) German Edition of 1877... by Alexander J. Ellis". Dover, New York.
[Johnstone1981] Keith Johnstone. 1981. "Impro: Improvisation and the Theatre". Routledge, New York.
[Levitin2006] Daniel J. Levitin. August 2006. "This Is Your Brain on Music: The Science of a Human
Obsession". Dutton, New York.
http://www.yourbrainonmusic.com/
[Mehegan1959] John Mehegan. 1984, 1959. "Jazz Improvisation 1: Tonal and Rhythmic Principles".
Watson-Guptill Publications, New York.
[Neely1999] Blake Neely. 1999. "How to Play From a Fake Book: Faking Your Own Arrangements
from Melodies and Chords". Hal Leonard, Milwaukee, WI.
[Suzuki1970] Shunryu Suzuki. 2001, 1970. "Zen Mind, Beginner's Mind". Weatherhill, New York &
Tokyo.
[Temperley2007] David Temperley. 2007. "Music and Probability". The M.I.T. Press, Cambridge,
Massachusetts.
[Terhardt1974-PCH] Ernst Terhardt. 1973/4. "Pitch, consonance, and harmony" J. Acoust Soc. Am.,
Vol.55, No.5, May 1974 (received 1973), pp. 1061-1069.
Sound
[acoustical-demo] A. J. M. Houtsma, T. D. Rossing, W. M. Wagenaars. 1 September 1987. "Auditory
Demonstrations" CD and booklet. Prepared at the Institute for Perception Research (IPO) Eindhoven,
The Netherlands. Supported by the Acoustical Society of America. http://asa.aip.org/discs.html
Image
[Jessica-pout] Image of Jessica Rabbit from "Who Framed Roger Rabbit?" [WFRR-1988], obtained
from http://screenmusings.org/WhoFramedRogerRabbit/pages/WFRR_0604.htm
[Picasso1938] Pablo Picasso. 1938. "Head of a Woman". (I have forgotten where I obtained this image.)
Film
[Jessica-bad] Jessica Rabbit in "Who Framed Roger Rabbit?" [WFRR-1988].
[WFRR-1988] "Who Framed Roger Rabbit?", directed by Robert Zemeckis and released by Touchstone
Pictures. 1988.
General Online
[Austen-caricature] Ben Austen. 15 July 2011. "What Caricatures Can Teach Us About Facial
Recognition". http://www.wired.com/magazine/2011/07/ff_caricature/
[Grammer-harmony] Red Grammer. "Harmony". Many websites say the author is "Anonymous", but
this one http://sniff.numachi.com/pages/tiHARMNY.html says the song is by Red Grammer, Smilin'
Atcha Music.
[Harmon-art-brain] Katherine Harmon. 4 June 2010. "Why so many artists have lazy eyes, and other
things art can teach us about the brain". http://blogs.scientificamerican.com/observations/2010/06
/04/why-so-many-artists-have-lazy-eyes-and-other-things-art-can-teach-us-about-the-brain/
[Jessica-great] "The 100 Greatest Movie Characters: 88 Jessica Rabbit". http://www.empireonline.com
/100-greatest-movie-characters/default.asp?c=88
[Mobley-antibiotics] Harry Mobley. March 13, 2006. "How do antibiotics kill bacterial cells but not
human cells?". http://www.scientificamerican.com/article.cfm?id=how-do-antibiotics-kill-b
[Mosquito-harmony] James Morgan. 8 January 2009. "Mosquitoes make sweet love music".
http://news.bbc.co.uk/2/hi/science/nature/7814404.stm
[Schmidt-music-theory] Catherine Schmidt-Jones. 10 January 2007. "Understanding Basic Music
Theory". http://cnx.org/content/col10363/1.3/
[Schmidt-waves] Catherine Schmidt-Jones. 11 March 2011. "Standing Waves and Musical Instruments".
http://cnx.org/content/m12413/1.11/
[Wilkerson-entropy] Daniel S. Wilkerson. October 2006. "An Intuitive Explanation of the Information
Entropy of a Random Variable". http://danielwilkerson.com/entropy.html
Wikipedia
Note that footnotes (but not inline citations) within Wikipedia articles are simply omitted: the alternative of
inlining the citation seemed cumbersome and they are readily available online.
[alg] Algorithm: http://en.wikipedia.org/wiki/Algorithm
[archetype] Archetype: http://en.wikipedia.org/wiki/Archetype
[arp] Arpeggio: http://en.wikipedia.org/wiki/Arpeggio
[beat] Beat (acoustics): http://en.wikipedia.org/wiki/Beat_(acoustics)
[canon] Canonical: http://en.wikipedia.org/wiki/Canonical
[con] Construction grammar: http://en.wikipedia.org/wiki/Construction_grammar
[con-dis] Consonance and dissonance: http://en.wikipedia.org/wiki/Consonance_and_dissonance
[concert-pitch] Concert pitch: http://en.wikipedia.org/wiki/Concert_pitch
[conv] Convergent evolution: http://en.wikipedia.org/wiki/Convergent_evolution
[cop] Copernican model of the solar system: http://en.wikipedia.org/wiki/Copernican_heliocentrism
[cutt] Cuttlefish: http://en.wikipedia.org/wiki/Cuttlefish
[dem] Demonic possession: http://en.wikipedia.org/wiki/Demonic_possession
[ent] Entropy: http://en.wikipedia.org/wiki/Entropy
[epi] Epicycles: http://en.wikipedia.org/wiki/Epicycle
[eqt] Equal temperament: http://en.wikipedia.org/wiki/Equal_temperament
[feat] Feature vector: http://en.wikipedia.org/wiki/Feature_vector
[gorb] Mikhail Gorbachev: http://en.wikipedia.org/wiki/Mikhail_Gorbachev
[har] Harmonics: http://en.wikipedia.org/wiki/Harmonics
[harmonic7] http://en.wikipedia.org/wiki/Minor_seventh
[hum] Humorism: http://en.wikipedia.org/wiki/Humorism
[just] Just tuning (intonation): http://en.wikipedia.org/wiki/Just_intonation
[log] Logarithm: http://en.wikipedia.org/wiki/Logarithm
[maj] Major scale: http://en.wikipedia.org/wiki/Major_scale
[maj-min] Major and minor: http://en.wikipedia.org/wiki/Major_and_minor
[min] Minor scale: http://en.wikipedia.org/wiki/Minor_scale
[min7] Minor seventh: http://en.wikipedia.org/wiki/Minor_seventh
[miss-fund] Missing fundamental (and virtual pitch): http://en.wikipedia.org/wiki/Missing_fundamental
[mus] Musical tuning: http://en.wikipedia.org/wiki/Musical_tuning
[occ] Occam's razor: http://en.wikipedia.org/wiki/Occam's_razor
[oct] Octave: http://en.wikipedia.org/wiki/Octave
[pen] Pentatonic Scale: http://en.wikipedia.org/wiki/Pentatonic_scale
[pitch] Pitch: http://en.wikipedia.org/wiki/Pitch_(music)
[ptol] Ptolemaic model of the solar system: http://en.wikipedia.org/wiki/Geocentric_model
[rel] Relative pitch: http://en.wikipedia.org/wiki/Relative_pitch
[sci] Scientific method: http://en.wikipedia.org/wiki/Scientific_method
[sci-pitch] Scientific pitch notation: http://en.wikipedia.org/wiki/Scientific_pitch_notation
[twelve-bb] Twelve-Bar Blues: http://en.wikipedia.org/wiki/Twelve-bar_blues
[undertone] Undertone series: http://en.wikipedia.org/wiki/Undertone_series
[venus-fly] Venus Flytrap: http://en.wikipedia.org/wiki/Venus_Flytrap
[wolf] Wolf intervals: http://en.wikipedia.org/wiki/Meantone_temperament#Wolf_intervals