Harmony Explained

arXiv:1202.4212v1 [cs.
SD] 20 Feb 2012
Harmony Explained:
Progress Towards A Scientific Theory of Music
The Major Scale, The Standard Chord Dictionary, and The Difference of Feeling Between The Major
and Minor Triads Explained from the First Principles of Physics and Computation; The Theory of
Helmholtz Shown To Be Incomplete and The Theory of Terhardt and Some Others Considered
Daniel Shawcross Wilkerson
Begun 23 September 2006; this version 19 February 2012.
Abstract and Introduction

Most music theory books are like medieval medical textbooks: they contain unjustified superstition,
non-reasoning, and funny symbols glorified by Latin phrases. How does music, in particular harmony, actually
work, presented as a real, scientific theory of music?
In particular we derive from first principles of Physics and Computation the following three fundamental
phenomena of music:
the Major Scale,
the Standard Chord Dictionary, and
the difference in feeling between the Major and Minor Triads.
While the Major Scale has been independently derived before by others in a similar manner as we do here
[Helmholtz1863, p. 300], [Birkhoff1933, p. 92], I believe the derivation of the Standard Chord Dictionary as
well as the difference in feeling between the Major and Minor Triads to be an original contribution to science
and art. Further, we think our observations should convert straightforwardly into an algorithm for classifying
the basic aspects of tonal music in a manner similar to the way a human would.
Further, we examine the theory of the heretofore agreed-upon authority on this subject, 19th-century German
Physicist Hermann Helmholtz [Helmholtz1863], and show that his theory, while making correct observations,
and while qualifying as scientific, fails to actually explain the three observed phenomena listed above;
Helmholtz isn't really wrong, he just fails to be really right, and considers only physical and not computational
phenomena. We also consider the more recent and more computational theory of Terhardt
[Terhardt1974-PCH] (and others) and show that, while his approach (and, it seems, that of others following in
his thread) also attempts a computational explanation and derives some observations that seem to resemble
some of those of the initial part of our analysis, we seem to go further.
I intend this article to be satisfying to scientists as an original contribution to science (as a set of testable
conjectures that explain observed phenomena), yet I also intend it to be approachable by musicians and other
curious members of the general public who may have long wondered at the curious properties of tonal music
and been frustrated by the lack of satisfying, readable exposition on the subject. Therefore I have written in a
deliberately plain and conversational style, avoiding unnecessarily formal language; Benjamin Franklin and
Richard Feynman often wrote in a plain and conversational style, so if you don't like it, to quote Richard
Feynman, "Don't bug me man!"
Table of Contents
1 The Problem of Music
1.1 Modern "Music Theory" Reads Like a Medieval Medical Textbook
1.2 What is a Satisfactory, Scientific Theory?
1.3 Music "Theory" is Not a Scientific Theory of Anything
1.4 Can we Make a Satisfactory Theory of Music?
1.5 Physical Science: Harmonics Everywhere
1.5.1 Timbre: Systematic Distortions from the Ideal Harmonic Series
1.6 Computational Science: as Fundamental as Physical Science
1.6.1 Algorithms are Universal
2 Living in a Computational Cartoon
2.1 Searching for Harmonics
2.1.1 Virtual Pitch: Hearing the Harmonic Series Even When it is Not There
2.1.2 Using Greatest Common Divisor as the Missing Fundamental
2.1.3 Even Animals Seem to Compute the Ideal Harmonic Series
2.2 Artifacts of Optimization
2.2.1 Relative Pitch: Differences Between Sounds
2.2.2 Octaves: Sounds Normalized to a Factor of Two
2.3 Harmony: Sweetness is the Ideal
2.3.1 Recreating an Ideal Harmonic Series using Instruments having SystematicallyDistorted Timbre
2.3.2 Harmony Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical
Across the Notes
2.3.3 Vertical Intervals Have Pure Ratios
2.3.4 Vertical Intervals Have Balanced Amplitudes
2.3.5 Vertical Intervals Are All The Same Ratio
2.3.6 Harmony is Sweeter Than Sweet
2.4 Interestingness: Just Enough Complexity
2.4.1 The Simplicity of Theme
2.4.2 The Complexity of Ambiguity
2.5 Recognition: Feature Vectors
2.5.1 Soft Computing
2.5.2 False Recognition
2.5.3 Cubism: Partial Recognition Due to Redundant, Over-Determined Feature Vectors
3 Harmonic Music Explained
3.1 The Major Triad
3.2 The Major Scale
3.2.1 Interlocking Triads
3.2.2 Using Logarithms to Visualize Distances Between Tones/Notes
3.2.3 The Keyboard Revealed
3.3 Scales and Keys
3.3.1 Changing Key: Playing Other Groups of Triads
3.3.2 Key Changes Break Harmony
3.3.3 Just versus Equal Tuning
3.4 The Minor
3.4.1 The Minor Triad
3.4.2 The Minor as Auditory Cubism
3.4.3 Minor Scales
3.5 Chords
3.5.1 The Standard Chord Dictionary
8
9
3.5.2 How to Turn Sweetness into Mud: Over-Using Octaves

3.5.3 Chords from the Harmonic Series
3.5.4 Chords Inducing Ambiguity
3.5.5 Chords Using the Minor Triad
3.5.6 Chords Preserving Intervals but not Harmonics
Miscellaneous Objections
4.1 But what about the Circle of Fifths!
4.1.1 Fifths make a Circle
4.1.2 The Circle of Fifths is Just a Combinatorial Coincidence
4.1.3 The Circle of Fifths Allows for Cool Chord Transitions
4.1.4 The Symmetries of the Circle of Fifths are a Terrible Red Herring
4.2 But Other Cultures Have Different Musical Scales!
4.2.1 A Culture May Simply not be Fully Exploiting All of the Universal Harmonic
Features
4.2.2 But The Nasca People Of Peru Use A Linear, Not A Logarithmic, Scale!
4.3 But You Can Make a Piece of Music Based Entirely on That Utterly Un-Harmonic Interval,
the Augmented Fourth!
4.4 But I've Been a Musician All My Life / Studied Music In College and I've Never Heard Any
of This Before!
Helmholtz Fails to Fully Explain Harmony
5.1 Helmholtz's Theory Relies Only On Interfering Overtones, But Harmony Is Something More
5.2 Helmholtz's Theory Doesn't Imply Virtual Pitch
5.3 Helmholtz's Theory is that Pleasure is Only the Absence of Pain
5.3.1 Harmony is Rapture
5.4 Helmholtz's Theory Fails to Fully Explain the Qualitative Difference Between the Major and
Minor Triads
5.5 Helmholtz Isn't Really Wrong, He Just Fails To Be Really Right
Other Modern Theories, such as Terhardt and 'Fusion or pattern matching' Theory
6.1 Terhardt Recognizes that the Brain is Listening For Something
6.2 Terhardt Does Not Explain Sustained and Minor Chords
Future Work: Towards A Unifying Theory of Music
7.1 Melody as Arpeggio
7.1.1 Scale As Theme: Melodic Association From Harmonic Association
7.1.2 Streaming: Multiple Similar Phenomenon Occurring Consecutively Are Explained By
The Brain As One Thing Moving
7.1.3 Melody can Easily Create Interesting Ambiguities
7.2 The Role of Narrative Generally
7.3 Embodiment and Emotion
7.4 A Proposal For A Unifying Physical and Computational Theory of Music
Acknowledgements
References
1 The Problem of Music

People push different keys on a piano; some combinations and patterns sound good; others do not. How does
that work? Looking at a piano, it is laid out in the following pattern (w=white, b=black)
... wbwbw wbwbwbw wbwbw wbwbwbw ...
Hmm, the white and black keys mostly just alternate, yet these alternating regions last for 5 and then 7 keys
and then that 5/7 region-pair repeats, and where these regions meet there are two adjacent white keys. There
seems to be a pattern, but it is quite an odd one.
The piano keyboard seems really weird and ad-hoc.
Doesn't it seem that something as simple as sound should have a simple device for producing it?
Further, this weirdness is not specific just to the piano: the key layout reflects the Major Scale [maj] which is
the basis of all Western music. Is that black-white pattern somehow fundamental to sound and music itself? Or
are they really just a cultural coincidence, combinations of sounds that we have heard over and over since
infancy and been trained to associate with different emotions? Is something fundamental to the ear and to
sound itself that is going on here or not?
1.1 Modern "Music Theory" Reads Like a Medieval Medical

Textbook
These questions have bothered
me literally for decades (starting
when I was about ten, looking at
our piano keyboard and asking
"what?!"; I basically wrote the
above Section 1 "The Problem
of Music" at that time).
Consulting "music theory" never
helped me either, as
Reading a music theory

book is like reading a
medieval medical
textbook: such books are
full of unjustified
superstition,
non-reasoning, and funny
symbols glorified by Latin
phrases.
For example, here is the first
page from a famous book on
Jazz Theory, "Jazz Improvisation
1: Tonal and Rhythmic
Principles" by John Mehegan
[Mehegan1959]. Recall, this is
the first page of Lesson 1 of
Section 1 of Book 1, the very
first thing the student reads!
"Each of the twelve scales
is a frame forming the
harmonic system."
What is a "scale"? Where do
they come from? For what
purpose are there or how does it
emerge that there are twelve
exactly? What is a "harmonic
system" and what does it mean
to say a scale "frames" it?
"Diatonic harmony moves
in two directions:
Horizontal and Vertical."
Really?! They both look pretty diagonal to me. Oh, but it's Diatonic! That sounds Latin so I guess these people
are smart.
"By combining these two movements... we derive the scale-tone seventh chords in the key of C."
What is a "chord"? What is a "key"? WHAT THE HECK ARE THEY TALKING ABOUT!
You can't start a science textbook like that. You have to start with simple observations humans can make. You
have to build up complex structures from simple ones. You have to motivate your distinctions.
Even if you say "A chord is 3 or more notes played together" that's also almost the definition of a "key" as
well; for what purpose do we have this distinction? You could say "well the notes of a key are played together
but not at the same time," but that also is true of an arpeggio-ed chord; again what's the distinction? Even if
you say "a C major chord is C-E-G" there is no motivation as to how it is that C-E-G sound good together and
other combinations of notes do not.
This "music theory" reminds me a bit of Richard Feynman's description of a science textbook he reviewed for
the California school board as told in '"Surely You're Joking, Mr. Feynman!": Adventures of a Curious
Character', [Feynman1985, p. 270-271], (emphasis in the original):
For example, there was a book that started out with four pictures: first there was a wind-up toy;
then there was an automobile then there was a boy riding a bicycle; then there was something
else. And underneath each picture it said, "What makes it go?"
I thought, "I know what it is: They're going to talk about mechanics, how the springs work inside
the toy; about chemistry, how the engine of the automobile works; and biology, about how the
muscles work."
It was the kind of thing my father would have talked about: "What makes it go? Everything goes
because the sun is shining." And then we would have fun discussing it:
"No, the toy goes because the spring is wound up," I would say.
"How did the spring get wound up?" he would ask.
"I wound it up."
"And how did you get moving?"
"From eating."
"And food grows only because the sun is shining. So it's because the sun is shining that all these
things are moving." That would get the concept across that motion is simply the transformation of
the sun's power.
I turned the page. The answer was, for the wind-up toy, "Energy makes it go." And for the boy on
the bicycle, "Energy makes it go." For everything, "Energy makes it go."
Now that doesn't mean anything. Suppose it's "Wakalixes." That's the general principle:
"Wakalixes makes it go." There's no knowledge coming in. The child doesn't learn anything; it's
just a word!
1.2 What is a Satisfactory, Scientific Theory?

Further, a scientific theory of something is expected to have a certain "explanatory power". But what is
"explanatory power"? Is it just whatever we like? Consider the old explanations of disease; here is one: evil
spirits inhabit you [dem]. Well, did anyone ever see these spirits? Were the experiences of these spirits
universal across human kind? Where there some general rules of how the spirits behaved? How many there
were? What would appease them?
Another theory was Humorism [hum]: that there were four different fluids in the body: blood, black bile,
yellow bile, and phlegm; when they got out of balance, you had a disease. Ok, this is better than arbitrary
spirits, but did anyone measure the relative levels of these fluids? Could someone predict sickness by
observing these fluids get out of balance? Could you make someone better by, say, draining blood from them?
"Treatment" based on this theory seem to have been long practiced, but did anyone measure to see if draining
blood really made people better versus a control group that did not have their blood drained?
Now we have an new theory called modern medicine. It is much more complex, but let's take a subset of it:
there are little creatures called bacteria that live everywhere. Certain kinds can live in your body and the
results of their activity, such as their excretions, get your body out of normal working order, and thus you
become sick. If you give chemicals to a person that are more toxic to the bacteria than the person, does the
person get better? Yes [Mobley-antibiotics] ! Even when compared to a control group? Yes! Can we see these
little bacteria in a microscope? Yes! Ok, this is much more satisfactory as a scientific theory.
Now, let us step back and consider what makes us more satisfied with this theory. What is going on such that it
is a better theory?
For one thing, the theory is mechanical: we have some mechanism, consistent with our understanding of
inanimate matter today (physics and chemistry) such that the operation of the mechanism corresponds
with what we observe (Scientific Method) [sci].
Further, this mechanism is deterministic and precise: there isn't much arbitrariness in the mechanism: we
can compute rather well how sick someone will get and how much toxin we have to give them to kill the
bacteria and not the person.
This mechanism is universal: there is no appeal to beliefs or cultural norms: people throughout the world
get sick in the same way and the medicines work on them, with but small differences that can be further
explained by another mechanism called genetics.
This mechanical explanation is simple and minimal (Occam's razor) [occ]. We can see the parts working.
Lastly, the mechanism is factored -- made up of independent parts -- and the complexity of the observed
phenomena is emergent -- arising naturally from the operation of the parts. That is, these parts of the
explanation of disease all operate independently: (1) how the body works such that the bacterial
excretions disrupt it, (2) how bacteria works such that the toxin kills it, (3) how the toxicity to the
human depends on the size of the human, etc.
Physicist Richard Feynman gave a series of lectures where he attempted to encapsulate the basic nature of
how science is done and the kind of results it produces; these were published as "The Character of Physical
Law" [Feynman1965]. Here is a brilliant paragraph on how to know when you have finally found the truth.
[Feynman1965, p. 171] (underlining added, not in the original):
One of the most important things in this 'guess -- compute consequences -- compare with
experiment' business is to know when you are right. It is possible to know when you are right way
ahead of checking all the consequences. You can recognize truth by its beauty and simplicity. It is
always easy when you have made a guess, and done two or three little calculations to make sure
that it is not obviously wrong, to know that it is right. When you get it right, it is obvious that it is
right -- at least if you have any experience -- because usually what happens is that more comes
out than goes in. Your guess is, in fact, that something is very simple. If you cannot see
immediately that it is wrong, and it is simpler than it was before, then it is right. The
inexperienced, and crackpots, and people like that, make guesses that are simple, but you can
immediately see that they are wrong, so that does not count. Others, the inexperienced students,
make guesses that are very complicated and it sort of looks as if it is all right, but I know it is not
true because the truth always turns out to be simpler than you thought.
Using Computer Science terminology, I summarize Feynman's point as follows.
The more factored a theory and the more emergent the observed phenomena from the theory, the
more satisfying the theory.
The Ptolemaic [ptol] model of the solar system puts the earth at the center. This explanation really does
explain the movements, especially when epicycles [epi] are added, but it is rather complex and ad hoc: how
does it emerge that we need epicycles? The Copernican [cop] system is also another explanation of the solar
system that puts the sun at the center. This second explanation only requires Newton's laws of motion plus
gravity. The consequences of Newton's laws are complex and even hard to simulate, even on a modern
computer, but the laws themselves are quite simple and independent and mechanical and factored and
observable etc. Even further, the notation used in this theory easily reflects the underlying understanding in
the theory: it allows for easy calculations when making predictions of the theory. All in all, the Copernican
system is quite a quite satisfying explanation, or theory, of the motions of planets in the solar system because,
not only does it explain the observed phenomena, it is factored into simple parts and the observed phenomena
are emergent from the interactions of those parts. Consequently, we use the Copernican system today
(adjusted for relativity and other more recent observations).
1.3 Music "Theory" is Not a Scientific Theory of Anything

Music "theory" as we find in books today contains none of the properties of a modern theory that we find
satisfying. At the start we are presented the odd white-black-white-WHITE-black keyboard or Major Scale as
a given. We are sometimes told for example that the Major Scale comes from the Ancient Greeks. We are
sometimes told it is arbitrary and it only sounds good because we have heard it since childhood.
Nothing in music "theory" counts as a scientific theory of anything.
We are told that certain combinations of notes sound good; these combinations are called "chords" and the
fact that these combinations sound good is also arbitrary. We are told lots of strange names for intervals
between notes and these names make no sense. The Standard Chord Dictionary of common chords simply
consists of a list of note combinations we are told are good to play together and will feel a certain way when
heard. Nowhere is there any notion of how we would predict the feeling each chord engenders from the
construction of the chord.
Sometimes I have encountered vague explanations offering "pairs of notes having low whole-number ratios" as
the reason some notes sound good together and then told no one really knows how that works. In Section 5
"Helmholtz Fails to Fully Explain Harmony" we address a well-known theory of Helmholtz where he attempts
an explanation of how it is that notes with frequencies that are in low whole-number ratios to one another
should sound good together. We will show that his theory has problems.
If we make any attempt to actually compute note ratios, the notation actually gets in the way of our
understanding: The notation for the notes and their distances really does not convey very well the actual ratios
of the notes. For example, in the Major Scale, sometimes going up to the next one (space to line above it or
line to space above it) goes up one whole "step", a ratio of 2^(1/6) = 1.122 (the sixth root of 2), and sometimes
only a "half-step" (or "semi-tone"), a ratio of half as much 2^(1/12) = 1.059 (the twelfth root of 2). (For more
on logarithms and exponentials, see Section 3.2.2 "Using Logarithms to Visualize Distances Between
Tones/Notes".) (To those unfamiliar with musical notation, we will explain the numbers later.) The difference
between these whole and half steps can only be discerned by looking way over to the left of the page of music
and doing complex computations with sharps and flats in order to compute the "key" of the music; and that
whole process is designed to defeat the sometimes-half/sometimes-whole steps (for the arbitrary key of C) that
is baked into the notation itself. This notation may make music easy to play, but it does not make it easy to
understand.
This music "theory" has all the properties of preventing understanding, not promoting it. It fits the description
of pseudo-science pretty well. Let's try to do better.
1.4 Can we Make a Satisfactory Theory of Music?

I simply refuse to believe that something so fundamental to human life and so satisfying to so many people is
so arbitrary and so un-explainable. I have attempted to come up with something better and I think I have
succeeded.
As we build up this theory, we want to make sure that we make as few assumptions as possible, and that these
assumptions are founded upon actual experimentally-derived facts -- just as we now demand of the rest of
science. In particular we would like a real, scientific theory of music to be universal and not appeal to cultural
relativism that says "it's all just arbitrary"; no explanation that says such things is a real scientific theory of
anything.
Given that
1. sound and instruments exist in reality and
2. music only sounds like something because a human brain is computing the listening to it,
it seems therefore that
1. physics and
2. computation
respectively seem the appropriate place to start with a real theory of music.
The brain is central to our theory. Not knowing how the brain really works, we therefore have a hole to fill in
our explanation. We proceed by telling a story to explain the known properties of music; along the way we
assume certain conjectures about the structure of the brain where we need them. We make these conjectures
as reasonable as possible, given the assumption that
The brain is a machine optimized by evolution to compute human survival.
That is, being a machine, the brain is likely to be subject to properties that computer scientists and engineers
have observed across many computational systems and that these properties will be driven by evolutionary
optimization. In the end, the test of our theory will depend on (1) how well it explains the observed
phenomenon called music, and (2) how well the conjectures hold up under testing. In this essay we do (1) and
we leave (2) for future work by cognitive/brain scientists.
1.5 Physical Science: Harmonics Everywhere

Physical science is about as rock-solid of a theory of the world as anything. This is a good place to start.
Catherine Schmidt-Jones [Schmidt-waves]:
For the purposes of understanding music theory, however, the important thing about standing
waves in winds is this: the harmonic series they produce is essentially the same as the harmonic
series on a string. In other words, the second harmonic is still half the length of the fundamental,
the third harmonic is one third the length, and so on.
We can either compute or observe (using, say, high-speed cameras) the properties of the stable vibrations that
occur when a string or or a column of air is excited:
1. There is one frequency (the "fundamental") at which the string or air will vibrate;
2. there are also other vibrations (the "harmonics" or "overtones") having higher frequencies that are
multiples of 2, 3, 4, 5, 6, 7 etc. times the fundamental at which the string or air will also vibrate.
These harmonics can be demonstrated by two people hold a long jump-rope: (1) If they swing the rope slowly,
the whole rope makes a single wave. (2) However if they go twice as fast and out of phase (one goes up while
the other goes down) then half of the rope will be up and the other half down and the positions of up and down
will switch twice as fast; further the very middle of the rope will not move at all (a "node"). (3) A similar
effect happens with three waves if they go even faster. For a picture, see [Schmidt-waves, Figure 2]. When a
string is plucked, all of these waves are happening at the same time. That is, plucking generates all waves, but
only those the frequency of which divides the length of the string will bounce back and forth and re-enforce
each other and persist; other frequencies will die out. From [Schmidt-waves]:
In order to get the necessary constant reinforcement, the container has to be the perfect size
(length) for a certain wavelength, so that waves bouncing back or being produced at each end
reinforce each other, instead of interfering with each other and cancelling each other out. And it
really helps to keep the container very narrow, so that you don't have to worry about waves
bouncing off the sides and complicating things. So you have a bunch of regularly-spaced waves
that are trapped, bouncing back and forth in a container that fits their wavelength perfectly. If you
could watch these waves, it would not even look as if they are traveling back and forth. Instead,
waves would seem to be appearing and disappearing regularly at exactly the same spots, so these
trapped waves are called standing waves.
We will call each single sine-wave at a single frequency a "tone", whereas the collection of frequencies that
occur together due to a single physical process (such as a vocal utterance or the striking of a piano key) we
will call a "note". (A tone can be expressed simply as (1) a wave "frequency" in Hertz (Hz), the number of
cycles per second, (2) a wave "amplitude", the wave peak height, and (3) a wave "phase", where the wave is
in its cycle compared to other waves; we won't discuss amplitude and phase much.)
This sequence of tones forming a note is called the "Harmonic Series" [har] or "Overtone Series" of the
fundamental. Herein we speak of "the (ideal) Harmonic Series" when we mean an abstract computational ideal
and speak of "an overtone series" when we mean what is actually produced in reality by a particular actual
instrument (which may be quite different from the ideal); note that others quoted here may not follow this
same convention. (Further, throughout we pluralize "series" as "series-es" because in a technical discussion it
is very important to avoid the ambiguity between a single series of multiple tones and multiple series-es of
multiple tones.)
There are two conventions for numbering overtones/harmonics; we use the convention where the fundamental
or "Root" tone is called "harmonic 1", the tone vibrating twice as fast is called "harmonic 2", the tone vibrating
three times as fast is called "harmonic 3", etc.
1.5.1 Timbre: Systematic Distortions from the Ideal Harmonic Series

From "This is Your Brain on Music" by Daniel J. Levitin [Levitin2006, p. 43-44]:
The timbre of a sound is the principal feature that distinguishes the grow of a lion form the purr of
a cat, the crack of thunder from the crash of ocean waves,.... Timbral discrimination is so acute in
humans that most of us can recognize hundreds of different voices. We can even tell whether
someone close to us -- our mother, our spouse -- is happy or sad, healthy or coming down with a
cold, based on the timber of that voice.
Timbre is a consequence of the overtones.... When you hear a saxophone playing a tone with a
fundamental frequency of 220 Hz, you are actually hearing many tones, not just one. The other
tones you hear are integer multiples of of the fundamental: 440, 660, 880, 1200, 1420, 1640, etc.
The different tones -- the overtones -- have different intensities, and so we hear them as having
different loudnesses. The particular pattern of loudnesses for these tones is distinctive of the
saxophone, and they are what give rise to its unique tonal color, its unique sound -- its timbre. A
violin playing the same written note (220 Hz) will have overtones at the same frequencies, but the
pattern of how loud each one is with respectively to the others will be different. Indeed, for each
instrument, there exists a unique pattern of overtones. For one instrument, the second overtone
might be louder than in another, while the fifth overtone might be softer. Virtually all of the tonal
variation we hear -- the quality that gives a trumpet its trumpetiness and that gives a piano its
pianoness -- comes from the unique way in which the loudnesses of the overtones are distributed.
Each instrument has its own overtone profile, which is like a fingerprint. It is a complicated
pattern that we can use to identify the instrument. Clarinets, for example, are characterized by
having relatively high amounts of energy in the odd harmonics -- three times, five times, and
seven times the multiples of the fundamental frequency, etc. (This is a consequence of their being
a tube that is closed at one end and open at the other.) Trumpets are characterized by having
relatively even amounts of energy in both the odd and the even harmonics (like the clarinet, the
trumpet is also close at one end and open at the other, but the mouthpiece and bell are designed to
smooth out the harmonic series). A violin that is bowed in the center will yield mostly odd
harmonics and accordingly can sound similar to a clarinet. But bowing one third of the way down
the instrument emphasizes the third harmonic and its multiples: the sixth, the ninth, the twelfth,
etc.
Besides introducing us to timbre, Levitin points out:
Most real instruments systematically produce tones having amplitudes distinct from that of the
ideal Harmonic Series.
Michael O'Donnell points out that the effects of timbre on the overtone series goes even further [O'Donnell, 14
January 2009]:
I suggest that you check into the importance of approximate harmonic series. E.g., the overtones
on a piano string are measurably and audibly higher in frequency than the harmonics that they
approximate. Both the nearness to harmonics, and the perceptible difference, appear to be
important....
You mentioned the way that the harmonic series of frequencies occurs naturally in air columns, as
in strings. But, on soft strings (such as guitar, violin---little resistance to bending) the natural series
of resonant frequencies is very accurately harmonic. In wind instruments, the natural resonances
of the air column approximate the harmonic series rather poorly. In the brass, the approximation is
so poor that the numbers of the harmonics don't even match between the natural resonances and
the notes as played. While the conical shape of many reeds is designed to improve the
harmonicity of the resonances, the bell on the brass is actually designed to increase the
inharmonicity of the natural resonances, which produces a better match in the misaligned
overtones. It is phase locking between vibrational modes, caused by the highly nonlinear feedback
in the excitation mechanisms (reeds, lips, bow scraping) that makes the overtone series so
accurately harmonic, not the natural resonances.
That is, O'Donnell points out:
Most real instruments systematically produce tones having frequencies distinct from that of the
ideal Harmonic Series.
Therefore whatever our theory of harmony it should work for sounds where the overtone series differs from
the ideal Harmonic Series by (1) altered amplitudes and (2) altered frequencies. However, notice that both of
these distortions of the ideal Harmonic Series have one important property:
The distortions made by the overtone series of a given instrument to the ideal Harmonic Series
are a predictable, systematic function of the instrument kind.
That is, two notes (series-es of overtones) made by the same (kind of) instrument will be distorted from the
ideal Harmonic Series in the same (or similar) way. This must be the case in order for an instrument or
instrument kind to have a uniform, recognizable timbre. We will use this below.
1.6 Computational Science: as Fundamental as Physical Science
I think part of the reason the theory we develop here might not have been described before is that there aren't
many people who think about both the physical and the computational understanding needed to derive it.
The properties, or laws, of computation are just as fundamental as the physical laws.
Computation is everywhere -- you live in a sea of it.
You may see a cup, but computational engineers see an idiom for managing liquids by getting them stuck
in a local optimum.
You may think of ownership as a basic human right, but engineers think of it as an distributed decisionmaking algorithm.
You may enjoy a field full of bumblebees pollinating flowers, but engineers enjoy it as information
distribution network.
You may think it is polite to not talk on top of other people at dinner, but engineers think it is optimal to
use a back-off algorithm to resolve a network packet collision.
I wrote that list off of the top of my head as fast as I can type and edit text: the examples are myriad.
Consider for a moment that perhaps you are computation: that you are the computational activity of your
brain. Some people say that this reduces the wonder of life to simple mechanism; I say it simply elevates
mechanism to the wonder of life. While you need not adopt this All-Is-Computation point of view as your
personal understanding of life or of yourself, a computational understanding of the brain has amazing
explanatory power, so please consider it at least for the rest of this essay.
1.6.1 Algorithms are Universal

Finding good ways to solve a problem with less resources is a basic pursuit of those who study computation. A
general method for solving a problem is called an "algorithm"[alg]. New algorithms that solve common
problems well are rare and highly valued. When a solution is "reduced to the simplest and most significant
form possible without loss of generality" we say it is "canonical" [canon]. An algorithm is a canonical method.
Many tricks in engineering seem not to be merely the artifacts of human cleverness, but instead the result of
fundamental properties of the medium of computing. Algorithms invented by different species to solve the
problem called staying alive often resemble each other in ways that cannot be explained by any other means
than "that's the only way to do it" (or one of only a few ways). From [cutt]:
The organogenesis of cephalopod eyes differs fundamentally from that of vertebrates like humans.
Superficial similarities between cephalopod and vertebrate eyes are thought to be examples of
convergent evolution.
The human eye and the cuttlefish eye both address the problem of extracting information at a distance from
light. Both evolved separately and yet they both end up at a very similar solution. Biologists call this
phenomenon "convergent evolution" [conv]; architects call it "timeless pattern" [Alexander1979]; storytellers
call it "archetype" [archetype]; clothiers call it "classical style"; computer scientists call it "algorithm". When
humans tried to find a mechanical solution to the same problem, they invented the camera which is just an eye
again. We should therefore not be surprised if
Conjecture One: Computational laws/idioms/patterns/algorithms are universal: The brain works
using a combination of simple computational algorithms of which we are likely already aware.
2 Living in a Computational Cartoon
"I'm not bad, I'm just drawn that way." -- Jessica Rabbit [Jessica-bad]
Jessica Rabbit [Jessica-pout] is one of the sexiest characters in Hollywood, elected 88th of The 100 Greatest
Movie Characters of All Time by Empire Magazine [Jessica-great]. Sadly, she is just a drawing and a voice.
Despite the powerful illusion to the contrary, we do not see or hear the world; we see and hear the world that
our brains compute. Like the characters in "Who Framed Roger Rabbit?" [WFRR-1988], we live in a cartoon.
Music is not what the world does; it is what we do with the world.
A friend of mine Joel Auslander used to intern at Pixar; his job was to make physics simulator tools for the
animators. He wanted to make simulators that were accurate to the real physics, but he said that the animators
told him that people don't want to watch real physics, people want to watch cartoon physics: even though not
accurate as real physics, cartoon physics is somehow more satisfying [Auslander, c. 1996].
Conjecture Two: The brain uses cartoon physics, that is, physics that is easy to compute, but not
necessarily faithfully accurate to reality.

We suggest that both the use of cartoon physics and the inaccuracy of cartoon physics are due to the simple
fact that the brain is computationally limited.
Here is a cartoon physics effect in vision. When taking a drawing class our teacher pointed out some useful
visual effects to us: (1) To make an object look round, shade the object the more its face bends away from the
viewer and (2) put highlights where the light source would reflect off of it. Now think what pantyhose do to
women's legs. (1) When the mesh of the hose is straight on, it is not very dark, but as the leg bends away and
the mesh is seen on edge, the threads line up and the grid rapidly appears to darken. (2) Pantyhose are shiny
and so naturally produces reflection highlights. That is, pantyhose fire the recognizers in your brain for the
features of roundness harder than a real round leg could: her leg looks rounder than round, impossibly round.
See Section 2.5 "Recognition: Feature Vectors" for more on this phenomenon.
We suggest that the brain is using cartoon physics when processing sounds as well. That is, explanations of
auditory effects based on the physical properties of actual overtones of different instruments (such as the
piano or the trumpet) are beside the point (or at least beside the primary point) when it comes to the brain. As
we will see in Section 5 "Helmholtz Fails to Fully Explain Harmony", this point of view is the essential point
where our theory differs from that of Helmholtz. What primarily distinguishes this essay from previous
attempts to explain music is that our whole approach is oriented primarily not from the external world of
physics, but from the internal world of the computation by our brains that is us, from the computational
cartoon in which we live and from which we think we experience the world, but which is not the world, but
instead only ourselves.
2.1 Searching for Harmonics

As Levitin pointed out in Section 1.5.1 "Timbre: Systematic Distortions from the Ideal Harmonic Series",
finding the difference between what we hear and the ideal Harmonic Series is a valuable tool for recognizing
people and determining their emotional state. Many sounds are made by vibrating strings or columns of air, but
perhaps more importantly, the human voice is made up of vibrating "chords" and a "windpipe" of air. Given
that sounds associated to a single source would tend to be arranged in a Harmonic Series, and especially given
how important the voice is to humans, it would not be surprising if perhaps
Conjecture Three: Finding harmonics is a common and important problem, so the brain has
hardware for recognizing the Harmonic Series.
You can hear a demonstration of this, and of many other interesting auditory phenomena, on from the
"Auditory Demonstrations" CD from the Institute for Perception Research, Eindhoven, The Netherlands and
the Acoustical Society of America [acoustical-demo, Demo 1], "Cancelled Harmonics":
[Twenty tones in the same Harmonic Series are all played together.] When the relative amplitudes
of all 20 harmonics remain steady (even if the total intensity changes), we tend to hear them
holistically. However, when one of the harmonics is turned off and on, it stands out clearly. The
same is true if one of the harmonics is given a "vibrato" (i.e. its frequency, its amplitude, or its
phase is modulate at a slow rate).
I recall my voice teacher Andrea Fultz saying the goal was to get me to sing so that my voice resonated in my
"mix": in both my head and chest voice at the same time [Fultz, c. 2006]. She was trying to get me to have a
more ringing or sweeter voice by making sure all the overtones were present by ensuring that somewhere in
my body some resonator of the right size was amplifying it (see Section 2.3 "Harmony: Sweetness is the Ideal"
below).
2.1.1 Virtual Pitch: Hearing the Harmonic Series Even When it is Not There
There is reliable acoustic phenomenon called "Virtual Pitch": if the Harmonic Series is processed to remove
the Root or Fundamental tone and then played to a person, that person will hear the note, including the Root
tone, even thought it is not played [miss-fund]. The "Auditory Demonstrations" CD again [acoustical-demo,
Demo 20], "Virtual pitch":
A complex tone consisting of 10 harmonics of 200 Hz having equal amplitude is presented, first
with all harmonics, then without the fundamental, then without the two lowest harmonics, etc.
Low-frequency noise (300-Hz lowpass, -10dB) is included to mask a 200-Hz difference tone that
might be generated due to distortion in playback equipment.
As they say, in the demo overtones are subtracted one at a time, from the fundamental on up. Amazingly, the
note being played seems to stay the same; however it does get more buzzy or annoying to the point where a
fellow listener Simon Goldsmith thought that he would no longer call the last example the same note
[Goldsmith, c. 2010].
Virtual pitch is what allows engineers to fake bass notes on small speakers: they don't play the low tones, as
often the speaker is too physically small to make the fundamental frequency anyway; instead they play the
overtones and rely on your brain to reconstruct the whole Harmonic Series. However, as we noted above, you
will hear that small, cheap speakers sound, well, cheap or "tinny"; the bass just doesn't sound as good as it
does when played on sub-woofers. That said, don't forget how remarkable it is that you can still "hear" the
non-existent fundamental tone at all (which helpfully prevents the need for people to jog with sub-woofers
attached to their ears). From [miss-fund]:
For example, when a note (that is not a pure tone) has a pitch of 100 Hz, it will consist of
frequency components that are integer multiples of that value (e.g. 100, 200, 300, 400, 500.... Hz).
However, smaller loudspeakers may not produce low frequencies, and so in our example, the 100
Hz component may be missing. Nevertheless, a pitch corresponding to the fundamental may still
be heard.
(Note that virtual pitch is a special case of (1) the feature vector understanding that we give in Section 2.5
"Recognition: Feature Vectors" and (2) the concomitant effect of false recognition that we speak of in Section
2.5.2 "False Recognition", where here virtual pitch is the false recognition of the Harmonic Series.)
(See Section 6.2 "Terhardt Does Not Explain Sustained and Minor Chords" for an illustration by Coren
[Coren1972] (as quoted by Terhardt [Terhardt1974-PCH]) which shows standard visual illusions as a metaphor
with virtual pitch.)
(In "How to Play From a Fake Book" [Neely1999] says that when playing a chord, you can drop not only the
Root of the chord, but also the Fifth and the listener will still hear the chord; see Section 3.5.4 "Chords
Inducing Ambiguity". We should point out that here we speak of omitting one note from a chord, a collection
of multiple notes, or multiple series-es of tones, whereas virtual pitch is a phenomenon of omitting one tone
from a single Harmonic Series of tones of a single note. However we argue later in Section 2.3.2 "Harmony
Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical Across the Notes" that these two
situations are closely related and therefore the fact that it works to omit the Root or Fifth of a chord is actually
the phenomenon of virtual pitch again and is thus more evidence for our theory that the brain is listening for
the Harmonic Series.)
2.1.2 Using Greatest Common Divisor as the Missing Fundamental

What is the means by which the brain determines the missing fundamental? From [acoustical-demo, Demo 21],
"Shift of Virtual Pitch":

A tone having strong partials with frequencies of 800, 1000, and 1200 Hz will have a virtual pitch
corresponding to the 200 Hz missing fundamental, as in Demonstration 20. If each of these
partials is shifted upward by 20 Hz, however, they are no longer exact harmonics of any
fundamental frequency around 200 Hz. The auditory system will accept them as being "nearly
harmonic" and identify a virtual pitch slightly above 200 Hz (approximately 1/3 * (820/4 + 1020/5
+ 1220/6) = 204 Hz in this case). The auditory system appears to search for a "nearly common
factor" in the frequencies of the partials.
There is a simple algorithm for finding the Root of a partial overtone series:
Given a set of tones, hear the (approximate) Greatest Common Divisor (gcd) of the tones as the
fundamental.
2.1.3 Even Animals Seem to Compute the Ideal Harmonic Series

This conjecture on the brain creating virtual pitch seems to hold even for non-humans, as pointed out in "This
is Your Brain on Music" by Daniel J. Levitin [Levitin2006, p. 41] (emphasis in the original):
When I was in graduate school, my advisor, Mike Posner, told me about the work of a graduate
student in biology, Petr Janata.... Peter [sic] placed electrodes in the inferior colliculus of the barn
owl, part of its auditory system. Then, he played the owls a version of Strauss's "The Blue Danube
Waltz" made up of tones [by "tones" here he means what we are calling "notes": each note is an
entire series of overtones] from which the fundamental frequency [what we are calling the
fundamental tone of the overtone series] had been removed. Petr hypothesized that if the missing
fundamental is restored at the early levels of auditory processing, neurons in the owl's inferior
colliculus should fire at the rate of the missing fundamental. This was exactly what he found. And
because the electrodes put out a small electrical signal with each firing -- and because the firing
rate is the same as a frequency of firing -- Petr sent the output of these electrodes to a small
amplifier, and played back the sound of the owl's neurons through a loudspeaker. What he heard
was astonishing; the melody of "The Blue Danube Waltz" sang clearly from the loudspeakers: ba
da da da da, deet deet, deet deet. We were hearing the firing rates of the neurons and they were
identical to the frequency of the missing fundamental. The harmonic series has an instantiation not
just in the early levels of auditory processing, but in a completely different species.
Michael O'Donnell pointed out to me that there is an ambiguity here [O'Donnell, 14 February 2009]:
[The above story] doesn't allow one to distinguish whether the Owl, or the human listener, is
experiencing the virtual pitch.
I passed this on to Daniel J. Levitin; his response [Levitin, 24 May 2010]:
You're absolutely right that these two possibilities need to be distinguished. The electrodes that
were placed in the brain of the owl (in the inferior colliculus) were analyzed using
specotrograms[sic] and fourier[sic] analysis. It was clear that the signal itself coming from the
owl's brain had replaced the missing fudnamental[sic]. It was only after this analysis that Petr
thought to hook it all up to play the signal over loudspeakers (so that humans could hear the
output) as a cool demonstration.
Female Mosquitoes only mate when rate of the wing-beats of the male harmonize at a Perfect Fifth above the
rate of her wing-beats (we start introducing musical terminology such as the Perfect Fifth in Section 3.1 "The
Major Triad"). From "Mosquitoes make sweet love music" [Mosquito-harmony]:

The familiar buzz of a flying female mosquito may be irritating to humans, but for her male
counterpart, it is an irresistible mating signal. Males and females each have their own
characteristic flight tone - which they create by beating their wings.
But when scientists from Cornell University listened in on a male Aedes aegypti pursuing his
mate, they were surprised to hear a new kind of "music" playing....
The amorous couple began to beat their wings together at a matching frequency - 1,200 hertz.
This love song is a "harmonic", or multiple, of their individual frequencies - 400 Hz for the female
and 600 Hz for the male....
"So we're trying to discover what makes a male more attractive. It's a mystery. It could be his
odour[sic], or his bright black and white markings.
"But we think females are assessing the fitness of males based on how well they can sing."
2.2 Artifacts of Optimization

The brain has constrained resources. Evolution has no time to waste and therefore these resources are likely
used in an optimal way -- or at the very least any easy optimizations will have been done for a given
organization of a brain. (That is, evolution will drive a machine into a local optimum, even if it gets stuck there
and does not reach a global optimum.)
Having separate hardware in the brain for recognizing each combination of tones that co-occur in nature is
sub-optimal and it would just be an expensive way to use up neurons. The algorithm every engineer resorts to
in this situation, and what I suspect the brain does also, is to find a way to "re-use code": to solve the problem
by generalizing the hardware a little so the same "code" can be used in many more situations. Here, we want
one Harmonic Series recognizer that works for all the different overtone series-es we may encounter.
Further, the problem that the brain is solving when listening to music is recognizing sounds that are important
to it, such as perhaps the nuances of a human voice against a background of noise. In order to recognize
something, it is ok to simplify the input or throw away information if it makes the problem easier, as long as
enough information is retained to complete the task.
We now consider two different tricks for greatly simplifying the computation the brain must do in order to
recognize the harmonic series. We will also conjecture some computational artifacts of the way the brain
computes that should result from these optimizations, resulting in well-known universal features of music:
relative pitch and octaves.
2.2.1 Relative Pitch: Differences Between Sounds

Again, most engineers would tell you that, given the problem of designing a brain to recognize the Harmonic
Series, their intuition would tell them to build one, single Harmonic Series recognizer, not a different one for
every possible note. The way to accomplish this would be to make the machine recognize only that which is
the same (or mostly the same) in all overtone series-es and ignore that which changes. While the tones of
different Harmonic Series-es differ, conveniently the ratio of their frequencies to their fundamental frequency
does not. Therefore we consider it very likely that
Conjecture Four: The brain normalizes tones by dividing tones to get tone ratios.
Recognizing ratios of tones (and notes) more strongly than the absolute tones themselves is a phenomenon
called "Relative Pitch" [rel]. A ratio of a pair of tones (or notes) is called an "interval".
2.2.2 Octaves: Sounds Normalized to a Factor of Two

Processing sound requires operating on frequencies over several orders of magnitude. If these frequencies
could be made to "wrap-around" then we have another opportunity for code re-use.
When the police take a mug shot of a criminal, their goal is to take the photo in such a way as to maximize the
recognizability of the subject in the future given the photo. They employ a common trick used in the
recognition problem: they photograph the subject in standard positions (front and profile), under standard
lighting conditions, against a standard backdrop, and after removing any obscuring clothing. We say they
normalize the photograph: they remove information irrelevant to the thing to be recognized and put it in a
standard form; doing this helps recognize the thing later.
Consider the conceptually straightforward process of the brain halving or doubling the frequency of a wave
until it is within a particular range. Now the brain only needs a Harmonic Series recognizer for tones within a
frequency range of a single factor of two, not across the whole spectrum of sound. Breaking the problem into
two parts like this, (1) normalization followed by (2) recognition, greatly simplifies the resulting frequency
recognizer. We therefore consider it likely that
Conjecture Five: The brain normalizes tones by halving or doubling them until within a particular
frequency range spanned by a factor of two.
The individual computational units of the brain are not as fast as those in modern electronics, however those of
the brain are operating in "massive parallel": many operations may be computed at once and all that is needed
is that one find the answer. To the intuition of anyone who has seen hardware designed it seems very likely
that the brain is halving/doubling frequencies by many different powers of two in parallel and then running all
of the results through the frequency recognizer at once. If any one matches, the harmonic has been found.
If this were so, then tones (and notes) that differ from each other by a factor of two would sound very much
alike. The range of notes that are all within one factor of two is called in music an "Octave" [oct]. ("Oct" is
Latin for eight, not two; the relationship to the number eight will become clear later.) Levitin again from "This
is Your Brain on Music" [Levitin2006, p. 29]:
Here is a fundamental quality of music. Note names repeat because of a perceptual phenomenon
that corresponds to the doubling and halving of frequencies. When we double or halve a
frequency, we end up with a note that sounds remarkably similar to the one we started out with.
This relationship, a frequency ratio of 2:1 or 1:2, is called the octave. It is so important that, in
spite of the large differences that exist between musical cultures -- between Indian, Balinese,
European, Middle Eastern, Chinese, and so on -- every culture we know of has the octave as the
basis for its music, even if it has little else in common with other musical traditions.
Again, according to Levitin, the Octave interval occurs in every musical tradition in the world. This
observation is the first of many to suggest that the musicality of sound depends on something universal about
human beings, rather than simply being learned from culture.
2.3 Harmony: Sweetness is the Ideal

Recall from Section 1.5.1 "Timbre: Systematic Distortions from the Ideal Harmonic Series" that the brain uses
differences from the ideal/cartoon model as a kind of or "personality" or in this case "timbre". Recall from the
same section that Levitin suggests that we use this timbre to solve the important problem of recognizing people
and their emotional state. But being perfect makes this recognition hard; from "What Caricatures Can Teach
Us About Facial Recognition" [Austen-caricature] (see Section 2.5.2 "False Recognition" for more):
[W]hen you talk to these artists about their process, you realize that the psychologists have gotten
the basics down pretty well. When Court Jones, the 2005 Golden Nosey winner, describes how he
teaches the craft to younger artists, he lays out exactly the algorithm that vision scientists believe
humans use to identify faces. Students, he says, should imagine a generic face and then notice
how the subject deviates from it: "That's what you can judge all other faces off of."
Also, just as a vision scientist would predict, symmetrical faces -- those close to our internal
average -- are especially difficult to caricature. People at the convention mention struggles with
Katy Perry and Brad Pitt; the animator Bill Plympton, a guest speaker at the convention, tells me
that Michael Caine has long been a bte noire. The same principle explains why the person at the
convention with maybe the least symmetrical of faces appears by week's end in no fewer than 33
works of art on the ballroom walls.
I don't think I need a citation to claim that Katy Perry and Brad Pitt are considered to be very beautiful
people. This suggests another conjecture.
Conjecture Six: Absence of distortion (or personality or timbre) is sweetness.
2.3.1 Recreating an Ideal Harmonic Series using Instruments having SystematicallyDistorted Timbre
In Section 1.5.1 "Timbre: Systematic Distortions from the Ideal Harmonic Series" above we saw that the
overtone series of a single instrument is easily distorted by myriad physical effects. However, recall that for
the same (kind of) instrument, those distortions were systematic and reliable. Therefore by playing
multiple notes,
on instruments having the same (or similar) timbre,
and relying on Relative Pitch to subtract the differences for us,
from distorted overtone series-es we can magically recreate parts of the ideal Harmonic Series!
2.3.2 Harmony Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical
Across the Notes
Suppose we play two notes on the piano that are a Fifth (a factor of 3/2) apart. Per O'Donnell's comment in
Section 1.5.1 "Timbre: Systematic Distortions from the Ideal Harmonic Series" above, since piano strings are
not the strings of ideal physics, they don't make an ideal Harmonic Series. Instead, each tone in the series is
moved by being multiplied by some fudge factor. However notice that strings on the piano are made of the
same stuff, at least nearby strings, and this fudge factor should therefore be somewhat consistent across
strings. That is, two corresponding tones at the same point in the overtone series of two different notes should
get multiplied by the same fudge.
Tones of 1st note:
1 ---> (1
* 2 * fudge2) ---> (1
* 3 * fudge3) ...
------------------------------|
|
|
|
|
|
v
v
v
--------------------------------Tones of 2nd note: 3/2 ---> (3/2 * 2 * fudge2) ---> (3/2 * 3 * fudge3) ...
Now notice that there are two kinds of intervals of tone pairs:
"horizontal": intervals made by pairs of tones within the one series of tones generated by one note, and
"vertical": intervals made by pairs of tones across the two series-es of tones generated by the two
different notes, especially those of corresponding overtones.
2.3.3 Vertical Intervals Have Pure Ratios

As O'Donnell points out above in Section 1.5.1 "Timbre: Systematic Distortions from the Ideal Harmonic
Series", real instruments can systematically produce overtones at frequencies different from those of the ideal
Harmonic Series; one such instrument is the piano which produces stretched overtones. However, these
distortions from the ideal Harmonic Series affect these horizontal and vertical intervals differently:
Horizontal intervals are fudged: the ratio of overtone 3 of the 2nd note to overtone 1 of the 2nd note
has fudge in it:
(3/2 * 3 * fudge3) / (3/2)
= 3 * fudge3,
Vertical intervals are pure: the ratio of overtone 3 of the 2nd note to overtone 3 of the 1st note is pure:
(3/2 * 3 * fudge3) / (3 * fudge3) = 3/2 (pure!).
However, I would be remiss if I did not point out here [acoustical-demo, Demo 31], "Tones and Tuning with
Stretched Partials" from "Auditory Demonstrations" CD, quoted in Section 5.1 "Helmholtz's Theory Relies
Only On Interfering Overtones, But Harmony Is Something More". In Demo 31, a piece by Bach is played on
computer-generated piano (part 1) having normal overtones and (part 4) having overtones where an Octave is
stretched from a factor of 2 to a factor of 2.1. Taken naively, our theory that the purity of vertical intervals
matters to the brain suggests that these should both harmonize; however the normal one (part 1) certainly
sounds better. We suggest therefore that if the horizontal intervals are distorted grossly enough, then the fact
that the vertical intervals are pure cannot save the harmony from being destroyed by the dissonance of the
horizontal intervals.
2.3.4 Vertical Intervals Have Balanced Amplitudes

As Levitin points out above in Section 1.5.1 "Timbre: Systematic Distortions from the Ideal Harmonic Series",
real instruments can systematically produce overtones at amplitudes different from those of the ideal
Harmonic Series; one such instrument is the clarinet which emphasizes the odd overtones. Again however,
these distortions of the ideal Harmonic Series affect these horizontal and vertical intervals differently:
Horizontal intervals are sometimes made by a pair of tones having unbalanced amplitudes: for
example, with the clarinet the ratio of an odd overtone to an even overtone will be an interval between a
loud tone and a soft tone.
Vertical intervals are always made by a pair of tones having balanced amplitudes: again, the
amplitude variations are systematic, so the tones that are paired up vertically will have the same
amplitude variations.
2.3.5 Vertical Intervals Are All The Same Ratio

Further, these two kinds of intervals are going to show up very differently to the relative pitch detector:
Horizontal intervals are only one of each kind, a Whitman's Sampler: while there is sweetness in
one voice, especially that of a trained singer, as in the horizontal intervals of that voice there is one
instance of each interval of the Harmonic Series (albeit with the fudge we mentioned above of
horizontal intervals).
Vertical intervals are all of the same kind, an entire box of chocolate almond cherry: on the other
hand when two voices are sung, say, a Fifth apart, there is an entire wall of the same kind of sweetness,
a wall of many Fifths coming at you, namely the vertical intervals above, each of which is a Fifth.
(Again, for an introduction to musical intervals such as the Fifth), see Section 3.1 "The Major Triad".)
2.3.6 Harmony is Sweeter Than Sweet

Therefore we see that note ratios induce a set of the same tone ratios. Further these tone ratios
are pure, have balanced amplitudes, and are all of the same interval.
This harmonic effect works best if the two notes of an interval are played on the same instrument having
therefore the same distortions from the ideal Harmonic Series. My Men's Chorale teacher Bill Ganz told us
that to have our voices harmonize, we should sing the same vowels, which supports this theory as the same
vowels will have closer timbres [Ganz, c. fall 1991] (Bill says this is a known effect, not something he
independently observed; a cursory search does not produce a better reference, so I cite him). Notice that this
effect allows instruments making tones that are not anywhere near the Harmonic Series to still harmonize with
each other (at least up to a point where the horizontal intervals interfere too much; see the point about
[acoustical-demo, Demo 31] in Section 2.3.3 "Vertical Intervals Have Pure Ratios").
The wall of vertical intervals hammer the same relative pitch sensor with a wall of the pure interval one of the
features of the cartoon physics ideal Harmonize Series of your brain is looking for. Recall from the
introduction to Section 2 "Living in a Computational Cartoon" the effect of pantyhose making a leg look
rounder than round; again more on this effect in Section 2.5 "Recognition: Feature Vectors". Harmony is
sweeter than sweet. It's impossibly sweet -- impossible for one voice anyway -- which is just what the theory
predicts.
2.4 Interestingness: Just Enough Complexity

Anticipation and prediction is one of the fundamental operations of the brain. We suggest that there is an art to
balancing the simplicity and complexity: if understanding and predicting a storyline are too easy, then it is
boring, and if too hard, then it is noise, but if just right, then it is interesting. As we discuss below, (1) simplicity
comes from data having a "theme" and (2), ambiguity is the absence of a single explanation or theme and
therefore a good way to rapidly produce complexity. See Section 7.2 "The Role of Narrative Generally" for
how theme and ambiguity are unified to make narrative.
2.4.1 The Simplicity of Theme

People frequently experience that, before receiving information, having an expectation as to the context of
that information, its theme, helps considerably in the processing of it. For example, people who speak more
than one language sometimes have the experience of hearing words (1) in a language that they know, but (2)
that they were not expecting, and therefore not understanding those words until they "listen" to them again in
their mind from within the context of the language in which those words were spoken. There are myriad
examples of context influencing how something occurs to someone.
Surprise Reduction: The technical name for the amount of expected information one gets from situation is
the entropy [ent] [Wilkerson-entropy]. Some call the entropy of a measurement the amount of surprise one
expects get out of it. Clearly, if one knows more about what to expect in a situation, the amount of surprise can
be greatly reduced. Since it is work to process information, we suggest that the brain likes to have reliable
expectations in order to minimize the amount of surprise it is dealing with all day.
Model Inference: Life is full of situations where we may observe the consequences of a situation but are not
told explicitly what is the state of the situation. There is nothing left to do but to infer a model of the state of
affairs from observation of many details, and therefore inference is likely a constant activity of the brain. For
example, people often infer the rules of a game from observation and without reading the rules.
Have you ever seen someone color-coordinate their clothes or even their room? Have you ever been to a
"theme party" where everyone was to dress and act from a given era or situation? How about a "theme
restaurant" or "theme park"? Having a theme for all of the elements of a given situation
(surprise reduction) reduces the amount of new information or "surprise" that each one introduces, and
(ease of inference) allows the brain to construct a whole from the parts.
Differences and changes are interesting to the brain, but too much difference fails to feel "unified" -- it does
not all occur as parts of a whole. In support of both surprise-reduction and ease of inference, we consider it
likely that
Conjecture Seven: The brain wants input to have a theme. That is, the brain both infers themes
from input and uses themes as context when processing input.
2.4.2 The Complexity of Ambiguity

Research on parsing of sentences suggests that one of the major functions of the brain is to disambiguate
ambiguous and incomplete input. In "From Molecule to Metaphor", Jerome Feldman, both a Computer and
Cognitive Scientist, points out how much of the brain's processing of sentences is devoted to disambiguation
and how easy it is to tease the brain by using ambiguous inputs that resolve in an unusual way. From
[Feldman2006, p. 307, 308] (unconventional spacing in the original):
Please read the following sentence aloud slowly, word by word:
The
horse
raced
past
the
barn
fell.
Sentences like these are called garden-path sentences because, in slow reading, we often notice
that we have followed an analysis path that turned out to be wrong....
But why are people surprised in garden-path situations? The brain is a massively parallel
information processor and is able to retain multiple active possibilities for interpreting sentence,
scene, and so on. Well, there must be a cutoff after which some possible interpretations are
deemed so unlikely as to be not worth keeping active. The final piece of their [referring to a model
given by other researchers] model was an assumption that a hypothesis was abandoned if its belief
net score was less than 20% of that of its rival. We experience surprise when the analysis needed
for a full sentence is one that was deactivated earlier as unlikely. This is a complex computational
model, but nothing simpler can capture all the necessary interactions.
The input the brain gets as we live life is inherently and often wildly ambiguous. Alternatives multiply and so
the number possible ambiguities in a situation can easily grow exponentially. No machine can keep up with the
demands of a problem the size of which grows that fast. Therefore:
Much of the brain is a massive disambiguation engine that is running all the time and is
functioning at its computational limit.
Jokes are often of the form of an ambiguity of contexts/themes resolved by a punchline which evaluates one
way in one context and another way in the other context (say true in one and false in the other); the story that
precedes the punchline serves to amplify the weaker context, the weaker side of the ambiguity, so as to
maximize the punch of the line by making it break symmetry between two almost equal contexts/themes. Story
plots are often of this form as well, in particular mysteries. The language of Shakespeare is full of double
meanings and even perhaps a triple meaning here and there. These are all to the same purpose:
Conjecture Eight: The brain enjoys having its disambiguation engine teased.
2.5 Recognition: Feature Vectors

I need to introduce yet another computational idiom: the feature vector [feat]. It is actually a completely
straightforward idea that you already use every day. Think of how you summarize a thing when you post an
online ad to sell it. Suppose you are selling a car. You might very well put in the ad the total volume of the
cylinders in the engine. Your probably won't list the number of bolts in the engine. You probably will list how
many miles the engine has driven. You probably will not list the number of hours the radio has been on (even if
you knew it). The point is that
Humans naturally abstract; that is, they retain the features that are important for a given
purpose and discard the rest.
All language is abstraction. Suppose I point at a chair and I say "what is that?" You say "that is a chair." I say
"are you telling the complete truth?" You say "yes!" I lean down and look very closely and I say "yea, but you
didn't mention this little scratch down here...." You roll your eyes in annoyance.
An abstraction is a reduced amount of information that still serves the purpose. In the context of recognizing a
thing as a member of a class, an abstract adjective is called a "feature". Usually there is more than one, so we
collect them together into a "vector", which just means a list where the elements are not interchangeable (that
is, you can't swap the mileage and the year of a car without severely changing the meaning of the car ad).
Once we have described a class of inputs as a vector of features, we have a clear algorithm for recognizing a
thing as being a member of that class:
1. Whenever we encounter a thing, for each feature (in parallel), check if that feature is present.
2. If all (or most) of the features in the vector are present ("fire"), then recognize the thing as being in the
class abstracted by the feature ("fire" the whole recognizer).
Note that the second part above which looks for the conjunction of features may be realized by a more
sophisticated mechanism than a simple AND gate that just fires its output when all of its inputs have fired: a
simple conjunction mechanism would be too "brittle" in the face of the noisy input of the real world. For
example, even plants such as the Venus Fly Trap can compute a rather sophisticated conjunction of features
before recognizing a fly [venus-fly]:
The trapping mechanism is so specialized that it can distinguish between living prey and non-prey
stimuli such as falling raindrops; two trigger hairs must be touched in succession within 20
seconds of each other or one hair touched twice in rapid succession, whereupon the lobes of the
trap will snap shut in about 0.1 seconds.
Recall that in the case of virtual pitch, the feature recognition mechanism seems to find the greatest common
divisor of the tones presented; that is, this recognizer uses a special wholistic property of this particular set of
features in order to work well in the face of missing features. Recall that a timbre amounts to the systematic
absence of parts of the idea Harmonic Series and that real sounds (in particular, voices) exhibit a range of
timbres; thus the Harmonic Series recognizer must be able to robustly find the fundamental even when some of
the tones are missing. See Section 2.1.1 "Virtual Pitch: Hearing the Harmonic Series Even When it is Not
There", Section 2.1.2 "Using Greatest Common Divisor as the Missing Fundamental", and Section 1.5.1
"Timbre: Systematic Distortions from the Ideal Harmonic Series".
2.5.1 Soft Computing

Machines are good at crisp, mechanical behavior, such as adding huge lists of numbers. This is fun for a while,
but it can get old.
I don't often need huge lists of numbers added, but I really would like to go to an online auction
site and find a car that is "sort of" like my ideal car which I might be willing to describe.
You will notice the use of the non-crisp or "soft" phrase "sort of" in the previous problem specification. Some
people try to get machines to do this sort of soft reasoning that humans do so well. It can sometimes be done,
at least within a very constrained context of, say, shopping for cars or plane tickets. Such a discipline is called
Artificial Intelligence or Soft Computing or Machine Learning or Statistical Inference, depending on exactly
how one goes about it and who is providing the research funding. The important thing for us is that describing
problems using feature vectors is a very general and widely used technique. Recalling our conjecture that
computational laws are universal, we would not find it surprising if
Conjecture Nine: The brain uses feature vectors for recognition.
2.5.2 False Recognition

To get the brain to (1) have the experience of the presence of a thing, it is not necessary to (2) present the
actual thing to the brain. It is enough to just present anything that fires the feature vector in the brain assigned
to recognize that thing. That is, if I want you to think "hamburger" I don't have to show you a hamburger, only
a picture of one. Recall the example from Section 2 "Living in a Computational Cartoon" of pantyhose making
a leg look rounder than round, impossibly round.
It is pretty easy to tell the difference between a photograph of something and the thing itself: you wouldn't
accidentally eat a photograph of a hamburger. Yet at the same time the picture definitely says "hamburger" to
your brain, often strongly enough that you are willing to part with some money to have a real one right now!
But it gets even weirder.
Have you ever seen a cartoon of a person that looks more like the person than the person does?
Some political cartoonists are very good. They
1. pick some very unusual features of the person, and then
2. exaggerate those features.
Amazingly, what can result is something that looks more like the person than the person. From [Harmonart-brain]:
As someone who has worked in pen and ink for decades, cartoonist Jules Feiffer realizes that
"what we see is often quite divorced from what is actually there," he noted. He calls the
two-dimensional representations metaphors, noting that "the metaphor is often more
understandable than the real thing."
And research on the perception of faces reveals that the human brain and individual neurons are
tuned to extreme representations, explained Margaret Livingstone, a professor of neurobiology at

Harvard Medical School. Her research has shown that people are much quicker to recognize
caricatures of people than documentary photographs, showing how the brain at work prizes the
representative over the more factual.
From "What Caricatures Can Teach Us About Facial Recognition" [Austen-caricature] :
At the University of Central Lancashire in England, Charlie Frowd, a senior lecturer in
psychology, has used insights from caricature to develop a better police-composite generator. His
system, called EvoFIT, produces animated caricatures, with each successive frame showing facial
features that are more exaggerated than the last. Frowd's research supports the idea that we all
store memories as caricatures, but with our own personal degree of amplification. So as an
animated composite depicts faces at varying stages of caricature, viewers respond to the stage that
is most recognizable to them. In tests, Frowd's technique has increased identification rates from as
low as 3 percent to upwards of 30 percent.
. . . "A lot of people think that caricature is about picking out someone's worst feature and
exaggerating it as far as you can," Seiler says. "That's wrong. Caricature is basically finding the
truth. And then you push the truth."
The features can be anything that is important to the task of recognizing that person: a nose or lip shape, etc. -technically, this feature has a lot of "information". An good example of this I remember was a yellow smiley
face that had red blotch on its forehead -- everyone knew it was Mikhail Gorbachev [gorb].
While a thing may induce a feature vector in the brain for use later in recognize the thing, some
other things will also fire that vector, causing artificial recognition.
2.5.3 Cubism: Partial Recognition Due to Redundant, Over-Determined Feature Vectors

There is no rule that says that the features in a feature
vector must be independent, that for every subset of
features, there is some input that will fire those features
and not the others. If the brain is doing all it can to
recognize things as fast and cheaply as possible, it is going
to use the most effective features it has and some
redundant / over-determined sets of features can easily
arise.
Hmm, what would you experience if some but not all of
the features in a vector were to fire? Note that, while there
may be no natural input that can cause this, that does not
mean that there is no such art-ificial input. This leads to
interesting phenomena that can be exploited by artists.
Cubism is a form of art from the early 20th century that
has a certain particular quality:
the parts of an object may be rendered reasonably
faithfully so that one recognizes them,
however they do not arrange into a whole in a
coherent way.
This produces an interesting effect:

we recognize the object, as the features we require for recognition do fire,
although we still have an overall feeling that we are not seeing the thing in it's natural form, but instead
in a disturbed or unhappy or dreamy state.
You may say "Of course it looks disturbed! It's all messed up!" But think for a moment: if it is all messed up,
how is it that it looks like anything at all? Again, per Section 2.5.2 "False Recognition", because the features
are present.
Consider Picasso's "Head of a Woman" [Picasso1938] on the right. One eye is in profile and the other is
straight ahead, a physical impossibility. Yet we have no trouble at all instantly recognizing a woman.
3 Harmonic Music Explained

What can we make of all of this? Do the above insights into physics and computation yet provide enough
information for us to derive something that we recognize as music? For example, can we compute a set of
notes that will sound good when played together?
Recall the observation of Section 2.3.2 "Harmony Induces Two Kinds of Intervals: Horizontal Within the Note
and Vertical Across the Notes" that two notes induce parallel vertical series-es of overtones all of the same
ratio means that note ratio and tone ratio are intimately connected. That is, from now on, when speaking of
two notes that are in a ratio, what we really mean is that the overtone series-es of the two notes make two
series-es of vertical tones having that ratio. From now on we will omit reiterating this point and simply speak
of "the ratio of two notes making an interval within the Harmonic Series".
3.1 The Major Triad

Let's try the simplest thing we can that will generate notes that the brain wants to hear together (recall from
Section 2.1 "Searching for Harmonics" how much the brain wants to hear the Harmonic Series):
find the ideal Harmonic Series induced by, say, Middle C,
map it into one Octave by dividing by two whenever necessary,
replace tones with notes as, again, these notes will induce the same (vertical) intervals as the tones.
Note that in Scientific Pitch Notation [sci-pitch] the particular Octave that contains Middle C is "Octave 4",
the next Octave up is "Octave 5", etc., where we increment the octave number each time we cross the note C.
Starting at C4 the sequence of notes we get is as follows.
Factor of 1: The fundamental: C4.
Factor of 2: This is just C5 (up one Octave); dividing by 2 gives us 1 times C4 = C4 again, so no new
note in the collection.
Factor of 3: This is G5; divide once by 2 gives us 3/2 times C4 = G4. This is the first really "interesting"
different note.
Factor of 4: This is C6; dividing twice by 2 gives us 1 times C4 = C4 again, which we have already in
our collection.
Factor of 5: This is close to E4; dividing twice by 2 gives us 5/4 times C4 = E4. Ah, another new and
"interesting" note.
Factor of 6: This is G6; dividing twice by 2 gives us 6/4 times C4 = 3/2 times C4 = G4, which we
already have in our collection.
Let's stop here. (We stop at harmonic 6 in particular for a reason that will become clear later.) The starting
tone/note of Middle C is arbitrary, but the ratios we we get, namely 1, 5/4, and 3/2, times the fundamental, are
not. Three notes in these ratios are called "The Major Triad". There are standard names for these notes
(relative to the fundamental): reordering them from the harmonic order above to their numeric order when
folded down into one Octave, the first note (1) is called the "Root", the second (harmonic 5, so in this Octave
5/4) is called the "Major Third", and the third (harmonic 3, so in this Octave 3/2) is called the "Perfect Fifth"
(!). The weirdness of musical nomenclature is just beginning. Note further that it is unclear which terms should
be capitalized; we treat as proper nouns any illusory Platonic ideal objects created by the mind: "Harmonic
Series", "Major Triad", etc. Again, these names of the intervals reflect their position in the Major Scale,
described below, and as you can see, confusingly do not correspond to their order in the Harmonic Series.
Major Triad
Root:
(harmonic 1):
1 = 1.0.
Major Third
(harmonic 5): 5/4 = 1.25.
Perfect Fifth (harmonic 3): 3/2 = 1.5.
Note that according to our measure of interestingness in Section 2.4 "Interestingness: Just Enough
Complexity", the intervals of the Fifth (factor of 3) and the Third (factor of 5) are, in that order, the most
interesting intervals: (1) they are in the theme of the Harmonic Series, while also (2) they have some
complexity resulting from not being a simple power of two times the Root (which if they were would make
them subject to the octave effect tending to make two notes sound like one; that is, a factor of 2 is too boring).
If we pick C as the Root (as we did above) then the resulting Major Triad is called the "chord" of C Major. The
starting node of "C" was arbitrary; however the resulting triad was not. Is it so surprising that this Major Triad
is everywhere in music? It sounds rather nice to play notes in the C-Major Triad; try it. However, after a while
it is a little boring, so we would like to add some variety. How little complexity can we add and yet still change
something?
3.2 The Major Scale

The Major Scale [maj] is so fundamental to Western music that it is even "built into" the notation (the Major
Scale is sometimes called the Diatonic Scale, although the term "Diatonic" seems to mean different things
depending on who you ask, therefore instead I use the less ambiguous term "Major"): if you play notes by
going up the white keys of a piano keyboard one step at a time, which is the same as going up the alternating
lines and spaces of an unadorned musical score, you are playing the C Major Scale. Is this Major Scale
arbitrary or is it somehow fundamental to the way the brain hears? If it is fundamental, we should be able
derive such a thing simply from first principles as we suggested in the introduction. Let's try it.
3.2.1 Interlocking Triads

Well, we like the Major Triad, so let's make another one, but starting with a different note as the fundamental.
To preserve as much theme with the previous triad, let's start with the "closest" notes to the C that we have in
our first triad: The first note other than C that we hit was 3/2 times the Root, also called the Perfect Fifth;
therefore let's build a triad using 3/2 times C4 = G4 as the fundamental. Let's remember to divide by 2 when
necessary to keep everything within the same Octave.
Major Triad Up by a Perfect Fifth
Root:
3/2 *
1 = 3/2 = 1.5.
Major Third:
3/2 * 5/4 = 15/8 = 1.875.
Perfect Fifth: 3/2 * 3/2 = 9/4,
which is bigger than 2,
so divide by 2, giving:
9/8 = 1.125.
Ok, that was so much fun let's go in the other direction as well. That is, let's make yet another Major Triad
where that the Perfect Fifth of that Triad is the Root of our first Triad. That means multiplying by 1/(3/2) =
2/3; therefore let's build a triad using 2/3 times C4 = F3 as the fundamental. Let's be sure to multiply by 2
when necessary to keep everything within the same Octave. (Note that throughout we use "~" (tilde) to mean
"almost equals".)
Major Triad Down by a Perfect Fifth
Root:
2/3 *
1 = 2/3
which is smaller than 1,
so mult by 2: 4/3
~ 1.333.
Major Third:
2/3 * 5/4 = 5/6
which is smaller than 1,
so mult by 2: 5/3
~ 1.666.
Perfect Fifth: 2/3 * 3/2 = 1
= 1.0.
Note that the selection of three interlocking triads is suggested by our measure of interestingness from Section
2.4 "Interestingness: Just Enough Complexity". That is, using three overlapping Major Triads (1) maximizes
the theme of the Harmonic Series while not requiring any harmonics beyond harmonic 5 (the interval called
the Third), while also (2) having some complexity by not all being of one Harmonic Series.
Now we have three "interlocking" Triads: the Perfect Fifth of one is the Root of the next. How many notes is
that? Three notes per triad times three triads is nine notes; however two of the notes where the triads interlock
are counted twice, so there are 3 * 3 - 2 = 7 unique notes. Let's plot them on a line to see how far they are
from one another.
3.2.2 Using Logarithms to Visualize Distances Between Tones/Notes

Wait... before we do that, when plotting notes, such a plot should "mean something to us". As we saw above,
what makes sense would be for the ratios of the notes to have some regularity; that is the multiplicative ratios
of frequencies is what our brain is listening to, not the additive distances. For this plot to mean something, we
would want equal ratios to show up equally on the plot. How do we turn (multiplicative) ratios into (additive)
distances?
The function that does this is called the "logarithm" (or just "log") [log]. Explaining it is beyond the scope of
this article, but basically if you count someone's income by how many digits they have in it, you are already
familiar with logarithms. In general, a "six-figure income" is ten times that of a "five-figure income" (though a
high five-figure income is usually pretty close to a low six-figure income, so the example isn't perfect). That is,
by counting the number of figures, you have turned a (multiplicative) ratio of a factor of one ten into an
(additive) increment of one figure of income. You see,
After going through the logarithm, multiplicative factors turn into additive increments.
The logarithm we want turns factors into increments in the same way as the income example, except that we
care about factors of 2 instead of 10, so we take logs "base 2" (instead of "base 10"): each multiplicative factor
of 2 will be displayed as one unit of additive increment in the graph called an Octave. This means that the
relative pitch of specific tone (or note) ratios shows up as specific distances between tone (or note)
logarithms.
You can skip the next little section if your eyes are glazing over. All you need to remember is that on a
logarithmic scale a multiplicative factor looks like an additive increment.
Some Technical Details on Logarithms and Exponents
*** Feel free to skip this section! ***
The "log base 2 of x" is usually denoted something like "log_2(x)"; to avoid visual clutter, we omit the
parentheses around "x" when the meaning is unambiguous, writing "log_2 x". When computing the ratio of
two logs, as we do in Section 3.5.3 "Chords from the Harmonic Series", the bases of the logs don't matter (they
cancel out) and so we omit them, writing simply "log x / log y".
We denote the exponential, raising a base, b, to a power, y, as "b^y". (Note that your browser may or may not
render the y as a superscript; in case it does not, I have also redundantly retained the "^" character.) The
logarithm and exponential are inverses, so
b^(log_b x) = x and
log_b(b^y) = y
You can use exponentials to think about logarithms. When computing the ratios of numbers, imagine each
number represented as an exponential of a base, such as 2. Now think about what multiplication and division
of that number do to the exponent as the numbers are divided or multiplied. That is, if we think of taking the
ratio of two numbers represented as exponents we see that we we are just subtracting their exponents (the
logarithms of the original numbers); that is,
2^p / 2^q = 2^(p-q).
3.2.3 The Keyboard Revealed

We now plot all of the notes of the three above-derived interlocking triads. Due to the phenomenon of relative
pitch, we express each note as a ratio to the Root of the base Triad. Due to the phenomenon of octaves, we
multiply or divide by 2 to keep all the notes within one Octave. Again, since we want the plot to mean
something, we take logarithms before we plot so that same multiplicative ratios map to same additive
increments. We use 2 as the base of our logarithm so that a factor of one Octave, or 2, corresponds to an
additive increment of 1, so all numbers will be between 0 (inclusive) and 1 (exclusive).
To review, the three fractions multiplied to obtain the fraction for the note are in order:
which Major Triad: 1/1 for C, 3/2 for up a Fifth, 2/3 for down a Fifth,
which element of that Major Triad: 1/1 for Root, 5/4 for the Third (harmonic 5), 3/2 for the Fifth
(harmonic 3),
multiply or divide by more factors of 2 to keep the result in the same Octave.
Multiplying all of that together and taking log base 2 we get the following:
Base Major Triad
Root:
(1/1) * (1/1) * (1/1) =
Major Third: (1/1) * (5/4) * (1/1) =
Perfect Fifth:(1/1) * (3/2) * (1/1) =
Major Triad up by a
Root:
(3/2)
Major Third: (3/2)
Perfect Fifth:(3/2)
Perfect
* (1/1)
* (5/4)
* (3/2)
Major Triad down by

Root:
(2/3)
Major Third: (2/3)
Perfect Fifth:(2/3)
a
*
*
*
1/1;
5/4;
3/2;
Fifth
* (1/1) = 3/2;
* (1/1) = 15/8;
* (1/2) = 9/8;
Perfect
(1/1) *
(5/4) *
(3/2) *
Fifth
(2/1) =
(2/1) =
(1/1) =
4/3;
5/3;
1/1;
log_2
log_2
log_2
1/1 ~ 0.000.
5/4 ~ 0.322.
3/2 ~ 0.585.
log_2 3/2 ~ 0.585.

log_2 15/8 ~ 0.907.
log_2 9/8 ~ 0.170.
log_2
log_2
log_2
4/3 ~ 0.415.
5/3 ~ 0.737.
1/1 ~ 0.000.
Note that this may look more complicated than it really is. All that is going on numerically is playing with
factors of 2, 3, and 5, in a rather systematic way, as follows:
The three Triads are each a factor of 3 (a "Fifth") apart.
Within each Triad, we have a factor of 3 (a "Fifth") and a factor of 5 (a "Third") from the Root.
We multiply or divide by 2 (an "Octave") enough times to keep everything in one Octave.
Notice that the importance of the numbers 2, 3, and 5 is not uniform: 3 is used most prominently, 5 is more
secondary, and, going the other direction, 2 is so boring we just throw it in wherever we like. This reflects our
observation from Section 3.1 "The Major Triad" of the different harmonics, a factor of 3 seems to have the
right amount of complexity to be most interesting, so it gets top billing (see Section 2.4 "Interestingness: Just
Enough Complexity" for more on interestingness in general).
Sorting and Plotting the Three Triads on One Line

Now we sort the logarithms and remove duplicates. Let's give them letter names and for some strange reason
let's start at C instead of A and wrap around.
C:
D:
E:
F:
G:
A:
B:
log_2 1/1
log_2 9/8
log_2 5/4
log_2 4/3
log_2 3/2
log_2 5/3
log_2 15/8
~
~
~
~
~
~
~
0.000.
0.170.
0.322.
0.415.
0.585.
0.737.
0.907.
Now let's plot them on the unit interval to within 0.02 units.
C
D
E
F
G
A
B
C
+----+----+----+----+----+----+----+----+----+----+
0
1
2
3
4
5
6
7
8
9
0
Hmm, now that's interesting, if we were to fill in a few gaps they would look almost evenly spaced. Following
music theory, we'll call the big gaps "tones" (yes, this is a different meaning of the word "tone") and the small
gaps "semi-tones". (This meaning of "tone" and "semi-tone" will not occur very often, and below I try to use
only semi-tone.) I'll fill in the big gaps with a hash sign (I'll omit computing any exact values for them) so all
the gaps are now semi-tones.
C
#
D
#
E
F
#
G
#
A
#
B
C
+----+----+----+----+----+----+----+----+----+----+
0
1
2
3
4
5
6
7
8
9
0
Does that look familiar? If not, color the letters white and the hashes black and look again at the picture of the
keyboard at the top of the article.
Notice that there was no resorting to the following arguments:
"Because the Ancient Greeks did it this way."
"Because if the notes were equally spaced your ear would lose its place."
"Because your culture has trained these notes into your ear since you were a baby."
The result emerged naturally just from some physics and some computer science.
After I made this derivation of the Major Triad from first principles, a friend of mine Peter McCorquodale
pointed me to "Aesthetic Measure" by George D. Birkhoff [Birkhoff1933]. On page 92 in the section "The
Natural Diatonic Scale", Birkhoff independently makes the same derivation of the Major Scale as we do
above, albeit providing less detail and with no motivation from computer or brain science. Given the Major
and Minor Triads, Helmholtz also seems to give the same theory of a key as interlocking chords
[Helmholtz1863, p. 300] as we do above, though we argue below that he fails to explain how it is that we find
the Major and Minor Triads compelling to listen to in the first place.
While my derivation of the Major Scale above is therefore not a completely original contribution, it is also
certainly not well known. It was quite an effort for me to invent it, given my starting point of nothing but
curiosity about the problem and disgust with all the books to which I had access. How is it that even music
majors in college not know this derivation of the Major Scale?
3.3 Scales and Keys
The notes above are known as the 12-(Semi-)Tone Western (Chromatic) Scale (you will hear people call it the
"12-tone Western Scale", including in quotations below). The subset of lettered (or white) keys, omitting the
hashes (or black keys), is called the Major Scale.
We can now explain some standard musical terminology. In the particular case of the C-triad, the note E is
called the Major Third, as it is the third white key in the Major Scale. Similarly, the note G is called the Perfect
Fifth, as it is the fifth white key. Deep huh? We defer discussion on how it is that one is called "Major" and the
other "Perfect" until the section on Equal vs. Just Tuning below.
Explaining all musical conventions is beyond the scope of this article, but I mention a few basic ones we will
need. Going up a step is called "sharp", denoted "#", and down "flat", denoted "b", so we now have two names
for each black key; for example the black key between C and D is both "C#" and "Db".
The "key" of the scale is the Root note of what we called the base triad (the one in the middle of the three
interlocking triads); that is, in the example above the key was C Major. It will turn out that there is more than
one way to build a scale than to lock together those three Major Triads (F, C, G).
1. We could use a note other than C as the base of the center triad.
2. We could use another kind of triad other than the Major Triad; we haven't talked about that yet.
The name of the key indicates which choices were made for the two variables above: the one we built above
was built (1) starting at C and (2) using three Major Triads (a Major Scale), so it is called the C Major Scale.
3.3.1 Changing Key: Playing Other Groups of Triads

The three interlocking triads we came up with have the Roots C, F, and G. Lots of music uses just these three
triads; in fact
The entire genre of music called "Twelve-Bar Blues" [twelve-bb] basically uses only these three
triads!
In the hands of a skilled musician, these three triads of the Major Scale can actually be interesting for quite a
long time. I once asked my then Jazz piano teacher Ben Stolorow to give me more interesting chord
progressions to practice (so I wasn't just playing scales for hours). His response was that F/C/G was plenty
interesting enough and he demonstrated by simply arpeggio-ing these three chords (playing the notes of a
chord one at a time) while switching between different rhythms [Stolorow, c. 2006]; I remember saying "Wow,
I would pay money just to listen to that and it's just three chords!".
If you enjoy playing notes that all lie within the three interlocking triads we made, you will be just fine and
dandy with the group of notes above, which we call the Major Scale, and you will never need the black keys
on your piano. However, after a while, just as you got bored with one Major Triad, you might get bored with
three of them (though, per the above demo by my teacher, it might take longer). That is, if your melody is
playing around in one Major Scale, you might want to do the same playing around but all within a different
scale made starting with a different key note than C. Making this change is called a "key change".
For example, you might pick a key using two of the triads you have already, C and G, but making G the base
triad (instead of C as we did above) and adding one more triad based at the Perfect Fifth above G (which is
D). Uh, oh, we don't have all the notes for the D Major Triad in our C Major Scale (the white keys). You can
repeat the construction of the C Major Scale above and discover that the missing note is F#. This is how it
emerges that piano players playing in a Major Scale that is not C can still have to use some black keys.
3.3.2 Key Changes Break Harmony
Adding F# to our keys can be done; however there is a worse problem. Let's compute the ratios we get if we
build a Major Triad starting at D, that is, using the notes D, F#, and A. In particular, if A is to be the Perfect
Fifth above D, then their ratio should be 3/2. (I kept so many decimal places below as the fractional part is just
too cool to omit.)
We got D as the Fifth above G, and G as the Fifth above C.
(We divide by 2 to keep it in the same Octave:)
D = (3/2) * (3/2) / 2 = (9/4) / 2
= 9/8.
We got A as the Third above F, and F as the Fifth below C:
A = (1/(3/2)) * (5/4) = (2/3) * (5/4) = 5/3.
A over D is therefore
A / D = (5/3) / (9/8) = (5*8)/(3*9) = 40/27 ~ 1.481.
Whereas a Perfect Fifth should be
3/2 = 1.5.
The error is therefore
(PerfectFifth - (A/D)) / PerfectFifth
= ((3/2) - (40/27)) / (3/2)
~ 0.0123456790123457
~ 1.2%.
So if we measure carefully, we notice that, even with the big gaps filled in, the intervals are not all exactly right
for playing another key, such as D Major. That is:
If we want to do a key change, we can try (a) just using the same piano we derived for the key of C
Major, but (b) playing whatever piano keys we find when we just move "up" a triad; that is, using the
same notes as for C Major but making the triad rooted at D.
However, if we compute the note ratios carefully, we see that the ratios for the "triad" rooted at D will
not be quite right. They also will not sound right. In fact, if we do more key changes, moving, say,
repeatedly "up" by a Perfect Fifth again (beyond D), some of the other triads will be even less right and
will start to sound really bad.
Uh, oh. Should we buy a new piano every time we change key?
3.3.3 Just versus Equal Tuning

There is no good solution to this problem, because there are more constraints than variables to play with.
Engineers call this situation "the problem is over-constrained".
In the old days, there were many partial solutions proposed for this problem [mus]. They all amounted to
fudging the actual note values so that instead of three triads sounding really right and the rest pretty "off",
some would sound less right and others less off. These different settings of the frequencies of the note values
were called different "tunings".
The original tuning we computed in order to make triads F, C, and G sound just right is called "Just Tuning"
[just] (for the key of C anyway). It made the key of C Major sound great, but, as we noticed above, also made
the other keys sound not so great. In the past composers used this fact to artistic effect, deliberately switching
keys to make the music sound worse or better. In fact if you go "up" by enough Perfect Fifths you come to a
key that turns out to be maximally bad (it can't keep getting worse because there are only 12 keys), namely the
key of F# (also known as Gb). This key sounded so bad it was called "the Wolf Key" [wolf]. Maybe people
thought it sounded like a wolf howling.
However, someone realized that if we just make all the steps equally far apart, well, it sounds sort of ok once
you get used to it. From Wikipedia: "Vincenzo Galilei (father of Galileo Galilei) may have been the first person
to advocate equal temperament...." [eqt]. Using this system we can change key and it works out ok since all
the gaps are the same size. This is called "Equal Tuning (or Temperament)". Notice that by making all the
notes equally-spaced, each semi-tone is the number such that when you multiply it by itself 12 times you get
an Octave, or factor of 2. This numbers is also called twelfth-root of 2 which is 1.0594630943593 or one plus
about six percent. That is, every time you go up a semi-tone, you are adding six percent to the frequency of
the note. Below I'll call this ratio of the interval of a semi-tone "TwR2" when I want to emphasize that it is the
twelfth-root of 2.
Note that the Equal Tempered Chromatic Scale is commensurate with our measure of interestingness from
Section 2.4 "Interestingness: Just Enough Complexity". That is, these twelve notes (1) maximize the theme of
being closed under arbitrary key changes and preserve the theme of the Major Scale (as a subset of the twelve
notes), while also (2) having probably the right amount of complexity, namely twelve notes, which Levitin
points out later in Section 4.2 "But Other Cultures Have Different Musical Scales!" seems to be a human
universal for scales (my guess is that's a limit on how much complexity the human brain can handle).
Equal Tuning is not considered to be a completely good thing. First, now none of the keys are tuned really
right -- now they all sound a little "off" -- some of the sweetness is permanently gone. Second, while all the
keys sound the same, on the other hand... all the keys sound the same! (That is, "sound the same" with respect
to relative pitch considerations; Mark Hoemmen points out that other factors of absolute pitch may still make
keys sound different [Hoemmen, October 2011].) When the notes were all tuned to make one key sound
perfect, other keys sounded "off" in different ways and this could be used for dramatic effect by the
composer; as a musician once pointed out to me, when we play that old music today on a modern keyboard,
we no longer hear it as it was intended. By adopting the Equal Tempered Scale, as with all engineering
tradeoffs, something has been gained and something lost.
In our insistence on symmetry we have lost both some sweetness and some richness -- a common
theme of Modernism.
Recall that in, say, the C Major Scale, the Fifth is called "Perfect" and the Third "Major". This is because the
approximations introduced by Equal Tuning caused more damage to the Third than the Fifth. Specifically the
Fifth on an Equally Tempered piano is very close to 3/2:
(TwR2^7) / (3/2) ~
~ 1 - 0.1%.
0.998871 = 1 - 0.001129
However the Third is not very close to 5/4:

(TwR2^4) / (5/4) ~
~ 1 + 0.8%.
1.007937
You can actually hear the difference.

"Western music is fast because it's not in tune." -- Terry Riley [mus]
Notice that, due to the power of Relative Pitch, it seems to be somewhat arbitrary exactly which note we pick
to be C4, Middle C (though not completely arbitrary: Middle C is in the middle of the range of human hearing
that has the best "tonal perception"; an easy way to hear this is to play a dissonant interval very high or very
low on the keyboard; you will notice that it doesn't bother you). For specificity, from now on we speak of the
Equal Tempered Scale with A3 at 220 Hz. This is what in the West is called "concert pitch" [concert-pitch],
but it varies geographically and historically [pitch].
3.4 The Minor

Recall that we assumed that the brain wants to find a Harmonic Series in what it hears. Recall from Section 2.5
"Recognition: Feature Vectors" that the brain is likely looking for the Harmonic Series by looking for features.
We now consider what different kinds of easily-computable features the Harmonic Series might have that the
brain might use to look for it. Recall further from Section 2.2 "Artifacts of Optimization" that we assumed that
the brain uses two different tricks in order to recognize the many different possible instances of the Harmonic
Series (those based on different fundamentals) using the same hardware and using that hardware as efficiently
as possible.
Relative pitch: the brain divides pairs of co-occurring tones and recognizes the resulting pairwise
intervals.
Octaves: the brain normalizes tones into a single factor of two range by dividing or multiplying up front
by powers of two.
Octaves: when normalized into the Octave of the fundamental, the Major Triad looks like this.
Harmonics of the Major Triad
Root:
1 = 1.0.
Major Third:
5/4 = 1.25.
Perfect Fifth: 3/2 = 1.5.
Relative pitch: when expressed as a set of ratios of pairwise intervals, the Major Triad looks like this.
Intervals of the Major Triad
Major Third
/ Root:
(5/4) / (1)
= 5/4.
Perfect Fifth / Root:
(3/2) / (1)
= 3/2.
Perfect Fifth / Major Third: (3/2) / (5/4) = 6/5.
3.4.1 The Minor Triad

Let's consider another set of notes, which we will call the "Minor Triad" for a reason that will soon become
clear. When normalized into the Octave of the fundamental, the Minor Triad looks like this.
Harmonics of the Minor Triad
Root:
1 = 1.0.
Minor Third:
6/5 = 1.2 (new!).
Perfect Fifth: 3/2 = 1.5.
When expressed as a set of ratios of pairwise intervals, the Minor Triad looks like this.
Intervals of the Minor Triad
Perfect Fifth / Minor Third: (3/2) / (6/5) = 5/4.
Perfect Fifth / Root:
(3/2) / (1)
= 3/2.
Minor Third
/ Root:
(6/5) / (1)
= 6/5.
Wow, with the Minor Triad

(Relative pitch) we do get the same set of ratios of pairwise intervals as the Major Triad,
(Octaves) but we do not get the whole triple of ratios when normalized into the Octave of the
fundamental!
3.4.2 The Minor as Auditory Cubism

Now comes the art. If the brain is hearing the series but also the relative intervals, we can tease it, for artistic
purposes, by giving it the right intervals but in the wrong order.
The way the Minor Triad will sound to the brain is
I hear the pairwise intervals as the Harmonic Series,
but I do not hear the Harmonic Series itself -- something is missing.

Recall the discussion of Cubism from Section 2.5.3 "Cubism: Partial Recognition Due to Redundant,
Over-Determined Feature Vectors". Doesn't that sound like a good description of how music in a Minor key
sounds?: "I recognize it, but something is off." I conjecture this is the same phenomenon as what happens
when we look at cubist art: it's a woman, but something is off. Recall our conjecture from Section 2.3
"Harmony: Sweetness is the Ideal" which suggests that the Major Triad should sound sweet whereas the Minor
Triad, while recognized, is "off" and therefore not as sweet.
In "Music and Probability" David Temperley [Temperley2007] goes on at length wondering what could make
the Major and Minor scales sound happy and sad respectively and does not mention this idea. (Here we derive
the scales from the triads so we consider it to be the same question.) Given the recent publication date of
January 2007, I suspect that this theory of the difference between the feeling of the Major and Minor Triads
may be really an original contribution to science and art: that is, this may be the first time in the history of
Western Music that someone has finally figured out how Major sounds right/happy and Minor sounds off/sad.
3.4.3 Minor Scales

While there is only one Major Scale, there are several different scales all called some kind of Minor Scale
[min].
The Natural Minor consists of three Minor Triads linked together in the same way as we linked together
three Major Triads to get the Major Scale.
The Major-Minor Scale which is a Major Triad, then a Minor Triad linked a Fifth above, then another
Major Triad linked a Fifth above that.
There are more: Harmonic Minor Scale, Ascending Melodic Minor Scale, Descending Melodic Minor
Scale.
Our theory of the Minor is that the Minor Scales are just games with intervals where we tease the Harmonic
Series recognizer of the brain with partial recognition:
we fire some of the intervals in the feature vector for the Harmonic Series,
but not the whole series itself.
This theory is further re-enforced by the fact that there is one Major Scale whereas there are many Minor
scales. Recall that in the Major Scale, built from the Major Triad, everything goes "right", whereas in the
Minor scales, built from the Minor Triad, something is always "off" or "wrong". It is a well-known engineering
adage that
There is one way for things to go right, but many ways for things to go wrong.
For example, in the Unix operating system each program upon exit returns to the system a number. In theory it
could be used to mean anything, but all of the tools of Unix are set up to enforce the convention that the
returned number be interpreted as an error code, as follows: zero is the code meaning "ok" and any other
number means some kind of error occured (where the map from the number code to the exact meaning of the
error depends on the program returning the number). This convention works because of the above observation:
the one uniquely distinct number, namely zero, is the code meaning that things went the one way for things to
go right, and the rest of the numbers encode the many ways that things can go wrong.
Recall from Section 2.3 "Harmony: Sweetness is the Ideal" our conjecture that when things go right they sound
sweet and anything else gives us information to recognize a thing but also makes it "off", not perfect. We
should expect therefore one way for things to sound sweet, the one Major Scale reflecting the one ideal
Harmonic Series, and myriad ways for things to sound off in some way, resulting in the multiple Minor Scales
and other even stranger things (see Section 3.5.6 "Chords Preserving Intervals but not Harmonics").
3.5 Chords
Musicians sometimes play multiple notes at the same time. Likely by trial and error, people have discovered
that certain combinations of notes convey a particular "emotion" or "sense" to the listener. Notice I didn't say
"sound good" -- in language we often use words that don't sound good but still convey a certain sense, even if
it is bad. That is, a musical sound should sound "like something", even if not a good something. Similarly, in
written expression we may say things good or bad, but we don't (often) say complete nonsense and even if we
do it is in a context that gives the nonsense some sort of meta-sense. Let's call these groups of notes that sound
like something "chords".
Instead of resorting to trial and error to discover these chords, let's see if we can derive them from our first
principles and conjectures, given above. To make this a proper scientific experiment we need a "ground truth"
list of chords as a goal against which to test our progress. Many piano books come with a "chord dictionary"
which is a reasonable measure of the chords that people use most of the time and that have been discovered by
trial and error over the centuries to sound like something; we'll use that. Motivating observed phenomena from
the first principles of a theory is considered to be an ultimate test of the quality of a theory (together with
being well-factored; see Section 1.2 "What is a Satisfactory, Scientific Theory?"): if it succeeds, the theory has
about all of the explanatory power we could ever want.
Let's see if we can motivate from first principles all of the chords in a Standard Chord
Dictionary.
Throughout this derivation of different chords, you will note that a gradual progression or degradation from
high-theme/low-complexity (Major Triad, Harmonic Series chords) to low-theme/high-complexity (Minor and
Ambiguous chords). This progression is suggested by our measure of interestingness from Section 2.4
"Interestingness: Just Enough Complexity". That is, as we progress, more and more of the theme of the
Harmonic Series is lost and more and more complexity is introduced. Notice that this progression seems to
mirror that of musical sophistication as well: musically untrained listeners like Major chords while more
musically trained listeners are more tolerant to loss of theme and more interested in complexity. (I met a signal
processing engineer who had played piano for something like 18 years and who simply did not like Major
chords at all.) Other fields seem to progress similarly: white wine is preferred by new wine drinkers, whereas
more "complex" red wines are an acquired taste.
3.5.1 The Standard Chord Dictionary

Here are the chords in the chord dictionary of "How to Play from a Fake Book" by Blake Neely [Neely1999].
We give the notes on the C Major Scale and also in number of semi-tones from the fundamental. (You can
reconstruct the chord with a different Root than C-based version given here by playing the Root note for
semi-tone 0 and for the others counting the number semi-tones over from the Root; note that to count
semi-tones, count the keys across the top of the keyboard where the black and white keys are all the same
width. Also, you might want to play the Root (semi-tone 0) and the Fifth (semi-tone 7) in one Octave and the
remaining notes in the next Octave up; see Section 3.5.2 "How to Turn Sweetness into Mud: Over-Using
Octaves".)
Name
C "Major Triad"
Notes in C Major Scale Semi-tones from fundamental

C-E-G
0-4-7
Cm "Minor Triad"
C-Eb-G
0-3-7
Cdim "Diminished"
C-Eb-Gb
0-3-6
C+ "Augmented"
Csus "Sustained"
C-E-G#
C-F-G
0-4-8
0-5-7
C6 "Sixth"
C-E-G-A
0-4-7-9
Cm6 "Minor Sixth"
C-Eb-G-A
0-3-7-9
C7 "Dominant Seventh"
Cmaj7 "Major Seventh"
C-E-G-Bb
C-E-G-B
0-4-7-10
0-4-7-11
Cm7 "Minor (Dominant) Seventh" C-Eb-G-Bb
0-3-7-10
Cdim7 "Diminished Seventh"
C-Eb-Gb-A
0-3-6-9
C(add9) "Add 9"
C-E-G-D
0-4-7-14
C9 "Ninth"
Cmaj9 "Major Ninth"
C-E-G-Bb-D
C-E-G-B-D
0-4-7-10-14
0-4-7-11-14
Cm9 "Minor Ninth"
C-Eb-G-Bb-D
0-3-7-10-14
C11 "Eleventh"
C-E-G-Bb-D-F
0-4-7-10-14-17
C13 "Thirteenth"
C-E-G-Bb-D-A
0-4-7-10-14-21
3.5.2 How to Turn Sweetness into Mud: Over-Using Octaves

Since you now have the formula to play any of these chords in any key, I recommend playing them and asking
yourself how they sound. They certainly sound different and those differences are what our theory is going to
attempt to explain.
Be careful about one point: as my then jazz piano teacher Ben Stolorow pointed out to me (I'm paraphrasing)
[Stolorow, c. 2006]:
If you play all the notes of a chord within one Octave, it sounds like mud.
All of this dividing and multiplying of intervals by factors of two as we move across Octaves has to stop
someplace. No such processing of signals can be taken to an extreme: the more you do it the more it corrupts
the signal.
Conversely, playing a chord across several Octaves, in particular putting the Root on the bottom (keyboard
left) and the higher harmonics on the top (keyboard right), helps make the chord "ring" beautifully. In fact,
selecting which Octaves in which to play the notes of a chord is so important it has a term: voicing the chord.
To hear this effect for yourself find a keyboard and try playing C Major, C-E-G, across multiple Octaves such
that there are no factors of two in the denominator:
Harmonic 1: so as to not get too high you probably want to play the Root, C, one Octave below Middle
C (C3 in Scientific Pitch Notation [sci-pitch]).
Harmonic 3: play the Fifth, G, one Octave up from the Root (G4): from the Root this interval is a factor
of (3/2) * 2 = 3.
Harmonic 5: play the Third, E, two Octaves up from the Root (E5): a factor of (5/4) * 4 = 5.
If you really want to make it ring fill in these other notes from the Harmonic Series of C in their proper
Octaves -- you can just do it with two hands if they can span an Octave:
Harmonic 2: play the Octave interval, C4, one Octave up from the Root: a factor of 2.
Harmonic 4: play the "Double-Octave", C5, two Octaves up from the Root: a factor of 2 * 2 = 4.
Harmonic 6: play the "Double-Octave-Fifth", G5, two Octaves up from the Root: a factor of (3/2) * 2 *
2 = 6.
Since playing the notes of a chord in Octaves that put them closer to the ideal Harmonic Series really sounds
better, this experiment supports the theory that notes sound good together because they are all from one
Harmonic Series, rather than because their intervals are just somehow special.
However, I would be remiss if I did not point out the following phenomenon (I recall Michael O'Donnell
making this point to me, but he does not recall making it) [O'Donnell, c. January 2009]:
even if you invert the chord -- select an alternate voicing, say with the Root high (keyboard right) and
other harmonics low (keyboard left),
but are sure to play the notes across several Octaves anyway,
then the chord stills sounds better than if played all in one Octave, though I think perhaps it doesn't quite
"ring" as nicely; try it for yourself. This observation may dampen the ringing endorsement of the Harmonic
Series of the above paragraph (puns intended): it seems that even in the absence of the ideal Harmonic Series
(due to the chord inversion), the brain would still rather hear notes played across several Octaves rather than
all bunched into one.
Having said that, we now begin our plan to derive the Standard Chord Dictionary from our theory.
3.5.3 Chords from the Harmonic Series

First, let's try playing the six notes having Roots corresponding to the first six overtones in the Harmonic Series
(recall from Section 2.3.2 "Harmony Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical
Across the Notes" that the ratios of the notes will induce the vertical ratios of their composite tones). Well, as
we saw in Section 3.1 "The Major Triad" above, we just get the Major Triad (as a chord, that's C Major in the
chart above).
Recall however that during the derivation of the Major Triad we stopped at harmonic 6. What if we keep
going? Well we find that no note in our Twelve-Semi-Tone Scale is terribly close to 7/8. This is just the way the
math works out: if we want enough things, sometimes we just don't get them all -- again, engineers call this an
over-constrained (or over-determined) system. In this situation we'll just pick a key on one side or the other of
the real 7/8. Fudging downward is the closer approximation and that is likely the purpose in it being called the
"Dominant Seventh"; in the key of C the Dominant Seventh is Bb. Michael O'Donnell insists that this interval
is not properly called the Dominant Seventh, but the "Minor Seventh" [O'Donnell, 18 January 2009]; however
I have retained the term "Dominant Seventh" for this fudged-down interval as (1) that is the term used to
distinguish chords that use this note as their seventh, (2) to call it Minor would confusingly introduce a second
meaning to the word "Minor", and (3) even more confusingly, there is a chord called the "Minor Seventh" that
employs a Minor Third interval and a Dominant Seventh interval!
Interestingly, fudging upward produces an interval called the "Major Seventh"; in the key of C the Major
Seventh is B. This makes sense once we notice that the Dominant Seventh is not in the Major Scale (is not a
white key in the key of C) whereas the Major Seventh is in the Major Scale. In the key of C the Major Seventh
is B. However, O'Donnell points out that this explanation of the Major Seventh interval may be a red herring
and the use of the Major Seventh interval may have nothing to do with its being close to the seventh harmonic;
that is, the Major Seventh interval may make no sense outside of the context of an effect that arises when it is
used as part of the Major Seventh chord [O'Donnell, c. April 2009]; please see the discussion of the Major
Seventh chord in Section 3.5.6 "Chords Preserving Intervals but not Harmonics" where we discuss this
situation further.
You can read a discussion on several kinds of seventh intervals here: [min7] and [harmonic7]. From that
discussion it seems that the exact ratio 7/4 (not the fudged-downward Dominant Seventh or the fudgedupward Major Seventh) is called the "Harmonic Seventh" (at least when it's not being called the "Septimal
Minor Seventh", or the "Subminor Seventh"!).
A chord containing harmonic 7 is called a seventh chord because B is the seventh white key counting from C
(and Bb is the flatted version of B), and not because it is harmonic number 7 in the Harmonic Series. Recall
that the Perfect Fifth is actually harmonic 3 and the Major Third is actually the harmonic 5! This convenient
naming of the seventh white key being harmonic 7 is however just a coincidence! That is, in the case of seven
we are just lucky. (Get it? Lucky seven! Nevermind.) It turns out that this lucky correspondence of the two
numbers of (a) the white key count from the Root, and (b) the overtone harmonic number in the Harmonic
Series continues on for all of the rest of the subsequent harmonics above seven as well (approximately well
enough anyway and up until we stop at harmonic 13; more on this below).
Let's add some more notes to our chord just by going up the Harmonic Series. Harmonic 8 is just the Root
again if we divide by two enough times; for this reason in general we will skip even numbers from now on (to
find the interval for an even harmonic just divide by two until you get an odd number). For a harmonic N times
the fundamental the number of semi-tones to the right of the Root key on a Equal Tempered 12-semi-tone
keyboard is log N / log TwR2 (recall that "TwR2" denotes the twelfth-root of 2). Of course we subtract off any
extra multiples of 12 to keep it in the same Octave; another way to do that is divide up-front by a high-enough
power of 2. (This computation differs from the ones done earlier taking logs base 2 in Section 3.2.3 "The
Keyboard Revealed" because in that section we wanted the answer as a percentage of the Octave (a factor of
2), whereas here we now know that we have divided the Octave into 12 semi-tones and instead want the
answer as the number of semi-tones; we could just as easily instead take the log_2 x as before and then
multiply by 12.)
Ninth
Eleventh
Thirteenth
9/8 = 1.125;
11/8 = 1.375;
13/8 = 1.625;
log 1.125 / log TwR2 ~ 2.039.

log 1.375 / log TwR2 ~ 5.513.
log 1.625 / log TwR2 ~ 8.405.
Hmm, well those eleventh and thirteenth harmonics are pretty badly approximated by the piano keyboard!
Hey, wait a minute! I thought earlier we were saying that the names of the intervals were the number of white
keys we had to count over from the Root, not the name of the harmonic number. Remember, harmonic 3 is the
fifth white key and harmonic 5 is the third white key.
As noted above, just by coincidence harmonic 9 is really the ninth white key: 12 + 2 = D above C. It basically
works out, if we fudge just a little, that harmonic 11 is the eleventh white key: 12 + 5 (round down instead of
up!) = 17 = F (also known as the 4th). Harmonic 13 is the thirteenth white key: 12 + 9 (round up instead of
down!) = 21 = A (also known as the 6th). Very conveniently, with these fudgings, these harmonics just fit in
between the Major Triad white keys. How nice, especially if we don't let the pesky mathematical precision
bother us. Be sure to play the higher harmonics in higher Octaves, as detailed in Section 3.5.2 "How to Turn
Sweetness into Mud: Over-Using Octaves" below.
Chords from the Harmonic Series
Name
C "Major Triad"

C-E-G
0-4-7
C7 "Dominant Seventh" C-E-G-Bb
0-4-7-10
C(add9) "Add 9"
C-E-G-D
0-4-7-14
C9 "Ninth"
C11 "Eleventh"
C-E-G-Bb-D
C-E-G-Bb-D-F
0-4-7-10-14
0-4-7-10-14-17
C13 "Thirteenth"
C-E-G-Bb-D-A
0-4-7-10-14-21
And what about the harmonic 15 and higher? I don't know if people can even hear harmonic 15, as it is also
the Major Seventh -- the brain may simply start to hear lower harmonics instead. Also, remember that my goal
is to explain the chords in my chord dictionary and it doesn't go higher than a Major Thirteenth. I don't know if
people can hear harmonics 11 or 13 either, but note that by stopping at harmonic 13 each odd interval neatly
corresponds one-to-one with a white key on the keyboard; that is, the fact that the chord dictionary stops at
harmonic 13 may be another consequence of people's desire for symmetry (also, harmonic 15 is not the
"fifteenth" white key, which would instead be the Double-Octave).
Harmonic
15/8 = 1.875;
log 1.875 / log TwR2 ~ 10.883, which is B.
3.5.4 Chords Inducing Ambiguity

"How to Play From a Fake Book" [Neely1999] says that when playing, say, a thirteenth chord, that after the
7th note one should just omit any notes before we get to the thirteenth -- that is, omit the ninth and eleventh.
Further he says if it is too hard for your hand to play that, then drop the Root or even the Fifth. Given that the
Root is, well, the root of the chord, somehow like its foundation, I was astounded when I first read this; it
seemed as if someone were suggesting to me that we not bother to build the basement and first storey of a
house and instead start by building the second storey in mid-air. We explain this situation simply as follows:
Recall from Section 2.2.1 "Relative Pitch: Differences Between Sounds" that relative pitch is making us
hear differences between tones, even series-es of tones each induced by a note in a chord.
Recall from Section 2.3.2 "Harmony Induces Two Kinds of Intervals: Horizontal Within the Note and
Vertical Across the Notes" that ratios of notes induces vertical ratios of their overtones.
Recall from Section 2.1.1 "Virtual Pitch: Hearing the Harmonic Series Even When it is Not There" that
virtual pitch will fill in a missing fundamental tone of an overtone series, it will also fill in other missing
tones near the Root, such as harmonics 2 and 3.
Therefore when listening to a chord, relative pitch on the induced vertical intervals makes us hear an artificial
Harmonic Series that is missing a Root or even a Fifth, but then virtual pitch fills those tones; that is, inducing
us to hear the missing Root or Fifth note of the chord.
Conjecture Ten: The brain wants to hear one Harmonic Series (which can be seen as Conjecture
Seven in Section 2.4.1 "The Simplicity of Theme" applied to Conjecture Three in Section 2.1
"Searching for Harmonics").
If we leave out more and more notes and the brain is filling in more and more, we can start to get really close
to barely playing enough notes for the brain to figure out which Harmonic Series it is supposed to be listening
to. What if we play so few notes that the implied Harmonic Series is ambiguous, that the missing Harmonic
Series could be completed in more than one way?
Some chords are ambiguous therefore unstable: if we give the brain more than one alternative then the sound
is is "unsettled" until the player provides enough notes to "break symmetry" and disambiguate the series. Many
standard chords are of this kind as we will see as we enumerate a taxonomy of the ways we can tease the brain
with ambiguity.
Sustained: One ambiguity is to have two instances of the interval from Root to harmonic 3, the Perfect Fifth.
For example, if we play notes C, F and G, we create the possibility of either F or C being the Root.
Unsurprisingly, this chord is called the "Sustained" chord and musicians say that it "wants" to "resolve" to a
Major Triad at F or C. This effect is very easy to hear. Try it: play C-F-G (sustained) and then play C-E-G (C
Major).
Augmented: Another ambiguity is to have two instances of the interval from Root to harmonic 5, the Major
Third. For example, if we play notes C, E and G# we have this situation. Counting carefully, note that these
three notes actually make three Major Thirds! Unsurprisingly, it doesn't sound very satisfying, but the brain
does "recognize it as something", as opposed to sounding like noise.
Diminished: Another ambiguity we can create is to have two instances of the Minor Third. For example, if we
play notes C, Eb and Gb we have this situation and if we add the note A, we have not only three but four (they
wrap around) copies of this interval all at once! Unsurprisingly, it also doesn't sound very satisfying, but again
it still sounds "like something".
Chords Inducing Ambiguity
Name
C+ "Augmented"
C-E-G#
0-4-8
Csus "Sustained"
C-F-G
0-5-7
Cdim "Diminished"
C-Eb-Gb
0-3-6
Cdim7 "Diminished Seventh" C-Eb-Gb-A
0-3-6-9
3.5.5 Chords Using the Minor Triad

Similar to the way we extended the Major Triad to a family of chords, we can also make more chords by
adding more notes from the Harmonic Series on top of this Minor Triad, producing chords such as Minor
(Dominant) 7, etc.
What about the C6 chord: C-E-G-A. What the heck is that A doing in the Chord Dictionary? This chord is
called C6 because it is the C Major Triad plus the 6th white key starting from C, that is, A. However thinking
of the chord that way that seems to be a red herring. At my (then) girlfriend's house I started discussing this
theory with her musician father Tim and he pointed out to me that C6 is also the same set of notes as the
Minor (Dominant) 7 chord when rooted at A (instead of C); that is, the 6th forms a Minor Third with C
[Turner, c. 2006]. (Again, such a rearrangement of the same notes in a different Octaves is called an
"inversion").
During that discussion with Tim I also told him about my efforts to learn absolute pitch by attempting to guess
Middle C every morning and then checking my guess on the keyboard. I never seemed to hit it, but I was
getting closer and got to the point where I was predictably within maybe four semi-tones (again, after having
just woken up). He seemed to take this as a claim that I could pick Middle C out of my head and said, "let's try
it." I thought it would be wimpy to back down, especially in front of my girlfriend, and I really wanted to
impress my girlfriend's father, so I said "let's see" and hummed a note that sounded right. He matched my note
and went to the other room to check on the piano. He came back "yup, you got it right on," looking a little
impressed. I never told him that that was the first and only time I had ever pulled that off.
Chords Using the Minor Triad
Notes in C Major
Scale
C-Eb-G
Semi-tones from
fundamental
0-3-7
C6 "Sixth" [alias Am7 "Minor (Dominant)

Seventh"]
C-E-G-A
0-4-7-9 [from A: 0-3-7-10]
Cm7 "Minor (Dominant) Seventh"
C-Eb-G-Bb
0-3-7-10
Cm9 "Minor Ninth"
C-Eb-G-Bb-D
0-3-7-10-14
Name
Cm "Minor Triad"
3.5.6 Chords Preserving Intervals but not Harmonics

However there is another seventh chord called the Major Seventh; in C, it is C-E-G-B (no flat on the B). For
what purpose do we even bother calling this group of notes a chord? Does it sound "like something"? Try it
yourself; to get the effect best, put at least two of the notes one Octave above the other two, say the E and the
B.
Perhaps the fact that it is called a "seventh" is a red herring: that is, perhaps the seventh in the chord is not
interesting because it is harmonic seven of the fundamental. Perhaps it is there for another reason. As Ben
Stolorow also pointed out to me [Stolorow, c. 2006]:
The Third and the Seventh of a Major Seventh chord form a second Perfect Fifth interval.
In fact, there is a square of Fifths and Thirds:
C -- Fifth -- G
|
|
|
|
Third
Third
|
|
|
|
E -- Fifth -- B
Note that there are two sub Triads, one Major and one Minor:
The C Major Triad.
The E Minor Triad.
The Major Seventh has always had a "cool Jazz" sound to me -- I can hear something, it is a bit of a thin,
distant flavor having just a hint of sweetness -- like one of those drinks in an upscale organic juice bar that
you've never heard of -- I'm thinking celery and watermelon rind with a touch of pomegranate.
But then what about C Minor (Dominant) 7? It doesn't seem to fit into any Harmonic Series no matter how I
look at it. There are also other many other weird chords that occur in Jazz pieces that are too rare to make it
into my chord dictionary but show up in actual music, such as C7#5: C Dominant 7 with a sharped 5th. There
are two things all of these chords have in common
they are relatively rare, and
they sound rather weird and off, even more so than a Minor Triad (probably related to the fact that they
are rare!).
Explanation? They are probably all just fragments of various sets of intervals that occur in the Harmonic
Series, with no attempt to actually represent the series, or they combine that with some Harmonic Series
ambiguity as does the Minor Triad above. We discuss a possible further effect in Section 7.2 "The Role of
Narrative Generally" where we consider the importance of content providing context for further content,
thereby creating narrative. In sum, it is likely that the brain has one disambiguation engine and that the
processing that occurs in verbal narrative would process similarly in other contexts, such as music. So, while
these chords may sound strange in isolation, the theme created by the preceding music before the chord may
bring a certain sense to them. Think of one standard structure for a joke: a story (creating a theme) and then a
punchline; the punchline would not be funny in isolation without the context provided by the story, and yet we
attribute the funniness of the joke to the punchline and not the story which did the work. Theme and ambiguity
may alternate throughout a narrative, repeating this effect. Investigating this process is left as future work.
Chords Preserving Intervals but not Harmonics
Notes in C Semi-tones from
Name
Analysis of composition
Major Scale
fundamental
Cm6
A Minor Triad plus another Minor Third interval
"Minor
C-Eb-G-A
0-3-7-9
(from the ninth to the fundamental one Octave
Sixth"
up).
Cmaj7
"Major
Seventh"
Cmaj9
"Major
Ninth"
C-E-G-B
C-E-G-B-D
0-4-7-11
Almost the Harmonic Series up to the 7th, but

worse approximation for the 7th chosen; also
forms two Perfect Fifths.
0-4-7-11-14
Almost the Harmonic Series up to the ninth, but

worse approximation for the 7th chosen; also
forms two Perfect Fifths; also forms a Major
Triad starting at the Fifth!
Other stuff in Jazz too weird to make it into the chord dictionary also likely exploits distant
harmonics, subtle ambiguities, or effects of context created by preceding content which cannot
be appreciated in isolation....
4 Miscellaneous Objections
Here are my thoughts on a few objections that I anticipate.
4.1 But what about the Circle of Fifths!

Some readers who have been abused since childhood by music "theory" have pointed out to me that I have
entirely omitted any mention of the "Circle of Fifths". They are right. First of all, what is this Circle anyway?
4.1.1 Fifths make a Circle

If you start at, say, Middle C, and go up by a Fifth you get to G. If you keep going in this manner but be sure to
stay within a single Octave by sometimes (about every-other time) dropping down by an Octave (which of
course retains the same note name, since note names cycle every Octave), then eventually
the notes will repeat and you will come back to Middle C, and
you will have visited every note in the Western Twelve-(Semi-)Tone Scale exactly once along the way.
(You visit every note because a Fifth is 7 half-steps, the Octave is 12 half-steps, and 7 is "relatively prime" to
12: they have no common divisors.)
This sequence is called the "Circle of Fifths" and is so important to know as a musician that I have spent a so
long drilling it into my head that I will type it in now from memory without hesitation. (Note that, given that
the C Major Scale is the F, C, and G Major Triads linked together at Fifths, you get the cleanest presentation in
C Major by starting one Fifth below C, at F; note further that splitting the sequence across two lines
emphasizes an internal almost-symmetry with sharps):
F
F#
C
C#
G
G#
D
D#
A
A#
E
B
(...and then repeats: E# = F).
Cool!
As we saw above, of all the intervals, the Fifth is the most
sweet (near the bottom of the Harmonic Series), and
interesting (not so harmonic as to be boring; that is, not the Octave interval).
As we discuss in the next section, by using the Fifth (and, well, of course the Octave) repeatedly we get all the
notes on our piano! This Circle of Fifths thing must be fundamentally important to the nature of how sounds
sound musical or something!
4.1.2 The Circle of Fifths is Just a Combinatorial Coincidence

Nope. The Circle of Fifths is not fundamental or important or anything to the nature of sounds sounding
musical. The Circle of Fifths is just a huge red herring that prevents people from understanding harmony, or at
least how it is that harmony sounds good.
The Circle is just a combinatorial coincidence. We went up by a Fifth, a factor of 3/2, twelve times, and if you
count carefully we went down by an Octave, a factor of 1/2, an additional seven times, and we got back to the
"same" note, or a ratio to that same note that is very close to one (actually within 1.4%). In other words, we
have discovered that:
3^12 = 531441
almost equals
19
2^
= 524288;
that is,
3^12 / 2^19
= 1.0136432647705078125 (exact!)
Wow is that close to 1. When that small amount of error is spread out evenly over twelve Fifth intervals, you
can see how the Equally Tempered Scale is rather appealing: it gets Fifths almost exactly right, to within
almost a tenth of a percent:
((3^12 / 2^19) - 1) / 12
= 0.0136432647705078125 / 12
~ 0.0011369387308756511
~ 0.1%.
By the way, if you write a computer program to look for other amazingly-close collisions of powers of two and
three with reasonably low exponents (I have), you won't find any.
Amazingly, (1) picking twelve notes makes the most important harmonic, number 3, be the fifth white key, and
(2) humans have five fingers on one hand, making the most important harmonic also the one that is easiest to
play.
4.1.3 The Circle of Fifths Allows for Cool Chord Transitions

The amazing fact of the existence of the Circle does not explain how it emerges that chords sound good,
however it is still cool because it does allow you play around with chords in fun ways, as follows.
A "chord transition" is playing one chord and then another (or playing in sequence notes from one and then
the other -- see the discussion of melody and harmony in Section 7 "Future Work: Towards A Unifying Theory
of Music"). Musicians would often like to make transitions between chords that are not too jarring, or sound
"natural". One really common way to do this is for the second chord to have as its Root a note of the first
chord, such as the Fifth (as we did above when constructing the Major Scale, going from C Major to G Major)
or Third, or to do the reverse (as when going from C Major to F Major). Note that while we call it the Circle of
Fifths, really it's just another way of looking at the Western 12-Tone Scale, which also contains Circle(s) of
other intervals such as Thirds. The basic point of the coolness of the Circle of Fifths is the fun you can have
with chord transitions.
You can repeatedly make transitions from one chord to another using the three natural intervals of the
Major and Minor Triads -- Fifths (and their inverse, Fourths), Major Thirds, Minor Thirds -- and you will
not fall out of the twelve notes generated by the Circle -- well, you won't if you use if you use an Equal
Tempered Tuning anyway, or if you fudge some of the notes along the way -- recall that the Thirds don't
work out so well in Equal Temperament.
Notice that moving by either a Fifth (or equivalently, a Fourth), a Major Third, or a Minor Third, you get
you to different places on the Circle, and therefore by combining these different transitions, you can go
one direction in the "harmonic space", and come back by another direction!.
This kind of thing is what allows musicians to make interesting chord progressions, which are a fundamental
aspect of Western (and likely all) music. Just look at a Jazz Fake Book (see [Neely1999]) and you will see the
essence of a piece pared down to the minimum: it consists of a chord progression, a melody line, some rhythm,
and maybe some lyrics. But it is clear that the chord progression and rhythm are more fundamental than the
melody: I have twice now asked two different Jazz pianists, after they had just finished a piece, "When you are
improvising while playing, what are you really doing?" Both times I got the same answer "I know the melody,
and I'm not playing it." The only way to be playing something that goes with the melody is to play a different
set of notes that are harmonically related to the melody, and, of course, in a related rhythm.
4.1.4 The Symmetries of the Circle of Fifths are a Terrible Red Herring
Michael O'Donnell points out the chord progression possibilities here (addressed to me) [O'Donnell, 14
January 2009]:
Many years ago, there was an article in the Computer Music Journal about the value of the
diatonic scale as a subset of the 12-tone chromatic scale. The article dealt entirely with grouptheoretic structural properties of the notes under the half-step and perfect-fifth generators. It
deliberately left out acoustical/perceptual issues. I think that there is something to learn from this
structural study, and from connecting it to acoustical/perceptual issues, even though the authors
did not even claim to have solved some defined problem. Roughly speaking, while you go into
reasons why the major and minor thirds sound nice, this article discussed why they yield a lot of
structurally interesting harmonic progressions.
Using symmetries to create chord progressions may be interesting, however no amount of symmetries is going
to explain how it works that the Fifth sounds good in and of itself; that is, no one wants interesting chord
progressions made out of chords that don't sound like anything. As people love symmetry so much, the Circle
of Fifths makes for a powerful red herring. Don't fall for it. Mark [Hoemmen, 22 October 2011] again:
The issue there is that circle of fifths transitions have become perhaps a cultural expectation[.] So
there's an interaction between what people expect culturally, vs. what their brains expect
biologically.
4.2 But Other Cultures Have Different Musical Scales!

Many people object to a universal theory of music and claim that it is all "culturally relative". Such people
aren't really paying attention, and more evidence will likely be needed to convince them. From [Levitin2006,
p. 37]:
Nearly all this variation in context and sound comes from different ways of dividing up the octave
and, in virtually every case we know of, dividing it up into no more than twelve tones. Although it
has been claimed that Indian and Arab-Persian music use "microtuning" -- scales with intervals
much smaller than a semitone -- close analysis reveals that their scales also rely on twelve or
fewer tones and the others are simply expressive variations, glissandos (continuous glides from
one tone to another), and momentary passing tones, similar to the American blues tradition of
sliding into a note for emotional purposes.
The choice of twelve notes to a scale is likely to simply be the amount of complexity that the brain can handle
in its note-expectation engine. A scale (such as the C Major Scale) consists of several interlocking or
overlapping chords between which one can move while playing notes. However if the brain really wants to
expect and predict some outcomes, too many notes makes that too hard. Recall our discussion in Section 2.4
"Interestingness: Just Enough Complexity"; also see Section 7.2 "The Role of Narrative Generally" below
where we discuss how theme and ambiguity are used together to create narrative.
4.2.1 A Culture May Simply not be Fully Exploiting All of the Universal Harmonic
Features
Suppose the music of another culture is missing some important feature emerging from our canonical
derivation of chords and harmony. Does that mean the music of that culture is a counter-example to our
argument? Maybe not: just because the brain is capable of experiencing something does not mean that the art
of that culture has taken advantage of that fact.
That is, the situation could be as follows:
The brain is listening for certain canonical patterns induced by the Harmonic Series.
The brain induces processing artifacts such as relative pitch (listening for the ratios between tones/notes)
and octaves (listening for tones/notes normalized by factors of 2).
The brain is listening for parts of a feature vector firing.
However, a culture may not fully exploit all of these features in its tradition. An engineer might say that
features above that we claim to be universal to human hearing define the shape of the parameter space, but a
particular design need not make use of the whole space.
Many cultures use only five notes in their scale; any such scale is called "Pentatonic" [pen]. For example, one
of these Pentatonic scales is the Major Pentatonic: the set of notes we get by starting at one note and going
around the Circle of Fifths four more times, resulting in five notes (starting with C, that would be C, G, D, A,
E). From [pen]:
Pentatonic scales are very common and are found all over the world....
Much African and Chinese music makes use of Pentatonic scales and yet African and Chinese people seem to
have no difficulty enjoying Western music. Their brains were always capable of hearing Western music, but
their culture has simply never made use of the rest of the available parameter space.
4.2.2 But The Nasca People Of Peru Use A Linear, Not A Logarithmic, Scale!
From [acoustical-demo, Demo 18], "Logarithmic and Linear Frequency Scales":
A musical scale is a succession of notes arranged in ascending or descending order. Most musical
composition is based on scales, the most common ones being those with five notes (pentatonic),
twelve notes (chromatic), or seven notes (major and minor diatonic, Dorian and Lydian modes,
etc.). Western music divides the Octave into 12 steps called semitones. All the semitones in an
Octave constitute a chromatic scale or 12-tone scale. However, most music makes use of a scale
of seven selected notes, designated as either a major scale or a minor scale and carrying the note
name of the lowest note. For example, the C-major scale is played on the piano by beginning with
any C and playing white keys until another C is reached.
Other musical cultures use different scales. The pentatonic or five-tone scale, for example, is basic
to Chinese music but also appears in Celtic and Native American music. A few cultures, such as
the Nasca Indians of Peru, have based their music on linear scales (Haeberli, 1979), but these are
rare. Most music is based on logarithmic (steps of equal frequency ratio f / f) rather than linear
(steps of equal frequency f) scales.
In this demonstration we compare both 7-step diatonic and 12-step chromatic scales with linear
and logarithmic steps.
Who knows what is going on with the linear scales of the Nasca. I suspect the following.
Linear-scale flutes are easy to make as the lengths of the tubes of a linear-scale pan-flute are increasing
linearly, rather than exponentially; in contrast, to make, say, a pan flute you must make tube of
exponentially-related lengths, which is probably quite unintuitive the first time someone has to work that
out.
This culture was small and isolated and so no one ever noticed harmony and no one nearby showed it to
them; that is, innovation seems likely to be proportional to the number of people available to have new
ideas, but notice that the interesting number here is the number of people who are all in communication
with one another.
Let's consider the cultural evolution of visual instead of auditory processing for a moment. Cultures introduce
color names in an almost deterministic fashion. From [Feldman2006, p. 102-103]:
Berlin and Kay (Berlin et al. 1969) showed that, in languages around the world, basic color terms
had essentially the same focal colors, even though boundaries around color categories varied. The
neurophysiology of color vision was seen as directly providing the best explanation. There are
now a number of competing explanations for the commonality of focal colors, but the are all
based on embodiment (Kay et al. 2005).
... In the 1950s, color names were believed to be arbitrary in different languages. The assumption
was that you couldn't predict the ranges of these different color terms. Paul Kay and Brent Berlin
did a study in which they asked whether the boundaries of color terms were and also what colors
from a color chart where the best examples of each term. Between their own experiments and the
literature, they surveyed about 100 languages. They found that the boundaries for different
languages were somewhat different, but the best examples where quite similar. This study has
since been greatly expanded and the basic result confirmed (Kay et al. 2005).
... There is also considerable evidence on how the color word system evolves over time -- usually
when its community encounters other languages. Figure 8.1 outlines the development as speakers
of a language (like Dani) that has only two color words come to express further distinctions.
Systematically, when a third word is added, it distinguishes white from warm; a fourth term will
separate black from cool, and so on. Since this progression appears to hold very widely, it is rather
further evidence that human color terms are anything but arbitrary.
Where the figure 8.1 referred to above looks like this:
light-warm ---> white ---> white ---> white ---> white
\
\-> warm ---> warm ---> red
---> red
\
\--> yellow ---> yellow
dark-cool
(2 Terms)
---> dark- ---> black ---> black

cool \
\-> cool ---> cool
(3 Terms)
(4 Terms)
---> black
---> green
\
\--> blue
(5 Terms)
(6 Terms)
(I should add that in class Feldman said that this progression of colors was not completely deterministic: there
are two paths through the space of color sophistication [Feldman, c. 2006]. Canonicality of algorithm has
limits; the main point still holds.)
Perhaps musical sophistication of a culture proceeds similarly:

1. Linear Scale, because it is easy to carve flutes that way?
2. (Circle of Fifths) Pentatonic Scale, so you have some Fifths to play with and you notice that Fifths
sound good (any scale having 5 notes is "pentatonic", so we mean the one made by four Fifths linked
together).
3. Major (Diatonic) Scale, so you can get more overtones such as the Third.
4. Equally-Tempered Scale, so you can do arbitrary key changes.
4.3 But You Can Make a Piece of Music Based Entirely on That
Utterly Un-Harmonic Interval, the Augmented Fourth!
Recall the diagram of the three interlocking Major Triads laid out all in one Octave with C as the fundamental.
There is a huge gap between F and G, which in the Major Scale we fill in with a black note F#/Gb. That results
in an interval between C and F# called the "Augmented Fourth" (or "Diminished Fifth"), as it is the Fourth, F,
plus a semi-tone (or the Fifth, G, minus a semi-tone).
Play C and F# on a piano; it sounds awful. This interval is also called the Tritone as the distance between C
and F# is three whole tones (where here "tone" means a distance of two semi-tones, so a distance of six
semi-tones). We can see how it emerges that it sounds so bad: the ratio between F# and C it isn't near that of
any of the harmonics in the Harmonic Series. This interval deserves its nickname as the Devil's Interval.
I have heard the following argument from a music student: all music must be culturally-relative (as opposed to
the universal, physics-and-computation explanation we give) because someone has even written a piece of
music based entirely in the most un-harmonic of intervals, the Augmented Fourth, and gotten away with it.
There are people who can abuse themselves to the point of re-calibrating their expectations to all kinds of
strange inputs, including thinking that getting beaten with whips is fun or that McDonald's tastes good. That
doesn't mean that those inputs are natural or good or beautiful or true. Ben Franklin pointed this out in a letter
to Lord Kames, June 2, 1765 [Franklin]:
[T]he Pleasure Artists feel in hearing much of that compos'd in the modern Taste, is not the
natural Pleasure arising from Melody or Harmony of Sounds, but of the same kind with the
Pleasure we feel on seeing the surprizing Feats of Tumblers and Rope Dancers, who execute
difficult Things. For my part, I take this to be really the Case and suppose it the Reason why those
who being unpractis'd in Music, and therefore unacquainted with those Difficulties, have little or
no Pleasure in hearing this Music. Many Pieces of it are mere Compositions of Tricks. I have
sometimes at a Concert attended by a common Audience plac'd myself so as to see all their Faces,
and observ'd no Signs of Pleasure in them during the Performance of much that was admir'd by
the Performers themselves; while a plain old Scottish Tune, which they disdain'd and could
scarcely be prevail'd on to play, gave manifest and general Delight.
4.4 But I've Been a Musician All My Life / Studied Music In College
and I've Never Heard Any of This Before!
Michael O'Donnell again [O'Donnell, 14 January 2009]:
I think that most "Music Theory" as taught in music departments is intended more as a
development of descriptive terminology and notation than as an explanatory theory. I think that a
lot of humanists don't understand the difference between description and explanation.
Amen to that. But what makes me angry is that music teachers do not even MENTION that there is a whole
science of the perception of sound and if you would like to know how it really works, then you should talk to
them. They basically lie by omission, which is how it ended up taking me several decades to figure all of this
out.
Recent music theory textbooks continue to be utterly exemplary in the physical science and yet fall on their
face completely when it comes to the computational nature of the brain. Catherine Schmidt-Jones [Schmidtmusic-theory] again:
Why are some note combinations consonant and some dissonant? Preferences for certain sounds
is partly cultural; that's one of the reasons why the traditional musics of various cultures can
sound so different from each other. Even within the tradition of Western music, opinions about
what is unpleasantly dissonant have changed a great deal over the centuries. But consonance and
dissonance do also have a strong physical basis in nature.
In simplest terms, the sound waves of consonant notes "fit" together much better than the sound
waves of dissonant notes. For example, if two notes are an octave apart, there will be exactly two
waves of one note for every one wave of the other note. If there are two and a tenth waves or
eleven twelfths of a wave of one note for every wave of another note, they don't fit together as
well. For much more about the physical basis of consonance and dissonance, see Acoustics for
Music Theory, Harmonic Series, and Tuning Systems.
Nope. There are way too many asymmetric phenomena left unexplained by this overly-simplified theory of
just whole numbers dividing each other, two major ones being (1) the feeling that the Sustained Chord should
"resolve" to the Major Triad, and (2) the difference in the feeling of Major and Minor Triads. See Section 5.1
"Helmholtz's Theory Relies Only On Interfering Overtones, But Harmony Is Something More" for more
details.
5 Helmholtz Fails to Fully Explain Harmony

After writing the first version of this article I read "This is Your Brain on Music" by Daniel J. Levitin
[Levitin2006] and met him when he spoke on his book at Black Oak Books in Berkeley. I said "I've derived
the Major Scale from first principles!" He replied in that knowing academic manner of a professor speaking to
a crackpot [Levitin, November 2006]: "Oh, it's been done by (19th-century German Physicist Hermann)
Helmholtz." I said "would you be willing to read my article on it?" "No," he replied. (At least he signed my
copy of the book.)
Indeed, Helmholtz's theory of harmony, detailed in his treatise on the subject "On the Sensations of Tone as a
Physiological Basis for the Theory of Music" [Helmholtz1863], seems to be the prevailing view of anyone I
have spoken to about the question "how does harmony work?". I therefore feel obligated to attempt to refute
it. (Note that I only attempt to refute Helmholtz's theory of harmony of one chord; given the Major and Minor
Triads, Helmholtz seems to give the same theory of a key as interlocking chords [Helmholtz1863, p. 300] as
we do above, though I thought of it independently.)
We need yet another acoustical phenomenon in order to introduce Helmholtz's theory, called "beating" [beat]:
[A] beat is an interference between two sounds of slightly different frequencies, perceived as
periodic variations in volume whose rate is the difference between the two frequencies.... Beating
can also be heard between notes that are near to, but not exactly, a harmonic interval, due to some
harmonic of the first note beating with a harmonic of the second note.
As far as I can tell from studying Helmholtz's treatise, his theory is the following.
Notes have overtones, or, as Helmholtz calls them, "upper partial tones". (Indeed the word "overtone" in
English seems to come from a mis-translation of what Helmholtz intended to call "upper tone" in
German.) Helmholtz also points out the phenomenon of beating.
Intervals of notes are dissonant (sound bad) due to (1) beating of (2) their overtones. This seems to be
the case when the ratio of their frequencies are not a ratio of small whole numbers; for more, see the
extensive quote in Section 5.3 "Helmholtz's Theory is that Pleasure is Only the Absence of Pain" where
Helmholtz enumerates intervals by increasing dissonance.
Intervals are consonant (sound good) if their notes are not dissonant. Further, chords are consonant if
the intervals of their notes are pairwise consonant.
However there is a bit of a difficulty in refuting Helmholtz: the theory so misses the point that it is rather hard
to do. If someone makes an argument with a small error, one can point out where he left the path of sense: it is
a local error which shows up clearly against a background of otherwise correct reasoning. On the other hand,
if someone is completely off, pointing out their error is much harder. I am reminded of the Mad Hatter's
question "Why is a raven like a writing desk?" (from "Alice in Wonderland" by Lewis Carroll [Carroll1865]).
A raven is not like a writing desk, and so much not so that it is a little hard to say exactly how it is not. I shall
attempt it anyway.
Helmholtz did not have Access to Computer Science
Before we do that, I should point out how it is that someone as intelligent and scientifically trained as
Helmholtz could produce a theory that just misses the point on harmony. The problem is that he was only a
physical scientist. Physical science is a tremendous breakthrough for human beings and physical scientists
know this; many questions that for millennia were the subject of superstition and wild speculation were finally
clarified, and in a way that could be verified on the anvil of experiment. This lead not just to new
understanding, but new engineering, and from this we have built a wholly new world. It is a triumph that can
hardly be overstated.
However, such success can induce arrogance and therefore blindness. Computation science is the next
breakthrough, its wave washing over us even now. The science of complexity and emergent behavior,
"Computer Science" if you will, is Science 2.0 and its insights again bring new light to heretofore mysterious
questions. Physical scientists say that Computer Science is not a science, but they are wrong:
computation/algorithm is a phenomenon of nature that surrounds us and has stable, though often mysterious,
properties that can be reliably verified by experiment. It is a tradition in Music Theory to resort only to physics
for the explanations; however this is a mistake, as using only physical science ignores
the computational nature of the brain,
how much we really do know about the canonicality of the space of algorithm itself,
and how much the brain must be using the canonical algorithms to solve the very difficult problem of
surviving and thriving in everyday life using very constrained resources.
It is only because of this new computational understanding that this article before you can even be written.
5.1 Helmholtz's Theory Relies Only On Interfering Overtones, But

Harmony Is Something More
"On the Sensations of Tone" by Hermann Helmholtz [Helmholtz1863, p. 211] (emphasis in the original):
The first problem is to determine under what conditions chords are consonant, in which case they
are termed concords. It is quite clear that the first condition of a concord is that each tone of it
should form a consonance with each of the other tones ; for if any two tones formed a dissonance,
beats would arise destroying the tunefulness of the chord. Concords of three tones are readily
found by taking two consonant intervals to any one fundamental tone as c, and then seeing
whether the new third interval between the two new tones, which is thus produced, is also
consonant. If this is the case each one of the three tones forms a consonant interval with each one
of the other two, and the chord is consonant, or is a concord.
Now compare to this passage from the "Auditory Demonstrations" CD [acoustical-demo, Demo 31], "Tones
and Tuning with Stretched Partials" (underlining added):
Most tonal musical instruments used in Western culture have spectra that are exactly or nearly
harmonic. Tone scales used in Western music are also based on intervals or frequency ratios of
simple integers, such as the octave (2:1), the fifth (3:2), and the major third (5:4). When several
instruments play a consonant chord such as a major triad (do-mi-so), harmonics will match so that
no beats will occur. If one changes the melodic scale to, for instance, a scale of equal
temperament with 13 tones in an octave, harmonics will have nearly but not exactly the same
frequency. Similarly, instruments with a nonharmonic overtone structures, such as conventional
carillon bells, can create unpleasant beat sensations even if their pitches are tuned to a natural
scale. Beats will not occur, however, if the melodic scale of an instrument's tones matches the
overtone structures of those tones.
First, a four-part chorale ("Als der gtige Gott") by J.S. Bach is played on a synthesized piano-like
instrument whose tones have 9 exactly harmonic partials with amplitudes inversely proportional to
harmonic number, and with exponential time decay. The melodic scale used is equally tempered,
with semitone frequency ratios of 122 [the twelfth root of 2].

In the second example, the same piece is played with equally stretched harmonic as well as
melodic frequency ratios. The harmonics of each tone have been uniformly stretched on a
log-frequency scale such that the second harmonic is 2.1 times the fundamental frequency, the 4th
harmonic 4.41 times the fundamental, etc. The melodic scale is similarly tuned in such a way that
each "semitone" step represents a frequency ratio of 122.1 [the twelfth root of 2.1]. The music is,
in a sense not dissonant because no beats occur. Nevertheless the harmonies may sound less
consonant than the did in the first demonstration. This suggests that the presence or absence of
beats is not the only criterion for consonance. The listener may also find it difficult to tell how
many voices or parts the chorale has, since notes seem to have lost their gestalt due to the
inharmonicity of their partials.
In the third example, the tones are made exactly harmonic again, but the melodic scale remains
stretched to an "octave" ratio of 2.1. Disturbing beats are heard, but the four voices have regained
their gestalt. The piece sounds as if it is played on an out-of-tune instrument.
In the final example, the harmonics of all tones are stretched, as was done in example 2, but the
melodic scale is one of equal temperament based on an octave ratio of 2.0. Again there are
annoying beats. This time, however, it is again very difficult to hear how many voices the chorale
has.
In other words, the above Demo 31 tests Helmholtz's assertion that what makes two notes consonant is that
when played together the two overtone series "fit" into one another, rather than being off and making "beats".
Their test works like this: using a computer we can make an artificial piano-like instrument that has any
properties we want, including those the laws of physics would prohibit, such as making an overtone series that
is stretched out. That is, the overtones are not in the Harmonic Series having the ratios 1, 2, 3, 4, 5... but from
in a series of those numbers but then multiplied by a constant; here the constant makes an octave a factor of
2.1 instead of the usual factor of 2. (I am thinking therefore that the constant must be 2.1 / 2 = 1.05). In the
test, they vary two variables independently: (a) harmonics (overtones of the notes played): normal overtone
series versus stretched, and (b) melodics (fundamental tones of the notes played): normal scale versus
stretched scale. They play the same piece in four combinations.
1. harmonics normal, melodics normal,
2. harmonics stretched, melodics stretched,
3. harmonics normal, melodics stretched,
4. harmonics stretched, melodics normal.
If Helmholtz were right about harmony, then it seems that part 2 (both harmonics and melodics stretched the
same) would sound just as good as part 1 (normal). However the commentator points out that this is simply not
the case. The commentator further points out in the underlined sentence that this is bad news for the accepted
theory of Helmholtz: "This suggests that the presence or absence of beats is not the only criterion for
consonance." That is: (a) Helmholtz is right that harmonies minimize beats, and (b) if we stretch both the
harmony and the melody the same amount, we do not get beats, but (c) we have also lost some harmony, so
something else must be going on to create harmony.
That said, note that we are claiming that the theory of Helmholtz is incomplete, not that it is completely wrong:
it seems that overtones interfering with each other likely do matter, as suggested by the observation on beats in
the quoted text of Demo 31 above. If they did not, then our theory of pure vertical intervals from Section 2.3.2
"Harmony Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical Across the Notes" would
imply that part 4 of Demo 31 would sound as good as part 1; however, as the commentator points out, it does
not. Note that, in particular, it does not sound good due to beats (again, according to the commentator), which
seems to be what Helmholtz would suggest. Below, Terhardt [Terhardt1974-PCH] also agrees with this point
that the absence of beats/roughness does not cause consonance, however the presence of beats/roughness does
cause dissonance; see Section 6.1 "Terhardt Recognizes that the Brain is Listening For Something". Therefore,
again, the theory of Helmholtz isn't really wrong; it is however incomplete, and rather strikingly so. See
Section 2.3.3 "Vertical Intervals Have Pure Ratios" for more discussion on this point.
5.2 Helmholtz's Theory Doesn't Imply Virtual Pitch

A secondary feature of Demo 31 above is that in exactly those cases where the harmonics are stretched, the
commentator says the notes "loose their gestalt", suggesting very strongly that the brain is hard-wired to listen
for the (normal) Harmonic Series as we discussed in Section 2.1.1 "Virtual Pitch: Hearing the Harmonic Series
Even When it is Not There".
We can already see here that Helmholtz's analysis is in contradiction to our theory: he sees the "tunefulness" of
a chord as being reductionistic, or no more than the sum of the parts, whereas our theory is holistic, that the
brain is listening to the whole chord which is somehow more than just the sum of the parts. The theory of
Helmholtz has no way of explaining Virtual Pitch nor the phenomenon observed by musicians that one may
omit the Root of the chord(!) and the listener will still hear the chord which we discuss and explain Section
3.5.4 "Chords Inducing Ambiguity".
Such a phenomenon can only be explained by the brain listening for whole chords (as abstracted Harmonic
Series-es). Helmholtz's theory that notes sound good together because, well, in combination they weren't so
terribly dissonant, proposes no mechanism whatsoever for a tone/note being added because it was found to be
missing.
5.3 Helmholtz's Theory is that Pleasure is Only the Absence of Pain

But what is Helmholtz's definition of these consonant intervals that are so important to his theory? From
[Helmholtz1863, p. 181] (emphasis in the original):
Two musical tones, therefore, which stand in the relation of a perfect Octave [2/1], a perfect
Twelfth [3/1], or a perfect Fifth [3/2], go on sounding uniformly without disturbance, and are thus
distinguished from the next adjacent intervals, imperfect Octaves and Fifths, of which a part of
the tone breaks up into distinct pulses, and consequently the two tones do not continue to sound
without interruption. For this reason the perfect Octave, Twelfth, and Fifth will be called
consonant intervals in contradistinction to the next adjacent intervals, which are termed
dissonant. Although these names were given long ago, long before anything was known about
upper partial tones and their beats, they give a very correct notion of the essential character of the
phenomenon which consist in the undisturbed or disturbed coexistence of sounds.
Well, there isn't much science here other than the proclamation that certain ratios are consonant because
everybody says so. But notice below that we are to accept on faith that an interval having a ratio 3/2 is
consonant, but an interval having a ratio of, say, 4/3 is not quite consonant; I suppose Helmholtz simply
considers this to be an empirical observation. For completeness I have included his entire taxonomy of
intervals and their qualities. From [Helmholtz1863, p. 194] (emphasis in the original):
When two musical tones are sounded at the same time, their united sound is generally disturbed
by the beats of the upper partials, so that a greater or less part of the whole mass of sound is
broken up into pulses of tone, and the joint effect is rough. This relation is called Dissonance.
But there are certain determinate ratios between pitch numbers, for which this rule suffers an
exception, and either no beats at all are formed, or at least only such as have so little intensity that
they produce no unpleasant disturbance of the united sound. These exceptional cases are called
Consonances.
1. The most perfect consonances are those that have been here called absolute, in which the
prime tone of one of the combined notes coincides with some partial tone of the other. To this
group belong the Octave, Twelfth, and double Octave.
2. Next follow the Fifth and the Fourth, which may be called perfect consonances, because they
may be used in all parts of the scale without any important disturbance of harmoniousness. The
Fourth is the less perfect consonance and approaches those of the next group. It owes its
superiority in musical practice simply to its being the defect of a Fifth from an Octave, a
circumstance to which we shall return in a later chapter.
3. The next group consists of the major Sixth and the major Third, which may be called medial
consonances. The old writers on harmony considered them as imperfect consonances. In lower
parts of the scale the disturbance of the harmoniousness is very sensible, but in the higher
positions it disappears, because the beats are too rapid to be sensible. But each, in good musical
qualities of tone, is independently characterized, by the fact that any little defect in its intonation
produces sensible beats of the upper partials, and consequently each interval is sharply separated
form all adjacent intervals.
4. The imperfect consonances, consisting of the minor Third and the minor Sixth, are not in
general independently characterized, because in good musical qualities of tone the partial on
which their definition depend are often not found for the minor Third, and are generally absent of
the minor Sixth, so that small imperfections in the intonation of these intervals do not necessarily
produce beats.
5.3.1 Harmony is Rapture

Helmholtz defines the pleasure of a chord as the absence of the pain of the beats of the overtones. However,
pleasure is not simply the absence of pain, or otherwise we should prefer silence to any music at all.
Sometimes we enjoy silence, but sometimes we really enjoy music.
Further, I and many others experience an intense joy at the sound of harmony, especially vocal harmony; see
Section 2.3.6 "Harmony is Sweeter Than Sweet" above. If this joy were merely the absence of the pain of
dissonance, how is it that I not simply prefer the sound of once one single voice? Here is a song that describes
the feeling well, "Harmony" [Grammer-harmony]:
Singin' Har-ar-ar...mony.
One voice makes me want to sing,
Two voices make me feel like a king,
Three voices gettin' out of hand,
Four like to take me to the promised land.
Singin' Har-ar-ar...mony,
These lyrics describing the power of harmony are not mere fanciful imagination. When I sang in the Men's
Barbershop group Cable Car Chorus I had I similar experience: harmony wasn't just nice and lacking in pain, it
was rapture. Once I was being given a ride back to town from practice with one of the senior members Charles
Feltman (now director) and the then-director Bill Ganz. I was searching for a word to describe the unique
feeling of singing four-part harmony. After a while I said the feeling was "visceral". They both immediately
exclaimed that that was the same word that they themselves had eventually converged upon after a similar
deliberation [Feltman, Ganz, c. mid 1990s] . Harmony is a uniquely powerful force for people. The mere lack
of a few interfering upper harmonics can't possibly produce from absence of annoyance such a powerful
presence of visceral rapture.
5.4 Helmholtz's Theory Fails to Fully Explain the Qualitative

Difference Between the Major and Minor Triads
People experience the Minor Triad as sounding quite different from the Major Triad. Yet Helmholtz's primary
theory, namely that the "tunefulness" of a chord is a function of the consonance and dissonance of the
intervals induced by the chord, has problems here: as we saw above, the Major and Minor Triads induce the
exact same three intervals (in a different order)! Can he rescue himself by finding some other difference
between them? From [Helmholtz1863, p. 189]:
The precedence given to the Fourth over the major Sixth and major Third, is rather due to its
being the inversion of the Fifth than to its own inherent harmoniousness.
Wow is that strange. He speaks as if there are two ways of hearing the Fourth: hearing its "inherent"
harmoniousness, and then hearing it as an inversion of the Fifth. Yet, his theory gives no explanation at all as
to how an interval that is an inverted form of another (and shifted down an Octave) should be heard as the
same interval or as a related interval to the original: his whole theory is the interference of partials, which as
far as I can tell allows for no notion of recognizing one interval as an inversion of another. (He speaks more of
inversions on p. 213, but with no more justification.)
From a section with the heading (perhaps given by the translator?) "Ambiguity of the Minor Chord",
[Helmholtz1863, p. 294] (note that here the square brackets are in the original):
In the minor chord c - e'b + g, the g is a constituent of the compound tones of both c and e'b.
Neither e'b nor c occurs in either of the other two compound tones c, g. Hence it is clear that g at
least is a dependent tone. But, on the other hand, this minor chord can be regarded either as a
compound tone of c with an added e'b or as a compound tone of e'b with an added c. Both views
are entertained at different times, but the fist usually prevails. If we regard the chord as the
compound tone of c, we find g for its third partial, while the foreign tone e'b only occupies the
place of the weak fifth partial e1. But if we regarded the chord as a compound tone of e'b,
although the weak fifth partial g would be properly represented, the stronger third partial, which
ought to be b'b, is replaced by the foreign tone c. Hence in modern music we usually find the
minor chord c - e'b + g treated as if its root or fundamental bass were c, so that the chord appears
as somewhat altered and obscured compound tone of c. But the cord also occurs in the position
e'b + g ... c (or better still as e'b + g ... c') even in the key of B'b major, as a substitute for the
chord of the subdominant e'b. Rameau then calls it the chord of the great Sixth [in English 'added
Sixth'], and, more correctly than most modern theoreticians, regards e'b as its fundamental bass.
He notices that the Minor Triad contains different parts of different Harmonic Series-es, but doesn't fit into any
of them.
Helmholtz's explanation of the Minor Triad is quite complex and odd for a phenomenon so
frequently used and fundamental to music as the Minor Triad. The explanation of Helmholtz
makes the Minor sound like some abortive thing that we could just as easily throw out, whereas
in music the Major and Minor are treated as a kind of duals of almost equal status. We assert
that clear and simple phenomena such as this should have clear and simple explanations.
Further, he concludes by justifying the Minor Third as really being an Added Sixth when a different note is
taken as the Root (or "fundamental bass") of the chord. But as we saw above, according to Helmholtz the
(Major) Sixth is considered to be a rather dissonant interval. So for what purpose use it at all?
Further, I suspect that he really cannot be right about Eb being the real Root of a Cm (C Minor) chord,
because otherwise it would make no sense to build Minor chords with higher harmonics unless all of those
higher harmonics were also named wrong and also by coincidence made another chord. For example, the Cm9
(C Minor ninth) chord would not make any sense as a ninth chord if C was not really the Root of the chord. If
indeed Eb is the Root of Cm9, then the chord is really EbMaj7(add6) or Eb6(add Maj7); I suppose that could
be the case but it sounds awfully ad-hoc to me. (Similarly Cm7 becomes Eb6.) Try voicing them that way
(putting the higher harmonics in higher Octaves) and see how it sounds. From [Helmholtz1863, p. 300]:
Minor chords do not represent the compound tone of their root as well as the major chords : their
Third, indeed, does not form any part of this compound tone. The dominant chord alone is major,
and it contains the two supplementary tones of the scale. Hence when these appear as constituents
of the dominant triad, and therefore of the compound tone of the dominant, they are connected
with the tonic by the close relationship of Fifths. On the other hand, the tonic and subdominant
triads do not simply represent the compound tones of the tonic and subdominant notes, but are
accompanied by Thirds which cannot be reduced to the close relationship of Fifths. The tones of
the Minor Scale can therefore not be harmonized in such a way as to link them with the tonic note
by so close a relationship as in the major mode.
So Helmholtz does seem to notice that (1) all notes in the Major Triad have a ratio to the Root that is in the
Harmonic Series whereas (2) this is not the case for the Minor Triad. Further, on p. 212 Helmholtz tries all
combinations of triads consisting of two intervals already "known" to be consonant with the Root, and notices
that the only ones which induce a third interval that is on his list of intervals known to be consonant with each
other are the Major and Minor Triads.
He almost gets to the truth: (1) he notices that the Minor Triad also has the same three intervals as the Major
Triad and (2) that the Major Triad "represents the compound tone of their root" better than the Minor. Yet he
never seems to put these two facts together into a coherent theory that the brain is listening for the presence of
something, namely an abstracted version of the overtone series of common sound-making devices (tubes and
strings, and in particular, voice) called the ideal Harmonic Series, rather than just an absence of colliding
"upper partials". That is, at best his theory explains how it is that Major sounds better than Minor, but it fails to
explain (a) how Major sounds so compelling and "right", even when inverted, (b) how Minor sounds like
something other than noise, like something I recognize, and yet simultaneously like something that is off or has
something missing about it, and (c) how it is that it might make sense that Major and Minor are afforded a kind
of almost equal dual status in music when his explanation for Minor is so convoluted and odd.
"Major" isn't just a combinatorial accident minimizing pain as Helmholtz would have us believe, instead
it is positively recognized as something.
"Minor" isn't just a more dissonant other combinatorial accident, instead it is both positively recognized
and yet not-right at the same time.
5.5 Helmholtz Isn't Really Wrong, He Just Fails To Be Really Right

I still feel as if I am trying to explain the difference between a raven and a writing desk: Helmholtz wanders
around for pages and pages with high verbosity computing ratios of notes. He then fails to come to some
coherent theory that really is more than intervals not of low whole number ratios sound bad because they
make overtones that make beats. While beats do sound bad, he takes his theory too far, attempting to use it to
explain compelling phenomena that it just does not explain.
1. Helmholtz explains the pleasure of harmony as the absence of pain, however it is clearly more, a
presence of something. Helmholtz does not explain virtual pitch or the related phenomenon that chords
can be played with the Root or even the Fifth omitted. We explain it simply in Section 3.5.4 "Chords
Inducing Ambiguity": the brain wants to hear the Harmonic Series.
2. As far as I know Helmholtz has no explanation for why the Sustained chord "wants" to resolve to the
Major Triad. Again, we explain it simply in Section 3.5.4 "Chords Inducing Ambiguity": the brain wants
to hear one Harmonic Series.
3. Helmholtz almost seems to hit upon the idea that the Minor Triad has the ratios of the Harmonic Series
that occur in the Major Triad, but he really just notices that both triads induce three intervals with low
whole number ratios. Further he remarks that the Minor Triad can be thought of with different notes as
the Root, making the Minor Third into the Major Sixth, but he doesn't say for what purpose we should
care about this interval at all, having also said the Major Sixth is rather dissonant anyway. His theory
has no explanation for how it is that this very odd thing called the Minor Triad seems to have about
equal status with the Major Triad in idiomatic musical usage. We explain it simply: the compellingness of
both Major and Minor comes from the Harmonic Series with the Minor functioning as a kind of
Auditory Cubism; see Section 3.4.2 "The Minor as Auditory Cubism".
Helmholtz never seems to be wrong, he just never finishes the job: his theory just stops before explaining the
observed phenomena. We can measure the ratios of the height and width of two cars but it won't explain how
they got into a crash. Helmholtz just fails to provide a simple and compelling explanation of harmony. There's
nothing wrong with that -- future generations will find errors and omissions in our work as well -- thus is the
nature of progress. That said however:
People should not be lulled into thinking that they have an answer when they do not -- that they
have a full understanding of harmony when they do not.
6 Other Modern Theories, such as Terhardt and

'Fusion or pattern matching' Theory
After deriving this whole theory I found some competing theories of consonance and dissonance to the theory
of Helmholtz which I felt obligated to investigate. From [maj-min]:
Advanced theory
[diagram] Minor as upside down major.
[diagram] Major and minor triads: The minor mode is considered the inverse of the major mode.
In the German theory by or derived from Hugo Riemann, the minor mode is considered the
inverse of the major mode, an upside down major scale based on (theoretical) undertones rather
than (actual) overtones (harmonics) (See also: Utonality). The "root" of the minor triad is thus
considered the top of the fifth, which, in the United States, is called "the" fifth. So in C minor, the
tonic root is actually G, and the leading tone is Ab (a halfstep), rather than, in major, the root
being C and the leading tone B (a halfstep).
This theory would be a great explanation of the Minor Triad if in reality there actually were undertones!
(People seem to be able to produce them in rather artificial circumstances which you never experience in daily
life and which therefore make no sense as an explanation: [undertone]) Or if someone could demonstrate some
computational process in the brain result resulting in an artifact of undertones, say, due to the exploitation of
some symmetry as an optimization (similar to the way we conjecture Octaves to be an artifact resulting from
an optimization) then there might be something here. However I see no optimization in the processing of
(normal, everyday) sound that would require reversing the low-high order of tones. So much for this theory.
From [con-dis]:
Fusion or pattern matching: fundamentals may be perceived through pattern matching of the
separately analyzed partials to a best-fit exact-harmonic template (Gerson & Goldstein, 1978) or
the best-fit subharmonic (Terhardt, 1974), or harmonics may be perceptually fused into one entity,
with dissonances being those intervals less likely to be mistaken for unisons, the imperfect
intervals, because of the multiple estimates, at perfect intervals, of fundamentals, for one
harmonic tone (Terhardt, 1974). By these definitions, inharmonic partials of otherwise harmonic
spectra are usually processed separately (Hartmann et al., 1990), unless frequency or amplitude
modulated coherently with the harmonic partials (McAdams, 1983). For some of these definitions,
neural firing supplies the data for pattern matching; see directly below (e.g., Moore, 1989; pp.
183187; Srulovicz & Goldstein, 1983).
Period length or neural-firing coincidence: with the length of periodic neural firing created by two
or more waveforms, higher simple numbers creating longer periods or lesser coincidence of neural
firing and thus dissonance (Patternson, 1986; Boomsliter & Creel, 1961; Meyer, 1898; Roederer,
1973, pp. 145-149). Purely harmonic tones cause neural firing exactly with the period or some
multiple of the pure tone.
The "Period length or neural-firing coincidence" theory seems easy to dispose of: how should it work that
coincidental firing of different tones should be something the brain is looking for particularly? Is it just a bug in
the brain that it is pleasurable when the periods of neural firings line up? There is no motivation.
However the "Fusion or pattern matching" theory seems much more interesting, so we will investigate it
further.
6.1 Terhardt Recognizes that the Brain is Listening For Something

I was unable to locate a copy of Terhardt, E. (1974) "On the perception of periodic sound fluctuations
(roughness)" Acustica 30 (4): 201213, however I did locate Terhardt "Pitch, consonance, and harmony" J.
Acoust Soc. Am., Vol.55, No.5, May 1974 (received 1973), pp. 1061-1069 [Terhardt1974-PCH] (thanks you
helpful university librarians! see the acknowledgements). This work is surprisingly relevant; here is the
abstract in full (emphasis in the original):
Comparison of recent psychoacoustic data on consonance with those on roughness reveals that
"psychoacoustic consonance" merely corresponds to the absence of roughness and is only slightly
and indirectly correlated with musical intervals. Thus, psychoacoustic consonance cannot be
considered as the basis of the sense of musical intervals. The basis of that sense seems to be
provided by the concept of virtual pitch. This concept is introduced with a model. The concept
accounts for many psychoacoustiac and musical phenomenas[sic], e.g., the ambiguity of pitch of
complex tones, the "residue," the pitch of inharmonic signals, the dominance of certain harmonics,
pitch shifts, the sense for musical intervals, octave periodicity, octave enlargement, "stretching" of
musical scales, and the "tonal meaning" of chords in music.
His first sentence points out without naming him that the theory Helmholtz is flawed in the sense that it does
not explain everything, just as we point out. Then he points out that the concept of virtual pitch has a lot of
explanatory power; by virtual pitch he is implying the same model we are: the brain is looking for something,
namely the Harmonic Series (which is present in speech), and that this activity of the brain results in a
phenomenon called virtual pitch; see Section 2.1.1 "Virtual Pitch: Hearing the Harmonic Series Even When it
is Not There". The main point is that hearing sound is not a physical phenomena all the way through and at
some point computational considerations take over.
Terhardt concludes his article with several points, the first three of which again agree with us completely
(emphasis in the original):
(1) The concept of psychoacoustic consonance which is defined by the absence of roughness does
not provide a satisfactory explanation of the ear's particular sense of musical intervals. On the
other hand, roughness probably is an important factor in music since it has an annoying effect.
Thus one may conclude that psychoacoustic consonance accounts for a sound's pleasantness, in a
very general sense.
(2) Tonal music seems to be based on the fact that the auditory system behaves as an active
processor of Gestalt attributes. By repeatedly processing speech, the auditory system acquires -among other Gestalt laws -- knowledge of the specific pitch relations which exist between the
lower six to eight harmonics of complex tones. These pitch intervals become familiar to the
"central processor" of the auditory system and, moreover, convey "virtual tonal meanings," i.e.
certain subharmonic bass notes. This way, these intervals become the so-called musical intervals.
(3) The realization of musical sounds seems to be governed by the two foregoing principles which
may be termed the principle of minimal roughness and the principle of tonal meanings. Both
principles imply certain requirements for the fundamental-frequency ratios and spectral
configurations of realized musical chords.
Wow. Point (1) recapitulates both of our points about the theory of Helmholtz: (a) it is incomplete, however
(b) it is correct as far as it goes -- that is, absence of roughness does not provide consonance, but roughness
does provide dissonance. Point (2) recapitulates our point that the brain is listening for something and that this
likely derives from a computational mechanism for processing speech (he even calls the brain a "'central
processor'"). Point (3) posits that chords therefore result from two activities: (a) listening for something ("tonal
meaning") in addition to the usual (b) avoiding roughness.
I have not attempted to follow the computational model of the core of his article, but the result seems to be
that (a) using human speech input to train (b) his learning model results in a machine that exhibits the artifact
of hearing the classical intervals in the usual chords. This theory reminds me generally of our derivation of the
same intervals by (a) assuming, as he does, that the brain is optimized for processing speech and in particular
finding the harmonic series, and then (b) assuming a subtraction signal processing optimization in the brain
resulting in relative pitch, as we do in Section 2.2.1 "Relative Pitch: Differences Between Sounds". However,
how close his learning model and our subtraction optimization (the two part (b)-s in the preceding) are to one
another I cannot say as I did not study his model.
However I do not see how he can get away without also assuming an additional mechanism for getting rid of
powers of two, as we do in Section 2.2.2 "Octaves: Sounds Normalized to a Factor of Two". Terhardt
[Terhardt1974-PCH, section E.2] claims that a simple learning model of the brain together with inputs from
speech also induce the octave effect. I did not follow his argument, but I also did not try very hard. If he is
right, then the number of mechanisms we require in the brain for our theory could possibly be reduced by one,
as we instead assumed Octaves result from an optimization built into the brain.
6.2 Terhardt Does Not Explain Sustained and Minor Chords

Terhardt [Terhardt1974-PCH]
suggests that the "Gestalt" perception
of the whole Harmonic Series induces
the auditory illusion called virtual
pitch. He explains this by "analogy"
with the effect that the "Gestalt"
perception of images may sometimes
induce visual illusions; see his figure
on the right, which he quotes from
[Coren1972] (that is, the image on the
right is a literal screen shot of Figure 4
and its caption in [Terhardt1974-PCH] which includes (1) a figure that Terhardt seems to be quoting from
[Coren1972] and (2) a caption which seems to be Terhardt remarking upon the figure he quoted from Coren;
my apologies if this is not the correct way to quote a figure of someone quoting a figure of someone else while
also quoting the quoters caption making commentary on that quoted figure; it doesn't help that HTML doesn't
have a standard way to add a caption and that there is no good way to put quotes around a figure). We
basically agree with Terhardt's visual-illusion metaphor of virtual pitch; see Section 2.1.1 "Virtual Pitch:
Hearing the Harmonic Series Even When it is Not There". Terhardt's notion of Gestalt is a more primitive
version of the feature vector understanding that we give here in Section 2.5 "Recognition: Feature Vectors"
and the visual illusion illustrates the concomitant effect of false recognition that we speak of in Section 2.5.2
"False Recognition" (though our theory was arrived at independently through use of standard Computer
Science).
Terhardt again (emphasis in the original):
Thus, with respect to a theory of consonance and harmony it appears rather promising that the
concept of virtual pitch readily provides subjective cues which correspond to subharmonics of
given tones. Figure 7(h) [not shown] depicts the distribution of virtual-pitch cues which is
produced when the model is stimulated by a major triad consisting of three complex tones, i.e., a
typical sound of music. The distribution has pronounced maxima one and two octaves below the
lowest one of the primary fundamentals. This means that the system attributes to the chord I-III-V
the "tonal meaning" I. One can easily prove that the model reflects also for other musical chords
the well-known relations between fundamental frequencies and "tonal meaning." Hence, by
means of the concept of virtual pitch, the theory of consonance and harmony possibly can be
provided with what was lacking yet: a psychoacoustic basis.
In a sense, Terhardt really gets quite close to our theory, his notion of "Gestalt" explaining the Major Triad in a
manner similar to the way we do, though without real computational sophistication: he notices that a
recognizer trained to hear the Harmonic Series will also be fired by some chords. However though Terhardt
goes on to claim that 'One can easily prove that the model reflects also for other musical chords the
well-known relations between fundamental frequencies and "tonal meaning."', he never actually does it.
Terhardt never goes further to use the common computational understanding that we have today of a feature
vector detector and computational disambiguation engine, so he does not come to our ultimate explanations of
ambiguous chords, such as the desire of the sustained chord to resolve to the Major Triad (see Section 3.5.4
"Chords Inducing Ambiguity") or how the the Minor Triad can be understood as a form of auditory cubism
(see Section 3.4.2 "The Minor as Auditory Cubism").
Regarding the other work of the above "Fusion or pattern matching" theory, "Evidence for a general template
in central optimal processing for pitch of complex tones" Gerson & Goldstein 1978 seems completely
irrelevant, but I found it quite hard to follow so I cannot be certain. There seems to be an article "Fusion or
pattern matching: fundamentals may be perceived through pattern matching of the separately analyzed partials
to a best-fit exact-harmonic template" by Gerson & Goldstein 1978 that I could not locate easily and so I did
not investigate. I also did not read any of the later references mentioned. The paragraph above from Wikipedia
implies that articles in this thread will be similar to Terhardt:
They support the conjecture that the brain is listening for something, likely the Harmonic series.
This observation therefore supports the conclusion that the brain finds the classic intervals inherently
pleasurable/interesting.
However this observation does not take this line of explanation further into the computational models of
feature vectors and disambiguation engines; we suggest that it is these models may be an original
contribution of this work.
Perhaps I should investigate the literature further, but I am must truncate this investigation somewhere; even
UC Berkeley didn't have entire journal(s) that I needed for some articles (I invite the reader to subtract the
start and stop dates at the top of the article). If anyone had come up with my theory of the Minor presented in
Section 3.4.2 "The Minor as Auditory Cubism" then Temperley [Temperley2007] would likely not have still
been wondering how it is that the Minor and Major sound so different in January 2007.
However, should any of my readers find anything relevant in the undoubtedly rich literature of prior research
into acoustic models which have computational explanatory power relevant to our question of "how does
harmony work?", please do let me know and if I publish an updated version of this article, I will include it.
7 Future Work: Towards A Unifying Theory of

Music
Can this explanation of harmony be extended to a more general model of music, including melody, rhythm,
and most importantly, emotion? We outline some thoughts on how this might be done.
7.1 Melody as Arpeggio

Does the brain somehow "keep" notes for a period after they have finished actually sounding? It seems likely.
Therefore could notes heard over time but not simultaneously combine to form chords? Given our model of
the brain listening for the Harmonic Series, it seems likely. Could playing some of the notes of a chord make
you "want" to hear the rest of the notes and complete the chord? Again, given our theory, it seems likely.
Breaking a chord apart and playing the notes one at a time is called "arpeggio" [arp].
Conjecture Eleven (Benjamin Franklin): Melody is arpeggio (at least partially).
Ben has so many good insights in this letter on music and the performance thereof that I cannot resist quoting
most of it. From [Franklin] (paragraph breaks added), to Lord Kames, June 2, 1765:
In my Passage to America, I read your excellent Work, the Elements of Criticism, in which I
found great Entertainment, much to admire, and nothing to reprove. I only wish'd you had
examin'd more fully the Subject of Music, and demonstrated that the Pleasure Artists feel in
hearing much of that compos'd in the modern Taste, is not the natural Pleasure arising from
Melody or Harmony of Sounds, but of the same kind with the Pleasure we feel on seeing the
surprizing Feats of Tumblers and Rope Dancers, who execute difficult Things.
For my part, I take this to be really the Case and suppose it the Reason why those who being
unpractis'd in Music, and therefore unacquainted with those Difficulties, have little or no Pleasure
in hearing this Music. Many Pieces of it are mere Compositions of Tricks. I have sometimes at a
Concert attended by a common Audience plac'd myself so as to see all their Faces, and observ'd
no Signs of Pleasure in them during the Performance of much that was admir'd by the Performers
themselves; while a plain old Scottish Tune, which they disdain'd and could scarcely be prevail'd
on to play, gave manifest and general Delight.
Give me leave on this Occasion to extend a little the Sense of your Position, That "Melody and
Harmony are separately agreable, and in Union delightful;" and to give it as my Opinion, that the
Reason why the Scotch Tunes have liv'd so long, and will probably live forever (if they escape
being stifled in modern affected Ornament) is merely this, that they are really Compositions of
Melody and Harmony united, or rather that their Melody is Harmony. I mean the simple Tunes
sung by a single Voice. As this will appear paradoxical I must explain my Meaning.
In common Acceptation indeed, only an agreable Succession of Sounds is called Melody, and only
the Co-existence of agreeing Sounds, Harmony. But since the Memory is capable of retaining for
some Moments a perfect Idea of the Pitch of a past Sound, so as to compare with it the Pitch of a
succeeding Sound, and judge truly of their Agreement or Disagreement, there may and does arise
from thence a Sense of Harmony between present and past Sounds, equally pleasing with that
between two present Sounds.
Now the Construction of the old Scotch Tunes is this, that almost every succeeding emphatical
Note, is a Third, a Fifth, an Octave, or in short some Note that is in Concord with the preceding
Note. Thirds are chiefly used, which are very pleasing Concords. I use the Word emphatical, to
distinguish those Notes which have a Stress laid on them in Singing the Tune, from the lighter
connecting Notes, that serve merely, like Grammar Articles, to tack the others together. That we
have a most perfect Idea of a Sound just past, I might appeal to all acquainted with Music, who
know how easy it is to repeat a Sound in the same Pitch with one just heard.
In Tuning an Instrument, a good Ear can as easily determine that two Strings are in Unison, by
sounding them separately, as by sounding them together; their Disagreement is also as easily, I
believe I may say more easily and better distinguish'd, when sounded separately; for when
sounded together, tho' you know by the Beating that one is higher than the other, you cannot tell
which it is. Farther, when we consider by whom these ancient Tunes were composed, and how
they were first performed, we shall see that such harmonical Succession of Sounds was natural
and even necessary in their Construction.
They were compos'd by the Minstrels of those days, to be plaid on the Harp accompany'd by the
Voice. The Harp was strung with Wire, and had no Contrivance like that in the modern
Harpsichord, by which the Sound of a preceding Note could be stopt the Moment a succeding
Note began. To avoid actual Discord it was therefore necessary that the succeeding emphatic Note
should be a Chord with the preceding, as their Sounds must exist at the same time. Hence arose
that Beauty in those Tunes that has so long pleas'd, and will please for ever, tho' Men scarce know
why. That they were originally compos'd for the Harp, and of the most simple kind, I mean a Harp
without any Half Notes but those in the natural Scale, and with no more than two Octaves of
Strings from C. to C.
I conjecture from another Circumstance, which is, that not one of those Tunes really ancient has a
single artificial Half Note in it; and that in Tunes where it was most convenient for the Voice, to
use the middle Notes of the Harp, and place the Key in F. there the B. which if used should be a B
flat, is always omitted by passing over it with a Third.
The Connoisseurs in modern Music will say I have no Taste, but I cannot help adding, that I
believe our Ancestors in hearing a good Song, distinctly articulated, sung to one of those Tunes
and accompanied by the Harp, felt more real Pleasure than is communicated by the generality of
modern Operas, exclusive of that arising from the Scenery and Dancing. Most Tunes of late
Composition, not having the natural Harmony united with their Melody, have recourse to the
artificial Harmony of a Bass and other accompanying Parts. This Support, in my Opinion, the old
Tunes do not need, and are rather confus'd than aided by it. Whoever has heard James Oswald
play them on his Violoncello, will be less inclin'd to dispute this with me. I have more than once
seen Tears of Pleasure in the Eyes of his Auditors; and yet I think even his Playing those Tunes
would please more, if he gave them less modern Ornament.
7.1.1 Scale As Theme: Melodic Association From Harmonic Association

Further examination of songs reveals that melody is not quite just arpeggio: within the melody line other notes
from the same scale are often thrown in that do not harmonize with the chord currently being emphasized. This
is particularly obvious if one reads music written in fake book [Neely1999] format. In contrast to the classical
format, where every single note is written out explicitly, in fake book format all that is presented of the music
is (1) a sequence of chords, and (2) a melody line. Two forces seem to be generating the notes that are actually
played:
harmony associates some notes with one another and we therefore expect them together, whereas
melody exhibits
a tendency to go "up and down" by consecutive or almost consecutive notes,
often traveling within the harmonic expectation of the current chord or scale, but sometimes not.
What is going on?
Perhaps melody is arpeggio of an entire scale. Recall that a scale is generally not one chord but three triads
(particularly powerful chords) occurring together. Recall from our discussion of theme in Section 2.4.1 "The
Simplicity of Theme" that any kind of expectation is useful to the brain, not just those that are computable
from direct harmonic relationship. Therefore it is possible that a scale creates a weaker kind of association
than (or containing) the stronger harmonic association of the current chord. Specifically, as a scale is multiple
chords, notes within a scale may be harmonically far from one another and may instead be related simply by
the fact that that they are known to occur together, due to their frequent use together to make harmoniclyrelated chords.
Given the theme of a scale, we suggest that a general desire for the melody line to go vaguely "up" or "down"
will naturally create an expectation for the "next" or "previous" note in the scale, regardless of any direct
harmonic relationship between the two notes. Let's call this kind of association a "melodic association". Recall
the first page of the music theory book referred to above, "Jazz Improvisation 1: Tonal and Rhythmic
Principles" by John Mehegan [Mehegan1959], where Mehegan referred to harmonic association as "vertical"
and melodic association as "horizontal" (a different notion of vertical and horizontal than we give in Section
2.3.2 "Harmony Induces Two Kinds of Intervals: Horizontal Within the Note and Vertical Across the Notes").
7.1.2 Streaming: Multiple Similar Phenomenon Occurring Consecutively Are Explained

By The Brain As One Thing Moving
One of the first movies was created by a horse galloping through a sequence of strings, each of which fired a
camera when broken. When the sequence of images taken was played back rapidly, people noticed an illusion
of a single horse moving, not multiple horses appearing one after the other. Similar tones that are close enough
and also occur consecutively seem to produce a similar effect in the brain. From [acoustical-demo, Demo 19],
"Pitch Streaming" (emphasis in the original):
It is clear in listening to melodies that sequences of tone can form coherent patterns. This is called
temporal coherence. When tones do not form patterns, but seem isolated, that is called fission.
Temporal coherence and fission are illustrated in a demonstration first presented by van Noorden
(1975) and included in the "Harvard Tapes" (1978). Van Noorden describes it as a "galloping
rhythm."
We present tones A and B in the sequence ABA ABA. Tone A has a frequency of of 2000 Hz,
tone B varies from 1000 to 4000 Hz and back again to 1000 Hz. Near the crossover points, the
tones appear to form a coherent patterns, characterized by a galloping rhythm, but at large
intervals the tones seem isolated, illustrating fission.
This demo supports my conjecture at the end of the article that, while there is "harmonic locality" to pitches
that sound good together, there is also a "melodic locality" to pitches that are near each other, which is why
melodies tend to be close notes going up and down.
Simon Goldsmith [Goldsmith, c. 2010] suggests that some natural phenomena are well-modeled this way, such
a horse galloping on plain or raindrops on a tree: that is, it seems natural to associate similar pitches as they
tend to come from the same physical phenomenon.
7.1.3 Melody can Easily Create Interesting Ambiguities

Melody can easily create interesting ambiguities even between multiple chords none of which are ambiguous
when played together. That is, melody can be used to play ambiguity games with chords, extending the effect
that we noticed earlier in Section 2.4.2 "The Complexity of Ambiguity": if I play two notes and your brain is
expecting them to complete a chord, there are still many that it could complete, even if none of these chords
are the strange chords given in Section 3.5.4 "Chords Inducing Ambiguity". Doing this certainly gives the
disambiguation and prediction engine in your brain something to do.
7.2 The Role of Narrative Generally

Recall our conjectures on the importance of theme and ambiguity from Section 2.4 "Interestingness: Just
Enough Complexity". How might theme and ambiguity play out as information is added over time, and what
would we call that?
Since anticipation and prediction is one of the fundamental operations of the brain, they are at the heart of
what we call narrative and how narrative can be so entertaining the mind. Recall that there is an art to
balancing the simplicity of theme and the complexity of ambiguity: if understanding and prediction of the
storyline are too easy, then it is boring, and if too hard, then it is noise, but if just right, then it is interesting.
That is, it is likely that the brain has one disambiguation engine and that the processing that occurs in verbal
narrative would process similarly in other contexts, such as music. Above in Section 3.5.6 "Chords Preserving
Intervals but not Harmonics" we mentioned some chords that are hard to explain in isolation. Well these
chords do sound strange in isolation, however the theme created by the preceding music before the chord may
bring a certain sense to them. Think of one standard structure for a joke: a story (creating a theme) and then a
punchline; the punchline would not be funny in isolation without the context provided by the story, and yet we
attribute the funniness of the joke to the punchline and not the story which did the work.
Recently while listening to the rhythm of an insect at twilight I was struck by how the rhythm occured in
layers of declining theme and increasing complexity: there was a simple rhythm creating an expectation, and
then regular a violation of that expectation, creating a rhythm on top of that. This phenomenon of narrative, of
anticipation and prediction within a theme, applies to both harmony and rhythm. The phenomenon of
expectation itself is likely generic across kinds of inputs and so harmonic expectation should work in a
similarly layered manner as rhythmic expectation.
Keith Johnstone is a genius of the narrative form who helped create modern improvisational acting. Below he
points out below that the duality of theme versus ambiguity is at the heart of what makes stories work when he
says "What matters to me is the ease with which I free-associate and the skill with which I reincorporate."
Notice that "free-associat[ing]" implies that he has created ambiguity (hard-to-model complexity) and the
"reincorporat[ing]" implies that he resolves the ambiguity, creating a theme. This two-part process of creating
ambiguity and the resolving to theme repeats and layers in real stories, previously-created themes being used
as context for the upcoming ambiguities.
From "Impro: Improvisation and the Theatre" by Keith Johnstone [Johnstone1981, p. 112-113], (emphasis in
the original):
Suppose I make up a story about meeting a bear in the forest. It chases me until I come to a lake. I
leap into a boat and row across to an island. On the island is a hut. In the hut is a beautiful girl
spinning golden thread. I make passionate love to the girl...
I am now storytelling but I haven't told a story. Everyone knows it isn't finished. I could continue
forever in the same way....
The trouble with such a sequence is that there's no place where it can stop, or rather, that it can
stop anywhere; you are unconsciously waiting for another activity to start, not free association,
but reincorporation.
... What matters to me is the ease with which I free-associate and the skill with which I
reincorporate.
Here's a 'good night' story made up by me and Dorcas (age six).
'What do you want a story about?' I asked.
'A little bird,' she said.
'That's right. And where did this little bird live?'
'With Mummy and Daddy bird.'
'Mummy and Daddy looked out of the nest one day and saw a man coming through the trees.
What did he have in his hand?'
'An axe.'
'And he took the axe and started chopping down all the trees with a white mark on. So Daddy bird
flew out of the nest, and do you know what he saw on the bark of his tree?'
'A white mark.
'Which meant?'
'The man was going to cut down their tree.'
'So the birds all flew down to the river. Who did they meet?'
'Mr Elephant.'
'Yes. And Mr Elephant filled his trunk with water and washed the white mark away from the tree.
And what did he do with the water left in his trunk?'
'He squirted it over the man.'
'That's right. And he chased the man right out of the forest and the man never came back.'
'And is that the end of the story?'
'It is.'
At the age of six she has a better understanding of storytelling than many university students. She
links the man to the birds by giving him an axe. She links up the water left in the trunk with the
wood-cutter, who she remembers we'd shelved. She isn't concerned with content but any narrative
will have some (about insecurity, I suppose).
7.3 Embodiment and Emotion

Where does the emotion of music come from? Many people are asking what I consider to be a similar
question: where does the meaning of sentences come from?
The best answer I have heard to this question is "Embodied Construction Grammars" [con]. The idea is that
words and grammatical constructions in sentences are associated directly to an embodied experience. From
[Feldman2006, p. 7] (emphasis in the original):
The Embodied Mind
One simple insight has driven much of the scientific study of how the structure and function of the
brain results in thought and language. Human language and thought are crucially shaped by the
properties of our bodies and the structure of our physical and social environment. Language and
thought are not best studied as formal mathematics and logic, but as adaptations that enable
creatures like us to thrive in a wide range of situations....
The embodied approach entails several crucial questions. How much, and in exactly what ways,
are thought and language products of our bodies? How, exactly, does our embodied nature shape
the way we think and communicate? Here are some of the findings discussed in the course of this
book:
Concrete words and concepts directly label our embodied experience. Think of of such
short words in English as knee, kick, ask, read, want, sad.
Spatial relations, for example, concepts directly expressed by words such as in, through,
above, and around, can be seen as derived form specialized circuitry in the visual system:
topographic maps of the visual field, orientation-sensitive cells.
What is technically called "aspect" in linguistics -- the way we conceptualize the structure
of events, reason about events, and express events in language -- appears to stem from the
neural structure of our system of motor control.
Abstract thought grows out of concrete embodied experiences, typically sensory-motor
experiences. Much of abstract thought makes use of reasoning based on the underlying
embodied experience....
Grammar consists of neural circuitry pairing embodied concepts with sound (or sign).
Grammar is not a separate faculty, but depends on embodied conceptual and phonological
systems....
Thought and language are thus very strongly shaped by the nature of our bodies, our brains, and
our experience functioning in the everyday world....
Thoughts and language are not disembodied symbol systems that happen to be realized in the
human brain though its computational properties. Instead, thought and language are inherently
embodied. They reflect the structure of human bodies and have the inherent properties of neural
systems as well as the external physical and social environment.
And further [Feldman2006, p. 213]:
Understanding as Simulation
... The important point for us is that much of language can be seen as setting up the conditions for
imagining the scene being portrayed.
Perhaps we need an "embodied construction grammar" for music:
Harmony as an abstraction of human voice.
Rhythm as an abstraction of physical movement, such as walking or dance (which we discuss more in
the rhythm section below).
Since much human relationship is expressed through voice and movement, is music therefore recalling to us
experiences of vocally and physically relating to others and therefore also recalling the associated emotions?
As Zen Buddhists say, while on the one hand, "mind and body are two", on the other hand "mind and body are
one". (Maddeningly) though I have heard the non-dualism of body and mind expressed often in Zen (it is quite
standard), I have given up on trying to find a citation for that exact pair of sentences. Here is a pretty good
statement of the sentiment by Shunryu Suzuki in "Zen Mind, Beginner's Mind", [Suzuki1970, Epilogue: Zen
Mind]:
We Buddhists do not have any idea of material only, or mind only, or the products of our mind, or
mind as an attribute of being. What we are always talking about is that mind and body, mind and
material are always one.
7.4 A Proposal For A Unifying Physical and Computational Theory of

Music
Harmony as Abstract Voice: In sum, we have the following theory of what is compelling about harmony.
Embodiment: The voice is of great importance to humans as it is one primary way of relating to each
other. Emotion is our ancient, pre-intellectual, way of understanding each other and therefore much
emotion is communicated in voice. It is likely that the brain has much hardware devoted to processing
voice, both for finding signal and separating out noise, and this hardware is being repurposed when
listening to music.
Abstraction: The Harmonic Series is an abstraction of voice. The features that occur as the artifacts of
its processing can be fired off by artists in clever ways beyond those that occur in nature. These
manipulations of the features of harmony produce manipulations of the human response to voice,
namely emotion.
Complexity: Discovering a theme in some input is a way to manage the complexity of the input. Likely
layers of theme and the resulting unexplained residual complexity are being processed by an expectation
engine in the brain. Much of the art of manipulating harmony is simply playing with this expectation
engine, giving it just enough complexity so the input remains on the interesting border between
monotony and noise.
Rhythm as Abstract Movement: This phenomenon of the joy of just-interesting-enough anticipation and
prediction connects thinking about harmony and rhythm. We did not take up rhythm in depth in this article,
however I think it is quite likely that rhythm can be explained in a manner similar to harmony.
Embodiment: Regular moments of contact occur throughout our lives while we use our bodies, for
example in our breathing, heart-beating, speaking, and walking.
Abstraction: Just as the Harmonic Series is an abstraction of voice, rhythm is an abstraction of
coordinated body movement.
Complexity: Just as melody plays with the expectation of harmony, rhythm plays with the expectation of
movement.
Melody as the Narrative Unifying of Harmony and Rhythm: Finally, we see melody as the unification of
Harmony and Rhythm as a single narrative, intertwining both.
8 Acknowledgements
I gratefully acknowledge the proofreading of Simon Goldsmith, Mark Hoemmen, Emma Dzelzkalns, Peter
McCorquodale, Ryan Barrett, Karl Chen, Russell Sears, and Michael O'Donnell. Thanks also to Michelle and
others at the Art and Music Department of the Central Berkeley Public Library for their assistance finding
Helmholtz and checking a quote over the phone. Thanks also to Joanne of The 24/7 Reference Cooperative
and John Kupersmith of the University of California, Berkeley Library for help in finding the obscure Terhardt
and Gerson & Goldstein articles.
I have to particularly acknowledge Michael O'Donnell for his extensive and in-depth discussions with me on
the topic of this paper. He also recommended to me the "Auditory Demonstrations" CD [acoustical-demo],
which certainly extended my understanding of the details of what is known about how the brain processes
sound. Mike has genuine enthusiasm for and knowledge of the subject of how the brain listens to music. He
was quite generous in sharing that knowledge with me by providing much thoughtful and thorough feedback.
Thanks Mike.
Any remaining errors are mine alone.
9 References
Personal Communication
[Auslander] Joel Auslander
[Feldman] Jerome A. Feldman: http://www.eecs.berkeley.edu/Faculty/Homepages/feldman.html
[Feltman] Charles Feltman: http://www.sfcablecarchorus.org/director.html
[Fultz] Andrea Fultz: http://www.andreafultz.com/
[Ganz] William Ganz: http://ucce.berkeley.edu/about
[Goldsmith] Simon Goldsmith: http://sfg.users.sonic.net/
[Hoemmen] Mark Hoemmen: http://www.cs.berkeley.edu/~mhoemmen/
[Levitin] Daniel J. Levitin: http://ego.psych.mcgill.ca/levitin.html/
[O'Donnell] Michael J. O'Donnell: http://www.cs.uchicago.edu/people/odonnell
[Stolorow] Ben Stolorow: http://www.benstolorow.com/
[Turner] Tim Turner
Print
[Alexander1979] Alexander, Christopher. 1979. "The Timeless Way of Building". Oxford University
Press.
[Birkhoff1933] George D. Birkhoff. 1933. "Aesthetic Measure"; in particular see "Chapter V: The
Diatonic Chords". Harvard University Press, Cambridge, Massachusetts.
[Carroll1865] Lewis Carroll, John Tenniel. 1865. "Alice in Wonderland".
[Coren1972] S. Coren. 1972. Psychol. Rev. 79, 359-367.
[Feldman2006] Jerome A. Feldman. 2006. "From Molecule to Metaphor". The M.I.T. Press, Cambridge,
Massachusetts.
[Feynman1965] Richard Feynman. 1983, 1965. "The Character of Physical Law". The M.I.T. Press,
Cambridge, Massachusetts.
[Feynman1985] Richard P. Feynman. 1985. '"Surely You're Joking, Mr. Feynman!": Adventures of a
Curious Character'. Bantam Books, New York.
[Franklin] Benjamin Franklin. 1959--. "The Papers of Benjamin Franklin". Yale University Press.
Online and searchable at: http://franklinpapers.org/
[Helmholtz1863] Hermann L. F. Helmholtz. 1877, 1863. "On the Sensations of Tone as a Physiological
Basis for the Theory of Music". A translation of "Die Lehre von den Tonempfindungen als
physiologische Grundlage fr die Theorie der Musik", the first edition of which was published in 1863.
Title page note: "The Second English Edition, Translated, thoroughly Revised and Corrected, rendered
conformal to the Fourth (and last) German Edition of 1877... by Alexander J. Ellis". Dover, New York.
[Johnstone1981] Keith Johnstone. 1981. "Impro: Improvisation and the Theatre". Routledge, New York.
[Levitin2006] Daniel J. Levitin. August 2006. "This Is Your Brain on Music: The Science of a Human
Obsession". Dutton, New York.
http://www.yourbrainonmusic.com/
[Mehegan1959] John Mehegan. 1984, 1959. "Jazz Improvisation 1: Tonal and Rhythmic Principles".
Watson-Guptill Publications, New York.
[Neely1999] Blake Neely. 1999. "How to Play From a Fake Book: Faking Your Own Arrangements
from Melodies and Chords". Hal Leonard, Milwaukee, WI.
[Suzuki1970] Shunryu Suzuki. 2001, 1970. "Zen Mind, Beginner's Mind". Weatherhill, New York &
Tokyo.
[Temperley2007] David Temperley. 2007. "Music and Probability". The M.I.T. Press, Cambridge,
Massachusetts.
[Terhardt1974-PCH] Ernst Terhardt. 1973/4. "Pitch, consonance, and harmony" J. Acoust Soc. Am.,
Vol.55, No.5, May 1974 (received 1973), pp. 1061-1069.
Sound
[acoustical-demo] A. J. M. Houtsma, T. D. Rossing, W. M. Wagenaars. 1 September 1987. "Auditory
Demonstrations" CD and booklet. Prepared at the Institute for Perception Research (IPO) Eindhoven,
The Netherlands. Supported by the Acoustical Society of America. http://asa.aip.org/discs.html
Image
[Jessica-pout] Image of Jessica Rabbit from "Who Framed Roger Rabbit?" [WFRR-1988], obtained
from http://screenmusings.org/WhoFramedRogerRabbit/pages/WFRR_0604.htm
[Picasso1938] Pablo Picasso. 1938. "Head of a Woman". (I have forgotten where I obtained this image.)
Film
[Jessica-bad] Jessica Rabbit in "Who Framed Roger Rabbit?" [WFRR-1988].
[WFRR-1988] "Who Framed Roger Rabbit?", directed by Robert Zemeckis and released by Touchstone
Pictures. 1988.
General Online
[Austen-caricature] Ben Austen. 15 July 2011. "What Caricatures Can Teach Us About Facial
Recognition". http://www.wired.com/magazine/2011/07/ff_caricature/
[Grammer-harmony] Red Grammer. "Harmony". Many websites say the author is "Anonymous", but
this one http://sniff.numachi.com/pages/tiHARMNY.html says the song is by Red Grammer, Smilin'
Atcha Music.
[Harmon-art-brain] Katherine Harmon. 4 June 2010. "Why so many artists have lazy eyes, and other
things art can teach us about the brain". http://blogs.scientificamerican.com/observations/2010/06
/04/why-so-many-artists-have-lazy-eyes-and-other-things-art-can-teach-us-about-the-brain/
[Jessica-great] "The 100 Greatest Movie Characters: 88 Jessica Rabbit". http://www.empireonline.com
/100-greatest-movie-characters/default.asp?c=88
[Mobley-antibiotics] Harry Mobley. March 13, 2006. "How do antibiotics kill bacterial cells but not
human cells?". http://www.scientificamerican.com/article.cfm?id=how-do-antibiotics-kill-b
[Mosquito-harmony] James Morgan. 8 January 2009. "Mosquitoes make sweet love music".
http://news.bbc.co.uk/2/hi/science/nature/7814404.stm
[Schmidt-music-theory] Catherine Schmidt-Jones. 10 January 2007. "Understanding Basic Music
Theory". http://cnx.org/content/col10363/1.3/
[Schmidt-waves] Catherine Schmidt-Jones. 11 March 2011. "Standing Waves and Musical Instruments".
http://cnx.org/content/m12413/1.11/
[Wilkerson-entropy] Daniel S. Wilkerson. October 2006. "An Intuitive Explanation of the Information
Entropy of a Random Variable". http://danielwilkerson.com/entropy.html
Wikipedia
Note that footnotes (but not inline citations) within Wikipedia articles are simply omitted: the alternative of
inlining the citation seemed cumbersome and they are readily available online.
[alg] Algorithm: http://en.wikipedia.org/wiki/Algorithm
[archetype] Archetype: http://en.wikipedia.org/wiki/Archetype
[arp] Arpeggio: http://en.wikipedia.org/wiki/Arpeggio
[beat] Beat (acoustics): http://en.wikipedia.org/wiki/Beat_(acoustics)
[canon] Canonical: http://en.wikipedia.org/wiki/Canonical
[con] Construction grammar: http://en.wikipedia.org/wiki/Construction_grammar
[con-dis] Consonance and dissonance: http://en.wikipedia.org/wiki/Consonance_and_dissonance
[concert-pitch] Concert pitch: http://en.wikipedia.org/wiki/Concert_pitch
[conv] Convergent evolution: http://en.wikipedia.org/wiki/Convergent_evolution
[cop] Copernican model of the solar system: http://en.wikipedia.org/wiki/Copernican_heliocentrism
[cutt] Cuttlefish: http://en.wikipedia.org/wiki/Cuttlefish
[dem] Demonic possession: http://en.wikipedia.org/wiki/Demonic_possession
[ent] Entropy: http://en.wikipedia.org/wiki/Entropy
[epi] Epicycles: http://en.wikipedia.org/wiki/Epicycle
[eqt] Equal temperament: http://en.wikipedia.org/wiki/Equal_temperament
[feat] Feature vector: http://en.wikipedia.org/wiki/Feature_vector
[gorb] Mikhail Gorbachev: http://en.wikipedia.org/wiki/Mikhail_Gorbachev
[har] Harmonics: http://en.wikipedia.org/wiki/Harmonics
[harmonic7] http://en.wikipedia.org/wiki/Minor_seventh
[hum] Humorism: http://en.wikipedia.org/wiki/Humorism
[just] Just tuning (intonation): http://en.wikipedia.org/wiki/Just_intonation
[log] Logarithm: http://en.wikipedia.org/wiki/Logarithm
[maj] Major scale: http://en.wikipedia.org/wiki/Major_scale
[maj-min] Major and minor: http://en.wikipedia.org/wiki/Major_and_minor
[min] Minor scale: http://en.wikipedia.org/wiki/Minor_scale
[min7] Minor seventh: http://en.wikipedia.org/wiki/Minor_seventh
[miss-fund] Missing fundamental (and virtual pitch): http://en.wikipedia.org/wiki/Missing_fundamental
[mus] Musical tuning: http://en.wikipedia.org/wiki/Musical_tuning
[occ] Occam's razor: http://en.wikipedia.org/wiki/Occam's_razor
[oct] Octave: http://en.wikipedia.org/wiki/Octave
[pen] Pentatonic Scale: http://en.wikipedia.org/wiki/Pentatonic_scale
[pitch] Pitch: http://en.wikipedia.org/wiki/Pitch_(music)
[ptol] Ptolemaic model of the solar system: http://en.wikipedia.org/wiki/Geocentric_model
[rel] Relative pitch: http://en.wikipedia.org/wiki/Relative_pitch
[sci] Scientific method: http://en.wikipedia.org/wiki/Scientific_method
[sci-pitch] Scientific pitch notation: http://en.wikipedia.org/wiki/Scientific_pitch_notation
[twelve-bb] Twelve-Bar Blues: http://en.wikipedia.org/wiki/Twelve-bar_blues
[undertone] Undertone series: http://en.wikipedia.org/wiki/Undertone_series
[venus-fly] Venus Flytrap: http://en.wikipedia.org/wiki/Venus_Flytrap
[wolf] Wolf intervals: http://en.wikipedia.org/wiki/Meantone_temperament#Wolf_intervals

Harmony Explained

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Harmony Explained

Загружено:

Авторское право:

Доступные форматы

arXiv:1202.4212v1 [cs.

SD] 20 Feb 2012

Abstract and Introduction

3.5.2 How to Turn Sweetness into Mud: Over-Using Octaves

1 The Problem of Music

... wbwbw wbwbwbw wbwbw wbwbwbw ...

1.1 Modern "Music Theory" Reads Like a Medieval Medical

Reading a music theory

1.2 What is a Satisfactory, Scientific Theory?

(adjusted for relativity and other more recent observations).

1.3 Music "Theory" is Not a Scientific Theory of Anything

1.4 Can we Make a Satisfactory Theory of Music?

1.5 Physical Science: Harmonics Everywhere

1.5.1 Timbre: Systematic Distortions from the Ideal Harmonic Series

1.6 Computational Science: as Fundamental as Physical Science

1.6.1 Algorithms are Universal

2 Living in a Computational Cartoon

necessarily faithfully accurate to reality.

2.1 Searching for Harmonics

2.1.2 Using Greatest Common Divisor as the Missing Fundamental

"Shift of Virtual Pitch":

2.1.3 Even Animals Seem to Compute the Ideal Harmonic Series

Major Triad"). From "Mosquitoes make sweet love music" [Mosquito-harmony]:

2.2 Artifacts of Optimization

2.2.1 Relative Pitch: Differences Between Sounds

2.2.2 Octaves: Sounds Normalized to a Factor of Two

2.3 Harmony: Sweetness is the Ideal

2.3.3 Vertical Intervals Have Pure Ratios

2.3.4 Vertical Intervals Have Balanced Amplitudes

2.3.5 Vertical Intervals Are All The Same Ratio

2.3.6 Harmony is Sweeter Than Sweet

2.4 Interestingness: Just Enough Complexity

2.4.1 The Simplicity of Theme

2.4.2 The Complexity of Ambiguity

2.5 Recognition: Feature Vectors

2.5.1 Soft Computing

2.5.2 False Recognition

tuned to extreme representations, explained Margaret Livingstone, a professor of neurobiology at

2.5.3 Cubism: Partial Recognition Due to Redundant, Over-Determined Feature Vectors

This produces an interesting effect:

3 Harmonic Music Explained

3.1 The Major Triad

3.2 The Major Scale

3.2.1 Interlocking Triads

Perfect Fifth: 2/3 * 3/2 = 1

3.2.2 Using Logarithms to Visualize Distances Between Tones/Notes

3.2.3 The Keyboard Revealed

Major Triad down by

log_2 3/2 ~ 0.585.

Sorting and Plotting the Three Triads on One Line

3.3 Scales and Keys

3.3.1 Changing Key: Playing Other Groups of Triads

3.3.2 Key Changes Break Harmony

3.3.3 Just versus Equal Tuning

However the Third is not very close to 5/4:

You can actually hear the difference.

3.4 The Minor

3.4.1 The Minor Triad

Wow, with the Minor Triad

3.4.2 The Minor as Auditory Cubism

but I do not hear the Harmonic Series itself -- something is missing.

3.4.3 Minor Scales

3.5.1 The Standard Chord Dictionary

Notes in C Major Scale Semi-tones from fundamental