Вы находитесь на странице: 1из 147




Module Content
1 Electronic Music Production Course



Lesson content

Introduction to Music Production

OK, so you are new. You have an idea of making and producing your own music. And
you feel inspired. Perhaps you are a seasoned musician, tired of paying someone else to
produce your music. Perhaps you are building a studio to record your band. Or you are
into producing audio for video, film, and podcasts. Maybe you have nothing more than a
spark, the urge to create, a desire to fulfill a sense of artistic vision. It's Cool. You are
welcome at the lab. You have found the right place and good people. We are here to help
you build your own recording studio, your own laboratory for creative projects that works
perfectly within your needs and budget. All the rules have changed in the past few years
for putting together a recording studio and they keep changing. It used to be that you
needed expensive multi-track recorders and mixdown machines, a roomful of outboard
gear and processors, and more cables than you would want to count.


Of course, you still can make a large studio with tons of outboard gear (which sounds
better than ever), or you can let computers and modern digital multi track machines
replace hundreds of functions that used to require separate hardware units.

We are not talking about a cheap, hissy, unprofessional sound, like we used to get with
old 4 track cassette studios. Those days are gone. With the dawn of modern recording
software (called sequencers), with their full-featured digital mixers built right into
software, you can expect your sound to rival the big boys in the studios downtown.
Yes. It's true! For a modest investment in microphones, preamps, audio interfaces and
software you can be well on your way. I'm going to tell you all about today's gear, tell
you what you need and what you don't need, give you strategies for gear acquisition that
are tried and true, and show you where you can save money and exactly where you
should not compromise.

But don't think just because you have the gear you will sound like a million bucks,
automatically. No, my friend, it does not work that way. You need to understand music to
write music and you need to know how to use the gear or software that you have as
tools. Talent is important, and there are many talents required to make a full production.
But that doesn't mean you need to know how to play an instrument, like the keyboard,
with proficiency.

Basically, we consider the studio itself to be a musical instrument. Like any instrument,
you get good by practicing, trying different things, experimenting, mimicking, tweaking,
mixing.... After a while, it dawns on you that making music is a craft, the mixer is its
workbench, and the studio is it's laboratory. You supply the creativity, your musicality,
your quest for musical beauty. You capture your tracks then tweak it down to a work of

The great masters of the recording arts learned their techniques by devoting their lives
creating, capturing and tweaking sound. These secrets are hard earned, and used to be
passed down from the pros to their apprentices at big studios. You would set up


microphones, sweep the floor, run for coffee and take out the trash, then, one day they let
you help them at the console. Those days are almost gone.

In your recording studio, you get to have three roles--musician (as creator and
performer), audio engineer, and producer. What stands between you and the masters is
simply knowledge and experience. Their knowledge translates directly-- the tools in the
modern software studio have the same names and functions as the classic hardware
machines in a pro facility and are used in the same way. The big studios downtown have
compressors, limiters, vocal processors, delays, reverbs, equalization, multi track
recorders, computer automation and massive consoles that hook it all together. If you
have a modern software package or hardware digital multitrack, you have all these tools
too. They know exactly when and how to use EQ to clean up a track, when to use
compression, the precise place to put reverb in the mix, how to record vocals, guitars,
drums and how to level everything to make a stunning audio image. Tweakheadz.com is
going to be your guide to acquiring all these skills.

We will tell you the things that you absolutely must know for music production in a clear,
simple, even entertaining way. While much of this knowledge is technical, we'll avoid
bogging you down with unnecessary technical details. We are not flying to Mars here
(except maybe musically), so we can have a little fun. After all, our music is something
we want people to enjoy.

Because you may be running your studio on a computer, we'll cover that too. You need to
know how to tweak your computer as well as you know how to tweak your musical
score. Then there is the matter of Understanding MIDI and digital audio and how these
work together (yes they ARE different). This is a core concept underpinning how the
contemporary computer-based home studio works


Key Components to a Home Studio

*** Key components to a home studio ***

What computer should I buy?

Having a rock-solid audio system is made easy with the growth of processing power
commercially available in the market today. But there are always some factors to take
into consideration. While we cannot recommend any specific brands or models, here are
some guidelines to keep in mind when purchasing a new computer for audio production.

The first step is to take a look at ableton’s minimum system requirements (see below) to
get an idea of what spec is needed to run Live.

Desktop, Laptop, Mac or PC?

It is not really important which platform you use and we recommend using whatever feels
the most comfortable, within your budget.

If you are already using an audio interface, it is always a good idea to make sure that your
system will support it. Ask yourself what requirements are important to run any existing
instruments or hardware you have, for example:

• Is your soundcard compatible with the operating system that is included

with the computer?
• Are there any known issues with different OS versions?
• Do you require special hardware specs, eg: USB, 6 pin firewire, firewire
800, Thunderbolt?

Most hardware companies have dedicated support sections or forums, filled with users
documenting any problems they've encountered, and a quick google/browse should reveal
any note-worthy issues.

Smaller laptops can come with limited hardware specs, so a little research can save any
headaches down the line, avoiding wasting precious music time spent on


What CPU?

Processor options when buying a Mac are usually condensed to high-performance Intel-
based chips, but in with PCs the options are considerably broader.

Generally, multi-core Intel-based processors are considered among the best in the market,
the most powerful being in the Intel i5 and i7 families. With high-end Mac
computers, the Intel Xeon is widely regarded as one the most powerful processors in the
market, and rightly so.

Slightly older dual-core Pentium chips, from the same lineage as the i-series, can also
perform well but the i-series are a newer and more powerful technology. Intel Atom
computers, while practical, are not built for robust processes like audio production.

The Intel website documents the differences between it's current models in a detailed
comparison chart.

It is also worth noting that some AMD chips can meet the same spec as Intel, and are not
to be disregarded. Balancing your budget against performance is the key and a really
useful tool to assist with this is CPUBenchmark.net, which provides performance test
results for most CPUs on the market.

Hard Drive?

In basic Projects a regular hard drive will suffice, but when Projects become larger,
especially when streaming groups of large audio files straight from a disk, speed becomes
a factor.

When possible, using a 7200 RPM (revolutions per minute) hard drive is recommended.
With commercial computer packages, this upgrade is offered at an additional cost, so
keep an eye out for it when purchasing.

Better again, but considerably more expensive, are SSDs (solid-state drives). Without any
internal moving parts, SSDs are becoming more popular as price reduces. If it is within
your budget, a SSD can really add to your computer performance - but is not essential.

Note: If you are running your system in a serious studio environment or if your
productions become more complex, you can further optimize the performance with a
multiple hard drive setup.

What about RAM?

We recommend using the 32-bit version of Live, which uses up to 4 GB of RAM. Even
though the minimum recommended amount of RAM is 2 GB, having more is always a
good idea when running external plugins or loading Live Sets with many clips and


samples. To future-proof your new computer for the next few years, please consider 4 GB
or more of RAM.

Like CPUs, RAM also comes in different speeds, so checking the RAM benchmark
website will further guide you in the right direction.

Note: The 64-bit version of Live can utilize considerably more memory, with a limit
theoretically higher than any home computer could house. If you wish to run the 64-
bit version and it is within your budget to purchase additional RAM, please take a look
through this FAQ for some clarification on the most commonly asked questions.

Numbers and charts are well and good when shopping around, but the best advice to take
is to use the above information as a guide, and to buy the most powerful computer you
can afford.

*** What are the Live 9 system requirements? ***

Intel® Mac with Mac OS X 10.7 or later, or PC with Windows 7 or Windows 8;
Multicore processor; 4 GB RAM; 1024x768 display; DVD drive or broadband internet
connection for installation;

Required disk space for basic installation:

3GB free disk space.

Required disk space if all included sounds are installed:

• Live 9 Suite: 55 GB free disk space

• Live 9 Standard: 12 GB free disk space
• Live 9 Intro: 6 GB free disk space

Live 9 is compatible with the legacy operating systems Mac OS X 10.5 and 10.6,
Windows XP and Vista up to version Live 9.1.10.

Live 9.2 and later is not compatible with Mac OS X 10.5 and 10.6, Windows XP and



*** DAW Software ***

In today's studio, the DAW - Digital Audio Workstation - rules the roost. While plenty of
people are still using analog machines or modular digital multitracks such as ADATs and
DA-88s, there simply is no comparison when it comes to the sheer power, capability,
sound quality, and cost-effectiveness of today's computer-based recording systems. Let's
take a look at what goes into today's DAW, and how to choose the right one for your
studio or live recording situation.

We'll be examining a number of aspects of the DAW story - software, hardware,

computers, options, and much more. Read through the info here, then check out the
product pages dedicated to each DAW program as well as all the audio interfaces that are
available. When you've finished, contact your Sales Engineer for the full story on
everything DAW.

It's a brave new computer-based studio world out there, and we've never had more power
or capability available to us for less money. Choose your weapons, assemble your studio,
and get busy making magic!

What is a DAW?

As you may know, "DAW" stands for "Digital Audio Workstation." But what does that
really mean? First of all, when most people mention a DAW, they're referring to a
recording system - a workstation - that has several components (in some cases you'll
purchase each of these components separately, in others you can get a "turnkey" rig that
includes everything you need from software to hardware):

Digital Audio Software - If your computer is the heart of your DAW system, then the
software you choose is the brain! The software is where it all comes together, creating a
virtual studio living inside your computer. There are a number of DAW software
packages out there. Finding the one that's best for you can be a challenge. But the good
news is, all the DAW software applications out there - even free ones like Garage band -
are amazingly powerful, and can do more than most of us are likely to ever need (not that
we don't all have our wish lists for features the manufacturers can add to the next
versions, of course). The big thing with DAW software (or any software, for that matter)
is finding an application that works in the way you prefer to work. If your brain and
creative process works a particular way, you're not going to be happy with a program that
forces you to work in a completely different manner.

Fortunately, many DAW software companies offer demo versions of their software. Or
they may offer a very inexpensive "entry level" version of the package that will let you
get a feel for the software before you buy the full deal. Most of these "lite" versions can
be upgraded to the full version for the difference in price, so you're not out anything if
you decide to "make sure" before you take the plunge.


*** SPEAKERS ***

How to Choose Studio Monitors

Your studio monitor speakers need to provide the most accurate, uncolored representation
of your music as possible. Whether you’re recording, editing, mixing, or mastering audio,
a sonically transparent monitoring system ensures your mix will translate well to
headphones, car audio systems, TVs, and other listening systems. And because there is
also more to consider than accuracy when selecting studio monitors, we’ve put together
this guide to help you find the best options for your studio space.

Active vs. Passive Monitors

While there is a vast array of active studio monitors to choose from these days, it’s worth
noting that the recording industry grew up with passive monitors. While one type isn’t
completely superior to the other, it’s important to understand the differences.

Passive monitoring systems are modular in nature,

requiring you to match your speakers with an appropriate amplifier and crossover. Active


monitors have all that built in, which presents a few benefits. You don’t have to deal with
extra rack gear, and you know the internal amplifier is specifically matched to that
speaker for the best sonic performance.

While you can create a world-class passive monitoring system,

countless professional studios worldwide rely on active systems with no regrets. Unless
you already have a specific reason to prefer a passive system, you’ll probably appreciate
both the convenience and performance you get from an active studio monitoring system.

Power – How Many Watts Do I Need?

In a studio monitoring system, the power handling of the system is going to have a big
effect on the overall sound, and not just in terms of volume. It also determines your
dynamic range, the amount of headroom you have before signals peak. Higher wattage
means you’ll be able to hear more transient detail, and you’ll be better able to make
precise adjustments to compressors, limiters, and gates.

Said another way, if you listen to a mix on two monitoring systems with different
wattages at the same average volume level, the higher wattage system will give you more
headroom. Many people are unaware that music peaks (transients like snare hits or kick
drums) can demand as much as 10X (yes, TEN TIMES) as much power as average music
program material. So for a given volume level that might demand an average of 20 watts,
the program peaks could require as much as 200 watts. If you have an amp that can
deliver a clean 70 watts, then you’ll be 130 watts shy of what you need. That results in
greater distortion and possibly clipping during that musical peak, and for a kick drum in
most pop music, that happens very often. While you don’t necessarily need the highest
possible power rating, keep in mind that more wattage will produce more definition and
dynamic range, not just overall volume.

Power – Single-amp, Bi-amp, and Tri-amp

How the input signal is divided up to power the drivers in a studio monitor determines
whether it’s a single-amp, bi-amp, or tri-amp configuration. Many studio monitors have
two speakers in them: a tweeter for high frequencies and a woofer for low and midrange
frequencies. Some may add a third speaker so that low frequencies are sent to the woofer,
and mid-frequencies are sent to a dedicated midrange speaker. In a single-amp
configuration, a crossover network divides the output of one amplifier, which sends the


appropriate frequencies to each speaker: low frequencies to the woofer and high
frequencies to the tweeter. In a bi-amp configuration, the crossover network precedes two
separate amplifiers, which are each used to power the high- and low-frequency drivers. A
studio monitor that divides the signal three ways to feed three amplifiers that drive each
high-, mid-, and low-frequency speaker individually is a tri-amp configuration.

Generally speaking, bi-amp and tri-amp configurations have a flatter (more accurate)
frequency response, as well as greater definition. By powering each speaker individually,
instead of all from a single amp, each driver is able to reproduce its dedicated frequency
range more precisely. When comparing single-amped with bi- or tri-amped monitors that
are similar in speaker size, the bi- and tri-amped monitors will usually sound clearer and
more defined.

Different Driver Types and Why They Matter

You’ll find all manner of speaker construction materials out there, from paper to Kevlar
to aluminum alloys and beyond. Manufacturers are constantly innovating, and if you’re
interested there are plenty of resources available about the properties of different
materials. But step back for a moment – do you really care what it’s made of at the end of
the day?

Materials play a big part in the sound of a speaker, but would you really buy studio
monitors based on one specific material used in its construction? While we fully
acknowledge the huge impact speaker driver materials have on its sound, you can quickly
get confused if you focus on materials instead of application-specific benefits.

Cabinet Considerations: Ported or Closed?

You’ll find that many smaller studio monitors, and quite a few larger ones too, have a
ported cabinet that helps extend the frequency response lower for more bass. While this
can be beneficial, the sonic accuracy of ported cabinets may not be as precise as closed
cabinets. This behavior is exaggerated if the ports are on the back of the speakers, and
they are then placed too close to a wall. If you can’t avoid putting your studio monitors
close to walls or corners, you may want to choose front-ported or closed designs for more
accurate monitoring.

EQ, Room Correction, and Other Features

Many studio monitor features have some type of EQ built into them to help you tune
them to your room. Some even have digital processing that can optimize their


performance for your acoustic space. While these are helpful features, it’s important to
remember that you can’t cheat physics. EQ and room correction DSP can help make the
most out of a bad-sounding room, and they can make a room with good acoustics sound
great. But ultimately, no set of speakers can make up for uncontrolled acoustics in your
control room.

Do I Need a Subwoofer?

It depends completely on what you’re doing with audio. If you’re

mixing sound for TV or motion pictures, then a multi-speaker monitoring setup with a
subwoofer is practically essential. If you’re mixing your band’s demo tracks that you
recorded in your basement, you really only need a stereo pair of studio monitors. Ask
yourself this – how will your audience be listening to your project? If it’s likely to get
played through a home theater system with a sub, or a powerful dance club system, you’ll
need a subwoofer to hear what’s going on in the lowest bass octaves. If you’re mixing
music that most people will listen to on their iPod or in their car, your mixes probably
won’t much benefit from the extended range a subwoofer will add to your monitoring

Another consideration is the size of your room. Without getting into the mathematics,
smaller rooms simply aren’t large enough to allow bass frequencies to fully develop.
Putting a subwoofer in a small room is just asking for sonic inaccuracies: you’ll notice
volume peaks and dips throughout the room, some bass notes will sound solid while
others will sound clouded and indistinct, and your studio’s overall sound will be
unbalanced. You also need to be cautious about introducing too much low frequency
energy into your room and skewing your perception of how much low end is appropriate
for your mix. Acoustic treatment such as bass traps will help reduce these issues, but the
size of your room will always remain a limiting factor in your quest for sonic accuracy.


Placement and Isolation

At least there are no gray areas here – place a

stereo pair of studio monitors so they form an equilateral triangle with your head when
you’re seated in your mix position. Said another way, place them so that they’re as far
away from you as they are from each other. This will result in the most accurate
frequency response and clearest stereo image. If you’re setting up a multichannel
surround sound system, it gets more complicated – contact your Sweetwater Sales
Engineer and they can direct you to more resources on surround sound speaker

Speaker stands help improve the sound of your monitors compared to

placing them directly on a desk or mixing console. Sound will reflect off the desk or
console and will arrive at your ears just slightly after the direct sound, causing subtle
comb filtering that reduces the accuracy of your monitoring. Here’s a studio tip from a
pro: Place a small mirror on any hard surface surrounding or between you and your
monitor. If you can see the monitor’s reflection in the mirror from your mix position, that
surface will act like an acoustic mirror and reflect sound via a non-direct path into your
ear, causing comb filtering. Any acoustically hard area between you and your monitors
should be treated with sound-absorbing material. Also, speakers will transmit some of
their energy into the surface they’re placed on, causing further sonic distortions if they’re
not isolated. Placing your studio monitors on stands with isolation pads is the best way to


Where Do I Go From Here?

We’ve covered most of the basic considerations when searching for new studio monitor
speakers, and much of your research from this point on will be determined largely by
what type of work you’re doing in your studio. Here are just a few more points to keep in

The size of your speakers should be appropriate for the size of your room. If you’re
mixing in a small space, you’ll get much more accurate results with smaller monitors.

Remember that technically speaking, studio monitors aren’t trying to sound “good.”
They’re trying to sound as accurate and precise as possible. The ideal set of studio
monitors should reveal every detail in your mix, both good and bad, while portraying an
accurate balance across the entire frequency range.

Keep in mind that it’s almost impossible to predict how a set of studio monitors will
sound in your room. Even if you invest time in auditioning a set at a store or a friend’s
studio, the acoustics of your room will play a huge role in what you’ll hear when you’re
mixing. You can make note of certain characteristics, but don’t expect them to sound
exactly the same.

Acoustic Treatment
If you’re serious about your choice of studio monitors, you should also be serious about
controlling the acoustics in your room. If you’re just starting out, we highly recommend
setting aside some of your budget for some basic absorbent acoustic treatment – that way
you’ll hear more of your speakers and less of your room’s reflections.



*** Headphones ***

How to Choose the Right Headphones for the Job

It’s a question we hear every day: “What’s the best pair of headphones I can get for
(various cash amount)?” That’s a tricky question. If you just look at baseline
professional-quality studio headphones, you’ll see that there are about a half a dozen or
more models, all at similar prices. Their specs are even similar!

Start with Your Application

Don’t start with the headphones; start with the application you need headphones for. If
you’re tracking in the studio, you’re going have different needs than if you want to
monitor and tweak a live headphone mix. If you’re playing drums on a recording, your
needs will be totally different than if you’re singing in a well designed vocal booth. In the
following sections, we’ll cover some of the important considerations you can use to
figure out what the right headphones are for the job at hand. We’ll also cover some
common pitfalls so you’re prepared to avoid them.


Headphone Terminology

Circumaural vs. Supra-aural

In layman’s terms, these terms simply mean “around the ear” (circumaural) and “above
the ear” (supra-aural). In the case of headphones, this refers to the design of the earcup,
which is the cushion that sits between the headphone’s speakers (drivers) and your ear.
You’re not going to find many professional-quality supra-aural headphones.

Closed-back vs. Open-back

Also referred to sometimes as simply “closed” and “open,” this

distinction addresses the design of the part of the headphone that covers the area behind
the driver in a straight line away from the side of your head. Closed headphones prevent
sound from escaping. The downside of this design is that it traps pressure inside the
headphone, which creates false low frequencies. These false low frequencies are fine for
most professional uses (and even desirable in consumer products), but less desirable for
critical listening.

For critical listening, headphones with an open back often provide a more
accurate frequency balance, with the trade off of providing slightly less isolation.
Extremely well engineered open-back headphones provide almost the same isolation as
high quality closed back headphones, but its a luxury you’ll have to pay for. That said,
there are some excellent “semi-open-back” headphones that are affordable, well
balanced, and provide enough isolation for some professional tracking applications.


Recording in the Studio

It almost goes without saying that isolation is one of the general key characteristics to
look for in tracking headphones. Outfitting your studio with enough pro-quality
headphones isn’t cheap, but you don’t have to provide top-dollar headphones for
everyone. Rather than pick up several of the same model, however, you may want to
diversify by application.

Drums: Acoustic isolation matters more than anything for tracking drums. Not only is
this isolation necessary for blocking out the sound of the drums (thereby allowing the
drummer to hear the mix), but it keeps the click and reference mix from bleeding out into
the microphones. We recommend 25-29dB of sound isolation minimum for tracking
drums. Second to isolation is low-frequency reproduction. Fortunately, the kind of tightly
sealed closed-back headphones that provide the isolation you need also have a slightly
disproportionate low end.

Electric Guitar and Bass: Depending on how you track these, you can use standard
closed-back headphones, semi-open-back headphones, or no headphones at all. You’ll
probably need to considerably turn up the monitor mix if the guitarist or bassist is
tracking with the amp in the same room, just to allow them to hear the mix. Although it’s
not likely that semi-open-back headphones will audibly bleed into the mic, you may want
to use closed-back headphones just for their isolation.

Acoustic Guitar (and Other Instruments): This one’s simple – play it safe. You’re better
off using closed-back headphones with proper isolation than realizing too late that there’s
headphone bleed in the tracks you recorded.

Vocals: Vocalists rely both on proper pitches and frequency balance to sing correctly. To
a singer, this is the difference between a good feeling and a murky one when you sing. To
an engineer, this can be the difference between a lively performance and one that’s flat
both in energy and in pitch. Ideally, you’d use high-end open-back headphones for
tracking vocals. If you have the money for one great pair of headphones and several pairs
of baseline pro-quality sets, or you’re a vocalist who wants to have a great set of ‘phones
for recording, then this is one investment you want to make.

If you don’t have the cash for a set of high-end headphones, you can try using a less
expensive semi-open-back model. However, you still want to avoid sound bleeding from
the headphones into the mic. If you can’t find a headphone level that’s quiet enough to
prevent all but a minimal/acceptable level of bleed, you’ll need to switch over to closed-
back headphones.


Mixing and Monitoring

During tracking, a good pair of reference quality headphones (almost always high-end
open-back headphones) can help you identify problems (such as ground hum and the
buzzing of bad cables) that you don’t want to be stuck with later. During mixing,
headphones let you hone in on details in the tracks you’ve recorded to isolate similar

Headphones also act as a great way to hear your mix without hearing the impact of your
room. Simply put, the headphones you use in your control room/mixing suite should have
the clearest and most accurate frequency response possible (just like your studio
monitors). We recommend open-back or semi-open-back headphones for these

One last note: Remember that this is the iPod age. You’ll want to keep a pair of regular,
single-driver earphones (a.k.a. ear buds) on hand when you mix. These will let you hear
what your final mix will sound like to the majority of listeners.

Live Sound Applications

The most important characteristics for live headphones are their ability to seal out the
overwhelming sound level in the house so that you can hear what’s going on in your own

Onstage: When it comes to live headphones, in-ear monitors (also called earphones) are
by far the most popular format. Professional in-ear monitors are extremely low profile
and fit more-or-less like earplugs. They’re nearly invisible from more than a few feet
away, and they block out sound so well that most bands who use them actually benefit
from a microphone or two used to pick up some crowd noise for reference.

Due to the size and design of standard single-driver earphones, they tend to have
extremely limited low-end expression. This limited low end is a deal-breaker for pretty
much any bass player or drummer. Bass players and drummers don’t just need to hear
low frequencies, they need to feel them. To accommodate low-frequency instruments,
there are excellent multi-driver earphones out there that act like 2- or 3-way PA speakers,
with dedicated drivers for higher and lower frequency ranges. These earphone pack
serious punch, meeting the expectations of bassists and drummers, but they aren’t cheap.

At the Board: You’ll want to keep a pair of closed-back headphones at the board for
soloing and troubleshooting. The typical sound pressure level at the mix position is about
110dB (i.e. really loud), so you’ll need headphones with heroic isolation if you want to
hear clearly. Also, if you mix bands that play with in-ear monitors, be sure to have a pair
of your own in-ears back at the board, preferably the same kind, but stick with a single-
driver model. That way you can set up in-ear mixes and hear more-or-less exactly what
the band hears.


The Break-in Process

Breaking in headphones is a simple and important process, but few people understand it.
Here’s how it works. The same way that a new set of strings or a new drum head needs to
be re-tensioned after a bit of normal use, new speaker drivers begin their lives at a higher
tension and loosen up after a short while.

Professional headphone manufacturers deliberately over-tension their drivers. Once the

initial tension slackens just a bit, the driver will remain perfectly stable for many years.
The flip side is that, when you first plug in your new headphones, they’re likely to sound
harsh, brassy, and possibly unpleasant. It commonly takes about 12-24 hours of use to
break in a pair of headphones. That may not sound like much, but in real application, that
can help avoid weeks of short, unpleasant, and (worst of all) inaccurate listening.

To break in your headphones, we recommend that you plug them in and run a fairly hot
signal through them for 12-24 hours. About 84dB is gentle enough not to run the risk of
damaging healthy headphones, but loud enough to break them in properly. That’s about
the same loudness you would use if you were really into it or listening for specific lyrics
(please take any hearing loss into consideration; this is, after all, only a general idea).
Also, use tonally diverse music. Jazz and classical are ideal, hip-hop and metal aren’t.

What to Look For…

Choose headphones based on your application(s)

What you need headphones for will dictate what kind of headphones you should be
looking for.

Types of headphones
Headphones come in two basic styles: closed-back or open-back. Closed-back phones
give you better isolation, but this isolation usually comes with a slight bass boost. Open-
back ‘phones offer more natural sound, but the tradeoff is isolation.

Headphones in the studio

You’ll likely need multiple sets of standard studio headphones for general tracking, along
with specialty headphones for specific applications. Vocalists present a special challenge,
as they need a blend of isolation and accuracy. Reference-quality headphones are also
good for identifying problems such as hum and buzz in your tracks.

Headphones in live sound

Good isolation headphones are awesome for checking your mixes at the monitor mixing
station or the FOH position. Onstage, in-ear monitors give performers a clear picture of
what they and their fellow performers are doing.


*** Microphones ***

How to choose a studio microphone

Professional recording engineers know that creative microphone selection is an essential

ingredient in every great recording. If you’re new to recording, however, the subject of
microphones is probably shrouded in mystery. With a bit of studio experience under your
belt, you learn that certain mics work well for recording particular instruments, but
without an understanding of acoustic theory, you likely won’t understand why. Do you
reach for a dynamic or a condenser mic? Tube or solid-state? With so many mics to
choose from, it can seem overwhelming, so a basic understanding of microphone types
and their uses will serve you well. A bad mic choice usually comes back to haunt you –
sticking out like a sore thumb in the mix


Types of microphones

Dynamic Microphones

In a dynamic microphone, the audio signal is generated by the motion of a conductor

within a magnetic field. In most dynamic mics, a very thin, lightweight diaphragm moves
in response to sound pressure. The diaphragm’s motion causes a voice coil suspended in
a magnetic field to move, generating a small electric current. Dynamic mics are less
sensitive (to sound pressure levels and high frequencies) than condenser mics, and
generally can take more punishment. They also tend to be less expensive. Dynamics are
perfect for drums and electric guitars. The most popular snare mic of all time is the Shure
SM57 (also great on guitar amps). Many engineers swear by the Sennheiser MD421 on
toms. The purpose-built AKG D112 is an excellent choice for bass drum.

Condenser Microphones

When it’s absolute fidelity to the source you’re after, reach for a condenser microphone.
Condensers are more responsive to the “speed” and nuances of sound waves than
dynamic mics. This simple mechanical system consists of a thin stretched conductive
diaphragm placed close to a metal disk (backplate). This arrangement creates a capacitor
which is given its electric charge by an external voltage source – a battery or dedicated


power supply, or phantom power supplied by your mixer. The diaphragm vibrates
slightly in response to sound pressure, causing the capacitance to vary and producing a
voltage variation – the signal output of the microphone. Condenser mics come in both
solid-state and tube variations and a wide variety of shapes and sizes – but they all
function according to these principles.

Ribbon Microphones

Used extensively in the golden age of radio, ribbon mics were the first commercially
successful directional microphones. Today, ribbon mics are enjoying a comeback, thanks
to the efforts of a handful of companies such as Royer. Ribbon mics respond to the
velocity of air molecules moving a small element suspended in a strong magnetic field,
rather than sound pressure level (SPL), which is what “excites” most other microphone
types. In studio applications this functional difference isn’t important, although it can be
critical during an exterior location recording on a windy day! Vintage ribbons such as the
RCA 44 and 77DX were notoriously delicate; today’s ribbon mics – such as the Royer
R121 and R122 – are designed to handle the rigors of daily studio use.


USB Microphones

A recent development in microphone technology, the USB mic contains all the elements
of a traditional microphone: capsule, diaphragm, etc. Where it differs from other
microphones is its inclusion of two additional circuits: an onboard preamp and an analog-
to-digital (A/D) converter. The preamp makes it unnecessary for the USB mic to be
connected to a mixer or external mic preamp. The A/D converter changes the mic’s
output from analog (voltage) to digital (data), so it can be plugged directly into a
computer and read by recording software. That makes mobile digital recording as easy as
plugging in the mic, launching your DAW software, and hitting record!

How to read a microphone frequency chart

A microphone’s Frequency Chart can tell you a lot about which situations are appropriate
for a given microphone and which situations are not. In theory, Frequency Charts are
generated at the factory by testing the microphones in an anechoic chamber. An anechoic
chamber is a specially constructed room just for audio testing. The idea here is to create a
controlled atmosphere where each microphone can be tested equally, so the room is
completely dead, without any form of sound reflection. Generally, a speaker is set up in
front of the microphone that is being tested and pink noise is played (pink noise is all
frequencies with equal energy in every octave). The microphone is routed into a spectrum
analyzer that measures the output and a Frequency Response Chart is produced. The chart
is usually over the 20Hz to 20kHz range, which is the range of human hearing.

So, how do you read it? The horizontal numbers in a Microphone Frequency Chart
represent frequencies (again, usually over the 20 Hz to 20 kHz range) and the vertical
numbers represents relative responses in dB (Decibels). As you look at a Frequency
Chart, you can tell how a given microphone performs at certain frequencies. How is this
information helpful? Well, let’s look at the famous Shure SM57’s frequency chart:


The frequency response of the SM57 makes it especially good for certain instruments
such as a snare drum because the fundamental frequency of the snare resides in the
150Hz to 250Hz range – right where the SM57 Microphone Frequency Chart shows that
the SM57 response is flat, or neutral. In other words, at this frequency, what you hear
going into the microphone is what you will tend to hear coming out – nothing more,
nothing less. The presence bump to the right of the chart is just where the frequency of
the “snap” of the snare resides. In addition, its rolled off low end makes it great for de-
accentuating the kick drum which is often very close in proximity. This combination is
what most engineers are looking for in a great snare drum mic – the ability to capture the
true sound of the snare, accentuate its snap and reject other instruments in close


Understanding microphone polar patterns


Mics with a Cardioid polar pattern “hear” best what happens in front of them while
rejecting sound from the sides and rear. The graphic representation of the pattern
resembles a heart (thus, “cardioid” shape. The ability to reject sound from the rear makes
Cardioid patterns useful in multi-miking situations, and where it’s not desirable to
capture a large amount of room ambience. Popular in both studio and live use (where rear
rejection cuts down on feedback and ambient noise), Cardioid mics are used for a very
high percentage of microphone applications. Keep in mind that like all non-
omnidirectional mics, Cardioid mics will exhibit proximity effect (increased bass
response when the mic is very close to the sound source).

Supercardioid and Hypercardioid

A Supercardioid polar pattern is more directional than Cardioid; Hypercardioid even

more so. Unlike Cardioid, both of these polar patterns have sensitive rear lobes (smaller
in the Supercardioid) that pick up sound, which can make positioning these highly-
directional mics somewhat tricky.



Omnidirectional microphones detect sound equally from all directions. The graphic
representation of the pattern is a circle. An omni microphone will not exhibit a
pronounced proximity effect. All mics are born omnidirectional. Additional engineering
is then applied to create directional polar patterns. Omnis are great for capturing room
sound along with whatever you’re recording.


A Figure-8 polar pattern is one in which the mic is equally sensitive to sounds picked up
from front and back, but rejects sounds coming from the sides. This produces a pattern
that looks like a “figure-8″, where the mic capsule is at the point of crossover on the 8.
This pattern is also known as bi-directional.

Multiple Pattern

Many professional condenser mics have switchable polar patterns. Omnidirectional,

Cardioid, and Figure-8 patterns are typically included. An extreme example is AKG’s
C12VR – which sports the aforementioned patterns, plus six intermediate settings, for a
total of nine polar patterns!


Understanding microphone diaphragm sizes

Why Size Matters

Condenser and dynamic mics are classified according to the size of their capsule,
Traditionally this has resulted in two classes: large diaphragm and small diaphragm; each
has its place in a well-equipped studio. The medium diaphragm mic – a relatively recent
development – can be considered a hybrid, blending characteristics of both large- and
small-diaphragm mics.

Large Diaphragm

Large-diaphragm condenser mics such as the venerable Neumann U87 are a studio staple.
From vocals to strings to brass to percussion, you can use large-diaphragm condenser
mics to record just about anything. The multiple pickup patterns and pads found on many
large-diaphragm condensers make these the most versatile mics in your studio. There are
also large-diaphragm dynamic mics that are well suited to capturing loud sources with
robust bass energy (such as kick drum or toms). The esteemed Sennheiser MD421 falls
into this category.


Medium Diaphragm

The definition of medium diaphragm is a potentially controversial subject. Historically

there have been large-diaphragm and small-diaphragm mics, but more recently the
medium size has begun carving out its own category, though not everyone agrees on the
precise upper and lower limits. Most professionals and manufacturers agree that any
microphone with a diaphragm of roughly 5/8″ to 3/4″ in diameter can be characterized as
medium diaphragm. Generally speaking, medium-diaphragm microphones tend to do a
good job of accurately catching transients and high frequency content (as a small
diaphragm would) while delivering a slightly fuller, round and potentially warmer sound
(as a large diaphragm might).

Small Diaphragm

Often perceived by the newbie recordist as living in the shadow of its large diaphragm
counterpart, small-diaphragm condenser mics actually shine in applications that leave
their bigger cousins struggling to keep up. Their characteristic ultra-responsiveness is
precisely due to their smaller, lighter-mass diaphragms. The hands-down choice for
acoustic guitar, hi-hat, harp – or any instrument with sharp transients and extended
overtones, many recording engineers actually favor small diaphragm mics as drum


overheads. A bonus with these bantamweights (often referred to as pencil mics, due to
their typically thin cylindrical shape) is that they are easy to position.

Typical microphone applications


For starters, you’ll normally want to stick with cardioid dynamic mics on the drums, a
cardioid small diaphragm condenser on the hi-hat, and a matched pair of either large- or
small-diaphragm condensers for the overheads. Note that condenser mics can certainly be
used on the snare and toms (and for the more adventurous, even kick). It’s a case of; once
you know the rules, feel free to break ’em. After all, you’re an artist, painting with

Drum mic placement requires a lot of experimentation: ask the drummer to play, listen;
move a mic a few inches, listen again. Better yet, stay in the control room and have your
assistant (if you’re fortunate enough to have one) move the mics while you listen. With
some experience, the process becomes streamlined: you’ll know which mics you like, and
where you’re going to place them. But that’s just a starting point; even professional
recording engineers will change mics and positioning on an inspired whim.

Electric Guitar

No need to over-think this one. If you can spare just one mic, use a dynamic (the iconic
Shure SM57, for example), positioned close on the speaker cabinet. If you have a large-
diaphragm condenser, try placing it a few feet back (if the source is loud, engage the pad,
if the mic has one). You can mix these mics together at the console, or record them to
separate tracks, if tracks are not at a premium. If you’re overdubbing the guitar (or you
can place the guitar cab in another room during tracking), you don’t have to be concerned
with (drum) leakage. Assuming the cab’s in a decent-sounding room, experiment by
backing up your condenser mic to soak up more room sound. If the mic has multiple
polar patterns, try switching it to omnidirectional. You can, of course, use more than two
mics – just watch your phase relationships.

Acoustic Guitar

A small-diaphragm cardioid condenser is preferable here. As a starting point, aim it

down, looking at the 12th fret, from about 6-8 inches away. Large diaphragm condensers
can also work nicely on acoustic guitars, as well as ribbon mics. Have fun experimenting
with different mics and placements to find which work best for you.



When it comes time to overdub vocals, you’ll want a large diaphragm condenser mic. For
a lead vocal, you should match the mic to the vocalist – who may also have a personal
mic preference. The best way to approach this is to set up three or four likely mics, and
have the singer sing the same (critical) section of the song into each of them. Record each
of these passes to individual tracks, then invite the singer into the control room for
playback to decide which mic best matches his or her voice (and the song).

Acoustic Piano

Usually recorded in stereo. Use two large-diaphragm condenser mics, or a large

diaphragm for the low strings, and a small diaphragm for the highs. Piano mic placement
is highly variable, so a certain amount of experimentation is in order. There are also
piano mic kits that take the guesswork out of mic choice and placement.


Violins, violas, cellos, double bass. Just about any high-quality condenser will do. Or, if
you’re fortunate enough to have a ribbon mic, this may well be what you prefer.

Brass and Reeds

Use ribbon mics, if you have them. Otherwise go for large diaphragm condensers.



*** Audio Interfaces ***

How to Choose an Audio Interface

Choosing the right audio interface may seem a little overwhelming. There are all kinds of
different input and output configurations, connection types, formats and many other
options to consider. So, how do you find the right one for you?

What is an audio interface?

An audio interface (or “interface”) is the hardware that connects your microphones and
other audio gear to your computer. A typical audio interface converts analog signal into
the digital audio information that your computer can process. It sends that digital audio to
your computer via some kind of connection (e.g. USB, FireWire, or a special PCI/PCIe
card). This same audio interface also performs the same process in reverse, receiving
digital audio information from your computer and converting it into analog signal that
you can hear through your studio monitors or headphones. Most audio interfaces include
line-level analog inputs and out puts, one or more microphone preamplifiers, and may
even include digital inputs and outputs such as S/PDIF or ADAT (lightpipe).

Why do you need an audio interface?

There are several reasons to use a dedicated audio interface, rather than the sound card
built into your computer. Technically speaking, a sound card is an audio interface, but its


limited sound quality and minimal I/O make it less than ideal for recording. Most sound
cards only have a consumer-grade stereo line level input, a headphone output, and
possibly also a consumer-grade stereo line level output. Electromagnetic and radio
interference, jitter, and excessive latency all degrade or negatively affect audio both on
the way in and on the way out. It’s also impossible to track a full drum kit (let alone a full
band) with only two channels of input. Sound cards are great for hooking up a pair of Hi-
Fi speakers and playing back compressed audio, but you’re going to need a reliable audio
interface for recording and monitoring production-quality audio.

Choosing the right I/O configuration

With the possible exception of computer connectivity (more on that below), no other
feature is as important for choosing your audio interface as its I/O (input and output)
configuration. The the number and type of inputs and outputs you need depends entirely
on what you want to be able to record, now and in the future. The range of audio
interfaces includes everything from 2-channel desktop units to systems that can record
hundreds of channels.

If you’re a singer-songwriter, than you may only need a

pair of inputs, as long as they’re the right inputs. Most audio interfaces include two or
more microphone preamps. If you’re going to use condenser microphones you’ll want to
make sure that your interface’s preamps are also equipped with phantom power. If you’re
going to plug your guitar or keyboard straight into your interface, make sure that the
interface you buy has instrument-level (also called “hi-Z”) inputs. line-level inputs and
outputs are great for hooking up outboard processors, headphone amps (for creating
separate headphone mixes) and studio monitors.

Digital I/O may not seem important when you’re first starting out, but it can be incredibly
useful down the road. For instance, some high-end 1- or 2-channel microphone
preamplifiers come equipped with S/PDIF output, which lets you hook them up to your
audio interface without depriving you of useful line-level inputs. If your interface comes
equipped with standard ADAT lightpipe I/O, you can easily expand your system with an
ADAT-equipped 8-channel mic pre. Eight extra channels can turn your personal
recording rig into a system that’s ready to track a full band.

Computer connectivity options

One of the constants of the recording industry is that technology simply doesn’t sit still
for long. In computer-related technology “standard” is next to “obsolete.” That said, a


few audio interface connection types are considered standard, and those are: USB,
FireWire, and PCIe. Most PC and Mac computers come equipped with USB 2.0 ports,
whereas FireWire is mostly found on Macs. Both of these protocols average the same
speed (480Mbps), which is fast enough to record up to 64 tracks at once under ideal
conditions. Also, there are some simple interfaces that still use USB 1.1, which is much
slower, but fast enough to record one or two channels.

The advantage of FireWire is that it transfers data at more

consistent rate than USB, which makes it slightly more reliable when you’re recording
more channels at once. The disadvantage is that there are less interfaces that use FireWire
than USB, and less computers that come equipped with FireWire ports. If you own a PC,
you might need to install a FireWire card.

The advantage to USB (3.0, 2.0, and 1.1) is that there are many interfaces
designed to run on USB bus power (rather than an external power supply), which is
excellent if you plan on doing mobile recording with your laptop. There is also a small
selection of PC Express and PCMCIA card-based interfaces, which are specifically
designed for laptops.

That brings us to the third standard audio interface connection, which is PCIe (PCI

Express). PCIe is an internal card-based interface, which

(by its very nature) means you can’t use these interfaces with laptop computers. By
effectively installing your audio interface into your computer’s motherboard, you gain the
advantage of bypassing some of the data conversion processes that cause latency and
limit bandwidth. The majority of PCIe audio interfaces are designed to handle high track
counts and the near-instantaneous speed required by professional studios, and are
consequently more expensive than FireWire or USB interfaces. That said, there are some


affordable PCIe interfaces that allow even entry-level users to take advantage of this

Tech specs and how to read them

People often ask, do things like bit depth and sample rate really matter? They’re some of
the specs listed with almost every interface out there. The answer isn’t simple, but yes,
they do matter. Let’s start with bit depth. When it comes to processing audio, bit depth
has a huge impact on your sound. The simple math is that 1 bit = 6dB. That means 16-bit
audio (CD standard) has a total dynamic range of 96dB. The problem is that the digital
noise floor is pretty high, and the remaining dynamic range is pretty small. The result is
that if you work at 16-bit, the quieter sections of your audio will tend to be noisy. With
144dB of range, 24-bit audio gives production professionals the range they need to
process audio smoothly. That’s why 24-bit is considered the professional standard and is
highly recommended.

On the flip side, Sample rate is much more subjective. Each sample is a digital snapshot
of the captured audio. The CD standard 44.1kHz takes 44,100 digital pictures of the
incoming audio every second. Digital to analog conversion only needs two samples (the
top and the bottom) of a waveform to generate a frequency, so the 44.1kHz sample rate is
capable of reproducing frequencies as high as 22.05kHz. The uppermost range of human
hearing (in young females) is 20kHz, so technically, 44.1kHz is more than enough to
capture and reproduce every sound you can hear. However, there are additional
considerations (all of which are technical) that may or may not suggest higher sample
rates capture valuable information. That’s why most audio professionals choose to work
at 48kHz, 96kHz, or even 192kHz.

In the end, it’s all relative. If you’re planning on releasing your tracks on CD or posting
MP3s online you’ll probably be fine working at or mixing down to 16-bit/44.1kHz. If you
plan to release jazz in DVD Audio format, don’t even consider working at less than 24-
bit/96kHz. Higher sample rates, such as 192kHz are also extremely useful for sound
design. Record a dog snarling at 192kHz, and import it into a 96kHz session (half the
speed and pitch but no loss of resolution), and you instantly have the ominous guttural
growl used in countless sci-fi monster movies. Just remember, higher sample rates and bit
depths eat up more disc space and limit your track count, so you’ll need to work within
the limits of your equipment.

The most important thing to remember about sample rate and bit depth is that they mean
nothing compared to the quality of the digital converters you use. The same way that a
go-cart with a Ferrari engine in it may be able to go 130mph, but you wouldn’t want to be
along for the ride, a low-end converter may do 24-bit/96kHz, but it’s not going to give
you the professional fidelity you’re after.



*** Midi Devices ***

Control Surfaces

How to Choose a Control Surface

For hands-on command of your DAW software or soft synths, you can’t beat a control
surface. It frees you from your computer keyboard and mouse and provides visual and
tactile feedback that can help you work more creatively and productively.

Why Should I Consider Buying a Control Surface?


The advent of computer-based recording has been great for streamlining the amount of
physical gear an engineer needs to record and mix a project. Traditionally, mixing
involved rigging up your console to a multitrack to perform fader moves and such. With
DAWs, virtually every function of a traditional console can be handled on-screen by
drawing automation in or performing fader moves by dragging virtual faders with the

But if you prefer the feel of an analog console and want to work on several tracks in one
pass, a mouse doesn’t quite cut it. A control surface is perfect, because it provides the
familiar, hands-on feel of a mixer. Even if you didn’t hone your skills in the analog
world, some engineers just prefer to use physical faders and knobs, because they feel that
it provides better control for smoother fades and automation moves.

What Are All These Faders, Knobs, and Buttons For?

A control surface will normally be

equipped with faders; rotaries to adjust panning and other parameters; and controls to
start, stop, and locate within the playback. Most control surfaces offer access to advanced
functionality such as EQ and dynamics plug-ins if the tracks you’re working with have
those plug-ins instantiated. Many of the current control surfaces feature touch-
sensitive/motorized faders that update seamlessly and provide a great visual aid when
looking at your levels.

Although infinite rotary encoders are standard issue on modern controllers, you may
occasionally find traditional knobs onboard as well. Rotary encoders can handle any
number of functions and are typically assignable. For example, you could assign a
channel pan to a rotary encoder and set the pan wherever you wish. When you come back


to that track, the encoder will know where it was set, and future adjustments will be
updates to its previous position. A knob has a designated travel path and typically has set
values. If the pan was set using a knob rather than a rotary encoder, upon coming back to
a track, the software would show the correct pan value, but the knob could possibly be
showing a different value.

Transport controls are another important feature that should be included on a control
surface. They let you start and stop playback or recording, as well as fast forward and
rewind tracks. This is great, because it maintains the mouse-free convenience, provided
by a control surface, and minimizes the use of the computer keyboard.

What Are Banks?

Banks refer to groups of faders. These faders normally come in groups of eight. The eight
faders correspond to eight sequential tracks in your DAW. Using the aptly titled Bank
Select button, you can specify whether you want to work on tracks 1-8, or 9-16, of other
banks. The great thing about banks is that you can stay in the sweet spot for mixing yet
still have fader control over any track you need. Whether you’re working on a kick drum
on channel one or a guitar overdub on channel 28, you never have to leave the mix


Control Surfaces That Are Also Traditional Consoles

Most control surfaces are just devices that provide control over your software. There are
a few manufacturers, however, that offer full-fledged analog or digital consoles
(complete with I/O, mic preamps, and other customary console functions), which also
provide flexible DAW-control functionality. If you want your control surface to function
as an audio interface or a traditional console, make sure that it includes the number and
type of inputs you need: XLR mic inputs, 1/4″ line, etc.

How Your Control Surface Communicates with Your DAW

Most control surfaces use MIDI to send control information to and from your DAW
computer. USB is commonly found on control surfaces as the primary means of
connecting to a DAW, but there are also surfaces that use FireWire, Ethernet, and
proprietary protocols.

Will It Work with Your Software and Hardware?

Most control surfaces are pretty flexible in terms of software compatibility, but there are
exceptions, so it’s best to check this in advance. On the hardware side of things, if you’re
planning to upgrade your computer soon, make sure that the primary connections of the
control surface are available on both your existing system and your new one.

How Many Faders/Fader Banks Do You Need?

While most surfaces are equipped with eight faders, many are expandable with additional
control surfaces. If your software tracks top out at 32, an 8-fader surface with four banks
will handle your fader needs nicely. A couple of manufacturers are making devices that
sport a single motorized fader and perhaps a knob for panning and transport controls.
This modest expansion is an inexpensive way to enhance your productivity, provided you
only want to write automation on one track at a time.


What to Look For…

• Connectivity protocol: USB, FireWire, Ethernet, etc.

• Software/hardware compatibility with your existing gear
• Number of faders/fader banks you need
• Touch-sensitive/motorized faders to enhance your workflow
• Analog or digital I/O functionality

*** MIDI Controller Keyboards ***

How to Choose a Keyboard Controller

MIDI keyboard controllers have become an important part of the music-making process
for contemporary musicians and producers due to the increasing use of virtual
instruments onstage and in the studio.

What is a Keyboard Controller?

Way back in the 1980s, one of the original purposes of developing the MIDI specification
was to allow live performers the ability to control the sounds of multiple synthesizers
from a single keyboard. That concept has been a smashing success! Today, live
performers, songwriters with laptops, studio musicians, sound designers, and others can
all benefit from the flexibility a keyboard controller offers them.

Technically, a keyboard controller is a device with piano or synth-style keys, and usually
a selection of knobs, buttons, and sliders. All of these transmit MIDI data to external


sound modules (synthesizers), computer software synthesizers, or a hardware or software

sequencer. Most keyboard controllers themselves have no internal sound-generating
capability, but almost any keyboard synthesizer/workstation can act to control the sounds
and parameters of other devices.

The real advantages of a keyboard controller are versatility and portability. They give you
control over virtually the entire range of modern music hardware and software while
sometimes even being compact enough to fit in your laptop computer bag.

Faders, Buttons, and Knobs

In addition to the piano-style keys found on keyboard

controllers, many also include a range of knobs, sliders, and buttons on their top panels.
These are capable of transmitting MIDI data and can dramatically increase the hands-on
control you have over your software or any module you have connected to your
controller. Here’s a specific example: you have your controller plugged into your
computer and your DAW of choice running with an instance of your favorite soft synth.
The controller’s knobs, sliders, and modulation wheel give you hands-on, real-time
control over tweaking the synth’s filter cutoff and resonance, amp envelope, and so forth.
This provides a much more “authentic analog” feel over using a mouse. Some controllers
now include automapping technology that sets up the knobs and faders to correspond to
your specific software applications.

Using a Keyboard Controller in Live Performance

This was one of the original concepts of MIDI: control of other modules from one
keyboard. Onstage, you could connect your controller to your laptop – or a rack full of
synth modules and effects processors – and use presets to combine or split devices using
simple button pushes. If you’re a DJ, you would definitely appreciate the compactness of
a 25-key controller while using its knobs to modulate the filters of a loop sequencer that
lives on your laptop.

Key count


How much space do you have in your studio? Do you play two-handed? Do you want to
be able to do keyboard splits (range mapping)? How important is portability – are you
taking your controller on the road? Your answers to these questions will determine how
many keys you’ll want on your controller. Controllers generally come with 25, 49, 61, or
88 keys, and can be anywhere from under 20″ to over 50″ in length. We also occasionally
see models with 32, 37, 73, and 76 keys.

Keyboard Action Types

A vital quality of any keyboard controller is the keyboard action – the manner in which
the key responds to playing. You, the player, need to feel comfortable using the
controller, whether live on stage or in your songwriting or recording studio. Don’t
underestimate the impact of having a less-than-ideal keyboard on your creativity and
productivity! The type of action you prefer is usually determined mostly by what you are
accustomed to, and also by the particular style of music that you play, which may call for
one type of action over another. You can choose from three basic keyboard action types:

Weighted Hammer Action

Many controllers have 88-note keyboards that replicate the mechanical action of a
conventional piano keyboard. This is difficult to do because a controller has no strings or
hammers. Manufacturers use different methods of applying weights and springs to mimic
a piano’s action. Others add a hammer action to more closely replicate a true piano
“feel.” If your primary instrument is piano, or if you compose a lot of piano-oriented
music, the realism of a weighted hammer-action keyboard might be ideal for you.

Semi-weighted Action
Similar to a weighted action, but with less key resistance and a slightly springier release,
semi-weighted actions are popular with many players. If you don’t need realistic piano
response but don’t care for spring-loaded synth actions (see below), try a semi-weighted

Synth Action
A synth-action keyboard, on the other hand, feels more like an electronic organ. The
spring-loaded keys are light and capable of being moved very quickly. They also tend to
return to their resting position much more quickly. This can be an important advantage
when trying to play very fast parts such as lead lines or fast arpeggios. Synth-action keys
are perfect for musicians who aren’t pianists by nature, such as guitarists wanting to add
MIDI functionality to their setup. If you need an ultra-compact controller that slips into a
backpack, several manufacturers also make controllers with synth-action mini keys.


Aftertouch – do I need it?

Closely watch your favorite pro keyboardist lay down a synth lead line that ends in a bit
of tasty vibrato, and you may witness a finger leaning deeper into the key, providing the
extra key pressure that triggers an aftertouch event. Aftertouch is a convenient,
ergonomic way to add expressiveness to your playing. The alternative, of course, is
commandeering your left hand to rock your controller’s pitch wheel or joystick (not
possible if you’re using it to comp under your lead). Typically found on higher-end
controllers, aftertouch is one of those features you don’t know you need until you’ve used

Aftertouch comes in two flavors, monophonic (channel aftertouch) and polyphonic.

Channel aftertouch typically employs a “rail” that can be pressured by any key, and it
sends an average MIDI value for all held keys. Polyphonic aftertouch lets you vary a
parameter on each note independently, based on the pressure on the key after the note is
struck. Because it’s expensive to design and manufacture, generates a lot of MIDI
information, and requires a certain dexterity on the part of the player to take full
advantage of it, polyphonic aftertouch is found on relatively few keyboards.

I/O Options

While all modern controller keyboards transmit MIDI via USB, for more complex setups,
there are two other types of jacks that can make your life easier. Having conventional 5-
pin MIDI DIN jacks on your controller lets you connect and control external MIDI
instruments such as hardware synths, while CV and Gate outputs will even let you play
and modulate vintage (non-MIDI) synth gear.

Almost all keyboard controllers are equipped with a sustain (switch-type) pedal jack, but
basic models usually don’t offer one for a continuous controller pedal. Having an
expression pedal in your rig can make your performances significantly more, well,
expressive, letting you modulate any controllable parameter in real time – all without
taking your hands off the keys! Where higher-end keyboard controllers will typically let
you assign a MIDI CC (continuous controller) number to the pedal jack, most
inexpensive controllers’ jacks are preset to send either CC 7 (volume) or CC 11
(expression). So, if your keyboard has a fixed-CC volume pedal input and you’d like to
sweep a filter that’s expecting to see CC 11, you’d have to reset the mapping of that
parameter in your software (a simple task if your software has a “MIDI learn” mode).


Performance Pads

Some keyboardists have no problem using their black-

and-whites for playing percussion. Others detest it, preferring the supple feel of velocity-
sensing performance pads. Many of today’s keyboard controllers sport eight or more pads
that you can use to play drums and trigger loops. Some pads even sense aftertouch. A
bank of pads (along with knobs, faders, buttons, and LCD screen) takes up real estate on
your keyboard’s top deck, making for a larger (and heavier) controller, so you should get
out the tape measure and factor your workspace into your decision.

Alternatives to the Standard Keyboard

Some controllers go beyond the standard definition of

“keyboard.” An excellent example is the so-called “keytar,” a strap-on device that allows
live keyboardists the chance to step out from behind their rigs and claim some of the
glory that normally gets lavished on guitar players.

Popular with hip-hop producers and remixers, pad devices allow

MIDI samples to be triggered at the tap of a pad. They’re great for programming

Another device that abandons the “keyboard” concept entirely is the wind controller,
which gives wind players access to MIDI sound modules and software.


What to Look for…

Key count: Do you need 25, 49, 61, or 88 keys?

Keyboard Action Type: An important choice! Synth, semi-weighted, or weighted
hammer action?
Aftertouch – Do you need it?
Faders, buttons, and knobs: How many, layout; Automapping?
I/O: MIDI via USB, iOS; conventional 5-pin MIDI jacks, CV/Gate outputs
Performance pads: How many, feel, aftertouch-sensing?


*** Cables ***

How to Choose the Right Cable

Here at Sweetwater, we know that choosing the right cable for the right device can
sometimes be tricky. Sweetwater’s Cable Buying Guide should demystify some of the
confusion surrounding cables, how they work, and which ones connect to what. With so
many different connectors out there, it has become more and more important to
understand which cable is the appropriate choice.


Balanced vs. Unbalanced Cables

A balanced electrical signal runs along three wires: a ground, a positive leg, and a
negative leg. Both legs carry the same signal but in opposite polarity to each other. Any
noise picked up along the cable run will typically be common to both legs. Assuming the
destination is balanced, the receiving device will “flip” one signal and put the two signals
back into polarity with each other. This causes the common noise to be out of phase with
itself, thus being eliminated. This noise cancellation is called “Common Mode Rejection”
and is the reason balanced lines are generally best for long cable runs. XLR and TRS
cables are used to transmit balanced audio from one balanced device to another.

Unbalanced cables are less complicated, but they’re much more susceptible to noise
problems. In general, unbalanced lines should be kept as short as possible (certainly
under 25 feet) to minimize any potential noise that may be carried with the signal into the
connected equipment. You may be able to hear a difference in the signal when cable
lengths exceed 17 feet.

Common Cable Connectors Explained

In the audio world, there are six common cable connectors: TRS and XLR for balanced
connections and TS, RCA, SpeakON, and banana plugs for unbalanced connections.

TRS is the abbreviation for “Tip, Ring, Sleeve.” It looks like a standard 1/4″ or 1/8″ plug
but with an extra “ring” on its shaft. TRS cables have two conductors plus a ground
(shield). They are commonly used to connect balanced equipment or for running both left
and right mono signals to stereo headphones. You will also find TRS connectors on the
stem of Y cables. These are used for mixer insert jacks where the signal is sent out
through one wire and comes back in through the other.


XLR connectors are 3-pin connectors: positive, negative, and ground. They are usually
used for transmitting microphone or balanced line-level signals. In audio, you will
typically see XLR cables connecting microphones to mixers and connecting various
outputs to powered speakers.

TS is the abbreviation for “Tip, Sleeve” and refers to a specific type of 1/4″ or 1/8″
connector that is set up for 2-conductor, unbalanced operation. One insulator ring
separates the tip and sleeve. The tip is generally considered the “hot,” or the carrier of the
signal, while the sleeve is where the ground or shield is connected. TS cables are best
known as guitar or line-level instrument cables.

RCA is the common name for phono connectors used to connect most consumer stereo
equipment. Typically, you will see tape or CD inputs and outputs using RCA connectors.
In the digital audio realm, RCA connectors are also used for S/PDIF connections,
although true S/PDIF cables are more robust.

A speakON connector is used to connect power amplifiers to PA speakers and stage
monitors. These are often preferred over 1/4″ TS connections because of their ability to
lock into place. Since you should NEVER use an instrument cable to connect an amp to a
speaker, they also help to avoid risky cabling mixups.


Banana Plug
A banana plug is an electrical connector that is designed to join audio wires, such as
speaker wires, to the binding posts on the back of many power amplifiers or to special
jacks called banana jacks. These jacks are commonly found at the ends of binding post
receptacles on the back of power amps. The ends of the wires are held in place by a
locking screw.

Cable Shielding Explained

Braided Shield
This is a cable shield that applies braided bunches of copper strands (called “picks”)
around the insulated, electrostatically shielded center conductor. Its coverage can be
varied from less than 50% to nearly 97% by changing the angle, the number of picks, and
the rate at which they are applied. The braided cable maintains consistent coverage even
if it is flexed or bent around corners.

Serve Shield
A serve shield, also known as a spiral-wrapped shield, is applied by wrapping a flat layer
of copper strands around the center in a single direction (either clockwise or counter-
clockwise). The serve shield is very flexible, providing very little restriction to the
flexibility of the cable. Although its tensile strength is much less than that of a braid, the
serve’s superior flexibility often makes it more reliable in real-world instrument


Foil Shield
A foil shield is composed of a thin layer of Mylar-backed aluminum foil in contact with a
copper drain wire used to terminate it. The foil shield/drain wire combination is very
cheap, but it severely limits flexibility and breaks down under repeated flexing. Foil’s
100% coverage advantage is largely compromised by its high transfer impedance
(aluminum being a poorer conductor of electricity than copper), especially at low

Glossary of Additional Cable Connectors

1/8″ (Mini) — 1/8″ diameter plug (or jack) used in smaller audio/visual interconnects.
The connector may be TRS or TS. This is the size of most iPod-style headphone

AES — AES/EBU is the most common alternative to the S/PDIF standard and the most
common AES/EBU physical interconnect is AES Type I Balanced — 3-conductor, 110-
ohm twisted pair cabling with an XLR connector.

BNC — A type of coaxial connector often found on video and digital audio equipment.
BNC connectors are normally used to carry synchronizing clock signals between devices.
BNCs are bayonet-type connectors rather than screw on or straight plugs.

DB25 — A type of D-Sub connector. DB25s are commonly found on computing

equipment where they are employed to connect peripherals. TASCAM commonly uses
the DB25 connector for analog and/or digital I/O on their products as do some other

Elco (or Edac) — A brand and type of multi-pin connector used in audio systems and
equipment for connecting multi-pair cables with one connector (instead of many). Both
Elco and Edac come in 20-, 38-, 56-, 90-, and 120-pin configurations.

Insert/Y Cable — A cable used to split a signal into two parts or combine two signals
into one. The term Y cable is used, because the cable is like (and looks like) the letter Y,
with two parts joined into one, or one split into two, depending upon how you look at it.
Y cables are common throughout audio as a simple and easy way to accomplish these
two tasks.


Optical — Optical cables are for compatible 2-channel S/PDIF connections and Alesis
ADAT lightpipe connections. The ADAT optical connections are for transferring digital
audio, eight tracks at a time. They have become an industry standard and are used in a
wide range of products from many manufacturers.

S/PDIF — A format for interfacing digital audio equipment together, S/PDIF

(Sony/Philips Digital Interface Format) is considered a consumer format and is largely
based on the AES/EBU standard. In fact, in many cases, the two are compatible. S/PDIF
typically uses either unbalanced, high-impedance coaxial cables or fiber-optic cables for

FireWire (IEEE 1394) — First developed for video because of its high-speed data
transfer, FireWire is now widely used for digital interfaces in the audio realm. FireWire is
currently available as 4-pin and 6-pin(for FireWire 400), and 9-pin (for FireWire 800).
The 6- and 9-pin versions can also supply power.

TDIF — TDIF is an acronym for TASCAM Digital InterFace. This is the protocol
TASCAM developed to use in their modular digital multitrack and digital mixing
products, for doing digital transfers of audio. TDIF connections are made via a 25-pin D-
Sub connector and data is carried on shielded cable. The TDIF standard is currently one
of two major formats (the other being ADAT optical) widely used in pro and semi-pro
MDM-related products for digital transfer of more than two tracks of audio
simultaneously using only one cable.

TT (Tiny Telephone) — A miniature version of what is known as a phone jack or phone

plug. We commonly refer to this type of jack as a 1/4″ jack (our modern version actually
is slightly different in size), which could come in TS and TRS forms. ADC built
essentially the same type of connector but referred to it as a Bantam connector. Currently,
the two names are interchangeable. TT/Bantam jacks are commonly used in recording
studio patch bays.



*** MIXERS ***

How to Choose a Mixer

The mixer forms the heart of your PA system or recording studio, giving you the inputs,
outputs, and other features you need.


Live and Studio Mixers

Studio and live sound mixers alike are systems for combining signal and routing it where
it needs to go. The application you want to use it for will dictate the type and quantity of
I/O you need. For example, a live sound mixer generally needs to feed a set of mains and
a few monitor mixes, whereas a traditional studio mixer might require direct outputs from
every channel for multitrack recording. The line between “studio” and “live sound”
mixers is pretty blurry these days, So the best place to start is with the various parts that
make up a mixer.

Key Mixer Terms

Simply put, the more channels a mixer has, the more stuff you can hook up to it.
Channels (like most mixer technology) come in multiple flavors and can be either mono
or stereo. They accept mic and/or line level and normally have a preamp to get the input
up to an appropriate level plus a fader for mixing the channel’s output. Usually, each
channel will have an equalizer, auxiliary sends, and pan control, but some mixers can be
as simple as a handful of extremely rudimentary channels and a master output.

The purpose of each channel in a mixer is to pass signal along to one of more buses. The
main output of the mixer is connected to the master mix bus, which is fed by the channel
faders. Similarly, each auxiliary bus (aux send) is fed by a separate volume control on the
channel and exits the mixer via its own output jack. Aux buses can either act
independently of the channel’s volume (pre-fader) or be affected by the output fader
(post-fader). These special outputs are extremely useful for monitor mixes, headphone
mixes, recording mixes, and as effects sends. Some mixing consoles even have a special
aux bus just for effects, which may include an onboard effects processor or dedicated
return channel.

Some large-format mixers feature channel grouping (sometimes still referred to as VCA
groups), which is a super useful way to manage large numbers of channels. Essentially,
each group matrix is like its own mixer that sits between your channels and the main bus.


The way groups work is pretty simple. You assign the output of each channel you’d like
to control to a buss that feeds a group fader. For example, you might send all of your
drums to Group 1 and all of your vocals to Group 2. Each group has its own output fader,
which then feeds into the master buss. This lets you control entire sections of your mix
with just a few faders, without affecting the relative mix of the channels in each group.

Mute Groups and Scenes

In addition to mix groups, some mixers also have mute groups. These let you assign
individual channels a single control that you can use to quickly mute or unmute multiple
channels at once. That’s handy if part of the band only plays for a few songs or you want
to mute the whole band while a presenter is speaking. More advanced mixing consoles
will let you store multiple mute scenes, which allow you to store several configurations
of muted and unmuted channels for quick recall, which is great for theater and worship

Inserts and Direct Outputs

Whereas aux sends are useful for effects you want to apply to multiple channels (e.g.,
reverb or delay), channel inserts are ideal for adding outboard processors such as
compressors and equalizers to individual channels. A channel’s insert point is typically
right after its preamp in the signal flow and either uses individual send and return jacks or
a single 1/4″ insert jack, which requires a special insert cable to use. By contrast, direct
outputs simply send a copy of the preamplifier signal out of the board, which makes them
useful for sending individual feeds to an external recorder or audio interface.

Analog and Digital Mixers

Analog Mixers

In an analog mixers, every channel, bus, preamp, EQ, and other component is comprised
of physical circuitry (e.g., wires, resistors, potentiometers, and switches). Essentially,
analog mixers are extremely simple, with individual hands-on controls for every function
and routing option. Once you know your way around an analog board, you can quickly
figure out what’s going on with a quick glance across the surface and when you need to
tweak a setting, you can just reach out and grab it.


Digital Mixers

Shortly after entering the board (usually right after the preamp stage) the analog signal in
a digital mixer is converted into digital data for processing and routing. Essentially, a
digital mixer is a sophisticated computer with a bunch of AD/DA converters and a
specialized control surface attached. Digital mixers offer a number conveniences analog
boards don’t. For instance, DSP chips are much smaller and far less expensive than their
analog counterparts, so you’ll find digital boards loaded with advanced EQs, dynamics
processing, and effects, which can seriously cut down on outboard gear. The ability to
store and recall entire setups is also a major advantage.

Digital mixers do tend to come with a higher learning curve, and they often have just a
single master control section for configuring all of your channels. However, large screens
and intuitive layouts often make up for this limitation. What’s more, advanced features
such as Ethernet audio integration (e.g., Dante), WiFi networking, and tablet PC control
all provide some level of access unavailable to analog boards. And when you take into
account the generally higher sound quality they offer, it’s little wonder that digital mixers
have become standard live sound gear.

Line Mixers

Line mixers are nearly always analog mixers that deal exclusively with line-level signal.
They are extremely simple, often with only a single volume control per channel. Live,
line mixers let you combine multiple sources such as in-the-booth playback devices into a
single output, freeing up channels (and preamps) on you main board. In the studio, many
engineers use high-end line mixers called summing mixers to consolidate their final


Powered Mixers

Powered mixers combine a mixer and a power amplifier into a single unit. These are ideal
for portable PA systems. In fact, sometimes powered mixers are integrated with matched
loudspeakers as well to create extremely handy all-in-one PA systems.

What to Look For In a Mixer

So, how do you narrow down the wide selection of mixers on the market to find the one
that’s best for you? Begin by creating a checklist of your own parameters:

Channels and Inputs

How many do you need? Don’t forget to include stereo inputs for keyboards or direct
inputs for bass and guitar. Also, you’ll want to keep future expansion in mind.

Some mixers offer basic low/high frequency adjustments, whereas others provide multi-
band parametric EQ on each channel with high and low shelving. What do you need?

Directs Outs/Inserts
Do you need input channels to be routed to external processing gear or recording

Onboard Processors and Effects

One appeal of onboard processing is that you won’t need to worry about adding (or
transporting) outboard gear. However, if you’re putting a new mixer into a rig that
already has outboard processing gear, then onboard processing might not be a priority.

Buses and Routing

This depends on your signal routing needs. If you’re sending out monitor mixes,
recording feeds, and external effects mixes, then you’ll need enough aux sends to handle
the demand.



*** Acoustic Treatment vs. Sound Proofing ***

How to Choose Acoustic Treatment

You’ve handpicked your mics, your preamps, and your monitors, yet you’re still not
happy with the sound of your projects. Most likely, you’ve overlooked one of the most
crucial aspects of recording: the acoustics of the room. Every sound in your studio will
come into contact with a surface and either be absorbed or bounced back. From early
reflections to bass modes, the lack of acoustic treatment can really color your recordings,
and usually in a bad way.

We’d like to thank our friends at Auralex, the experts on room acoustics, for providing
much of the information that follows.

Do I Really Need Treatment for a Project Studio?

Absolutely! Your project studio was probably not initially designed or tuned to be a
comfortable tracking environment. Therefore, affordable acoustic treatment is exactly
what you need.


Did you know that your room already has a sound all its own? Sound travels and is
affected by the path it’s on. It might get reflected and/or absorbed based on a number of
factors. The noise in your room is also interacting with other sounds that are traveling in
its path, further affecting the sound of the room. All this sound traveling around creates
sonic anomalies that did not exist in the original sound.

One of the keys to accurately monitoring a recording or to getting good, clean sound is
controlling or completely removing the sound of the room. You usually want to hear (and
record) only the source, not what’s bouncing off your walls. Acoustic treatment is the
best way to ensure that what you’re composing, mixing, and editing is accurate and
unaffected by the room you’re sitting in. Even if you want to have a live vibe in your
recordings, you’ll still want to have control over the sound. Acoustically treating your
room will allow you to control how sound behaves in your studio, giving you the ability
to accurately record and monitor your music.

Flutter Echo

When two parallel surfaces reflecting sound between one another are far enough apart, a
listener hears the reflections between them. The audible effect is in many cases a sort of
fluttering sound since the echoes occur in rapid succession. In smaller rooms, it can take
on a tube-like, hollow sound as the echoes are closer together. You need a combination of
absorption and diffusion to defeat flutter echo.

Absorption, Diffusion, and Bass Traps

Here’s how absorption can help your studio: Acoustical foam is well suited to alleviate
flutter echo and slap. These are the two most common problems in rooms that aren’t


designed for music recording and performance. Clap your hands in an open room: the
resulting sound is terrible for recording or mixing. Foam is easy to work with, simple to
trim to size, and cost-effective. It will help improve the sound picked up by your
microphones and give you a more accurate monitoring environment. This ensures that
your recordings will sound better wherever they’re played. In a monitoring environment,
foam allows you to hear recorded works the way the artist intended, without your room
detrimentally modifying the sound.

Diffusion prevents sound waves from amassing, so there are few to no hot spots in a
room. It disrupts standing waves and flutter echoes, without removing acoustic energy
from the space or changing the frequency content of the sound. Diffusion can make a
small space seem large and a large space seem even larger. The proper balance of
diffusive and absorptive surfaces varies with room size, function, and desired results.

Bass Traps

Low-frequency sound waves are so long (and powerful) that they are the toughest to
control. This is true whether you’re attempting to block their transmission to a
neighboring space or absorb them to clean up the low-frequency response within a room.
What’s more, low frequencies tend to collect in corners and cause problems, such as
boosting the apparent amount of bass in the room. Therefore, corner bass trapping is vital
to smoothing out any room’s sound.


How Much Acoustic Treatment to Use

Most rooms usually fall in the 25% to 75% coverage range. This is only for walls and
ceilings; it largely depends on the room’s design, your intentions for the room, and even
your style of music or content. We checked in with Auralex, and they suggest the
following generalizations:

• Control rooms for rock, pop, rap, hip-hop, R&B, country, techno, and MIDI
music usually benefit from 50% to 75% coverage and mostly absorption.
• Control rooms for jazz, art (classical), choral, acoustic, world, and other
forms of ensemble music usually benefit from 35% to 50% coverage.
However, the control room should be more dead than the main recording
room. Diffusion is used more generously in these types of control rooms.
• Live rooms vary a lot. Some well-designed live rooms can get by with 20%
coverage (or even less!). However, most fall into the 25% to 50% range.
This is usually a healthy mix of absorption and diffusion. The most
successful live rooms usually have some degree of variability.
• Isolation booths usually call for 75% absorption or more.

Would you like some help figuring out how much acoustic treatment you should place in
your room? Auralex and Sweetwater offer a free Personalized Room Analysis Form that
you can fill out to receive a recommended layout of treatments that will help you handle
your room.

Where to Hang Acoustic Treatment

Determining where to hang your treatment is both an art and a science. It’s based on your
goals, room function, and room design. Treatment placement is your call, but keep in
mind that once bad room sound is on tape, you can never get rid of it. Here are some
pointers to help walk you through the process of hanging your acoustic treatment.

Mixing Area
Many studio designers claim that the front of the room (walls and ceiling) should be
treated with absorption of some sort, perhaps as far back as the engineer. The ceiling
from the engineer back can contain a mix of diffusion and absorption. However, other
designers feel that the rear wall should feature a diffuser array surrounded by broad-
bandwidth absorption. The sidewalls from the engineer’s position on back can be
alternately absorptive and diffusive. If your budget is limited, then the treatment around
your mixing area should take priority over the rest of the room. This will give you a tight,
controlled listening area.

Recording Area
If you’re building an isolation booth or a tracking area, consider a completely absorptive
environment, starting from the top of the wall down to about knee-height or lower. This


will allow you to control sonic variables of the recording in the mix by adding reverb,
EQ, and more. However, it’s subjective. Some performers feel trapped by a booth that is
completely dead, so you might consider adding some diffusion.

If Soundproofing Is Your Goal

One of the biggest concepts to understand and appreciate is that acoustic foam and
diffusors aren’t going to soundproof your room. These are extremely effective treatments
for ambient and reflected sound and help make rooms sound better. But they really don’t
provide enough sound-isolating properties to keep sound in or out of a room.

Good sound isolation results from two main

details: density and air gaps. Density comes in the form of materials such as drywall,
chipboard, plywood, soundboard, vinyl barrier products (such as SheetBlok), lead, and
more. Air gaps between existing and new walls should, if possible, be at least two inches
wide. The combination of density and air gaps will result in isolation; the amount
depends mostly on the quality of the workmanship.

In a pinch, SheetBlok sandwiched between two layers of drywall offers an amazing

amount of density. This alone won’t completely soundproof a room. In rooms not
designed for recording, “floating” via a separate floor system and double walls (thus
creating necessary air gaps and decoupling) is an expensive proposition, but the
SheetBlok sandwich might be a functional possibility.



*** Musical & Digital Instruments ***

What are
Software Plugins?

For Anyone who built a studio in the late 90's

or earlier, it’s like being in a strange new world these days. New,
previously unreal possibilities are everywhere as the day of the
Software Studio has finally arrived. It's here. It's real. And it sounds
excellent. I can do it all on my computer with my audio interface and
monitors if I want to.


There are 3 basic software devices that work in the software realm of
the digital audio sequencer. They are: Soft Synths, Soft Samplers
and Software Processors. This section chats about all three in
general to get you grounded.

All 3 of these categories are sometimes referred to as

"plugins". They are called that because these are little computer
applications that run inside a "host" application, i.e., a sequencer,
typically. These plugin devices are very important, as they have led
the software revolution towards our virtual studios, which is changing
all recording studios, both home and pro. Today there are few
hardware devices left that cannot be emulated by plugins and
software. As you see from this page, software based synthesizers,
effects processors, samplers and multi-track recorders can all work
together on one single computer.

But how well can these devices work together? Aha! You are
thinking! Good! That is a matter of utmost critical importance. There
are a lot of toys in the toybox, but a lot of them will not play well
together, if at all. Ok, now remember this: Bad Plugins Crash
Sequencers. This is one reason to avoid free plugins. I have used
nearly every sequencer out there and can affirm that over half the
reasons for instable, flaky operations is due to a poorly written
plugins. You really have to be careful. If you are having stability
problems with any sequencer always check the soundcard driver and
the plugins installed. Even expensive plugins are often released full of
bugs and fixed "eventually" as users complain. Never trust stuff that
was released last week or stuff that has not had an update in a few
years. Plugins typically break when upgraded operating systems are
released. So keep an eye on these things and talk to actual users that
use your platform and sequencer and get their take on stability.

But I digress; let's get back to the topic. Each sequencer has its
favored protocol, and may refuse to work with the rest. Cubase
Sonar, Logic, Pro Tools LE and Digital Performer all want plugins to
follow defined rules, which we will call plugin formats.


Plugin Formats

Here it is plain and simple.

VSTi (virtual studio technology instrument) was developed by Steinberg as a

Universal platform for soft synths and samplers. Not all the companies bought into
it. Cubase and Nuendo use it extensively. You may also hear about VST2 and VSTi2
plugins. VST2 is simply an extension of the VST format. These pass on more
parameters to the host for automation.

Are PC VSTs and VSTi compatible on Macs? Always assume the answer is no unless
the developer makes it clear. If they made a Mac version, they will be sure to list
that. Usually, developers will have a PC VSTi version and a Mac VSTi version, so be
careful to get (and install) the right one.

Ableton as a leading software in the modern music world is VST compatible.

DXi: Cakewalk, initially, did not go with VSTi, it went with DXi, which is based on
Microsoft direct X code. Today, however, they have relented an allow use of VSTis
with Sonar in a shell. DXi's cannot be run on Macs. By the way, Steinberg has
dropped DX plugins in Cubase. Its going away. I personally will not be getting any
more DX or DXi plugins.

Ok by now you may be wondering what this "i" business is. You see VST, then VSTi,
then DX and DXi. The "i" stands for "instrument", like a softsynth or sampler to
distinguish it from a plugin processor, like a compressor, reverb or delay.

AU, short for Audio Units, refers to a format developed by Apple for Mac OS
10.x Because there is support in Apple's operating system, AU is used by many Mac
sequencers and audio applications, and is the major supported format in LogicPro &

Important note: Just because Logic favors AU does not mean it will run all Audio Unit
Plugins. As of Logic 7, Apple got stricter. Logic only supports AUs that follow Apple's
guidelines and not all of them do. This is Apple's way of forcing plugin developers to
follow the rules it developed. It’s a good thing, as the Audio Units that do pass are
less likely to crash Logic.

MAS refers to plugins that work with the MOTU Audio System in Digital Performer
(Mac only), which can also use VSTi, AU and ReWire. You'll note that fewer plugins


support MAS, and that's because MOTU DP users can use AU as well as MAS.

RTAS or AAX is the format used by Digidesign or now AVID, the makers of Pro
Tools RTAS is used for all versions pre Pro tools 10 and AAX for all versions after

Audiosuite is another Digidesign format which works with the above software
packages and hardware interfaces.

Rewire, finally, is a scheme that pipes digital audio from Reason and Rebirth,
Project 5 and Ableton's Live to other sequencers. It basically allows you to run a
sequencer inside of another sequencer. It’s not a plugin per say, but because many
soft synths use Rewire I include it here.

Q) What is a Plugin Wrapper

A) A plugin wrapper usually refers to a software device that fools a host (i.e., the
sequencer) into using formats that would be incompatible without it. For example, in
a VSTi to DXi shell you can run VSTis and the sequencer will treat them as
DXis. Through the use of wrappers, Sonar users can use VSTis and Logic Users can
use VSTis. Without a wrapper, Logic can only use AUs and Sonar (up to version 4)
can only use DXis and Pro Tools LE not use VSTs.

There are some disadvantages to using wrappers. If the wrapper does not convert
and pass on the data perfectly, there could be problems. Remember: Bad plugins
crash sequencers. Right? Right!

Software Synths

A Soft Synth mimics a hardware synthesizer with different sounds and

waveforms. Many follow the model of a vintage analog synth with
oscillators, filters, LFO’s and amp envelopes making the sound, other's
may use a model of FM, wavetable, or may be modeled after an
acoustic instrument.

Are Soft Synths Better than Hardware synths?


Well, you will never break your back carrying a soft synth to a gig. But softsynths will rather quickly
degrade your PC's performance as they eat CPU cycles with veracity. Why is this? The CPU must deal
with the soft synths instructions immediately or there will be latency. Most fast computers can achieve a
latency of 5 milliseconds and when they do, the soft synth "feels" like a hardware synth when you play
it. However, as you build your song and have 10-16 softsynths playing back at this incredible rate, the
CPU gets behind in other tasks. When you add effects on top might notice clicks and pops and other
nasties in your audio. If you don't heed the warning, suddenly the whole shebang may stop dead in its

Hardware synths do not suffer this as it just has to receive midi data on time, which
is any computer can do easily. So you can use your CPU for other tasks, like
recording audio, effects, even running other applications.

Soft synths are as good sounding as many hardware synths, sometimes better. They
also can be very specific in their focus. People don't mind spending $350 for a
softsynth that just does pads and atmospheres, but they would mind buying a
hardware box that only does this for $2000. Hardware costs more because making
the thing costs more. Once software is made it is much less of a problem to make
100,000 units.

Soft Samplers
What is the difference between a soft synth and a soft sampler?

A software sampler works like a digital sampler. These don't sample sounds per se
(you usually need an audio editor/recorder for that). But they do take "samples"
(i.e., .wav or .aif files typically) and let you map them along your keyboard, the
same way one does in a hardware sampler. A soft sampler may let you load sample
cd roms that are used by hardware samplers which gives you access to a universe of
premium sounds. Once you map the samples to the keyboard, you can then
program them with filters, lfos, and amps like a soft synth. One advantage of soft
samplers over their hardware rivals is that there is no memory limit to how many
samples are immediately accessible--any wave file on the computer is fair
game. Compare that to hardware samplers that have banks, which are limited to
128, 256, or maybe 512 megabytes.

So the soft sampler has an open architecture, which lets you import any sound. The
soft synth is a closed architecture that allows you to select from a number of
supplies internal waveforms.

Software Processors

Its perhaps with plugin processors that the home studio operator gets a shot at
being a mixdown and mastering engineer. This is a fact that has not gone unnoticed
by true professional audio engineer, many of whom are quite rightfully irritated at
the bloody hoardes of noobs that try to wrestle the secrets of mastering out of
them. As I have said many times, these skills come from knowledge and
experience. So get some. No! don't ask people on the forums "whas da best


compress..settin for my beatz" To gain knowledge and experience you sit there with
the machines (virtual or real) and experiment over and over again. Then you start
seeing patterns and hearing results.

The mindblowing thing is that many of the processors that studio engineers have
used now have software equivalents. These plugins are sometimes so good the the
mastering engineers themselves use them!

As you might expect the professional plugins are not cheap, but they are a lot
cheaper than the roomful of gear one used to have.


Understanding Sound

*** Understanding Sound & Frequencies ***

What Is Sound?

All sounds are vibrations traveling through the air as sound waves. Sound waves are caused by the
vibrations of objects and radiate outward from their source in all directions. A vibrating object compresses
the surrounding air molecules (squeezing them closer together) and then rarefies them (pulling them farther
apart). Although the fluctuations in air pressure travel outward from the object, the air molecules
themselves stay in the same average position. As sound travels, it reflects off objects in its path, creating
further disturbances in the surrounding air. When these changes in air pressure vibrate your eardrum, nerve
signals are sent to your brain and are interpreted as sound.


Fundamentals of a Sound Wave

The simplest kind of sound wave is a sine wave. Pure sine waves rarely exist in the natural world, but they
are a useful place to start because all other sounds can be broken down into combinations of sine waves. A
sine wave clearly demonstrates the three fundamental characteristics of a sound wave: frequency,
amplitude, and phase.

Frequency is the rate, or number of times per second, that a sound wave cycles from positive to negative to
positive again. Frequency is measured in cycles per second, or hertz (Hz). Humans have a range of hearing
from 20 Hz (low) to 20,000 Hz (high). Frequencies beyond this range exist, but they are inaudible to


Amplitude (or intensity) refers to the strength of a sound wave, which the human ear interprets as volume or
loudness. People can detect a very wide range of volumes, from the sound of a pin dropping in a quiet room
to a loud rock concert. Because the range of human hearing is so large, audio meters use a logarithmic scale
(decibels) to make the units of measurement more manageable.

Phase compares the timing between two similar sound waves. If two periodic sound waves of the same
frequency begin at the same time, the two waves are said to be in phase. Phase is measured in degrees from
0 to 360, where 0 degrees means both sounds are exactly in sync (in phase) and 180 degrees means both
sounds are exactly opposite (out of phase). When two sounds that are in phase are added together, the
combination makes an even stronger result. When two sounds that are out of phase are added together, the
opposing air pressures cancel each other out, resulting in little or no sound. This is known as phase

Phase cancelation can be a problem when mixing similar audio signals together, or when original and
reflected sound waves interact in a reflective room. For example, when the left and right channels of a
stereo mix are combined to create a mono mix, the signals may suffer from phase cancelation.


Frequency Spectrum of Sounds

With the exception of pure sine waves, sounds are made up of many different frequency components
vibrating at the same time. The particular characteristics of a sound are the result of the unique combination
of frequencies it contains.

Sounds contain energy in different frequency ranges, or bands. If a sound has a lot of low-frequency
energy, it has a lot of bass. The 250–4000 Hz frequency band, where humans hear best, is described as
midrange. High-frequency energy beyond the midrange is called treble, and this adds crispness or
brilliance to a sound. The graph below shows how the sounds of different musical instruments fall within
particular frequency bands.


Note: Different manufacturers and mixing engineers define the ranges of these frequency bands differently,
so the numbers described above are approximate.

The human voice produces sounds that are mostly in the 250–4000 Hz range, which likely explains why
people’s ears are also the most sensitive to this range. If the dialogue in your movie is harder to hear when
you add music and sound effects, try reducing the midrange frequencies of the nondialogue tracks using an
equalizer filter. Reducing the midrange creates a “sonic space” in which the dialogue can be heard more

Musical sounds typically have a regular frequency, which the human ear hears as the sound’s pitch. Pitch is
expressed using musical notes, such as C, E flat, and F sharp. The pitch is usually only the lowest, strongest
part of the sound wave, called the fundamental frequency. Every musical sound also has higher, softer parts
called overtones or harmonics, which occur at regular multiples of the fundamental frequency. The human
ear doesn’t hear the harmonics as distinct pitches, but rather as the tone color (also called the timbre) of the
sound, which allows the ear to distinguish one instrument or voice from another, even when both are
playing the same pitch.


Musical sounds also typically have a volume envelope. Every note played on a musical instrument has a
distinct curve of rising and falling volume over time. Sounds produced by some instruments, particularly
drums and other percussion instruments, start at a high volume level but quickly decrease to a much lower
level and die away to silence. Sounds produced by other instruments, for example, a violin or a trumpet,
can be sustained at the same volume level and can be raised or lowered in volume while being sustained.
This volume curve is called the sound’s envelope and acts like a signature to help the ear recognize what
instrument is producing the sound.

Measuring Sound Intensity

Human ears are remarkably sensitive to vibrations in the air. The threshold of human hearing is around 20
microPascals (μP), which is an extremely small amount of atmospheric pressure. At the other extreme, the
loudest sound a person can withstand without pain or ear damage is about 200,000,000 μP: for example, a
loud rock concert or a nearby jet airplane taking off.

Because the human ear can handle such a large range of intensities, measuring sound pressure levels on a
linear scale is inconvenient. For example, if the range of human hearing were measured on a ruler, the scale
would go from 1 foot (quietest) to over 3000 miles (loudest)! To make this huge range of numbers easier to
work with, a logarithmic unit—the decibel—is used. Logarithms map exponential values to a linear scale.
For example, by taking the base-ten logarithm of 10 (101) and 1,000,000,000 (109), this large range of
numbers can be written as 1–9, which is a much more convenient scale.

Because the ear responds to sound pressure logarithmically, using a logarithmic scale corresponds to the
way humans perceive loudness. Audio meters and sound measurement equipment are specifically designed
to show audio levels in decibels. Small changes at the bottom of an audio meter may represent large
changes in signal level, while small changes toward the top may represent small changes in signal level.
This makes audio meters very different from linear measuring devices like rulers, thermometers, and
speedometers. Each unit on an audio meter represents an exponential increase in sound pressure, but a
perceived linear increase in loudness.

Important: When you mix audio, you don’t need to worry about the mathematics behind logarithms and
decibels. Just be aware that to hear incremental increases in sound volume, exponentially more sound
pressure is required.

What Is a Decibel?

The decibel measures sound pressure or electrical pressure (voltage) levels. It is a logarithmic unit that
describes a ratio of two intensities; such as two different sound pressures, two different voltages, and so on.
A bel (named after Alexander Graham Bell) is a base-ten logarithm of the ratio between two signals. This
means that for every additional bel on the scale, the signal represented is ten times stronger. For example,


the sound pressure level of a loud sound can be billions of times stronger than a quiet sound. Written
logarithmically, one billion (1,000,000,000 or 109) is simply 9. Decibels make the numbers much easier to
work with.

In practice, a bel is a bit too large to use for measuring sound, so a one-tenth unit called the decibel is used
instead. The reason for using decibels instead of bels is no different from the reason for measuring shoe
size in, say, centimeters instead of meters; it is a more practical unit.

Number of decibels Relative increase in power

0 1
1 1.26
3 2
10 10
20 100
30 1000
50 100,000
100 10,000,000,000

Decibel Units

Audio meters are labeled with decibels. Several reference levels have been used in audio meters over the
years, starting with the invention of the telephone and evolving to present day systems. Some of these units
are only applicable to older equipment. Today, most professional equipment uses dBu, and most consumer
equipment uses dBV. Digital meters use dBFS.

• dBm: The m stands for milliwatt (mW), which is a unit for measuring
electrical power. (Power is different from electrical voltage and current,
though it is related to both.) This was the standard used in the early days
of telephone technology and remained the professional audio standard for
• dBu: This reference level measures voltage instead of power, using a
reference level of 0.775 volts. dBu has mostly replaced dBm on
professional audio equipment. The u stands for unloaded, because the
electrical load in an audio circuit is no longer as relevant as it was in the
early days of audio equipment.
• dBV: This also uses a reference voltage like dBu, but in this case the
reference level is 1 volt, which is more convenient than 0.775 volts in dBu.
dBV is often used on consumer and semiprofessional devices.
• dBFS: This scale is very different from the others because it is used for
measuring digital audio levels. FS stands for full-scale, which is used
because, unlike analog audio signals that have an optimum signal voltage,
the entire range of digital values is equally acceptable when using digital
audio. 0 dBFS is the highest-possible digital audio signal you can record
without distortion. Unlike analog audio scales like dBV and dBu, there is
no headroom past 0 dBFS.


Signal-to-Noise Ratio
Every electrical system produces a certain amount of low-level electrical activity called noise. The noise
floor is the level of noise inherent in a system. It is nearly impossible to eliminate all the noise in an
electrical system, but you don’t have to worry about the noise if you record your signals significantly
higher than the noise floor. If you record audio too low, you raise the volume to hear it, which also raises
the volume of the noise floor, causing a noticeable hiss.

The more a signal is amplified, the louder the noise becomes. Therefore, it is important to record most
audio around the nominal (ideal) level of the device, which is labeled 0 dB on an analog audio meter.

The signal-to-noise ratio, typically measured in dB, is the difference between the nominal recording level
and the noise floor of the device. For example, the signal-to-noise ratio of an analog tape deck may be
60 dB, which means the inherent noise in the system is 60 dB lower than the ideal recording level.

Headroom and Distortion

If an audio signal is too strong, it will overdrive the audio circuit, causing the shape of the signal to distort.
In analog equipment, distortion increases gradually the more the audio signal overdrives the circuit. For
some audio recordings, this kind of distortion can add a unique “warmth” to the recording that is difficult to
achieve with digital equipment. However, for audio post-production, the goal is to keep the signal clean and

0 dB on an analog meter refers to the ideal recording level, but there is some allowance for stronger signals
before distortion occurs. This safety margin is known as headroom, meaning that the signal can
occasionally go higher than the ideal recording level without distorting. Having headroom is critical when
recording, especially when the audio level is very dynamic and unpredictable. Even though you can adjust
the recording level while you record, you can’t always anticipate quick, loud sounds. The extra headroom
above 0 dB on the meter is there in case the audio abruptly becomes loud.

Dynamic Range and Compression

Dynamic range is the difference between the quietest and loudest sound in your mix. A mix that contains
quiet whispers and loud screams has a large dynamic range. A recording of a constant drone such as an air
conditioner or steady freeway traffic has very little amplitude variation, so it has a small dynamic range.

You can actually see the dynamic range of an audio clip by looking at its waveform. For example, two
waveforms are shown below. The top one is a section from a well-known piece of classical music. The
bottom one is from a piece of electronic music. From the widely varied shape of the waveform, you can tell
that the classical piece has the greater dynamic range.


Notice that the loud and soft parts of the classical piece vary more frequently, as compared to the fairly
consistent levels of the electronic music. The long, drawn-out part of the waveform at the left end of the top
piece is not silence—it’s actually a long, low section of the music.

Dynamic sound has drastic volume changes. Sound can be made less dynamic by reducing, or compressing,
the loudest parts of the signal to be closer to the quiet parts. Compression is a useful technique because it
makes the sounds in your mix more equal. For example, a train pulling into the station, a man talking, and
the quiet sounds of a cricket-filled evening are, in absolute terms, very different volumes. Because
televisions and film theaters must compete with ambient noise in the real world, it is important that the
quiet sounds are not lost.

The goal is to make the quiet sounds (in this case, the crickets) louder so they can compete with the
ambient noise in the listening environment. One approach to making the crickets louder is to simply raise
the level of the entire soundtrack, but when you increase the level of the quiet sounds, the loud sounds
(such as the train) get too loud and distort. Instead of raising the entire volume of your mix, you can
compress the loud sounds so they are closer to the quiet sounds. Once the loud sounds are quieter (and the
quiet sounds remain the same level), you can raise the overall level of the mix, bringing up the quiet sounds
without distorting the loud sounds.

When used sparingly, compression can help you bring up the overall level of your mix to compete with
noise in the listening environment. However, if you compress a signal too far, it sounds very unnatural. For
example, reducing the sound of an airplane jet engine to the sound of a quiet forest at night and then raising
the volume to maximum would cause the noise in the forest to be amplified immensely.

Different media and genres use different levels of compression. Radio and television commercials use
compression to achieve a consistent wall of sound. If the radio or television becomes too quiet, the
audience may change the channel—a risk advertisers and broadcasters don’t want to take. Films in theaters
have a slightly wider dynamic range because the ambient noise level of the theater is lower, so quiet sounds
can remain quiet.

Stereo Audio
The human ear hears sounds in stereo, and the brain uses the subtle differences in sounds entering the left
and right ears to locate sounds in the environment. To re-create this sonic experience, stereo recordings
require two audio channels throughout the recording and playback process. The microphones must be
properly positioned to accurately capture a stereo image, and speakers must also be spaced properly to re-
create a stereo image accurately.

If any part of the audio reproduction pathway eliminates one of the audio channels, the stereo image will
most likely be compromised. For example, if your playback system has a CD player (two audio channels)
connected to only one speaker, you will not hear the intended stereo image.


Important: All stereo recordings require two channels, but two-channel recordings are not necessarily
stereo. For example, if you use a single-capsule microphone to record the same signal on two tracks, you
are not making a stereo recording.

Identifying Two-Channel Mono Recordings

When you are working with two-channel audio, it is important to be able to distinguish between true stereo
recordings and two tracks used to record two independent mono channels. These are called dual mono

Examples of dual mono recordings include:

• Two independent microphones used to record two independent sounds, such as two different
actors speaking. These microphones independently follow each actor’s voice and are never
positioned in a stereo left-right configuration. In this case, the intent is not a stereo recording but
two discrete mono channels of synchronized sound.
• Two channels with exactly the same signal. This is no different than a mono recording, because
both channels contain exactly the same information. Production audio is sometimes recorded this
way, with slightly different gain settings on each channel. This way, if one channel distorts, you
have a safety channel recorded at a lower level.
• Two completely unrelated sounds, such as dialogue on track 1 and a timecode audio signal on
track 2, or music on channel 1 and sound effects on channel 2. Conceptually, this is not much
different than recording two discrete dialogue tracks in the example above.

The important point to remember is that if you have a two-track recording system, each track can be used to
record anything you want. If you use the two tracks to record properly positioned left and right
microphones, you can make a stereo recording. Otherwise, you are simply making a two-channel mono

Identifying Stereo Recordings

When you are trying to decide how to work with an audio clip, you need to know whether a two-channel
recording was intended to be stereo or not. Usually, the person recording production sound will have
labeled the tapes or audio files to indicate whether they were recorded as stereo recordings or dual-channel
mono recordings. However, things don’t always go as planned, and tapes aren’t always labeled as
thoroughly as they should be. As an editor, it’s important to learn how to differentiate between the two.

Here are some tips for distinguishing stereo from dual mono recordings:

• Stereo recordings must have two independent tracks. If you have a tape with only one track of
audio, or a one-channel audio file, your audio is mono, not stereo.

Note: It is possible that a one-channel audio file is one half of a stereo pair. These are known as
split stereo files, because the left and right channels are contained in independent files. Usually,
these files are labeled accordingly: AudioFile.L and AudioFile.R are two audio files that make up
the left and right channels of a stereo sound.

• Almost all music, especially commercially available music, is mixed in stereo.

• Listen to a clip using two (stereo) speakers. If each side sounds subtly different, it is probably
stereo. If each side sounds absolutely the same, it may be a mono recording. If each side is
completely unrelated, it is a dual mono recording.


Interleaved Versus Split Stereo Audio Files

Digital audio can send a stereo signal within a single stream by interleaving the digital samples during
transmission and de-interleaving them on playback. The way the signal is stored is unimportant as long as
the samples are properly split to left and right channels during playback. With analog technology, the signal
is not nearly as flexible.

Split stereo files are two independent audio files that work together, one for the left channel (AudioFile.L)
and one for the right channel (AudioFile.R). This mirrors the traditional analog method of one track per
channel (or in this case, one file per channel).


Analog & Digital recording Foundations

*** Analog & Digital Recording Foundations ***

The acoustic era (1877 to 1925)

The earliest practical recording technologies were entirely mechanical devices. These
recorders typically used a large conical horn to collect and focus the physical air pressure
of the sound waves produced by the human voice or musical instruments. A sensitive
membrane or diaphragm, located at the apex of the cone, was connected to an articulated
scriber or stylus, and as the changing air pressure moved the diaphragm back and forth,
the stylus scratched or incised an analogue of the sound waves onto a moving recording
medium, such as a roll of coated paper, or a cylinder or disc coated with a soft material
such as wax or a soft metal. These early recordings were necessarily of low fidelity and
volume, and captured only a narrow segment of the audible sound spectrum - typically
only from around 250 Hz up to about 2,500 Hz - so musicians and engineers were forced
to adapt to these sonic limitations. Bands of the period often favored louder instruments
such as trumpet, cornet and trombone, lower-register brass instruments (such as the tuba
and the euphonium) replaced the string bass, and blocks of wood stood in for bass drums;
performers also had to arrange themselves strategically around the horn to balance the
sound, and to play as loudly as possible. The reproduction of domestic phonographs was
similarly limited in both frequency-range and volume - this period gave rise to the
expression "put a sock in it", which commemorates the common practice of placing a
sock in the horn of the phonograph to muffle the sound for quieter listening. By the end
of the acoustic era, the disc had become the standard medium for sound recording, and its
dominance in the domestic audio market lasted until the end of the 20th century.


The electrical era (1925 to 1945) (including sound on film)

The 'second wave' of sound recording history was ushered in by the introduction of
Western Electric's integrated system of electrical microphones, electronic signal
amplifiers and electrical disc-cutting machines, which was adopted by major US record
labels in 1925. Sound recording now became a hybrid process - sound could now be
captured, amplified, filtered and balanced electronically, and the disc-cutting head was
now electrically-driven, but the actual recording process remained essentially mechanical
– the signal was still physically inscribed into a metal or wax 'master' disc, and consumer
discs were mass-produced mechanically by stamping an impression of the master disc
onto a suitable medium, originally shellac and later polyvinyl plastic. The Westrex
system greatly improved the fidelity of sound recording, increasing the reproducible
frequency range to a much wider band (between 60 Hz and 6000 Hz) and allowing a new
class of professional – the audio engineer – to capture a fuller, richer and more detailed
and balanced sound on record, using multiple microphones, connected to multi-channel
electronic amplifiers, compressors, filters and sound mixers. Electrical microphones led
to a dramatic change in the performance style of singers, ushering in the age of the
"Crooner", while electrical amplification had a wide-ranging impact in many areas,
enabling the development of broadcast radio, public address systems, and electrically-
amplified home gramophones. In addition, the development of electronic amplifiers for
musical instruments now enabled quieter instruments such as the guitar and the string
bass to complete on equal terms with the naturally louder wind and horn instruments, and
musicians and composers also began to experiment with entirely new electronic musical
instruments such the Theremin, the Ondes Martenot, the electric organ, and the
Hammond Novachord, the world's first analogue polyphonic synthesiser.

Contemporary with these developments, the movie industry was engaged in a frantic race
to develop practical methods of recording synchronised sound for films. Early attempts -
such as the landmark 1928 film The Jazz Singer – used pre-recorded discs which were
played in synchronisation with the action on the screen. By the early 1930s the movie
industry had almost universally adopted the "sound-on-film" technology (developed by
Western Electric and others) in which the audio signals picked up by the microphones
were modulated via a photoelectric element to generate a narrow band of light, of
variable width or height, which was then captured on a dedicated 'audio' strip on the edge
of the film negative, as the images were being filmed. The development of sound of film
also enabled movie-industry audio engineers to make rapid advances in the process we
now know as "multi-tracking", by which multiple separately-recorded audio sources
(such as voices, sound effects and background music) could be replayed simultaneously,
mixed together, and synchronised with the action on film to create new 'blended' audio
tracks of great sophistication and complexity. One of the best known example of a
'constructed' composite sound source from this era is the famous "Tarzan yell", originally
created for the RKO Picture series of Tarzan movies, starring Johnny Weissmuller.

Among the vast and often rapid changes that have taken place over the last century of
audio recording, it is notable that there is one crucial audio device, invented at the start of
the "Electrical Era", which has survived virtually unchanged since its introduction in


1925 – the electro-acoustic transducer, or loudspeaker. The most common form of

electro-acoustic speaker the dynamic loudspeaker – which is effectively a dynamic
microphone in reverse. This device typically consists of a flattened conical acoustic
diaphragm (usually made of a stiff paper compound) suspended in a metal ring, at the
apex of which a moving-coil magnet is attached. When an audio signal from a recording,
a microphone, or an electrified instrument is fed through an amplifier to a loudspeaker,
the electrical impulses drive the speaker magnet backward and forward, causing the
speaker cone to vibrate, and this movement generates the audio-frequency pressure waves
that travel through the air to our ears, which hear them as sound. Although there have
been numerous refinements to the technology, and other related technologies have been
introduced (e.g. the electrostatic loudspeaker), the basic design and function of the
dynamic loudspeaker has not changed substantially in 90 years, and it remains
overwhelmingly the most common, sonically accurate and reliable means of converting
electronic audio signals back into audible sound.

The magnetic era (1945 to 1975)

The third wave of development in audio recording began in 1945, when the Allied
nations gained access to a new German invention - magnetic tape recording. The
technology was invented in the 1930s, but remained restricted to Germany (where it was
widely used in broadcasting) until the end of World War II. Magnetic tape provided
another dramatic leap in audio fidelity - indeed, Allied observers first became aware of
the existence of the new technology because they noticed that the audio quality of
obviously pre-recorded programs was practically indistinguishable from live broadcasts.
From 1950 onwards, magnetic tape quickly became the standard medium of audio master
recording in the radio and music industries, and led to the development of the first hi-fi
stereo recordings for the domestic market, the development of multi-track tape recording
for music, and the demise of the disc as the primary mastering medium for sound.
Magnetic tape also brought about a radical reshaping of the recording process - it made
possible recordings of far longer duration and much higher fidelity than ever before, and
it offered recording engineers the same exceptional plasticity that film gave to cinema
editors - sounds captured on tape could now easily be manipulated sonically, edited, and
combined in ways that were simply impossible with disc recordings. These experiments
reached an early peak in the 1950s with the recordings of Les Paul and Mary Ford, who
pioneered the use of tape editing and "multi-tracking" to create large 'virtual' ensembles
of voices and instruments, constructed entirely from multiple taped recordings of their
own voices and instruments. Magnetic tape fueled a rapid and radical expansion in the
sophistication of popular music and other genres, allowing composers, producers,
engineers and performers to realize previously unattainable levels of complexity. Other
concurrent advances in audio technology led to the introduction of a range of new
consumer audio formats and devices, on both disc and tape, including the development
full-frequency-range disc reproduction, the change from shellac to polyvinyl plastic for
disc manufacture, the invention of the 33rpm, 12-inch long-playing (LP) disc and the
45rpm 7-inch "single", the introduction of domestic and professional portable tape
recorders (which enabled high-fidelity recordings of live performances), the popular 4-
track cartridge and compact cassette formats, and even the world's first "sampling


keyboards" - the pioneering tape-based keyboard instrument the Chamberlin, and its
more famous successor, the Mellotron.

The "digital" era (1975 to present day)

The fourth and current “phase,” the “digital” era, has seen the most rapid, dramatic and
far-reaching series of changes in the history of audio recording. In a period of less than
20 years, all previous recording technologies were rapidly superseded by digital sound
encoding, which was perfected by the Japanese electronics corporation Sony in the
1970s. Unlike all previous technologies, which captured a continuous analogue of the
sounds being recorded, digital recording captured sound by means of a very dense and
rapid series of discrete samples of the sound. When played back through a digital-to-
analogue converter, these audio samples are recombined to form a continuous flow of
sound. The first all-digitally-recorded popular music album, Ry Cooder's Bop 'Til You
Drop, was released in 1979, and from that point, digital sound recording and reproduction
quickly became the new standard at every level, from the professional recording studio to
the home hi-fi.

Although a number of shortlived "hybrid" studio and consumer technologies appeared in

this period (e.g. Digital Audio Tape or DAT, which recorded digital signal samples onto
standard magnetic tape), Sony assured the preeminence of its new digital recording
system by introducing the most advanced consumer audio format to date - the digital
compact disc (CD). The Compact disc rapidly replaced both the 12" album and the 7"
single as the new standard consumer format, and ushered in a new era of high-fidelity
consumer audio - CDs were small, portable and durable, and they could reproduce the
entire audible sound spectrum, with unrestricted dynamic range, perfect clarity and no
distortion. Because CDs were encoded and read optically, using a laser beam, there was
no physical contact between the disc and the playback mechanism, so a well-cared-for
CD could be played over and over, with absolutely no degradation or loss of fidelity. CDs
also represented a considerable advance in both the physical size of the medium, and its
storage capacity - LPs could only practically hold about 50 minutes of audio, because
they were physically limited by the size of the disc itself and the density of the grooves
that could be cut into it - the longer the recording, the closer together the grooves and
thus. the lower the overall fidelity; CDs, on the other hand they were less than half the
overall size of the old 12" LP format, but offered about double the duration of the average
LP, with up to 80 minutes of audio.

The Compact Disc almost totally dominated the consumer audio market by the end of the
20th century , but within another decade, rapid developments in computing technology
saw it rendered virtually redundant in just a few years by the most significant new
invention in the history of audio recording - the digital audio file (.wav, .mp3 and other
formats). When combined with newly-developed digital signal compression algorithms,
which greatly reduced file sizes, digital audio files rapidly came to dominate the domestic
market, thanks to commercial innovations such as Apple's iTunes media application, and
their hugely popular iPod portable media player.


However, the introduction of digital audio files, in concert with the rapid developments in
home computing, soon led to an unforeseen consequence - the widespread pirating of
audio and other digital media files, thanks to the rapid diffusion of file sharing
technologies such as Napster, and especially the introduction of freeware peer-to-peer
BitTorrent file-sharing software, which enabled users to locate, upload and download
large volumes of digital media files across the internet at high speed. The concurrent
development of high-volume private data storage networks, combined with rapidly
increasing internet signal speeds and continuous improvements in data storage devices,
fuelled an explosion in the illegal sharing of copyrighted digital media. This has caused
great consternation among record labels and copyright owners such as ASCAP, who have
strongly pressured government agencies to make trans-national efforts to shut down data-
storage and file-sharing networks, and to prosecute site operators, and even individual

Although piracy remains a significant issue for copyright owners, the development of
digital audio has had considerable benefits for consumers. In addition to facilitating the
high-volume, low-cost transfer and storage of digital audio files, this new technology has
also powered an explosion in the availability of so-called "back-catalogue" titles stored in
the archives of recording labels, thanks to the fact that labels can now convert old
recordings and distribute them digitally at a fraction of the cost of physically reissuing
albums on LP or CD. Digital audio has also enabled dramatic improvements in the
restoration and remastering of acoustic and pre-digital electric recordings, and even
freeware consumer-level digital software can very effectively eliminate scratches, surface
noise and other unwanted sonic artefacts from old 78rpm and vinyl recordings and
greatly enhance the sound quality of all but the most badly damaged records. In the field
of consumer-level digital data storage, the continuing trend towards increasing capacity
and falling costs means that consumers can now acquire and store vast quantities of high-
quality digital media (audio, video, games and other applications), and build up media
libraries consisting of tens or even hundreds of thousands of songs, albums, or videos -
collections which, for all but the wealthiest, would have been both physically and
financially impossible to amass in such quantities if they were on 78 or LP, yet which can
now be contained on storage devices no larger than the average hardcover book.

The Digital Audio File marked the end of one era in recording and the beginning of
another. Digital files effectively eliminated the need to create or use a discrete, purpose-
made physical recording medium (a disc, or a reel of tape, etc.) as the primary means of
capturing, manufacturing and distributing commercial sound recordings. Concurrent with
the development of these digital file formats, dramatic advances in home computing and
the rapid expansion of the Internet mean that digital sound recordings can now be
captured, processed, reproduced, distributed and stored entirely electronically, on a range
of magnetic and optical recording media, and these can be distributed almost anywhere in
the world, with no loss of fidelity, and crucially, without the need to first transfer these
files to some form of permanent recording medium for shipment and sale.


Acoustical recording

The earliest method of sound recording and reproduction involved the live recording of a
performance directly to a recording medium by an entirely mechanical process, often
called "acoustical recording". In the standard procedure used until the mid-1920s, the
sounds generated by the performance vibrated a diaphragm with a recording stylus
connected to it while the stylus cut a groove into a soft recording medium rotating
beneath it. To make this process as efficient as possible, the diaphragm was located at the
apex of a hollow cone that served to collect and focus the acoustical energy, with the
performers crowded around the other end. Recording balance was achieved empirically.
A performer who recorded too strongly or not strongly enough would be moved away
from or nearer to the mouth of the cone. The number and kind of instruments that could
be recorded were limited. Brass instruments, which recorded well, were often substituted
for instruments such as cellos and bass fiddles, which did not. In some early jazz
recordings, a block of wood was used in place of the bass drum, which could easily
overload the recording diaphragm.


In 1857, Édouard-Léon Scott de Martinville invented the phonautograph, the first device
that could record sound waves as they passed through the air. It was intended only for
visual study of the recording and could not play back the sound. The recording medium
was a sheet of soot-coated paper wrapped around a rotating cylinder carried on a threaded
rod. A stylus, attached to a diaphragm through a series of levers, traced a line through the
soot, creating a graphic record of the motions of the diaphragm as it was minutely
propelled back and forth by the audio-frequency variations in air pressure.

In the spring of 1877 another inventor, Charles Cros, suggested that the process could be
reversed by using photoengraving to convert the traced line into a groove that would
guide the stylus, causing the original stylus vibrations to be recreated, passed on to the
linked diaphragm, and sent back into the air as sound. An inventor from America soon
eclipsed this idea, and it was not until 1887 that yet another inventor, Emile Berliner,
actually photoengraved a phonautograph recording into metal and played it back.

Scott's early recordings languished in French archives until 2008, when scholars keen to
resurrect the sounds captured in these and other types of early experimental recordings
tracked them down. Rather than using rough 19th century technology to create playable
versions, they were scanned into a computer and software was used to convert their
sound-modulated traces into digital audio files. Brief excerpts from two French songs and
a recitation in Italian, all recorded in 1860, are the most substantial results.[1]



An Edison Home Phonograph for recording and playing brown wax cylinders, c.

The phonograph, invented by Thomas Edison in 1877,[2] could both record sound and
play it back. The earliest type of phonograph sold recorded on a thin sheet of tinfoil
wrapped around a grooved metal cylinder. A stylus connected to a sound-vibrated
diaphragm indented the foil into the groove as the cylinder rotated. The stylus vibration
was at a right angle to the recording surface, so the depth of the indentation varied with
the audio-frequency changes in air pressure that carried the sound. This arrangement is
known as vertical or "hill-and-dale" recording. The sound could be played back by
tracing the stylus along the recorded groove and acoustically coupling its resulting
vibrations to the surrounding air through the diaphragm and a so-called "amplifying"

The crude tinfoil phonograph proved to be of little use except as a novelty. It was not
until the late 1880s that an improved and much more useful form of phonograph was
marketed. The new machines recorded on easily removable hollow wax cylinders and the
groove was engraved into the surface rather than indented. The targeted use was business
communication, and in that context the cylinder format had some advantages. When
entertainment use proved to be the real source of profits, one seemingly negligible
disadvantage became a major problem: the difficulty of replicating a recorded cylinder in
large quantities.

At first, cylinders were copied by acoustically connecting a playback machine to one or

more recording machines through flexible tubing, an arrangement that degraded the audio
quality of the copies. Later, a pantograph mechanism was used, but it could only produce
about 25 fair copies before the original was too worn down. During a recording session,
as many as a dozen machines could be arrayed in front of the performers to record
multiple originals. Still, a single "take" would ultimately yield only a few hundred copies
at best, so performers were booked for marathon recording sessions in which they had to
repeat their most popular numbers over and over again. By 1902, successful molding
processes for manufacturing prerecorded cylinders had been developed.


Spring-motor-powered disc record player, c. 1909

The wax cylinder got a competitor with the advent of the Gramophone, which was
patented by Emile Berliner in 1887. The vibration of the Gramophone's recording stylus
was horizontal, parallel to the recording surface, resulting in a zig-zag groove of constant
depth. This is known as lateral recording. Berliner's original patent showed a lateral
recording etched around the surface of a cylinder, but in practice he opted for the disc
format. The Gramophones he soon began to market were intended solely for playing
prerecorded entertainment discs and could not be used to record. The spiral groove on the
flat surface of a disc was relatively easy to replicate: a negative metal electrotype of the
original record could be used to stamp out hundreds or thousands of copies before it wore
out. Early on, the copies were made of hard rubber, and sometimes of celluloid, but soon
a shellac-based compound was adopted.

"Gramophone", Berliner's trademark name, was abandoned in the US in 1900 because of

legal complications, with the result that in American English Gramophones and
Gramophone records, along with disc records and players made by other manufacturers,
were long ago brought under the umbrella term "phonograph", a word which Edison's
competitors avoided using but which was never his trademark, simply a generic term he
introduced and applied to cylinders, discs, tapes and any other formats capable of
carrying a sound-modulated groove. In the UK, proprietary use of the name Gramophone
continued for another decade until, in a court case, it was adjudged to have become
genericized and so could be used freely by competing disc record makers, with the result
that in British English a disc record is called a "gramophone record" and "phonograph
record" is traditionally assumed to mean a cylinder.

Not all cylinder records are alike. They were made of various soft or hard waxy
formulations or early plastics, sometimes in unusual sizes, did not all use the same groove
pitch, and were not all recorded at the same speed. Early brown wax cylinders were
usually cut at about 120 rpm, whereas later cylinders ran at 160 rpm for clearer and
louder sound at the cost of reduced maximum playing time. As a medium for
entertainment, the cylinder was already losing the format war with the disc by 1910, but
the production of entertainment cylinders did not entirely cease until 1929 and their use
for business dictation purposes persisted into the 1950s.


Disc records, too, were sometimes made in unusual sizes, or from unusual materials, or
otherwise deviated from the format norms of their era in some substantial way. The speed
at which disc records were rotated was eventually standardized at about 78 rpm, but other
speeds were sometimes used. Around 1950, slower speeds became standard: 45, 33⅓,
and the rarely used 16⅔ rpm. The standard material for discs changed from shellac to
vinyl, although vinyl had been used for some special-purpose records since the early
1930s and some 78 rpm shellac records were still being made in the late 1950s.

Electrical recording

Until the mid-1920s records were played on purely mechanical record players usually
powered by a wind-up spring motor. The sound was "amplified" by an external or
internal horn that was coupled to the diaphragm and stylus, although there was no real
amplification: the horn simply improved the efficiency with which the diaphragm's
vibrations were transmitted into the open air. The recording process was in essence the
same non-electronic setup operating in reverse, but with a recording stylus engraving a
groove into a soft waxy master disc and carried slowly inward across it by a feed

The advent of electrical recording in 1925 made it possible to use sensitive microphones
to capture the sound and greatly improved the audio quality of records. A much wider
range of frequencies could be recorded, the balance of high and low frequencies could be
controlled by elementary electronic filters, and the signal could be amplified to the
optimum level for driving the recording stylus. The leading record labels switched to the
electrical process in 1925 and the rest soon followed, although one straggler in the US
held out until 1929.

There was a period of nearly five years, from 1925 to 1930, when the top "audiophile"
technology for home sound reproduction consisted of a combination of electrically
recorded records with the specially-developed Victor Orthophonic Victrola, an acoustic
phonograph that used waveguide engineering and a folded horn to provide a reasonably
flat frequency response. The first electronically amplified record players reached the
market only a few months later, around the start of 1926, but at first they were much
more expensive and their audio quality was impaired by their primitive loudspeakers;
they did not become common until the late 1930s.

Electrical recording increased the flexibility of the process, but the performance was still
cut directly to the recording medium, so if a mistake was made the whole recording was
spoiled. Disc-to-disc editing was possible, by using multiple turntables to play parts of
different "takes" and recording them to a new master disc, but switching sources with
split-second accuracy was difficult and lower sound quality was inevitable, so except for
use in editing some early sound films and radio recordings it was rarely done.

Electrical recording made it more feasible to record one part to disc and then play that
back while playing another part, recording both parts to a second disc. This and
conceptually related techniques, known as overdubbing, enabled studios to create


recorded "performances" that feature one or more artists each singing multiple parts or
playing multiple instrument parts and that therefore could not be duplicated by the same
artist or artists performing live. The first commercially issued records using overdubbing
were released by the Victor Talking Machine Company in the late 1920s. However
overdubbing was of limited use until the advent of audio tape. Use of tape overdubbing
was pioneered by Les Paul in the 1940s.

Magnetic recording

Magnetic recording was demonstrated in principle as early as 1898 by Valdemar Poulsen

in his telegraphone. Magnetic wire recording, and its successor, magnetic tape recording,
involve the use of a magnetized medium which moves with a constant speed past a
recording head. An electrical signal, which is analogous to the sound that is to be
recorded, is fed to the recording head, inducing a pattern of magnetization similar to the
signal. A playback head can then pick up the changes in magnetic field from the tape and
convert it into an electrical signal.

With the addition of electronic amplification developed by Curt Stille in the 1920s, the
telegraphone evolved into wire recorders which were popular for voice recording and
dictation during the 1940s and into the 1950s. The reproduction quality of wire recorders
was significantly lower than that achievable with phonograph disk recording technology.
There were also practical difficulties, such as the tendency of the wire to become tangled
or snarled. Splicing could be performed by knotting together the cut wire ends, but the
results were not very satisfactory.

On Christmas Day, 1932 the British Broadcasting Corporation first used a steel tape
recorder for their broadcasts. The device used was a Marconi-Stille recorder,[3] a huge
and dangerous machine which used steel razor tape 3 mm (0.1") wide and 0.08 mm
(0.003") thick running at 90 metres per minute (approximately 300 feet per minute) past
the recording and reproducing heads. This meant that the length of tape required for a
half-hour programme was nearly 3 kilometres (1.9 mi) and a full reel weighed 25 kg (55

Magnetic tape


7" reel of ¼" recording tape, typical of audiophile, consumer and educational use
in the 1950s–60s

Engineers at AEG, working with the chemical giant IG Farben, created the world's first
practical magnetic tape recorder, the 'K1', which was first demonstrated in 1935. During
World War II, an engineer at the Reichs-Rundfunk-Gesellschaft discovered the AC
biasing technique. With this technique, an inaudible high-frequency signal, typically in
the range of 50 to 150 kHz, is added to the audio signal before being applied to the
recording head. Biasing radically improved the sound quality of magnetic tape
recordings. By 1943 AEG had developed stereo tape recorders.

During the war, the Allies became aware of radio broadcasts that seemed to be
transcriptions (much of this due to the work of Richard H. Ranger), but their audio
quality was indistinguishable from that of a live broadcast and their duration was far
longer than was possible with 78 rpm discs. At the end of the war, the Allies captured a
number of German Magnetophon recorders from Radio Luxembourg that aroused great
interest. These recorders incorporated all of the key technological features of analogue
magnetic recording, particularly the use of high-frequency bias.

Development of magnetic tape recorders in the late 1940s and early 1950s is associated
with the Brush Development Company and its licensee, Ampex; the equally important
development of magnetic tape media itself was led by Minnesota Mining and
Manufacturing corporation (now known as 3M).

American audio engineer John T. Mullin and entertainer Bing Crosby were key players in
the commercial development of magnetic tape. Mullin served in the U.S. Army Signal
Corps and was posted to Paris in the final months of World War II; his unit was assigned
to find out everything they could about German radio and electronics, including the
investigation of claims that the Germans had been experimenting with high-energy
directed radio beams as a means of disabling the electrical systems of aircraft. Mullin's
unit soon amassed a collection of hundreds of low-quality magnetic dictating machines,
but it was a chance visit to a studio at Bad Neuheim near Frankfurt while investigating
radio beam rumours that yielded the real prize.

Mullin was given two suitcase-sized AEG 'Magnetophon' high-fidelity recorders and fifty
reels of recording tape. He had them shipped home and over the next two years he
worked on the machines constantly, modifying them and improving their performance.
His major aim was to interest Hollywood studios in using magnetic tape for movie
soundtrack recording.

Mullin gave two public demonstrations of his machines, and they caused a sensation
among American audio professionals—many listeners could not believe that what they
were hearing was not a live performance. By luck, Mullin's second demonstration was
held at MGM studios in Hollywood and in the audience that day was Bing Crosby's
technical director, Murdo Mackenzie. He arranged for Mullin to meet Crosby and in June
1947 he gave Crosby a private demonstration of his magnetic tape recorders.


Crosby was stunned by the amazing sound quality and instantly saw the huge commercial
potential of the new machines. Live music was the standard for American radio at the
time and the major radio networks did not permit the use of disc recording in many
programs because of their comparatively poor sound quality. But Crosby disliked the
regimentation of live broadcasts, preferring the relaxed atmosphere of the recording
studio. He had asked NBC to let him pre-record his 1944–45 series on transcription discs,
but the network refused, so Crosby had withdrawn from live radio for a year, returning
for the 1946–47 season only reluctantly.

Mullin's tape recorder came along at precisely the right moment. Crosby realized that the
new technology would enable him to pre-record his radio show with a sound quality that
equaled live broadcasts, and that these tapes could be replayed many times with no
appreciable loss of quality. Mullin was asked to tape one show as a test and was
immediately hired as Crosby's chief engineer to pre-record the rest of the series.

Crosby became the first major American music star to use tape to pre-record radio
broadcasts, and the first to master commercial recordings on tape. The taped Crosby radio
shows were painstakingly edited through tape-splicing to give them a pace and flow that
was wholly unprecedented in radio. Mullin even claims to have been the first to use
"canned laughter"; at the insistence of Crosby's head writer, Bill Morrow, he inserted a
segment of raucous laughter from an earlier show into a joke in a later show that had not
worked well.

Keen to make use of the new recorders as soon as possible, Crosby invested $50,000 of
his own money into Ampex, and the tiny six-man concern soon became the world leader
in the development of tape recording, revolutionizing radio and recording with its famous
Ampex Model 200 tape deck, issued in 1948 and developed directly from Mullin's
modified Magnetophones.

Multitrack recording

The next major development in magnetic tape was multitrack recording, in which the tape
is divided into multiple tracks parallel with each other. Because they are carried on the
same medium, the tracks stay in perfect synchronization. The first development in
multitracking was stereo sound, which divided the recording head into two tracks. First
developed by German audio engineers ca. 1943, 2-track recording was rapidly adopted
for modern music in the 1950s because it enabled signals from two or more separate
microphones to be recorded simultaneously, enabling stereophonic recordings to be made
and edited conveniently. (The first stereo recordings, on disks, had been made in the
1930s, but were never issued commercially.) Stereo (either true, two-microphone stereo
or multimixed) quickly became the norm for commercial classical recordings and radio
broadcasts, although many pop music and jazz recordings continued to be issued in
monophonic sound until the mid-1960s.

Much of the credit for the development of multitrack recording goes to guitarist,
composer and technician Les Paul, who also helped design the famous electric guitar that


bears his name. His experiments with tapes and recorders in the early 1950s led him to
order the first custom-built eight-track recorder from Ampex, and his pioneering
recordings with his then wife, singer Mary Ford, were the first to make use of the
technique of multitracking to record separate elements of a musical piece asynchronously
— that is, separate elements could be recorded at different times. Paul's technique
enabled him to listen to the tracks he had already taped and record new parts in time
alongside them.

Multitrack recording was immediately taken up in a limited way by Ampex, who soon
produced a commercial 3-track recorder. These proved extremely useful for popular
music, since they enabled backing music to be recorded on two tracks (either to allow the
overdubbing of separate parts, or to create a full stereo backing track) while the third
track was reserved for the lead vocalist. Three-track recorders remained in widespread
commercial use until the mid-1960s and many famous pop recordings — including many
of Phil Spector's so-called "Wall of Sound" productions and early Motown hits — were
taped on Ampex 3-track recorders. Engineer Tom Dowd was among the first to use
multitrack recording for popular music production while working for Atlantic Records
during the 1950s.

The next important development was 4-track recording. The advent of this improved
system gave recording engineers and musician’s vastly greater flexibility for recording
and overdubbing, and 4-track was the studio standard for most of the later 1960s. Many
of the most famous recordings by The Beatles and The Rolling Stones were recorded on
4-track, and the engineers at London's Abbey Road Studios became particularly adept at
a technique called "reduction mixes" in the UK and "bouncing down" in the United
States, in which several tracks were recorded onto one 4-track machine and then mixed
together and transferred (bounced down) to one track of a second 4-track machine. In this
way, it was possible to record literally dozens of separate tracks and combine them into
finished recordings of great complexity.

All of the Beatles classic mid-1960s recordings, including the albums Revolver and Sgt.
Pepper's Lonely Hearts Club Band, were recorded in this way. There were limitations,
however, because of the build-up of noise during the bouncing-down process, and the
Abbey Road engineers are still famed for their ability to create dense multi-track
recordings while keeping background noise to a minimum.

4-track tape also enabled the development of quadraphonic sound, in which each of the
four tracks was used to simulate a complete 360-degree surround sound. A number of
albums were released both in stereo and quadrophonic format in the 1970s, but 'quad'
failed to gain wide commercial acceptance. Although it is now considered a gimmick, it
was the direct precursor of the surround sound technology that has become standard in
many modern home theatre systems.

In a professional setting today, such as a studio, audio engineers may use 24 tracks or
more for their recordings, using one or more tracks for each instrument played.


The combination of the ability to edit via tape splicing and the ability to record multiple
tracks revolutionized studio recording. It became common studio recording practice to
record on multiple tracks, and bounce down afterward. The convenience of tape editing
and multi-track recording led to the rapid adoption of magnetic tape as the primary
technology for commercial musical recordings. Although 33⅓ rpm and 45 rpm vinyl
records were the dominant consumer format, recordings were customarily made first on
tape, then transferred to disc, with Bing Crosby leading the way in the adoption of this
method in the United States.

Further developments

Analog magnetic tape recording introduces noise, usually called "tape hiss", caused by
the finite size of the magnetic particles in the tape. There is a direct tradeoff between
noise and economics. Signal-to-noise ratio is increased at higher speeds and with wider
tracks, and decreased at lower speeds and with narrower tracks.

By the late 1960s, disk reproducing equipment became so good that audiophiles soon
became aware that some of the noise audible on recordings was not surface noise or
deficiencies in their equipment, but reproduced tape hiss. A few specialist companies
started making "direct to disc recordings", made by feeding microphone signals directly
to a disk cutter (after amplification and mixing), in essence reverting to the pre-War
direct method of recording. These recordings never became popular, but they
dramatically demonstrated the magnitude and importance of the tape hiss problem.

Audio Cassette

Before 1963, when Philips introduced the Compact audio cassette, almost all tape
recording had used the reel-to-reel (also called "open reel") format. Previous attempts to
package the tape in a convenient cassette that required no threading met with limited
success; the most successful was 8-track cartridge used primarily in automobiles for
playback only. The Philips Compact audio cassette added much needed convenience to
the tape recording format and a decade or so later had begun to dominate the consumer
market, although it was to remain lower in quality than open reel formats.


In the 1970s, advances in solid-state electronics made the design and marketing of more
sophisticated analog circuitry economically feasible. This led to a number of attempts to
reduce tape hiss through the use of various forms of volume compression and expansion,
the most notable and commercially successful being several systems developed by Dolby
Laboratories. These systems divided the frequency spectrum into several bands and
applied volume compression/expansion independently to each band (Engineers now often
use the term "compansion" to refer to this process). The Dolby systems were very
successful at increasing the effective dynamic range and signal-to-noise ratio of analog
audio recording; to all intents and purposes, audible tape hiss could be eliminated. The
original Dolby A was only used in professional recording. Successors found use in both
professional and consumer formats; Dolby B became almost universal for prerecorded
music on cassette. Subsequent forms, including Dolby C, (and the short-lived Dolby S)
were developed for home use.

In the 1980s, digital recording methods were introduced, and analog tape recording was
gradually displaced, although it has not disappeared by any means. (Many professional
studios, particularly those catering to big-budget clients, use analog recorders for
multitracking and/or mixdown.) Digital audio tape never became important as a
consumer recording medium partially due to legal complications arising from piracy fears
on the part of the record companies. They had opposed magnetic tape recording when it
first became available to consumers, but the technical difficulty of juggling recording
levels, overload distortion, and residual tape hiss was sufficiently high that magnetic tape
piracy never became an insurmountable commercial problem. With digital methods,
copies of recordings could be exact, and piracy might have become a serious commercial
problem. Digital tape is still used in professional situations and the DAT variant has
found a home in computer data backup applications. Many professional and home
recordists now use hard-disk-based systems for recording, burning the final mixes to
recordable CDs (CD-R's).

Most Police forces in the United Kingdom (and elsewhere) still use analogue compact
cassette systems to record Police Interviews as it provides a medium less prone to
accusations of tampering.[citation needed]

Recording on film

The first attempts to record sound to an optical medium occurred around 1900. In 1906,
Eugene Augustin Lauste applied for a patent to record Sound-on-film, but was ahead of
his time. In 1923, Lee de Forest applied for a patent to record to film; he also made a
number of short experimental films, mostly of vaudeville performers. William Fox began
releasing sound-on-film newsreels in 1926, the same year that Warner Bros. released Don
Juan with music and sound effects recorded on discs, as well as a series of short films
with fully-synchronized sound on discs. In 1927, the sound film The Jazz Singer was
released; while not the first sound film, it made a tremendous hit and made the public and
the film industry realize that sound film was more than a mere novelty.


The Jazz Singer used a process called Vitaphone that involved synchronizing the
projected film to sound recorded on disk. It essentially amounted to playing a phonograph
record, but one that was recorded with the best electronic technology of the time.
Audiences used to acoustic phonographs and recordings would, in the theatre, have heard
something resembling 1950s "high fidelity".

However, in the days of analog technology, no process involving a separate disk could
hold synchronization precisely or reliably. Vitaphone was quickly supplanted by
technologies, which recorded an optical soundtrack directly onto the side of the strip of
motion picture film. This was the dominant technology from the 1930s through the 1960s
and is still in use as of 2013 although the analog soundtrack is being replaced by digital
sound on film formats.

There are two types of synchronized film soundtrack, optical and magnetic. Optical
sound tracks are visual renditions of sound wave forms and provide sound through a light
beam and optical sensor within the projector. Magnetic sound tracks are essentially the
same as used in conventional analog tape recording.

Magnetic soundtracks can be joined with the moving image but it creates an abrupt
discontinuity because of the offset of the audio track relative to the picture. Whether
optical or magnetic, the audio pickup must be located several inches ahead of the
projection lamp, shutter and drive sprockets. There is usually a flywheel as well to
smooth out the film motion to eliminate the flutter that would otherwise result from the
negative pulldown mechanism. If you have films with a magnetic track, you should keep
them away from strong magnetic sources, such as televisions. These can weaken or wipe
the magnetic sound signal. Magnetic sound on a cellulose acetate film base is also more
prone to vinegar syndrome than a film with just the image.

A variable density soundtrack (left) and a bi-lateral variable area soundtrack



For optical recording on film there are two methods utilized. Variable density recording
uses changes in the darkness of the soundtrack side of the film to represent the
soundwave. Variable area recording uses changes in the width of a dark strip to represent
the soundwave.

In both cases, light that is sent through the part of the film that corresponds to the
soundtrack changes in intensity, proportional to the original sound, and that light is not
projected on the screen but converted into an electrical signal by a light sensitive device.

Optical soundtracks are prone to the same sorts of degradation that affect the picture,
such as scratching and copying.

Unlike the film image that creates the illusion of continuity, soundtracks are continuous.
This means that if film with a combined soundtrack is cut and spliced, the image will cut
cleanly but the sound track will most likely produce a cracking sound. Fingerprints on the
film may also produce cracking or interference.

In the late 1950s, the cinema industry, desperate to provide a theatre experience that
would be overwhelmingly superior to television, introduced widescreen processes such as
Cinerama, Todd-AO and CinemaScope. These processes at the same time introduced
technical improvements in sound, generally involving the use of multi-track magnetic
sound, recorded on an oxide stripe laminated onto the film. In subsequent decades, a
gradual evolution occurred with more and more theatres installing various forms of
magnetic-sound equipment.

In the 1990s, digital audio systems were introduced and began to prevail. In many of
them the sound recording is again recorded on a separate disk, as in Vitaphone; but
digital processes can now achieve reliable and perfect synchronization.

Digital Recording

The DAT or Digital Audio Tape

The first digital audio recorders were reel-to-reel decks introduced by companies such as
Denon (1972), Soundstream (1979) and Mitsubishi. They used a digital technology
known as PCM recording. Within a few years, however, many studios were using devices
that encoded the digital audio data into a standard video signal, which was then recorded


on a U-matic or other videotape recorder, using the rotating-head technology that was
standard for video. A similar technology was used for a consumer format, Digital Audio
Tape (DAT) which used rotating heads on a narrow tape contained in a cassette. DAT
records at sampling rates of 48 kHz or 44.1 kHz, the latter being the same rate used on
compact discs. Bit depth is 16 bits, also the same as compact discs. DAT was a failure in
the consumer-audio field (too expensive, too finicky, and crippled by anti-copying
regulations), but it became popular in studios (particularly home studios) and radio
stations. A failed digital tape recording system was the Digital Compact Cassette (DCC).

Within a few years after the introduction of digital recording, multitrack recorders (using
stationary heads) were being produced for use in professional studios. In the early 1990s,
relatively low-priced multitrack digital recorders were introduced for use in home
studios; they returned to recording on videotape. The most notable of this type of
recorder is the ADAT. Developed by Alesis and first released in 1991, the ADAT
machine is capable of recording 8 tracks of digital audio onto a single S-VHS video
cassette. The ADAT machine is still a very common fixture in professional and home
studios around the world.

In the consumer market, tapes and gramophones were largely displaced by the compact
disc (CD) and a lesser extent the minidisc. These recording media are fully digital and
require complex electronics to play back.

Interference colors on a compact disc.

Digital sound files can be stored on any computer storage medium. The development of
the MP3 audio file format, and legal issues involved in copying such files, has driven
most of the innovation in music distribution since their introduction in the late 1990s.

As hard disk capacities and computer CPU speeds increased at the end of the 1990s, hard
disk recording became more popular. As of early 2005 hard disk recording takes two
forms. One is the use of standard desktop or laptop computers, with adapters for encoding
audio into two or many tracks of digital audio. These adapters can either be in-the-box
soundcards or external devices, either connecting to in-box interface cards or connecting
to the computer via USB or Firewire cables. The other common form of hard disk
recording uses a dedicated recorder which contains analog-to-digital and digital-to-analog
converters as well as one or two removable hard drives for data storage. Such recorders,
packing 24 tracks in a few units of rack space, are actually single-purpose computers,
which can in turn be connected to standard computers for editing.



The analog tape recorder made it possible to erase or record over a previous recording so
that mistakes could be fixed. Another advantage of recording on tape is the ability to cut
the tape and join it back together. This allows the recording to be edited. Pieces of the
recording can be removed, or rearranged. See also audio editing, audio mixing, multitrack

The advent of electronic instruments (especially keyboards and synthesizers), effects and
other instruments has led to the importance of MIDI in recording. For example, using
MIDI timecode, it is possible to have different equipment 'trigger' without direct human
intervention at the time of recording.

In more recent times, computers (digital audio workstations) have found an increasing
role in the recording studio, as their use eases the tasks of cutting and looping, as well as
allowing for instantaneous changes, such as duplication of parts, the addition of effects
and the rearranging of parts of the recording.


Audio Quality & File Types

*** Audio Quality & File Types ***

Understanding Digital Sound and Analog Sound

The creation of the digital compact disc changed the way music is recorded and played.
Digital sound has, to a large extent, replaced analog sound. But what really is the
difference between digital and analog recordings?

When musical recordings were stored on cassettes or vinyl records, the recordings stored
sound in an analog format — the sound was recorded to the disc or tape as physical
grooves or magnetic impulses. The medium got the song from the artist to the listener,
but it still had some drawbacks.

• Sound degradation: Analog recordings degrade each time they are

played. When you press the Play button, physical contact is made
between the recording and the player. As when you rub sandpaper
against wood, some of the detail on the recording wears away. Before


long, you start to hear the cracks and pops associated with old recordings.
The music gets lost behind the noise, and fairly soon, you need to go out
and buy a new copy to get that wonderfully clear sound back.

• Lack of portability: Vinyl records are hard to carry around and listen to
wherever you want. Unless you have a full stereo system available, it isn't
easy to hear your records in their intended glory. Cassettes made the
music a little more mobile with the advent of the portable stereo and the
Walkman, but the sound wasn't quite as good as that from the vinyl

Ah, then came the compact disc. With compact discs, the music is encoded on the disc as
numerical information. A laser reads the information and translates that into your favorite
song. This type of recording creates digital sound.

The benefits to using digital sound are the following:

• Portability: You can take digital sound anywhere on a variety of devices,

and you can transfer it from network to computer very easily.

• Durability: Digital audio doesn't degrade like analog audio sources.

• Options: You can buy or record your audio in differing levels of quality
and size, depending on your needs.

• Sound Quality: Unless you've invested thousands of dollars in

audiophile-quality analog audio gear, you'll probably note a better sound
coming from digital audio systems.

Comparing digital sound and analog sound

To compare digital and analog sound, you need to look at a variety of factors.

• Sample rate. In analog recordings, the machine is always recording any

sound or noise that is coming through the microphones. In digital
recording, however, you don't have a constant recording of what's going
on. Instead, you have a series of samples taken from the sound being

Think of it like a movie — a motion picture strings together a series of

pictures to make it look like moving action. In this case, digital recording
takes a series of "pictures" of what the sound is like and turns it into a
digital recording. A standard compact disc contains sound that has been
sampled at 44.1 kHz, or just over 44,000 times a second (that's a lot of
pictures!). However, you may run into digital sound on the Internet that's


been recorded at 48 kHz, 96 kHz, or even higher. Just think of it as getting

more detail from more pictures.

• Bits. By increasing the number of bits ("units of information") contained in

the file, the amount of detail contained in each sample is increased. It's the
difference between saying "The cat has white fur" and "The purebred
Siamese feline has ivory fur with charcoal roots." See the difference? Now
imagine the detail that you can get from higher bit rates in your music.
Again, a standard CD has 16-bit sound, although you might occasionally
run into higher bit sizes on the Internet.

• Bit rate. Digital music files are measured in the amount of information
they play per second. In most cases, it's measured in Kbps, or kilobits per
second. This is the amount of sound information presented to the listener
every second. The standard for near-CD quality is 128 Kbps, and some
files go up to 320 Kbps. On the other hand, files played over Internet radio
are 56 or 64 Kbps to allow faster transport over networks, like your dial-up
or broadband Internet connection.

The tradeoff for better sound

As with everything else in life, there is a tradeoff in getting this improved sound. For the
increased detail and better sound, you give up space on your hard drive or memory card.
The extra information and detail mean that more memory is taken up. The size of these
files is usually measured in megabytes, or MB. For comparison, the size of current hard
drives is measured in gigabytes, or GB. There are approximately 1,000MB in each GB. If
your songs are recorded at a lower bit rate, you can fit more songs on your drive — but
they won't sound as good. It all depends on what you want — more songs or better

To use less memory space, you can compress the data. Data compression reduces the file size while
keeping the detail of the music as close as possible to the original.


Music fans have their preferences — some people still insist on the superiority of analog
sound. But make no mistake: Digital music is here to stay.

*** Digital file Formats to know ***

For file-based music playback there are a number of popular file formats out
there and you may be wondering which is best. The answer is, it depends. It
depends on your hardware and software environment since not all music file
formats are supported by all file-based music playback devices. Network players
typically come with a list of supported file formats and media player software may
come with file formats restrictions as well. The most popular (and annoying)
example being iTunes lack of support for the FLAC format.

Pulse-code modulation (PCM) is the most common method of storing analog audio
signals in digital format. CD data is stored as PCM data as are most of the file formats
associated with digital downloads.

In a PCM stream, the analog waveform is represented by two values; the sample rate
which represents the number of times per second a sample is taken, and the bit depth
which represents the number of possible values each sample can have. A file's bit rate is
obtained by multiplying its bit depth times the sample rate, times the number of channels
(for stereo that means x2). So a CD's bit rate is equal to 1,411kbps (16 x 44,100 x 2).

For digital downloads, common bitrates, bit depth and sample rates include 256kbps,
320kpbs, 16-bit/44.1kHz (CD-quality or Redbook), 16-bit/48kHz, 24-bit/44.1kHz, 24-
bit/48kHz, 24-bit/96kHz, 24-bit/172.4kHz, and 24-bit/192kHz. We are also seeing Digital
extreme Definition (DXD) files becoming available which are PCM files with a bit depth
and sample rate of 24-bit/352.8kHz. There is also some debate as to what represents a
High Definition (HD) download. Some people feel that any file with a bit depth of 24-bit
is HD, while others restrict this classification to files with a sample rate of 48kHz or
greater. AudioStream has taken the position that 24-bit/48kHz and greater resolutions
represent HD.

Beyond bit and sample rates, PCM data can be stored in a number of file formats which
are either uncompressed or compressed. Compressed files are further broken down into
either lossy or lossless compression. An uncompressed file format means that the
associated data has not been altered in any way. It is stored bit for bit. A compressed file
format means the data has been altered in order to achieve a smaller file size. In lossless
compression, all of the original data is kept while in lossy compression some data is
discarded, i.e. thrown away forever.

PCM Lossy Compressed File Formats

Lossy compressed formats are used by streaming services like Pandora and Spotify and
remain the download format of choice for sites like Amazon and the iTunes store. You
can think of lossy compressed formats as the fact food of downloads and while they are


perfectly suitable for streaming services due to their reduced file size, when paying for
downloads we recommend sticking with lossless and uncompressed formats due to their
improved sound quality. It's important to note that you should not convert a lossy
compressed file from one lossy format to another since with each conversion you will
lose more musical data. I wonder if you convert from one lossy format to another enough
times if you'll end up with silence (joke).

Advanced Audio Coding (AAC) or MPEG-4 File Format, V.2 with Advanced Audio
Coding. AAC is a lossy compressed file format designed to be the successor to MP3 and
is used by sites like the iTunes Store for its music downloads (bitrate: 256kbps) and
YouTube for its streaming audio.

The original and still most popular and widely supported lossy compressed file format
which became an MPEG (Moving Picture Experts Group) standard in 1993. Amazon,
among many others, uses MP3 as the delivery format for its music downloads (average
bitrate: 256kbps). See our article The Aging Anatomy of MP3 for more info and history
on MP3.

OGG Vorbis
OGG Vorbis is an open source lossy compressed file format developed by the the
Xiph.Org Foundation. Among others, Spotify uses the Vorbis format for its streaming
services and offer three levels of quality: 96kbps (Spotify mobile "Low bandwidth"
setting), 160kbps (Spotify Desktop standard streaming quality), and 320kbps (Spotify
Mobile "High quality" setting).

Windows Media Audio (WMA) is Windows proprietary codec and comes in a lossy
compressed version as well as a lossless compressed version, WMA Lossless.

PCM Lossless Compressed File Formats

Since lossless compression is just that, lossless, all of the original data remains in tact
unlike lossy compression which discards musical data in order to achieve smaller file
sizes. You can convert from one lossless format to another or to an uncompressed format
with no loss of data.

Apple Lossless Audio Codec (ALAC). ALAC is Apple's open source (since 2011)
lossless compressed file format.

Monkey's Audio (Monkey's Audio APE). APE is a lossless compressed format.

Free Lossless Audio Codec (FLAC). The most common lossless compressed file format


for music downloads. FLAC, which is open source, supports embedded metadata and
typically reduces the original uncompressed file size by 50-60%. The only drawback for
FLAC is Apple's iTunes does not support it.

PCM Uncompressed File Formats

Uncompressed file formats are exact copies of the original data. As such they take up
more space than compressed formats. Some suggest, and I'm one of them, that the cost of
storage has reached a point where the extra storage requirements and associated cost for
uncompressed formats is negligible.

Audio Interchange File Format (AIFF). AIFF is Apple's proprietary uncompressed file
format. iTunes users interested in an uncompressed file format with embedded metadata
choose AIFF since iTunes does not support FLAC.

FLAC (uncompressed)
Free Lossless Audio Codec (FLAC). The application dbPoweramp offers an option to rip
or convert your music data to uncompressed FLAC format

Waveform Audio File Format (WAVE or WAV) is another popular uncompressed format
for music downloads developed by Microsoft and IBM. The one drawback for the WAV
format is a lack of widespread support for its method of encoding metadata.

Direct Stream Digital brought to you by Sony and Philips originally for SACD uses
pulse-density modulation (PDM) encoding to store analog signals. DSD is a 1-bit format
with sample rates of 2.8224 MHz (also referred to single rate DSD and 64x DSD or 64
times CD's sample rate) and 5.6448 MHz (also known as double rate DSD and 128x DSD
or 128 times the sample rate of CD) at present.

DSD File Formats

Digital Interchange File Format (DFF). A DSD format that does not support embedded

Direct Stream File (DSF). A DSD format that supports embedded metadata.


File Sizes and Storage

Here's a guide to different file types by bitrate and bit depth/sample rate and their
associated storage requirements.

MB of Storage per
File Type Albums per 1TB/storage*
Minute of Music (stereo)
128kbps 1MB 20,900
256kbps 2MB 10,500
320kbps 2.4MB 8,700

* Numbers are approximate and based on 45 minutes of compressed music per album

MB of Storage per
Albums per
File Type Minute of Music
16-bit/44.1kHz (CD
10MB 2,000
16-bit/48kHz 11MB 1,900
24-bit/48kHz 16.5MB 1,200
24-bit/96kHz 33MB 630
24-bit/192kHz 66MB 320
DXD (24-bit/352.8kHz) 121.12MB 165
DSD64 40.375MB 500
DSD128 80.75MB 250

** Numbers are approximate and based on 45 minutes of uncompressed music per album



Music Theory and Piano Foundations

Beats, Bars, Phrasing

*** Beats, Bars & Pattern Phrasing ***


Tempo is the Italian word for time, and we use it to describe how quickly a piece of
music is performed; the higher the tempo, the more spritely the music is. With acoustic
music, the play style is approximated by words – again in Italian – that describe the
intended pace of the piece of music. There are quite a few, but you may have heard some
of the more common ones:

Grave: Plodding

Adagio: Relaxed

Allegro: Jaunty

Presto: Fast

Whilst we tend to have an innate sense of what people mean when they say ‘fast’ or
‘slow’ music, in order to have a more firm benchmark we can also pin tempo to
something that everyone already knows: time measurement. We do this with Beats Per
Minute, or BPM. If something is defined as 60BPM then we know there is a beat every
second, and if a piece of music is set to a specific BPM then those beats will fall to a
strict time, every time: a key identifier of electronic music. In order to make sure that we
keep to a strict BPM, rather than our natural human tendency to drift slightly around a
more general tempo, we need some kind of aid to keep us in check. Enter the
metronome. A metronome can be set to a specific BPM (electronic metronomes can be
set extremely precisely, but traditional, mechanical metronomes like the one pictured are
a little less precise although still very accurate) and is used to keep time for you to play
along with. A metronome, sometimes called simply a click, is a standard feature included
in more or less all music production software and often offers precision of 100ths of a
beat. Oh, wait… let’s take a look at what a beat actually is…


Time Signature Explained

It’s all well and good saying that BPM means beats per minute, but when you actually
think about it what does that even mean? We don’t question what a ‘minute’ is – it’s just
a standard unit of measurement that we assign to time and trust that it’ll always be the
same, right? To draw a comparison, a minute is to the grand dimension of time as a beat
is to a whole note. In other words, a whole note is a musical constant: it exists in its
entirety, a perfect ideal. We chop whole notes up into divisions of themselves, be they
half notes, quarter notes, three-eighth notes, and so on, but the note is the whole. A beat is
a musical constant too, but it is relative to a (whole) note.

We write music terms of beats and bars, and a time signature is the key to what those
beats and bars mean relative to a ‘whole’ note. This is one of the most commonly
misunderstood aspects of basic music theory, but it’s actually very simple to grasp. Let’s
take an example; by far the most common time signature in western music is 4/4 (in fact
it’s so common that it’s actually referred to as common time). The first 4 is the number
of beats there are in a bar, and the second 4 is how many beats make up a note. From this
signature we can surmise that there is always a whole note in every bar of music written
in the 4/4 time signature. A 3/4 time signature, most popularly used for waltz style music,
contains three beats per bar and each beat is a quarter note long, so a bar always contains
three quarters of a whole note.

So a beat is a division of a bar, and both what kind of division and how much of a whole
note is in one bar is defined by the time signature. We’re on the way to having time
signature explained; now we need to look at why.



Without rhythm, music isn’t music – it’s just a collection of sounds with no drive behind
them. The rhythm of a piece of music is the thing that lends some sense to the tempo and
time signature in a similar sort of way to how the sun going up and down every day gives
us some sense of time going by – we know the beginning of each day follows the end of
the last one Without an all important sense of rhythm, time signature wouldn’t matter and
we could simply write beats down in a line and play them indiscriminately. In reality,
some beats in music are given more weight than others – they are stressed. How we
stress notes in a bar is what gives music that head nodding feel, and the design of musical
structure gives us the power to understand how that head nodding feel translates to the
page (or screen).

One way to get your head around how we approach creating rhythm is to work somewhat
backwards. Let’s take the example of 4/4 music again: if we really wanted to we could
write the exact same beat structure in 3/4 (as the four after the dividing line is the
indication of how many beats make up a note, and it’s the same), but if were to do that
each bar would only be three quarters as long – because there’s only three beats in a bar
in 3/4 compared to four in 4/4. This would mean that sequences that fit neatly into 4/4
timing would sprawl over the bars in 3/4 timing and start to make the notion of a bar
irrelevant. Therein lies the power of the bar; we write music in phrases, and how those
phrases naturally resolve gives us a natural ‘feel’ for what time signature the music is in.


With 4/4 music, the phrases sound something like ‘1 2 3 4, 1 2 3 4, 1 2 3 4, 1 2 3 4’. 3/4,
on the other hand, sounds like ‘1 2 3, 1 2 3, 1 2 3, 1 2 3’. The pattern in which beats are
stressed controls the rhythm, as we said, and when that pattern starts over it gives us a
natural place to split into a new bar.

The pattern of beat stresses doesn’t have to start on the first beat of the bar like in the
previous examples, of course – if you’re a reggae fan you’ll be familiar with the way that
stresses tend to occur on the second (and perhaps fourth) beat in a bar of 4/4, with the
effect of making all the energy when you nod your head to it come from lifting it up
rather than bobbing it down.

The down/up phases of your nodding head tend to work along with what we call on beats
and off beats, too. A bar starts with an on beat; this first beat is called the down beat.
The on/off pattern continues through the bar, and the last beat is called the up beat. They
are named after the way conductors (not the ones at train stations, those guys with the
sticks in orchestras) wave their sticks – or batons to use the correct term – to direct the
first beat in a bar with a down stroke and the last beat with an upstroke. Just to make
matters a little bit complicated, down beat and up beat is often used to describe rhythm
when on beat and off beat should be used instead, and on beat and off beat have their own
connotations when used outside of music theory – nobody likes to be told they’re off
beat, right?

If your brain is feeling a little like scrambled egg right now, don’t worry. This is pretty
heavy stuff, and it’s actually quite difficult to skim the surface of music theory because
every answer seems to pose three more questions and before you know it you’re so far
down the rabbit hole that you’re wondering whether the Cheshire Cat prefers techno or
dubstep. The good news is that we all have an innate sense of rhythm – after all, our
hearts are forever keeping time inside us – and some of the greatest musicians and
producers of all time have never so much as broken the spine on a music theory book.
The key to understanding rhythm is practice.


Lets look at beats and rhythmic patterns

To create a beat pattern we must place percussive sounds on beats in a way

that repeat in a pleasing and predictable way.

This is also known as creating a loop

Beats or note lengths fit inside Bars and using a number of bars we will create musical
sentences or Phrases

Here is a visual representation of 1 Bar in Ableton Live where the blue selected area
represents a whole note being divided into smaller and smaller note lengths

Whole Note

Half Note

Quarter Note

8th Note


16th Note

32nd Note

What is a Musical sentence or a phrase?

Well, just like when you start and stop talking to someone, a sentence needs to make
sense, it has to start somewhere, give the necessary information and then finish
strong or leave room to allow the listener to realize you are going to continue

Musical sentences are quite similar.

When using patterns to create music and rhythm we need to be aware that
repetition can get boring and cause the listener to lose interest, but if we understand
how to create subtle changes every so often we can re capture a listeners attention
and keep them entertained even while they feel they are able to predict the patterns
we have made.Introducing variation in our patterns at regular intervals does this.

“Our humanity rests upon a series of learned behaviors, woven together into patterns that are infinitely
fragile and never directly inherited.” – Margaret Mead

“Intelligence is the ability to take in information from the world and to find patterns in that information
that allow you to organize your perceptions and understand the external world.” – Brian Greene

The human brain seeks comprehension through the identification of patterns. Yet
while we seek predictable organization, we also crave the excitement of the
unexpected. Lets examine the many layers of patterns that fill our music as well as
the unexpected disruptions within those patterns that captivate our imagination.


Here’s an Example:

Pattern A is the fundamental beat. B,C and D are variations of A

This creates an interesting musical sentence that is predictable but entertaining

when the variations occur, therefore keeping the listener intrigued throughout the



*** Instrument Identification ***






*** The Music Alphabet ***

The Music Alphabet: Start Speaking the Language by

Learning the A-B-C's
When you play music with other people, you have to be able to talk about what you're
going to play. To do that, you have to know how to speak the language. Learning the
music alphabet will help you do that.

The language of music is just like any other language. That's why we use the music
alphabet. It's a simple system of letters that help us write down the sounds we actually
want to play.

Notes: the Building Blocks of Music

All music is made up of notes. A note is just any pitch made by a musical instrument.
Every note in music has a letter name. The music alphabet is made up of only seven
letters: A-G. This is because when we play the notes in order, the note that we would call
"H", sounds like another "A", so we just start the set over. Example:




This is similar to counting numbers. You start with 1, then 2, 3 and so on, but when you
get past 9, you don't start making up new numbers, you start back with 0, with a 1 in
front. Then the same thing happens with 20 and 30 and so on.

In music, we call these sets octaves. So when you get to the next highest note with the
same letter name, it will still be an "A" note, but it's pitch will be in a higher octave.

The image above shows how the notes continue in both directions, (hence the "..." on
both ends). When you are going to the right, or up the alphabet, you don't stop at G.
Instead you start over at the next highest A and keep going.

The same thing happens for the opposite direction as well. When you are going to the
left, or down the alphabet, you don't stop at A. You simply start over at the next lowest G
and keep going.


Sharps and Flats

Even though the music alphabet is only made up of 7 letters, that doesn't mean that there
are only seven notes. There are actually 12. The seven letters represent natural notes.
Don't get hung up on this word. Natural just means they are regular notes; they are just
the regular alphabet letter name like A, C, or G.

The other five notes fall in between these letters in the form of sharp notes or flat notes.
We write these sharp notes and flat notes by adding either a sharp symbol ♯, or a flat
symbol ♭ to the natural notes.

A sharp note is one note higher in pitch than the natural letter it uses. (So A♯ is higher
than A.) A flat note is one note lower in pitch than the natural letter it uses. (So A♭ is
lower than A.)

NOTE: Going to the right on this line would be going higher in pitch (or going
"up"), and going to the left would be going lower in pitch (or going "down").

Here is how this would work out on a piano keyboard. The blue notes are natural, the red
notes are sharp, and the orange notes are flat.


You might notice that certain sharps or flats take up the same space on the keyboard.
That's because they are actually the same note, they are just called by a different name
depending on the where we use them. We call these kinds of notes enharmonics. That
means that A♯ and B♭ are the same note, C♯ and D♭ are the same note, and so on.

You may also notice that there are no sharps or flats between B-C and E-F. Now that
doesn't mean there is no such thing as a B♯ and F♭, we just usually call them by C and E.

"Sharp" and "flat" can also be verbs. When we sharp a note, we raise its pitch by one
note. When we flat a note, we lower its pitch by one note.

So when we sharp a D, we get D♯. When we sharp a B, we get a C. Also, when we flat a
B, we get B♭. When we flat an F, we get E.

Notice: Don't worry about why there is no note between B-C or E-F. It is best just to
accept that this is how it is and learn it this way. To question this would be like
questioning why you never see Q without a U beside it in any words in the English
language. let's not waste time on it.


*** Scales ***

Music Scales: Subsets of the Alphabet

Music scales are a set of notes that we choose to use for a particular song. These notes are
chosen because they sound good together. Different cultures have different scales that
they use more. The most common to most of us is the major scale.

To form a scale, we proceed through the alphabet choosing notes that go together to
achieve a particular sounding set. Most of the time we do this by putting together 'whole
steps' and 'half steps'.

Major Scales

In this lesson we cover how to put together scales by starting with the simplest one, the
major scale. A major scale is put together using whole steps and half steps in a particular
regular pattern.


Minor Scales

Minor scales are different than major scales in the fact that there is no one "right" minor
scale. Minor refers to the fact that the 3rd degree (or 3rd note in the scale) is a half step
lower than it is in the major scale. The other members of this scale can change to get
different effects while still maintaining a "minor" sound.

Pentatonic Scales

In this lesson we'll cover the infamous pentatonic scales. Pentatonic is a word that comes
from the Greek 'penta' meaning five and 'tonic' referring to tone. Quite literally, a
pentatonic scale is one that is made of only five notes. There are two main forms of
pentatonics: minor and major.

Major Scales
A scale is a set of notes usually in sequential order that is used to play in a particular
key or range.

Different songs will use different scales and different parts of the same song can use
different scales. Simply put, a scale is a set of notes we work with in any music. Right
now we'll just be working with major scales and chromatic scales.

Scales are usually presented to us in a sequential order from one note (C) up through the
other notes in the scale to the next note of the same letter name (C). This is a one octave
scale because it only goes up through one full set of the notes, or one full octave.


This is a two octave scale:


These are the notes of the C major scale. The key of C has no sharps or flats. The key of
D has two sharps.

D E F# G A B C# D


So how do you know which notes to make sharp or flat? There are two ways to tell. One
way is to use a chart as a reference, such as this one . I actually don't prefer this way
because it relies on either referring back to the chart every time, or simply brute force
memorization of all of the key signatures. (While it is indeed a good idea to memorize all
of them, it's better to do it in a way that makes sense.)

Whole Steps & Half Steps / Tones & Semi Tones

That's why we have this way. This is also one way that we can use to actually build scales
from scratch, and that is by counting up in whole steps and half steps.

A half step is the distance between one note and the next note up or down. So referring to
the music alphabet, A to A# is a half step. B to C is a half step. E to Eb is a half step. So
that means that means that whole step is two of these, so A to B is a whole step. F# to G#
is a whole step. F to Eb is a whole step. Any major scale is made up of these steps:

Root, w w h w w w h / Root, T -T -St –T- T- T –St

That means that there is a whole step between the first note and the second note, a whole
step between the second note and the third note, a half step between the third note and the
fourth, and so on.


D w E w F# h G w A w B w C# h D

Remember this pattern. Get used to saying "whole whole half, whole whole whole half".
This pattern will remain the same for all major scales. Say this pattern out loud several
times until you memorize it.

When you play any music scales on an instrument or sing them, be sure to play/sing them
forwards and backwards, or ascending and descending. This gives you a firmer
understanding of how those scales are supposed to sound.

Numbering Scale Degrees

When we talk about scales, it's useful to be able to talk about individual components of
those scales. We call those components 'degrees.' But rather than having to say "it's the
F# in a D scale" and "it's the G in an Eb scale" we just call that degree the 3rd. This
means that any major scale can be converted to these numbers:


That means that we can use this formula to make any major scale. Just start on a note, and
use that "whole whole half, whole whole whole half" pattern.


NB: Tones & Semi Tones are movements up or down the keyboard, for example…

Tone: move up or down the keyboard and skip a note

Semi-Tone: Move up or down the keyboard to the very next note

Major Pentatonic Scale

The major pentatonic consists of the 1st, 2nd, 3rd, 5th and 6th notes of a major
scale. For instance, let’s take a look at the C major scale. The notes of the C major scale are
C – D – E – F – G – A – B. The C major pentatonic would therefore be, the 1st note, C, the
2nd note, D, the third note, E, the 5th note, G, and the 6th note, A, or C – D – E – G – A.

If you play the black keys on your piano, you will hear that pentatonic sound. The black keys
on your piano are pentatonic and they’re in the key of G flat and its relative minor, E flat



Minor Scales
A lot of people find minor scales to be confusing because when most people start
learning music, they start with mainly major scales and end up getting a stronger
foundation with those and not minors. Also there are several types of minors which
sometimes get confused and run together.

But just because they are different, doesn't mean they are harder, we just might have to
spend more time with them because we aren't used to them yet.

1. A natural minor is essentially its relative major that has been inverted or moved

C Major -> A Minor:

C D E F G A B C -> A B C D E F G A

Eb Major -> C Minor:

Eb F G Ab Bb C D Eb -> C D Eb F G Ab Bb C

2. Now lets compare the natural minor to its parallel major. (In music, parallel
means same letter name.)

C Major -> C Minor

C D E F G A B C -> C D Eb F G Ab Bb C

R 2 3 4 5 6 7 R -> R 2 b3 4 5 b6 b7 R

So we see that one way of looking at this is to say that the natural minor scale has a
flat 3, 6, and 7.

3. Another way is to count whole steps and half steps. (This is slightly more time
consuming but makes more sense to some people.) In this example the lower
case (w) and (h) indicate Whole and Half steps between the notes. Notice the italicized
notes are the ones that change.


C Major:

C Minor:
C w D h Eb w F w G h Ab w Bb w C

So there are 3 different ways of looking at natural minor scales: comparing them to
relative majors, comparing them to parallel majors, and counting whole steps and half
steps. Now let's check out the other types of minors.

Natural Minor: C D Eb F G Ab Bb C
Melodic Minor: C D Eb F G A B C + C Bb Ab G F Eb D C
Harmonic Minor: C D Eb F G Ab B CMinor Pentatonic: C Eb F G Bb

Melodic Minor

The melodic minor seems confusing at first but you just have to remember one thing: It's
the only minor scale that is different going up than it is coming down.

C D Eb F G A B C

R 2 b3 4 5 6 7 R

C Bb Ab G F Eb D C

R b7 b6 5 4 b3 2 R

On the way up the first half is minor with the flatted 3, but the second half is major with
the same as the natural minor. This is called a melodic minor because it is the most
common in the traditional 'melodies' that we are used to. It sounds more like music and
less like a boring exercise. The more you practice this scale ascending (forwards) and
descending (backwards) the more it will make sense to you.


Harmonic Minor

The harmonic minor is in between the other two. It uses the natural minor's flat 3 and 6
but uses the melodic minor's raised 7.

C D Eb F G Ab B C

R w 2 h 3 w 4 w 5 h 6 wh 7 h R

Notice the interval between the 6 and 7 is a step and a half, (hence the wh). This can be
tricky at first, so watch out for it. This scale has a more oriental feel that is slightly more
dissonant and could be very useful in creating catchy melodies because it is something
that is not heard as often and isn't "worn out".

The Minor Pentatonic is a natural minor but without the 2 and 6. It is the easiest minor
scale to improvise with because it eliminates the two most dissonant notes in the natural




*** Circle of fifths ***

Learn the Circle of Fifths and How YOU Can Use It!
Ever heard people talking about the circle of fifths and wondered what it was? Or are you
familiar with it, but don't know what it does or how to use it? Well wonder no more.

The Circle of Fifths is a chart organizing all of the keys into a system that we can use to
relate them to one another. There are several versions of this chart. The most common
ones use only the major keys with only the chord symbols. The version below however
shows the major and minor keys with their key signatures.

How to Read the Circle

How Does This Work?


The letters on the outside of the circle are the major keys and the letters on the inside are
minor. The circle itself shows how many sharps of flats there are in each key, and the key
signatures are on the edge.

This is called the Circle of Fifths because each note is a perfect fifth away from another.
A perfect fifth is the distance of 7 notes tones: A, A#, B, C, C#, D, D#, E.

When we go clockwise around the circle, we are going "up a fifth". This is also called
adding a sharp (because of the what it does to the key signature). When we go counter-
clockwise, we are going "down a fifth". This is also called adding a flat.

(This is also called the Circle of Fourths because when we go "up a fifth", it is the same
thing as going down a fourth. Also when we go "down a fifth", it is the same thing as
going up a fourth.)

Using the Circle

So how is this useful?

• Dominant 7th chords (which show up EVERYWHERE in all kinds of

music), have a certain tendency to go towards certain chords. But which
chord is next? The answer is simply to go down a fifth, or counter-
clockwise on the circle. So if we are dealing with an A7, it will pull toward a
D. If we are dealing with an F7, it will pull toward a Bb.
• You can also use the Circle of Fifths to work out ANY major or minor scale
by playing a game of leap frog going around the circle.
• When playing most kinds of music, the most common chords will be the
chord of the key you are working in, and the chords on either side of it on
the circle. For example, if you are playing in the key of C, you'll likely use F
and G as well.
• When writing songs, sometimes it can be difficult to come up with
interesting chord progressions. We can use the chart for this. The closer
two chords are on the circle, the better they will sound together. Part of the
reason this is used so much in music is because they sound good. To
make it more interesting, take the key you are in (for example 'D') and leap
to another chord (we'll use C). Then work back through the chords toward
home base again and use the chords around it. Try playing this on your
instrument to see how it sounds.



*** Chords ***

When building melodic and harmonic content in your song, it can be easier for some
people to start with the melody – and then build the chords later. If you can do this, kudos
to you. Others may find it easier to build the chords first, and then the melody later.

One thing I always found was that my chord progressions were boring, uninspiring, and
simple. Sometimes this can work, and there’s nothing wrong with just making an
extremely simple chord progression. BUT sometimes we want a little more than that. We
want to know how to make our music interesting, give it a mood.

And by giving it a mood, we don’t want to spend hours on end looking for that perfect
chord, right? Surely I’m not the only one who gets sick of plotting random notes in the
piano roll waiting for one to stick out?

What is Chord Quality?

See, in my opinion – if you want to be good at building chord progressions and the likes,
you’ve got to be familiar with chordal qualities in each key.

What do I mean by this?

In order to be good at building chord progressions and the likes, you’ve got to be familiar
with chordal qualities in each key, What do I mean by this?

In every key there are specific chords that contain a certain feel, or character. You could
also call this mood or quality. Chord qualities are like colors, in the simplest sense.

How important is knowing this?

Although I think almost all music theory is important, there are arguably some things that
are rendered unimportant due to advancement in DAW technology and other things
including music genres that contain a lack of harmonic content. But in saying this, I think
knowing the different chord qualities is important because it eliminates the need to spend
hours on end looking for that special chord that comes next in the progression. You won’t
need to jump around the piano roll plugging in random notes.

Most of all though, it’s about creating mood. Want a more saddening chord? Pick a
minor. Want something a little more energetic? Choose a ninth.

Different Chord Qualities

There are 9 relatively common chord qualities that you should be aware of. Some of these
aren’t used often, others are used extensively:


• Major and minor

• Major and minor seventh
• Dominant seventh
• Major and minor sixth
• Suspended fourth
• Ninth
• Diminished
• Augmented

You may have heard of a few of these, others may be foreign.

Now we’ll have a look at each chord quality individually.

Each section will contain an explanation of the mood, instruction on how to build them,
an audio clip of how they sound, and an image of the chord in the piano roll.


We’ll start with major as it’s the most simple of them all.

Major chords sound happy and simple, and the fact is – they are simple. To build a
major chord you simply use the 1st, 3rd, and 5th notes in whichever scale you’re using.
Let’s use the scale of C Major as an example:

This will give us a simple Major triad chord.

Seeing as we work with DAWs which contain piano rolls, you may want to work it out
via half steps (notes on a piano roll) which in the case of a Major chord would be the root
(C) + 4 half steps (C# – D – D# – E), + 3 half steps (F – F# – G).


As you can see, we start counting the half steps from the note above the last one.

Minor (m)

Minor chords are considered to be sad, or ‘serious.’

To build a minor chord, you follow the same pattern as a major chord except drop the
third degree of the scale down a half step.

• A C Major chord consists of a C, E, and G

• A C minor chord consists of a C, Eb, and G

Although they seem almost identical, the difference is significant. Listen to the audio clip
for the Major chord above, and then the one for the minor below:

Unlike the major, we start at the root (C) and then go up 3 half steps (Eb/D#), and then 4
half steps to the 5th degree (G) which is the same.


Once you understand that it’s simply the 3rd degree that needs to be moved – minor
chords are also just as simple as major.

Major Seventh (7)

Major seventh chords are considered to be thoughtful, soft. You’ll hear many seventh
chords in Jazz music.

Major seventh chords are built from the 1st, 3rd, 5th, and 7th tones of the Major scale.

In the scale of C Major, this would be C, E, G, B.

Another way to think about it is a Major chord plus an extra note which is a tone down
from the root on the scale.

In terms of half steps, a Major seventh would be exactly the same as a standard Major
triad, except for the fourth note being added 4 half steps above the 5th degree (G in this


To sum it up again: Root, then 4/3/4.

Minor Seventh (m7

Minor seventh chords a different to Major seventh chords in the fact that they’re a lot
more moody, or contemplative.

You can follow the same concept we showed above, but starting with a minor chord
instead. The minor seventh uses the 1st degree, a flat 3rd degree, 5th degree, and flat 7th
degree of the major scale.

Instead of C, E, G, B. We’d have C, Eb, G, Bb.


Instead of going 4/3/4, we go 3/4/3 from the root note.

Dominant Seventh (7)

Dominant seventh chords are strong, powerful, and adventurous.

These are very similar to minor sevenths except for the second note not being flat.

Instead of C, Eb, G, Bb. It would be C, E, G, Bb.

Dominant seventh chords are made from the 1, 3, 5, and flat 7 tones of the major scale.

In terms of half steps? Starting at the root note, then 4/3/3.

Major Sixth (6)

Major sixth chords are fun and playful.

A major sixth chord uses the 1st, 3rd, 5th, and 6th degree of the major scale:


To build a major sixth chord, simply follow the build for a major triad, and then add the
sixth degree or two more half steps.

Starting at the root note, then 4/3/2.

Minor Sixth (m6)

Minor sixth chords are almost identical to major sixth chords apart from their lowered 3rd
degree. They sound a lot darker and mysterious.

A minor sixth chord is built from the 1st, flat 3rd, 5th, and 6th degrees of the major scale.

Starting at the root note, then 3/4/2.

Suspended Fourth (sus4)

Suspended fourth chords are majestic in nature, and sound ‘proud’ almost.


They’re a little different to the rest, as they don’t follow a major or minor pattern.

To build a Suspended fourth, you use the 1st, 4th, and 5th notes of the major scale:

To describe in half steps, it would be root + 5 half steps + 2 half steps.


Ninth chords are very energetic and full of life.

To build a ninth chord, we’d use the 1, 3, 5, and 9 tones of the major scale. This
particular chord passes over two octaves, as you can see below.


In terms of half steps, it’d be the root, plus 4 half steps up to E, then 3 half steps up to G,
and finally 6 half steps up to D.

Note: a minor ninth is built in a similar way except the E is lowered by a semitone to an


Diminished chords are dark and edgy.

To build a diminished chord you’d use the root, flat 3, and flat 5 tones of the major scale.

In terms of half steps, this one’s pretty simple. Go up 3, and then 3 again.


Augmented chords contain quite a lot of movement and sound suspenseful.

To build an augmented chord we use the root, 3, and sharp 5 tones of the major scale.


This is basically the same as a major chord, but with the 5th degree raised a half step.


There are far more chords than just these.

The mood and feel of chords is a subjective thing. You have to discover what chords you
like yourself. If something doesn’t sound suspenseful or dark to you, then it doesn’t
sound suspenseful or dark to you.

Another thing to note is that these chords can sound completely different when following
another chord. Certain chords may sound dissonant when on their own, but when used in
a progression they might sound far more natural.


First things first, don’t panic! This isn’t as complicated as it sounds, its a
simple technique that will make your songs sound a lot better and also give
you much more choice in choosing what to play when you see a chord symbol.

Up until now you have been playing all your chords in what is known as root
position, this means playing the note with the same name as the chord as the
lowest note of your chord.

C Major Chord: The C Major chord shown above is in root position because the C
note is on the bottom, but hold on a second, what if we don’t play C at the bottom,
but we play it at the top like this.

C Major Chord first inversion denoted as C/E

Notice that its still the same chord, C Major. It contains the same notes as your
regular C Major chord, but the root note (C) is now at the top. Try playing this
and comparing the sound it makes with the regular C Major chord, it sounds
slightly different but has the same quality as the normal C major chord. You
can play this version of the chord almost anywhere you see a C chord
specified. If they specifically want you to play this version of the C chord then
they would denote it as C/E. This means, "play me a C chord but use E as the
bottom note".

How many such inversions are there for a particular chord? Well simple
math’s tells us that for a 3 note chord there are three possible ways to play it.


Either with the C on the bottom, the E on the bottom or the G on the bottom.
The third way to play the C chord is shown below, with G on the bottom.

C Major chord second inversion, denoted as C/G

For a four note chord such as C7 (C E G A#) there are four possible ways to
play it. With either the C, E, G or A# on the bottom. Try out each of the
different ways of playing this chord and see how they sound. It may take you
a while to work out the chord in the new positions but believe me its well
worth the effort.

This applies to any chord you may care to come across, they can all be played
inverted to give a slightly different feel to the chord.



*** Arpeggios ***

Let's talk about playing arpeggios. If you're like most music theory students, you have
heard the term and been intimidated by it. A lot of people think that arpeggio is Italian for
"hard to play." While I don't know the exact translation, the real definition of an arpeggio
is just a 'broken chord.'

What is an Arpeggio?

Some people will argue with me and say that an arpeggio is more like a scale than a
chord because it is a linear set of notes and not a simultaneous "tone cluster." True, but
who cares? It's still a broken up chord.

Like a scale, an arpeggio is linear: it's a set of notes that you play one at a time either in
order or otherwise. Like a chord, it is made up of only certain notes from that set. So an
arpeggio is a chord played like a scale.

Let's try one!

Let's say we have an A major chord. It is made up of A, C#, and E. Instead of playing
them all at once like we would with a chord, we play them individually:

A C# E A C# E A C# E A C#...

Here is a list of all of the major chords and their arpeggios, just so you can see how they
all work. I recommend playing all of these on your instrument right now!!

Chord Arpeggio
D D,F#,A
A A,C#,E
E E,G#,B
B B,D#,F#
F# F#,A#,C#
C# C#,E#,G#


Bb Bb,D,F
Eb Eb,G,Bb
Ab Ab,C,Eb
Db Db,F,Ab
Gb Gb,Bb,Db
Cb Cb,Eb,Gb

So now that we know what these are. But what do they do, and how can we use them?
Most melodies don't just use the previous or next note in a scale. There are some
exceptions, such as Mary Had a Little Lamb. In the key of C:

E D C D E E E... Each note is right next to another one on the C scale. (Try playing this
on an instrument to see how it lays out on the scale)

Real Life Examples

When we go from one note in the scale to the next note above or below it, this is called a
step (whole step or half step, doesn't matter). Most melodies will at some point skip notes
in a scale to get to the next note, likeTwinkle, Twinkle, Little Star. In C again:

C C -> G G A A G ... This is called a leap.

A C major chord is made of C, E and G. Notice how in the above leap, we go from C to
G. These are two of the notes in the arpeggio/chord. So if you are good at playing
arpeggios, you don't have to worry about finding that G note, you'll already know exactly
where it is.

Both of these are very elementary examples, but I use them because everyone knows
them so the don't require much explanation.

Playing arpeggios is common in melodies because the contain notes that naturally sound
good with the song's chords (because they ARE chords.)

(NOTE: If you are looking to improvise on a melody, try using arpeggios. Take the
current chord you're in, and use those notes to expand the melody.)

How do you get to improvising?

Practice, practice, practice. Learn lots of arpeggios on your instrument. Start by naming a
chord: F. Use the notes of that chord: F A C, and play them all in order: F A C F A C F A
... going up through several octaves until you get familiar enough to play them in your


sleep. This will make you a much better player and will make learning all songs much
easier. Playing arpeggios is one of the FASTEST ways to get better on your instrument. It
is also one of the fastest ways to start understanding general music theory and


• A chord progression can be described as a series of chord changes that are played
throughout a piece of music. When you listen to music of different genres you
will notice that there are several changes that occur. These changes normally flow
with the melody of the piece. The changes that you hear taking place in music are
musical progressions.
• Whether you play the guitar, organ or piano in a band you will have to make some
form of relative movements that falls within the structure of the song.
• Important Notice - Before you approach the concept of chord progression, it is
very important that you learn to play all the basic chords on your piano or
keyboard. Some of the basic chords you need to know are major, minor,
augmented and diminished. All other chords will fall into place as you learn more
about the piano.
• The main idea behind progressions
• Progressions are built based on the position of notes in a scale. Each note within a
major scale is used to form a specific chord. The notes in a scale are represented
by numbers called intervals.
• Here is an example below:

• Roman numerals are traditionally used in music to represent note positions, but
contemporary musicians have used the universal numeral system.
• Here is an example below:


• Each position of a note within a major scale is used to form a chord. Below is an
illustration that will explain more.

• In the illustration above the 1st or tonic note forms the tonic or main chord which
is a major. You should notice that the 1st, 4th and 5th notes of the scale are all
major chords while the 2nd, 3rd and 6th notes are minor chords. The 7th note
forms a diminished chord which will be discussed in another lesson. This method
applies to all major scales.
• Chord progressions are normally represented by numbers. For example – 1,4,5 or
I-IV-V. This type of progression is called the one-four-five progression. If this
type of movement is done in the C major scale then you would play C major
chord, F major chord and G major chord.
• Types of progressions
• There are different types of chord progression in music. Here is a list of some
popular progressions that are played on the piano, organ or synthesizer.
• 1-4-5 (I-IV-V)
• 1-3-4-5 (I-III-IV-V)
• 1-2-5-1 (I-II-V-I)
• 1-4-2-5-1 (I-IV-II-V-I)
• 1-6-2-5-1 (I-VI-II-V-I)
• 1-3-6-2-5-1 (I-III-VI-II-V-I)
• Take a look at how these progressions are apply in the key of C major.


• There are some songs that use the 1-4-5 or 1-5-4 progressions throughout the
entire music. However, most popular musical styles such as rhythm and blues,
funk, rock, gospel and jazz use two or more different progressions.
• It is very important that you know how to play all the progressions listed above in
every key, so you will need to practice. However, it is very important that you
take them one step at a time.
• Below is a table showing all the chords for the 1-4-5 progression in all 12 keys on
the piano.




*** Structure and Arrangement ***

So far we have quickly examined how melody and harmony contribute to create
some sort of order and coherency to a musical piece, which otherwise would not
sound harmonious and pleasant, but rather discordant and undirected.

However, another element that plays an important role in how music is put together
is the structure of the musical piece. We are used to listen to musical piece whose
structure we can somehow recognize as familiar and known, especially in popular
music. This does not mean that other kinds of music do not have rules or guidelines
of this kind: classical music for example is based on models such sonata, fuga,
interlude, etc. By choosing what kind of piece to write, a composer is expected to
follow specific rules that make that piece recognizable as of a particular kind.

Pop music structure is rather simple compared to classical canons. Being able to
recognize music elements is important to a sound engineer because they are used
very often in the studio: if a talent asks you to re-record 'from the end of the second
verse until just before the bridge' we must be able to understand what he is saying.
We engineers are probably more familiar with time codes and absolute time
references, but of course it is easier for a talent in the live room to refer to 'the end
of the first chorus' rather than to a time like '01:02:35:22', as we would do...

So, that's why we must be able to talk the talk. Here are some definitions of the most
common elements of a pop song, and they will be demonstrated in a few examples
for a better understanding. Please, understand that there is some degree of
generalization in these definitions, of course there are exceptions:


Stands for introduction, and as you may have expected it is the very beginning of a
song. Generally instrumental, it usually finishes with the beginning of the first verse,
but sometimes we can find a chorus right after the intro. Usually it contains melodic
elements that resemble the chorus or the riff.


A catchy instrumental sequence that very often becomes the most characteristic
part of a song. Usually four to eight bars long. Riffs are used most frequently in rock
music. To name a few classics, good examples are the guitar lines in Deep Purple's
'Smoke on the water' or in Rolling Stones 'satisfaction'. A very famous bass riff is
Queen's 'Another one bites the dust'. Riffs are usually repeated consistently through
the song.



The verse is usually the first lyrical element of a song; therefore it is common that
the instrumental parts here tend to work for the vocals, which is mostly highlighted.
For the same reason, often verses are calmer and tend to contain fewer instruments.
Verses commonly have a structure based on four lines or multiples of four: this
structure is repeated several times through the song, but with different lyrics. It is
not rare to find two or more verses in sequence, or perhaps divided by a four or
eight bars instrumental part, to enhance the separation.


Not so commonly used but still important, builds are used before a part of the song
that is considered important or very characteristic, such a chorus or a solo. A build
precedes these parts with the purpose to create an ascending climax sense, which
culminates in the part that immediately follows. We can find, for example, a
progressively increasing intention with which the band plays, or a slight increase in
volume, due to the higher emphasis progressively employed in playing this part. If it
contains lyrics, their meaning may lead to a conclusion in the following part and
singers may further stress the ascending climax pattern with their singing style. For
these reasons, builds cannot be very long; otherwise the climax would not be as
effective. They could range from two to eight bars, longer ones are less common.


Choruses play a most important rule in pop music, being the part that is intended to
be remembered more easily, since contains the whole identity of a song. Many pop
songs are actually written around a chorus. Usually the first chorus starts after a
couple of verses, but it is not unusual to hear songs that actually start with the
chorus. Think about The Beatles 'She loves me' for example. Many modern pop song
feature the chorus within the first minute, or even less. This trend has developed to
make sure that the listener is presented with the catchiest part of the song as soon
as possible. Choruses are repeated several times during a song, and unlike the
verses, the lyrics usually remain the same, or change very slightly. Sometimes,
especially towards the end of the song, choruses are transposed to a higher key to
further enhance them, or repeated twice or more ('ad libitum') to stress that the
song has come to an end.


Pop song employ a very repetitive structure, so it is common to introduce some


elements of variation. Bridges, or specials, are usually put after a second or third
chorus, so in a later stage of the song, and they may contain melodic and harmonic
variations to the song. Often they are transposed to a higher key, which can be kept
until the end of the song or restored to the original after the part, and they tend to
convey some sort of unexpected elements that surprise the listener: a new
instrument, or a different vocal melody, or different dynamics. After the special the
song usually arrives soon to an end.


A solo is an instrumental part where a particular instrument is dominant. It usually
features virtuoso parts that show the musical skills of the player. Guitar solos are
very common in rock music, but also keyboard/synth solos are very popular.
Depending on the genre, different instruments are used more frequently: in jazz
music brass instruments are widely used for example. Even instruments that are
usually dedicated to the rhythm parts can perform a solo: bass and drums solos are
common and very effective. More rarely, it is possible to find vocal solos as well.
Commonly (but not always) the melodic elements of the chorus are employed in a


A breakdown, like a special, is usually employed to include some variation in the
otherwise repetitive structure of a pop song. In a breakdown, there is a sudden
change of the dynamic of the song, and most instruments stop playing apart from a
few. Commonly only the drums are left, or only a rhythmic guitar. Sometimes
breakdowns contains lyrical parts, in which case the voice could unaccompanied by
instruments or it can be supported by a drum patter or a simple riff, or both.
Breakdowns are usually four bars or eight bars long, and they often lead to a chorus
or a solo, to achieve an effect similar to the builds.


The last part of the song. It is often instrumental and it can be a part specifically
designed to be a coda, or also a repetition of a previously used parts (most
commonly the riff or the chorus) that fades out into silence. A solo can also be used
as a coda. Some songs do not have a coda at all; they just finish abruptly after a
chorus or after an ending verse.

The definitions are not in a particular order, but songs tend to follow some kind of
common pattern with regard to their structure.


Please Note: all references are found in the Ableton help

view or in the Ableton Live 9 Manual.

Ableton Live Level 1: Beats, Sketches, and Ideas

Learn the fundamentals of Ableton Live by producing a series of musical sketches.

Learn how to create beats, bass lines, and melodies with MIDI instruments and
audio clips. Discover the infinite potential of Live as a studio production tool.

What You'll Learn

• Ableton First Steps: See Help View Audio I/O & Chpt 2.2
• Organizational techniques and practices: See Chpt 4.15
• File Types & File Structure: See Chpt 5.1/ 5.5.1 / 5.6.1 / 5.8.1
• Navigating Live's interface: See 6.1
• Session View/Arrangement View: See Chapter 6 & 7
• Session automation: See Chp 4.12
• Clips & Clip envelopes: See 4.13 Chp 8 Chpt 9
• Track automation: See 4.12 Chp 19
• Programming original drum beats: See Help View - Creating Beats
• Sequencing musical parts: See Help View - Playing Software Instruments
• MIDI and Key mapping: See Chp 4.14
• Introduction to effects and routing: See Chp 4.10 - 15.1 / Chp 23



Ableton Live Level 2: Analyze, Deconstruct, Recompose, and Assemble

Learn powerful workflow techniques, and increase the speed at which you can
compose and sequence musical parts. Build your musical vocabulary by analyzing a
variety of electronic music styles, and assemble your first full tune using modified
preset content.
What You'll Learn:

• The Drum Rack: See Chpt 18.6

• Recording original drum and percussion parts
• Modifying and shaping drum sounds: See Chpt 18.7
• Creating bass lines
• Creating melodies and leads
• Creating pads, textures, and keyboard parts
• Audio recording methods and techniques: See Help View - Audio rec & Chpt
• Resampling - See Chpt14.5
• Arranging
• Mixing basics and exporting: See Chpt 15 & 5.2.3


Ableton Live Level 3: Synthesis and Original Sound Creation

Create your own sounds. Learn the fundamentals of subtractive synthesis, make
custom drum kits, and program your own synthesizer sounds in Analog. In this
level, you'll learn universal sound design concepts that apply to almost any software
or hardware synthesizer you'll ever use. By the end of Level 3, you will have
completed a full tune using your own original sounds.

What You'll Learn:

• Building custom: Drum Racks See Chpt 18

• Layering drums for a custom sound
• Slice to MIDI: Chpt 11
• Creating original bass sounds: Help view adv, Creating a Bass Patch
• Creating original melody and lead sounds: Help view adv, Operator Creating
Lead Sounds
• Creating original pads, textures, and keyboard sounds
• Creating ear candy: sweeps, risers, and sound effects
• Using dynamics processors as part of the composition process: See Chpt 22.1
& 22.18 / 22.17 / 22.20



Ableton Live Level 4: Advanced Sound Creation

Expand your sonic repertoire and music making abilities with advanced synthesis
concepts and techniques. Delve into FM synthesis, advanced sampling, and physical
modeling. In this level you will complete a full song that utilizes the concepts and
techniques learned thus far.

What You'll Learn:

• FM synthesis with Operator

• Creating custom patches with Sampler: See help view sampler
• Physical modeling using Electric, Tension, and Collision: See 24.2 / 24.3 /
24.9 / 22.5
• "Breaking" physical models to make unimaginable sounds 22.1
• Combining instruments with Instrument Racks to make huge sounds
• Making sound effects with advanced synthesis, sampling, and vocoding : see
• How to best use original sounds in music creation


Ableton Live Level 5: Advanced Effect Processing

Take your music to the next level with advanced effect processing and routing
techniques. Gain mastery of Live's effect processors, both individually, and
combined in powerful Audio Effect Racks. By the end of this level, you will have
completed another full tune using advanced processing techniques.

What You'll Learn:

• The Audio Effect Rack: See Chpt 4.9

• Parallel processing and parallel compression
• Morphing effects: See Chp 22.2
• Multi-band effects: See Chp 22.22
• Advanced distortion: See Chp 22.9 / 22.23 / 22.26 / 22.29
• Time-based effectsSee Chp 22.14 / 22.4 / 22.14 / 22.19 / 22.21 / 22.25 /
22.28 / 22.30
• Pitch and modulation effects: See Chp 22.16 / 22.15 / 22.12 / 22.6 22.3 /
22.24 / 22.34
• Advanced EQ techniques See Chp 22.10
• Advanced Mixing technique: See Chp 15.1
• Mastering setups and tools: See Chp 22.20 / 22.31 / 22.32



Ableton Live Level 6: Going Global with your Music

Go far beyond the foundations of Live, and incorporate advanced features and
functions into your workflow. Learn how to create custom MIDI processors, utilize
unique Max for Live devices, and take your music to new levels that will continue to
grow and expand as you do. By the end of this level, you will have entered an active
remix contest, and completed a final composition that draws upon all the concepts,
skills, and techniques you have mastered throughout the course.

What You'll Learn:

• Making custom MIDI processors using the MIDI Effect Rack

• MIDI resampling
• Max for Live Instruments Chp 25
• Max for Live Audio Effects Chp 25
• Max for Live MIDI EffectsChp 25
• Composing with MIDI controllers chp 27
• Remixing and Mashup concepts and techniques
• Using video in Live: See Chp 21.2 / 5.2.3
• Marketing, self-promotion, and the world stage