Академический Документы
Профессиональный Документы
Культура Документы
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
What is Sound?
Sound is a continuous wave that propagates in the air.
2.1: Audio Technology
Representation and encoding
of audio information
Pulse Code Modulation
Digital Audio Broadcasting
Music Formats: MIDI
Speech Processing
Chapter 2: Basics
Audio Technology
Images and Graphics
Video and Animation
The wave is made up of pressure differences caused by the vibration of some material
(e.g. violin string).
The period determines the frequency of a sound
The frequency denotes the number of
periods per second (measured in Hertz)
Important for audio processing:
frequency range 20 Hz 20 kHz
The amplitude determines the volume of a sound
The amplitude of a sound is the
measure of deviation of the air
pressure wave from its mean
Page 1
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Sampling
Sample height
audio signal
time
Sampling rate: rate with which a continuous wave is sampled (measured in Hertz)
CD standard - 44100 samples/sec
Telephone - 8000 samples/sec
quantized sample
The analogous signal is sampled in regular time intervals, i.e. the amplitude of the wave
is measured
Discretization both in time and amplitude (quantization) to get representative values in a
limited range (e.g. quantization with 8 bit: 256 possible values)
Result: series of values:
0.5
0.5
0.75
0.75
0.75
Page 2
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
0.25
0.5
0.5
0.25
- 0.25
- 0.5
Page 3
Page 4
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Nyquist Theorem
Nyquist Theorem
Understanding Nyquist Sampling Theorem:
Example 1: sampling at the same frequency as the original signal
Page 5
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Page 6
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Nyquist Theorem
Noiseless Channel:
Nyquist proved that if any arbitrary signal has been run through a low-pass filter of
bandwidth H, the filtered signal can be completely reconstructed by making only
2H samples per second. If the signal consists of V discrete levels, Nyquist's
theorem states:
max. data rate = 2H log2 V bit/sec
= zv cos (v 0 t + v )
Example: noiseless 3 kHz channel with quantization level 1 bit cannot transmit
binary (i.e., two level) signals at a rate which exceeds 6000 Bit/s.
v =0
Noisy Channel:
The noise present is measured by the ratio of the signal power S to the noise
power N (signal-to-noise ratio S/N).
Usually the ratio is dimensionless, 10 log10 S/N is called decibel (dB)
Based on Shannon's result we can specify the maximal data rate of a noisy
channel:
max. data rate = H log2 (1+S/N)
a0
+ ( av cos (v 0 t ) + bv sin (v 0 t ) )
2 v =1
2
T
2
bv =
T
where: a v =
zv =
2
2
2
2
f ( t ) c o s (v 0 t ) d t ,v = 0 ,1 , 2 ,...
f ( t ) s in (v 0 t ) d t ,v = 0 ,1 , 2 ,...
a v 2 + bv 2
v = a rc ta n ( b v a v
Page 7
Page 8
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Fourier Transformation
Fourier Transformation
v = 0, 1, 2,
Cv := ( av jbv )
we have Cv = 2 ( av + jbv )
2
f (t ) :=
F () := f ( t ) e j t dt
jv 0 t
(Transformation)
v =
f (t ) =
1
T
f (t ) e
jv 0 t
(Retransformation)
dt
T 2
Page 9
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Page 10
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Fourier Transformation
Fourier Transformation
Thus there is a unique correspondence between f(t) and its Fourier transform F().
f(t)
F()
log(a),
log(b)
c) Retransform
1
F () ej t d
2
T 3T
A 0 t 4 ; 4 t T
f (t) =
0 T < t < 3T
4
4
A
-T
-T/2
T/4 T/2
ab
Page 11
Page 12
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Fourier Transformation
Cv =
1
T
f ( t ) e jv 0 t dt =
(for v 0)
since T =
1
T
Fourier Transformation
2A
T
A
sin v 0 =
sin v
v 0T
4 v
A e jv 0 t dt =
1
C0 =
T
T
4
A
sin v e jv 0 t
v
2
v =
A
2A
= +
sin v cos (v 0 t )
2 v =1 v
2
f (t ) =
A e 0 d t =
A
2
-50
composing positive and
negative values of v
Page 13
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
A
5
A
2
-40
A
3
-20
-30
-0
20
At the frequencies
0,
0,
-0,
we got waves
with amplitudes
A
,
2
A
5
A
3
30
40
50
30,
-30,
50,
-50,
...
...
(2n+1)0
-(2n+1)0
A
,
3
A
,
5
...
A ( 1)
( 2n +1)
Page 14
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Nyquist Theorem
Nyquist Theorem
F()
Example:
bandwidth
limitation
- g
F0()
- g
jt
f ( t ) e dt = 0
Cv e
v =
jv 0 t
where Cv =
for > g = 2 fg
Page 15
1
T
(t )
f (t ) e
jv 0 t
dt
T 2
Page 16
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Nyquist Theorem
F 0 (
Cv =
)=
Cv e
Nyquist Theorem
jv g
v =
1
2g
F 0 (
) e jv
d =
2g
F ( ) e
jv g
F0 ( ) =
since the integral is taken between -g and g where F() and F0() are identical.
And since F()=0 for g we may write:
Since f ( t ) =
1
2
F ( ) e
jt
Cv =
v
d we have f
g
1
2g
1
=
2
F ( ) e
jv g
F ( ) e
jv g
d = Cv
Page 17
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
1
F ( ) e d = 2
j t
F ( ) e
0
j t
f (t ) =
v jv
e j t d
f e
v = g
g
v
gv
v =
1
2g
v =
2 sin g ( t + v g )
t + v g
g
f v
v
= f
v = g
(t )
where gv ( t ) =
sin( g t v )
g t v
(v = 0, 1, 2, . . .)
sin ( g t v )
g t v
g3(t)
g0(t)
1
f (t ) =
Page 18
Nyquist Theorem
Finally:
1
=
2
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Nyquist Theorem
v =
v jv g
f
e
g
1
f (t ) =
2
2
g g
gv ( t ) = g 0 t v
1
=
g 2 fg
Page 19
Page 20
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Nyquist Theorem
f(0)=1
f(/g)=1.5
f ( t ) = original signal
f(t)
f(2/g)=2
2,1
2,0
1,8
f(3/g)=1.8
f(4/g)=1.5
1,5
f(5/g)=2.1
1,0
f 5
g 5 (t )
g
f(t)
f(0)g0(t)
f (t )
t
v
g (t )
v
g
Page 21
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
0100
0011
0010
0001
0000
1111
1110
1101
1100
2
2fg
3
2fg
sampling distance:
1
2fg
Page 22
4
3
2
1
0
-1
-2
-3
-4
1
2fg
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Code
f ( t ) carries all information which is necessary for reconstructing f(t) out of it.
Value
1
2fg
1. Precision
The resolution of the sample wave (sample precision), bits per sample
Samples are typically stored as raw numbers (linear PCM format)
or as logarithms (-law or a-law)
Example: 16-bit CD quality quantization results in 65536 (216) values
2. Sample Rate
Sampling Frequency, samples per second
Example: 16-bit CD audio is sampled at 44100 Hz
3. Number of Channels
Example: CD audio uses 2 channels (left + right channel)
4. Encoding
Page 23
Page 24
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Techniques
-law encoding
Standard for voice data in telephone companies in USA, Canada, Japan.
Part of the standard CCITT G.711.
a-law encoding
Standard and is used for telephony elsewhere.
Part of the standard CCITT G.711.
a-law and -law are sampled at 8000 samples/second with precision of 12 bits and
compressed to 8 bit samples (standard analog telephone service).
Page 25
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Page 26
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Deltamodulation
Instead of s(ti+1) [i.e. 8 bits] transmit only +1 or -1 [i.e. 1 bit] i.e. the differential change.
Aim: Digital audio streams with lower data capacity than PCM requires
Problem:
Low stepsize unit and/or rapidly changing signal
s(ti) can hardly follow the behaviour of f(t)
Idea: Deltamodulation
Number of samples: as before
Sender: - Follows the actual signal f(t) by a staircase function s(t1), s(t2), s(t3), . . .
- 8000 values of s(ti) per second (telephony)
- Stepsize has unit value
(normalised to 1)
Algorithm:
if s(ti) f(ti) then
s(ti+1) := s(ti) + 1
else
s(ti+1) := s(ti) - 1
Page 27
Page 28
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
ADPCM
Page 29
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Page 30
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Adaptive Stepsize
Begin with a standard stepsize STEP (might be relatively small in order to adapt to
non-rapidly moving curves).
SINC := STEP
Receiver:
Learns about the actual stepsize from the number of +1
(or -1 respectively) which have been received without any
interruption.
Page 31
Page 32
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Adaptive Stepsize
Adaptive Stepsize
actual curve
s(ti)
7 11 19 21 19 17 14 14 15 14
5 6 8 12 20 19 17 13 14 16
Stepsize:
+1 +2 +4 +8 -1
-2
-4 +1 +2 -1
Sender
transmits:
+1 +1 +1 +1 -1
-1
-1 +1 +1 -1
Receiver
receives:
+1 +1 +1 +1 -1
-1
-1 +1 +1 -1
20
Serious mistakes
if transmission
error occurs: thus,
full information
is additionally
required from time
to time.
actual
value
15
approximation
by staircase
10
Page 33
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Adaptive Stepsize
Adaptive Stepsize
Transmission errors and adaptive stepsizes
Server
Page 34
s(ti):
Stepsize:
+1 +2 +4 +8 -1
Transmission:
8 12 20 19 17 13 14 16
1 -1
-2
-1
-4 +1 +2 -1
-1
+1
+1
-1
+2
+1
+4
+1
Channel
Receiver
-1
-1
-1
-4
-8 +1 +2
-1
+2m
+1
-1
-1 -1
...
...
12 11
0
-3
-2
-1
-1
-2
-2
-1
-4
-4
-1
-8
...
...
-2n
-1
...
-2n+1
...
2n
+1
+1
+1
...
...
...
wrong values
distance in total:
-1
2m +1
+1
+2
+4
2n+1 - 1
Page 35
2m + 2n+1
Page 36
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Adaptive Stepsize
Adaptive Stepsize
The examples show that in a worst case situation adaptive schemes may result in a
deviation between sender staircase function and receiver staircase function
+1
+1
+2
+1
+4
+1
+2m
+1
...
...
-1
-1
-2
-1
+2m+1
-1
-4
-1
-2
...
...
-2n
-1
-2n-1
...
+1
+1
...
...
2m+1+1
+1
+2
...
2m + 2n+1
or
2m+1 + 2n
+1
The deviation would remain constant from that point onwards. However:
The deviation may be levelled off in a period of silence
It might be possible that mainly the differences of amplitudes are recognised by the
human ear
[Thus the consequences of transmission errors in an adaptive scheme are not quite as
dramatic as expected at first sight]
wrong values
distance in total:
of
2n-1
2n - 1
2m+1 + 2n
Page 37
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Adaptive Stepsize
Anyway it is recommended to insert a full actual sample from time to time; thus
classifying transmitted information into
sample bits + additional coding that this is a F sequence
F =
full sample
D =
difference value 1 bit with value 0 or 1; interpreted as 2x (+1) or as 2x (-1)
...DDD
DDDDDDDDD
DDDDDDDDD...
1 1 0 0 0 1 0 0 1 (Bits)
+1 +2 1 2 4 +1 1 2 +1 (Stepsize)
time
Page 38
PBB
PBB
IBB
...
bandwidth
mode
bitrate
reduction ratio
telephone sound
2.5 kHz
mono
8 kbps
96:1
4.5 kHz
mono
16 kbps
48:1
7.5 kHz
mono
32 kbps
24:1
similar to FM radio
11 kHz
stereo
56...64 kbps
26...24:1
near-CD
15 kHz
stereo
96 kbps
16:1
CD
>15 kHz
stereo
112..128kbps
14..12:1
Page 39
Page 40
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Audio Coding
Today, radio program is transmitted digitally. Digital Audio Broadcasting (DAB) is used for
bringing the audio data to a large number of receivers
Medium access
OFDM (Orthogonal Frequency Division Multiplex)
192 to 1536 subcarriers within a 1.5 MHz frequency band
Frequencies
First phase: one out of 32 frequency blocks for terrestrial TV channels 5 to 12 (174 230 MHz, 5A - 12D)
Second phase: one out of 9 frequency blocks in the L-band
(1452 - 1467.5 MHz, LA - LI)
Sending power: 6.1 kW (VHF, 120 km) or 4 kW (L-band, 30 km)
Data rates: 2.304 MBit/s (net 1.2 to 1.536 Mbit/s)
Modulation: Differential 4-phase modulation (D-QPSK)
Audio channels per frequency block: typically 6, max. 192 kbit/s
Digital services: 0.6 - 16 kbit/s (Program Associated Data, PAD), 24 kbit/s (Non-PAD,
NPAD)
Page 41
Goal
Audio transmission almost with CD quality
Robust against multipath propagation
Minimal distortion of audio signals during signal fading
Mechanisms
Fully digital audio signals (PCM, 16 Bit, 48 kHz, stereo)
MPEG compression of audio signals, compression ratio 1:10
Redundancy bits for error detection and correction
Burst errors typical for radio transmissions, therefore signal interleaving receivers can now correct single bit errors resulting from interference
Low symbol-rate, many symbols
Transmission of digital data using long symbol sequences, separated by
guard spaces
Delayed symbols, e.g., reflection, still remain within the guard space
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
OFDM
Properties
Lower data rate on each subcarrier less intersymbol interference
Interference on one frequency results in interference of one subcarrier only
No guard space necessary
Orthogonality allows for signal separation via inverse FFT on receiver side
Precise synchronization necessary (sender/receiver)
c
c3
Maximum of one subcarrier frequency appears exactly at a frequency where all other
subcarriers equal zero
superposition of frequencies in the same frequency range
Amplitude
Page 42
Advantages
No equalizer necessary
No expensive filters with sharp edges necessary
Good spectral efficiency
subcarrier:
sin(x)
function=
x
Application
802.11a/g, HiperLAN2, DAB, DVB, ADSL
f
Page 43
Page 44
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Page 45
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Transmission Frame
DAB sender
Service
Information
frame duration TF
guard interval Td
symbol
SC
phase
reference
symbol
synchronization
channel
......
L-1
data
symbol
FIC fast information
channel
FIC
DAB Signal
FIC
Multiplex
Information
Tu
data
symbol
MSC
carriers
Transmission
Multiplexer
Audio
Services
null
symbol
Page 46
Data
Services
Page 47
Transmitter
f
1.5 MHz
Audio
Encoder
Channel
Coder
data
symbol
main service
channel
ODFM
Packet
Mux
Channel
Coder
MSC
Multiplexer
Radio Frequency
FIC: Fast Information Channel
MSC: Main Service Channel
OFDM: Orthogonal Frequency Division Multiplexing
Page 48
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
DAB receiver
(partial)
MSC
Tuner
ODFM
Demodulator
Channel
Decoder
Audio
Decoder
Audio
Service
FIC
Packet
Demux
Control Bus
Independent
Data
Service
DAB - Multiplex
Audio 1
192 kbit/s
Audio 2
192 kbit/s
PAD
Controller
D1
Audio 3
192 kbit/s
PAD
D2
D3
Audio 4
160 kbit/s
PAD
D4
PAD
D5
D6
User Interface
Audio 5
160 kbit/s
PAD
D7
PAD
DAB - Multiplex - reconfigured
D8
D9
Audio 1
Audio 2
Audio 3
Audio 4
Audio 5
192 kbit/s 192 kbit/s 128 kbit/s 160 kbit/s 160 kbit/s
PAD
PAD
PAD
PAD
PAD
D10 D11
D1
Page 49
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Audio 6
128 kbit/s
D2
D3
D4
D5
D6
Audio 7
96 kbit/s
Audio 8
96 kbit/s
PAD
PAD
D7
D8
D9
Page 50
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
MOT Structure
Problem
Broad range of receiver capabilities: audio-only devices with single/multiple line text
display, additional color graphic display, PC adapters etc.
Different types of receivers should at least be able to recognize all kinds of program
associated and program independent data and process some of it
Solution
Common standard for data transmission: MOT
Important for MOT is the support of data formats used in other multimedia systems
(e.g., online services, Internet, CD-ROM)
DAB can therefore transmit HTML documents from the WWW with very little
additional effort
MOT formats
MHEG, Java, JPEG, ASCII, MPEG, HTML, HTTP, BMP, GIF, ...
Header core
Size of header and body, content type
Header extension
Handling information, e.g., repetition distance, segmentation, priority
Information supports caching mechanisms
Body: Arbitrary data
7 byte
header
core
header
extension
body
Thus: DAB is standard is broadcasting audio data per radio, and additionally could be
used for transfer of any other data.
Page 51
Page 52
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
MIDI Devices
Page 53
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Keyboard:
Sound generator:
Sequencer:
Synthesizer:
combined device which is the heart of the MIDI system.
It consists of:
Sound generator
produces audio signals which become sound when fed into a loudspeaker;
in most cases: manufacturer specific
Processor
sends and receives MIDI messages to and from keyboard, control panel
and sound generator
Page 54
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
MIDI Devices
MIDI Messages
Keyboard
Pressing keys informs the processor about
what score to play
how long to play a score
how loud to play
whether to add effects such as vibrate
Control panel
Controls functions which are not directly concerned with notes and their duration,
e.g. synthesizer on/off, overall volume, . . .
Auxiliary controllers
For additional functions, e.g. pitch bend, smooth glide from one tone to another
Memory
[Several scores can be played simultaneously per channel, typically 3-16 scores can be arranged in a synthesizer.
Examples: Flute: 1 score, Organ: up to 16 scores]
Page 55
Page 56
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
MIDI Messages
MIDI Messages
MIDI messages contain
instrument specification
beginning and end of a score
key and volume information etc.
Page 57
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
24
36
48
60
Page 58
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
1 byte
MIDI devices (musical instruments) communicate with each other via channels.
Music data is reproduced by the receivers synthesizer.
128 instruments, including noise, aircraft, marimba, ..., flute, ...
data 2
MIDI Timing
72
84
96
108
120
[9B,30,40]
key pressed (=event 9) on channel 11,
score C (= 48dec= 30hex),
velocity 64 (= 40hex)
[8B,30,40]
key released (= event 8) on channel 11,
score C,
velocity 64
Page 59
Page 60
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
MIDI Timing
MIDI Events
Events in MIDI
Keyboard related events: key pressed, key released
Common controller related events: modulation slider moved, pitch bend slider
moved
Program changed
A new instrument (sound characteristics) has been selected on the specified
channel. MIDI was standardized for musicians, not for multimedia. Therefore
programs are manufacturer-specific and not included in the specification.
25 x80
Page 61
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
Page 62
Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme
MIDI Compression
Example
Event 1: Piano key pressed by pianist
Event 2: Piano key released by pianist
...
Result
10 minutes of (piano) music: up to 200 Kbytes of MIDI information.
Crude comparison - PCM with CD quality:
600 seconds, 16 bit, 44.100 samples per second, 2 channels (stereo)
600 s (16/8) bytes 44100-s 2 = 600 s 176.400 bytes/s = 103.359 KBytes
Compression factor: 100.000/200 = 500, thus MIDI needs only a very small fraction
of data (when compared to CD)
But: only useful for music, not for speech
An analogy
Representation of a circle by centre + radius + thickness of line needs much less
data than collection of pixels which compose the picture of the circle
Page 63
Page 64