Вы находитесь на странице: 1из 36

ETSI/SMG11 "Speech Aspects"

Presentation of SMG11 Activities to Tiphon

Outline

SMG11
GSM Speech Codecs
GSM Enhanced Full Rate Codec
Tandem Free Operation
Adaptive Multi-Rate (AMR) Codec

Narrowband AMR
Wideband AMR

UMTS Matters
Next Meetings

SMG11

ETSI STC SMG11 is the competent body responsible for speech aspects of the GSM
and UMTS standards (since 1996)

SMG11 Chairman: Mr. Kari Jrvinen, Nokia


SMG11 plenary meets four times a year

SMG11 currently consists of three sub-groups

Additional extraordinary meetings as needed


TFO sub-group (Tandem Free Operation issues)
AMR sub-group (Adaptive Multi-Rate codec issues)
SQ sub-group (Speech Quality issues)

Sub-groups may have ad-hoc meetings between SMG11 plenary meetings


Typical attendance in SMG11 plenary meetings is between 30-50
SMG11 e-mail reflector as well as sub-group reflectors are extensively used between
the meetings

GSM Speech Codecs

GSM has so far standardised three codecs

13 kbps GSM-FR (1987); good cellular quality and robust operation in the
presence of background noise

5.6 kbps GSM-HR (1994); possibility for higher system capacity at the expense
of slightly lower speech quality in some conditions (particularly in background
noise)

12.2 kbps GSM-EFR (1996); high quality even exceeding the G.726 "wireline
reference" under clear channel conditions and in background noise

SMG11 is currently in the process of defining an Adaptive Multi Rate (AMR) codec
which will be the fourth GSM speech codec

Subjective speech quality

GSM FR

GSM HR

GSM EFR

No coding

Clean conditions (MOS)

3.71

3.85

4.43

4.61

Vehicle noise (DMOS)

3.83

3.45

4.25

4.42

Street noise (DMOS)

3.92

3.56

4.18

4.35

Source: TR 06.85 v5.0.0 (1998-07), "Subjective tests on the interoperability of the HR/FR/EFR speech
codecs; single, tandem and tandem free operation"

GSM EFR Codec

Selected as a basis for a new high quality speech service for PCS 1900 in the US in
1995 (formal standardization procedure in TIA and T1 completed in 1996)

ETSI standardized the same codec for GSM in 1996

Technical summary

Provides high quality speech service for GSM, GSM 1800 (DCS 1800), and GSM
1900 (PCS 1900) systems in all continents

Source coding rate 12.2 kbps (channel coding 10.6 kbps)


Based on the Algebraic CELP (ACELP) algorithm
Speech frame size and algorithmic delay 20 ms
Optional VAD/DTX function with comfort noise generation
Example implementation for error concealment
Complexity (encoder/decoder) approximately 18 MIPS (processor dependent)
Memory requirement (incl. RAM and ROM) approximately 16-19k 16-bit words

GSM EFR Speech Quality

GSM EFR speech quality is characterized in


ETSI Technical Report
Performance Characterisation of the GSM EFR speech codec, GSM 06.55.

Additional performance data can be found in


ETSI Technical Report
"Subjective tests on the interoperability of the HR/FR/EFR speech codecs;
single, tandem and tandem free operation", GSM 06.85

The GSM EFR codec has been included in numerous other formal and informal
subjective listening tests and extensive test data is available

The examples in the following slides are an extract of test results from COMSAT
laboratories obtained during the PCS 1900 EFR codec standardization, comparing
12.2 kbps EFR codec
32 kbps G.726 codec
8 kbps G.729
(13 kbps GSM FR codec)

GSM EFR Performance

Basic speech quality at different input levels and tandeming

Test Condition
Clean speech, high level, -16 dBOL (MOS)
Clean speech, medium level, -26 dBOL (MOS)
Clean speech, low level, -36 dBOL (MOS)
Self-tandem = codec-codec tandem (MOS)
Tandem with G.726 at 32kbit/s (MOS)

G.726 at
32kbit/s
3.7
3.6
3.0
3.1
3.2

GSM EFR

G.729

3.8
3.6
2.9
3.4
3.6

3.4
3.3
2.7
2.9
3.3

GSM EFR Performance

Performance in background noise

Test Condition
Background noise, Home noise 20 dB (DMOS)
Background noise, Car noise 10 dB (DMOS)
Background noise, Car noise 20 dB (DMOS)
Background noise, Street noise 10 dB (DMOS)
Background noise, Office noise 20 dB (DMOS)

G.726 at
32kbit/s
4.5
4.4
4.6
3.7
4.3

GSM EFR

G.729

4.6
4.5
4.6
4.1
4.5

4.3
3.9
4.1
3.7
3.7

GSM EFR Performance

Performance in error conditions

Test Condition
Clean speech, No errors (MOS)
Clean speech, 13 dB C/I, 30 mph (MOS)
Clean speech, 10 dB C/I, 30 mph (MOS)
Clean speech, 7 dB C/I, 30 mph (MOS)

Frame
error rate
0.0%
0.0%
0.5%
3.0%

BER
class 2
0%
2%
4%
8%

GSM FR

GSM EFR

3.4
3.3
3.0
2.3

4.1
4.0
3.8
3.2

In GSM, part of the coded bits are protected by a convolutional code, and residual
errors are detected via CRC. The frame error rate for this part is indicated above.
Part of the data is unprotected and receive the BER class 2 indicated above.

The frame error rates are not directly comparable to quality figures with no residual
errors

Tandem Free Operation (TFO)

Motivation: "Unnecessary" dual speech encoding and decoding in mobile-to-mobile


calls can significantly decrease speech quality

TFO prevents the encoding and decoding performed in the network

The same speech codec must be used in both mobile stations for TFO to work

TFO Standardization ongoing in ETSI SMG11 TFO Sub-group

Applicable to all the three GSM codecs (FR, HR, and EFR)

Work started (TFO sub-group established) in early 1996

Target: specifications ready by 4Q/1998 (ETSI GSM release 98)

Current work concentrating on completing four Annexes to Stage 3 description:


in-band signalling, operation with In-Path Equipments (IPEs), SDL definition, test
vectors

The TFO Stage 3 GSM 04.53 will be forwarded to SMG#27 plenary in October98

Formal subjective tests to evaluate the audible effects of TFO signalling are being
carried out by Coherent

MS-to-MS Call, no TFO


A-side

B-side

PLMN

PLMN
MSC

MSC

TRAU

Decoding

BSS

TRAU

64 kbits/s PCM Coded Speech

Encoding

BSS

8 or 16 kbits/s Voice Coded Speech

Decoding

Encoding
MSa

MSb

Effect of Tandeming
MOS value
One encoding and
decoding (normal)

Two encodings and


decodings (tandem)

Enhanced Full Rate

4.43

4.29

Full Rate

3.71

3.13

Half Rate

3.85

3.15

Speech codec

Source: TR 06.85 v2.0.0 (1998-06), "Subjective tests on the interoperability


of the HR/FR/EFR speech codecs; single, tandem and tandem free
operation"
Note: The above results are from clean conditions (no background noise, no
channel errors)

Effect of Tandeming in Error Conditions


MOS value
One encoding and
decoding (normal)

Two encodings and


decodings (tandem)

Enhanced Full Rate

4.12

3.45

Full Rate

3.41

2.64

Half Rate

3.68

2.77

Speech codec

Source: TR 06.85 v2.0.0 (1998-06), "Subjective tests on the interoperability


of the HR/FR/EFR speech codecs; single, tandem and tandem free
operation"
Note: EP1 error condition was used (moderate errors).

Effect of Tandeming in Background Noise


MOS value
One encoding and
decoding (normal)

Two encodings and


decodings (tandem)

Enhanced Full Rate

4.25

3.87

Full Rate

3.83

3.34

Half Rate

3.45

2.38

Speech codec

Source: TR 06.85 v2.0.0 (1998-06), "Subjective tests on the interoperability


of the HR/FR/EFR speech codecs; single, tandem and tandem free
operation"
Note: Vehicle noise of 10 dB was used.

TFO Modes

Two modes in TFO


Establishment mode: the necessary conditions for TFO are verified with inaudible
bit stealing

Verify whether both transcoders support TFO


Possible change of speech codecs to enable TFO
Duration typically 0.5-1.0 seconds

TFO mode: speech is transmitted compressed through the whole network with bit
stealing that guarantees smooth transitions in all situations

TFO includes the proper means to ensure TFO also when In Path Equipment such
as Echo Cancellers and DCMEs are used in the fixed network

MS-to-MS Call, with TFO


A-side

B-side

PLMN

Decoding

PLMN

MSC

MSC

TRAU

TRAU Encoding

BSS

BSS
56 or 48 kbits/s
8 or 16 kbits/s

Encoding

Decoding

MSa

MSb

TFO Mode
Voice Coded Speech

Voice Coded Speech

PCM Coded Speech


X

PCM Coded Speech


X

56 Kbits/s

48 Kbits/s
8 Kbits/s

16 Kbits/s

Coded speech is transmitted in the LSBs of the PCM samples in the A


interface with the decoded PCM samples

Both types of speech presentations (PCM and coded) are available at the
receiving end

Minor speech degradation in TFO - non TFO transition due to bit-stealing


(increased noise) when the 48/56 kbit/s speech samples are used for a
very short period

Adaptive Multi-Rate Codec

Source codec rates probably between 4 kbit/s and 14.4 kbit/s (no fixed source rate
requirements)

Operation in both GSM full rate (22.8 kbps) and half rate (11.4 kbps) channels
Main advantages in GSM

Increased robustness against channel errors


Enhanced quality in the half-rate channel in good channel conditions

Codec rate selected dynamically depending on radio conditions and local capacity
requirements

Codec bit rate selected by an adaptation algorithm specific to the system application
e.g. GSM or UMTS

Generic speech codec applicable to many mobile systems

Ability to adapt the bit-rate in a wide range may also be of interest for VoIP
applications

High AMR performance targets and the flexibility obtained by the switchable codec
bit-rates (modes) have made it an interesting candidate for UMTS and IMT2000.

Adaptive Multi-Rate Codec Schedule

Qualification testing has been completed on schedule

The AMR codec will be selected from among five different proposals passing the
qualification phase

Substantial improvements demonstrated, justifying the AMR technique


5 codecs advanced to the selection phase
Good expectation that all, or nearly all, requirements will be met
Selection phase to end by September 1998
The AMR speech codec specifications are planned to be completed by December
1998

Alcatel/BT/Cellnet/France Telecom/Nortel/Rockwell
Ericsson/Nokia 1
Ericsson/Nokia 2
Lucent
NEC

Delivery Dates of AMR Specifications


T a rg e t d a te
D ecem ber
1998
(re q u ire d )

S p e c ific a tio n s
s o u rc e c o d e c
channel codec
b a d fra m e h a n d lin g
in -b a n d s ig n a llin g o f c o d e c m o d e - tra n s m is s io n a s p e c ts
a n d d e fin itio n o f p a ra m e te rs
in -b a n d s ig n a llin g o f c h a n n e l m e tric a n d s id e in fo rm a tio n tra n sm is s io n a s p e c ts (b it a llo c a tio n a n d c h a n n e l
p ro te ctio n )

D ecem ber
1998
(o b je c tive )

V A D /D T X /c o m fo rt n o is e g e n e ra tio n
d e fin itio n o f ch a n n e l m e tric a n d s id e in fo rm a tio n
p a ra m e te rs
e xa m p le o f c o d e c m o d e a d a p ta tio n
la ye r 3 s ig n a llin g

June
1999

D ecem ber
1999

p e rfo rm a n c e c h a ra c te ris a tio n


[m in im u m p e rfo rm a n c e o f a d a p ta tio n a lo g o rith m s ]

A M R T R A U fra m e s
c h a n n e l p e rfo rm a n ce ta b le s (G S M 0 5 .0 5 )
TFO
te s t se q u e n c e s

AMR Speech Quality Requirements


Static error conditions: without background noise

Full-Rate Channel

Half-Rate Channel

C/I

Ideal case
performance
(requirement)

Worst case
performance
(objective)

Ideal case
performance
(requirement)

Worst case
performance
(objective)

no errors

EFR no errors

G.728 no errors

G.728 no errors

FR no errors

19 dB

EFR no errors

G.728 no errors

G.728 no errors

FR no errors

16 dB

EFR no errors

G.728 no errors

G.728 no errors

FR at 10 dB

13 dB

EFR no errors

G.728 no errors

FR at 13 dB

FR at 7 dB

10 dB

G.728 no errors

EFR at 10 dB

FR at 10 dB

FR at 4 dB

7 dB

G.728 no errors

EFR at 7 dB

FR at 7 dB

4 dB

EFR at 10 dB

EFR at 4 dB

FR at 4 dB

Table 1a: Clean speech requirements and objectives under static test
conditions.

AMR Speech Quality Requirements


Static error conditions: in the presence of background noise
Full-Rate Channel

Half-Rate Channel

C/I

Ideal case
performance
(requirement)

Worst case
performance
(objective)

Ideal case
performance
(requirement)

Worst case
performance
(objective)

no errors

EFR no errors

G.729 and FR
no errors

better than
G.729 and FR
no errors

G.729 and FR
no errors

19 dB

EFR no errors

G.729 and FR
no errors

better than
G.729 and FR
no errors

G.729 and FR
no errors

16 dB

EFR no errors

G.729 and FR
no errors

better than
G.729 and FR
no errors

FR at 10 dB

13 dB

EFR no errors

G.729 and FR
no errors

FR at 13 dB

FR at 7 dB

10 dB

G.729 and FR
no errors

FR at 10 dB

FR at 10 dB

FR at 4 dB

7 dB

G.729 and FR
no errors

FR at 7 dB

FR at 7 dB

4 dB

FR at 10 dB

FR at 4 dB

FR at 4 dB

Table 1b: Background noise requirements and objectives under static test
conditions.

AMR Speech Quality Requirements


Dynamic conditions

Full-Rate Channel
Requirement

(no background noise):

Same or better than the EFR under the same


conditions, and also the same or better than all the
AMR full rate tested modes under the same
conditions

Objective 1

Same or better than the EFR using the error pattern +


3 dB

Objective 2

Same or better than the EFR using the error pattern +


6 dB

Table 2a: Requirements and objectives under dynamic test conditions


for the full-rate channel

Half-Rate Channel
Requirement

Same or better than the FR under the same


conditions, and also the same or better than all the
AMR half rate tested modes under the same
conditions

Objective 1

Same or better than the FR on a full rate channel


using the error pattern + 3 dB

Objective 2

Same or better than the FR on a full rate channel


using the error pattern + 6 dB

Table 2b: Requirements and objectives under dynamic test conditions


for the half-rate channel

AMR Design Constraints

Some AMR design constraints (simplified to a general form)

Only very moderate complexity increase compared to existing GSM codecs

In-band signalling for codec modes. Independent adaptation on the up- and
down-links.

The AMR codec shall support Tandem Free Operation

It shall be possible to operate power control independently of the AMR


adaptation. Not included in qualification and selection tests.

Maximum source coding rate for FR channel modes is 14.4 kbit/s (due to 16
kbit/s sub multiplexing)

The AMR codec shall support DTX operation


The AMR codec and its control will operate without any changes to the airinterface channel multiplexing, with the possible exception of the interleave
depth.

AMR Design Constraints

Some AMR design constraints (continued)

Codec mode control relating to capacity or radio link quality should be located in
the network (BSS).

Transmission delay: The total algorithmic round trip delay is limited by EFR+10
ms in AMR FR channel, and HR+10 ms in AMR HR channel.

Frame size: 5ms, 10ms or 20 ms


The AMR in-band signalling shall be expandable to signal the use of future AMR
modes including signalling the use of the existing GSM FR, GSM HR and GSM
EFR speech coders, one or two wideband modes and all AMR speech codec
modes in FR channel mode (to guarantee proper TFO operation).

Qualification tests

The expected performance of the AMR candidates was evaluated in


qualification tests

Tests conducted in FR and HR channels, including

Clear speech
no errors and C/I 19 dB to 1 dB

Speech in background noise with channel errors


street noise (@15 dB SNR)
car noise (@15 dB SNR)

Tandeming
Speech level dependency
Switching between codec modes
Dynamic C/I: 5 error profiles
3 profiles for downlink test
2 profiles for uplink test

Overall performance aims


Introduce improvements where they are needed

low C/I in FR mode

high C/I in HR mode.


30.00

25.00

20.00
A M R -F R e nv e lo p e
A M R -H R e nv e lo p e

15.00

E FR
HR
10.00

5.00

0.00

C / I ( d B ) - Id e a l fre q u e n c y h o p p in g

Qualification results overview

Major benefits of AMR technique demonstrated especially

low C/I in FR mode (1 - 2 delta MOS)

high C/I in HR mode (same as G.728 - wireline)

dynamic conditions in FR mode (up to 1.6 delta MOS)

Several codecs close to meeting all the requirements


Most challenging condition - background noise in HR mode

Static C/I - examples

M OS

Experim ent 1a - Fam ily of Curves

M OS

5.00

5.00

4.50

4.50

4.00

4.00

3.50

3.50

3.00

3.00

Experiment 1b - Fam ily of Curves

Rate A

2.50
2.00
1.50

2.50

Rate A
Rate B

2.00

Rate C
1.50

Spec.

1.00
No Errors C/I=19 dB
@-26dBov l

C/I=16 dB

C/I=13 dB

C/I=10 dB

FR Channel

C/I= 7 dB

C/I= 4 dB
C/I= 1 dB
Condi ti ons

1.00
No Errors
@-26dBovl

Rate B
Rate C
Spec.

C/I=19 dB

C/I=16 dB

C/I=13 dB

C/I=10 dB

HR channel

C/I= 7 dB

C/I= 4 dB
C/I= 1 dB
Condi ti ons

Background noise; static C/I examples


Failed Conditions
M OS

Experim ent 2a - Family of Curves in FR

M OS

5.00

5.00

4.50

4.50

4.00

4.00

3.50

3.50

3.00

3.00

2.50

Rate A

Experim ent 2a - Family of Curves in HR

2.50

Rate B
2.00

Rate C

1.50

Spec. FR

2.00
1.50

Rate A
Rate B
Rate C
Spec. HR

Condi ti ons
1.00
FR No Errors

FR EC16

FR EC10

FR channel

FR EC4

Condi ti ons
1.00
HR No Errors

HR EC19

HR EC13

HR channel

HR EC7

Dynamic C/I - examples

Dynamic test designed to evaluate AMR performances in realistic radio environment with codec
adaptation turned on

Consistent results demonstrated by all candidates

adaptation mechanism finds best codec mode

in FR mode, significant improvement compared to fixed rate codec reference, EFR (up to 1.6
delta MOS)

in HR mode, quality equivalent to GSM FR or better (improvement sensitive to dynamic profile)

M OS

Experiment 4a Test Results

M OS

4.00

Experiment 4b Test Results

3.50

3.50
3.00

3.00
2.50

2.50
2.00

2.00

1.50

Y tes t
EFR
Rate C
Rate B
Rate A
Rate D

1.50

Dynam ic Er r or Condition

1.00
DEC1

DEC2

DEC3

DEC4

Typical Result in FR

DEC5

Y tes t
FR
Rate A
Rate B
Rate C
Rate D

Dynam ic Err or Condition

1.00
DEC1

DEC2

DEC3

DEC4

Typical Result in HR

DEC5

Examples of dynamic conditions


C and I profile etsiq3

C/(I+N) profile etsiq3

50
DL
UL
DL
UL

55

C
C
I
I

DL
UL
25
20

C/(I+N) [dB]

C and I [dBm]

60
65
70

15
10

75
5

80
85

0
0

10

20

30
40
time [s]

50

60

10

C and I profile etsiq11

20

30
40
time [s]

50

60

C/(I+N) profile etsiq11

50

22
DL
UL
DL
UL

C
C
I
I

DL
UL

20

55

18

C/(I+N) [dB]

16

C and I [dBm]

Dynamic error profiles


from Radio Simulator
(SMG2)
One minute long
Up and down links
Correlation of C/I
between up and down
links controlled

30

60

65

14
12
10
8

70

6
4

75

2
0

10

20

30
40
time [s]

50

60

10

20

30
40
time [s]

50

60

Wideband AMR

The narrowband AMR work will continue with the specification of a wideband mode

Feasibility phase on-going


Discussion on Design Constraints and Recommended audio bandwidth
Preliminary working assumption for optimum audio bandwidth (to be confirmed)

No target date for finalized specification yet

100 Hz to 7 kHz (possibly also 100 Hz to 5 kHz)


In some types of background noise, advantages to reducing low frequencies

So far, there has been little activity on wideband AMR due to work load on the
narrowband AMR

Several organisations indicated they are studying wideband AMR.


Results probably not available until end 1998.

UMTS Matters

Liaisons with ARIB (Japan)

Set-up collaboration on UMTS/IMT-2000 speech coding matters


ARIB representatives attending SMG11 meetings

AMR in UMTS and IMT-2000

Working assumption for UMTS (decision from SMG#26, subject to re-evaluation


after the AMR selection)

A possible candidate for IMT-2000 in ARIB, if standardized on schedule

WCDMA simulations

Initial simulation results with the GSM EFR codec and the AMR concept in a
WCDMA channel have been presented to SMG11

New Work Item: Noise Suppression

A new Work Item on Noise Suppression with AMR was approved by SMG in June
Optional DSP feature to reduce audio background noise
Can improve ease of conversation
Located ahead of the speech codec
Effective in many but not all background noise environments
Optimised for the AMR speech codec
Standardisation to guarantee minimum performance level
The work has not started yet and the scope of the work and possible standardization
has not been fully defined and agreed to

Next SMG11 Plenary Meetings

SMG11#7: 28 September - 2 October 1998; Sophia Antipolis; host Texas


Instruments
SMG11#8: 11 - 15 January 1999
SMG11#9: 3 - 5 June 1999

Вам также может понравиться