Вы находитесь на странице: 1из 21

SwissQual

Transition to SQuad08 Version and


Wideband Voice Tests

Test & Measurement

White Paper 01

The firmware of the instrument makes use of several valuable open source software packages. For information, see the "Open
Source Acknowledgement" on the user documentation CD-ROM (included in delivery).
Rohde & Schwarz would like to thank the open source community for their valuable contribution to embedded computing.

SwissQual AG
Allmendweg 8, 4528 Zuchwil, Switzerland
Phone: +41 32 686 65 65
Fax:+41 32 686 65 66
E-mail: info@swissqual.com
Internet: http://www.swissqual.com/
Printed in Germany Subject to change Data without tolerance limits is not binding.
R&S is a registered trademark of Rohde & Schwarz GmbH & Co. KG.
Trade names are trademarks of the owners.
SwissQual has made every effort to ensure that eventual instructions contained in the document are adequate and free of errors and
omissions. SwissQual will, if necessary, explain issues which may not be covered by the documents. SwissQuals liability for any
errors in the documents is limited to the correction of errors and the aforementioned advisory services.
Copyright 2000 - 2013 SwissQual AG. All rights reserved.
No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or translated into any
human or computer language without the prior written permission of SwissQual AG.
Confidential materials.
All information in this document is regarded as commercial valuable, protected and privileged intellectual property, and is provided
under the terms of existing Non-Disclosure Agreements or as commercial-in-confidence material.
When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo trademark somewhere in
your text.
SwissQual, Seven.Five, SQuad, QualiPoc, NetQual, VQuad, Diversity as well as the following logos are registered trademarks of SwissQual AG.
Diversity ExplorerTM, Diversity RangerTM, Diversity UnattendedTM, NiNA+TM, NiNATM, NQAgentTM, NQCommTM, NQDITM, NQTMTM,
NQViewTM, NQWebTM, QPControlTM, QPViewTM, QualiPoc FreeriderTM, QualiPoc iQTM, QualiPoc MobileTM, QualiPoc StaticTM, QualiWatch-MTM, QualiWatch-STM, SystemInspectorTM, TestManagerTM, VMonTM, VQuad-HDTM are trademarks of SwissQual AG.
The following abbreviations are used throughout this manual: R&S___ is abbreviated as R&S ___.

SwissQual

Contents

Contents
1 Introduction............................................................................................ 5
1.1

More Complex Telecommunication Networks And Handsets.................................. 5

1.2

Demand For Wideband Audio Transmission............................................................. 6

1.3

Technical Details........................................................................................................... 6

2 SQuad08 Wide-Band Voice Quality Measurements............................8


2.1

Differences to Narrow-Band.........................................................................................8

2.2

Wide-Band Speech Reference Signals..................................................................... 10

2.3

Where Wide-Band Quality Can Be Assessed........................................................... 10

2.4

Wide-Band Analysis In Diversity............................................................................... 11

2.5

SQuad08 Wide-Band Performance............................................................................ 12

3 SQuad08 Narrow-Band Voice Quality Measurements...................... 14


3.1

SQuad08 Prediction Performance in Narrow-Band Test Cases............................. 15

3.2

Differences to the Previous Version Of Squad........................................................ 16

4 SQuad in Diversity............................................................................... 19
4.1

Voice Telephony..........................................................................................................19

4.2

Video Streaming.......................................................................................................... 19

5 Conclusion............................................................................................21

White Paper 01

SwissQual

White Paper 01

Contents

SwissQual

Introduction
More Complex Telecommunication Networks And Handsets

1 Introduction
SwissQual uses SQuad as the heart of its voice quality suite since SwissQual was
founded in 2000. SQuad was specifically developed then to meet the requirements of
mobile and Voice-over-IP scenarios and forms the backbone of the entire voice quality
suite of SwissQual to this day. With its reliable and accurate results SQuad is highly
accepted and has been used for years for benchmarking and optimization of mobile
and fixed-line networks.
SQuad was maintained and continuously improved over the years. However, the evolution of networks and services that can be expected in the near future triggered a
complete revision of the Squad algorithm. This new and improved version of SQuad
supports unlimited bandwidths in voice signals as well as traditional narrowband measurements. This new version of SQuad called SQuad version 08 for clarification at
this time is SwissQuals candidate for the new ITU T Recommendation P.OLQA.
In order to avoid any ambiguity with the previous version, the new version of SQuad
will be called SQuad version 08 in this document.

1.1 More Complex Telecommunication Networks And


Handsets
Telecommunication networks are being equipped more and more with highly non-linear
components and long distance calls are usually passing through several such components and even different networks.
Today, speech quality is no longer determined by the speech codec used or by lost
frames alone. We also have an interaction with different other components controlling
the signal level automatically, applying smart loss concealments and similar strategies
in order to increase intelligibility in case of critical situations. Unfortunately, these components are not used just once in a connection; there are rather several of them,
potentially causing interferences.
In addition to that, we also saw some progress in the standardization of speech
codecs, with recently standardized coding schemes now being integrated in the networks. These new schemes, such as EVRC and EVRC-B used in CDMA networks
were considered from the very beginning of the development of the new SQuad version. In addition to the traditional schemes for voice source coding, audio compression
methods, that is, MP3, AAC, and so on, are increasingly being used in telecommunication services as well.
This wide range of new and updated transmission technologies is now covered by the
revised SQuad version '08'.

White Paper 01

SwissQual

Introduction
Demand For Wideband Audio Transmission

1.2 Demand For Wideband Audio Transmission


Telecom industries are now initiating the evolution from narrow-band telephony to
wide-band speech transmission. The codecs for wide band are ready and approved by
the standardization bodies, the handsets are not restricted in processing power and the
core networks are being upgraded.
Of course, narrow band speech is the normal experience for telephone users and has
been accepted for decades. As the mobile telephone becomes an increasingly multimedia based device, the traditional telephone sound seems less and less acceptable.
The expectation of the consumer is changing whilst, at the same time, the increased
processing power allows wider audio bandwidths. The standardization bodies provide
the corresponding coding schemes and the core networks are being upgraded. The
first step wide band transmission up to 7000Hz is already being overhauled by the
emergence of so-called super wideband transmission technology which opens the
band up to 14000Hz and of course by the family of audio codecs including MP3 and
AAC allowing transmission above the hearing threshold.
In SwissQual products wide-band actually means wide-band, and audio signals are
evaluated up to the limit of 14000Hz more than sufficient for speech signals.

1.3 Technical Details


To deal with all these upcoming demands in voice quality predictors in the expected
reliable and robust manner, SwissQual launched a new voice quality prediction algorithm SQuad version '08' as part of the Diversity 10.2 release.
SwissQual has already upgraded the entire audio processing chain in their products to
address wide band audio transmission and have launched a measurement suite for
wide band Listening Quality in the last quarter of 2008.
SQuad version '08' is a completely re-written program that allows to use and scale individual modules according to the application area. Of course, SQuad version '08' follows the same proven base concept of using advanced psycho-acoustic models as the
previous version of SQuad. However, the psycho-acoustic model has been extended
significantly and the following cognitive processing model has now to deal with new
types of distortions, which although measurable, might not be relevant for the perceived quality. At the other end of the program the new SQuad version '08' uses a
completely new time-alignment procedure that makes use of SwissQual's experience
in image recognition and video evaluation. This allows a very robust alignment even in
case of strong time warping coupled with other distortions.
Of course, the new SQuad version '08' is not used as a standalone solution. It is fully
integrated in the well-known SQuad suite that provides a lot of additional information
about the connection under test, including a cause analysis giving the most probable
cause for a given degradation.
The existing and widely introduced previous SQuad remains a part of Diversity and can
still be used if desired. This gives all customers the possibility to continue their ongoing

White Paper 01

SwissQual

Introduction
Technical Details

measurement campaigns and to pan a transition to SQuad version '08' based on the
customers' schedule.

White Paper 01

SwissQual

SQuad08 Wide-Band Voice Quality Measurements


Differences to Narrow-Band

2 SQuad08 Wide-Band Voice Quality Measurements


As already mentioned, SQuad version 08 provides Listening Quality scores for audio
applications exceeding the traditional telephone band by several orders of magnitude.
Wide-band is a term that is used differently. The first trials for extended audio bandwidths already started in the 1980s and opened the band up to 7000 Hz (a sampling
frequency of 16 kHz was used here). The lower band limitation was sometimes 50 Hz,
sometimes 100 Hz. An early coding standard was based on an ADPCM scheme (ITU T
G.722) and remained untouched for many years.
Starting after the year 2000, new efforts were made to standardize wide-band versions
of common telephony speech codecs like AMR-WB based on AMR, EVRC-WB based
on EVRC-B as well as G.729.1 which is the wide-band extension of G.729.
Along with the evolution of audio codecs the limitation to 7000 Hz became restrictive.
Ongoing developments of voice codecs are already processing the so-called superwideband (up to 14000Hz) or even higher (full-band). The perceived difference
between super-wideband and full-band can be ignored in case of human speech. However, voice codecs are now touching the area of common audio coding schemes such
as MP3 and AAC.
Within SwissQuals product lines, the term narrow-band refers to the traditional telephone band whilst the term wide-band stands for 50 14000 Hz audio bandwidth.
Along with Diversity Release 10.2, SwissQual launched a wide-band test application
for the first time. On the one hand a completely revised, renewed and extended SQuad
was developed as the core algorithm for the estimation of the Listening Quality. On the
other hand the whole audio processing chain from the handsets audio connector
across the analog circuits in Diversity as well as the digital signal processing was
extended to higher sampling frequencies.

2.1 Differences to Narrow-Band


In traditional telephony scenarios, the expectation is set to a perfect but narrow-band
voice signal. A signal that is close or identical to such a signal is scored with a high
quality value (usually a MOS-LQ of around 4.5 on a five-point scale). Additional degradations will decrease the quality value towards to 1.0.
For more detailed information, see the
White Paper About MOS and Quality Measurements, which was published
by SwissQual AG in 2009
Within a wide-band scenario the expectation of excellent quality is a perfect wide-band
speech signal. Since the same scale is used here, such a perfect wide-band signal is
scored with 4.5 too. Obviously, a narrow-band signal in the same context will not fulfill

White Paper 01

SwissQual

SQuad08 Wide-Band Voice Quality Measurements


Differences to Narrow-Band

the expectation of high quality due to its band limitation. Consequently, it will be scored
lower in this context.
This is roughly spoken the main difference. There are other effects such as a different
perception of noises, since there are noise parts in the higher frequency ranges, which
are less or not masked by voice anymore, as well as other effects. But the main difference will be the lower scored narrow-band signals.
Most important for customers will be typical values to be obtained with the wide-band
application compared to narrow-band measurements.
The following table shows typical values obtained in subjective experiments. These are
also predicted by SQuad version 08.
Subjective MOS-LQ scores depend from the individual experiment design and the cultural attitudes of the test listeners. The given values are just typical examples derived
by a series of tests in average.
Table 2-1: Typical MOS-LQ values for common transmission techniques
MOS-LQ wide-band (50-14000
Hz)

MOS-LQ narrowband (300-3400


Hz)

Transparent transmission 50
14000 Hz or wider

4.5

Transparent transmission 50
7000 Hz (old wide-band)

4.3

AMR-WB 12.65 kbps (50 7000


Hz)

3.9

Transparent transmission 300


3400 Hz (POTS)

3.9

4.5

Transparent transmission 300


3400 Hz (POTS)

3.8

4.4

EFR / AMR 12.2kbps

3.5

4.1

EVRC 9.5 kbps

3.5

3.9

EVRC-B 9.5 kbps

3.5

4.0

EVRC-B 9.5 kbps

3,4

3.8

It can be seen that the rank-order of the systems remains. The upper range of the
wide-band scale is just used for the high qualitative wide-band voice samples. The
common narrow-band scenarios are compressed into the lower 70% of the scale and
show a smaller gradient as well.
In case of optimizing and benchmarking pure narrowband networks and applications
the common narrow-band test application can be used without any problems. The individual systems are more discriminated due to the wider scale range used.
For optimizing wide-band applications and networks and especially for benchmarking
of wide-band networks against narrow-band ones a wide-band test-application is
required.

White Paper 01

SwissQual

SQuad08 Wide-Band Voice Quality Measurements


Wide-Band Speech Reference Signals

Firstly, the degradations in wide-band mode can only be assessed in a wide-band test
application and secondly, a wide-band signal can only show its better quality against
narrow-band in wide-band mode.
Narrow-band MOS-LQ values and wide-band MOS-LQ values must never be mixed or
directly compared. They are referring to different interpretations of the MOS scale.

2.2 Wide-Band Speech Reference Signals


To feed actually wide-band signals into the channel new voice samples were recorded.
They are without a perceptual band-width limitation and stored using 32 kHz sampling
frequency in a separate reference folder Speech-Wideband. As usual, the samples
are constructed out of a male and a female spoken sentence and have a constant
length of 6s. Thus, the continuity to the narrow-band tests is completely given.
For the time being SwissQual provides samples in

German (Swiss pronunciation)

American English

British English

Italian

Dutch

Each language sample is provided without any pre-filtering (except for a 50 14000
Hz band-pass) and called, for example, AM_fm_wide.wav. In addition, for special
applications there are samples pre-filtered with the wide-band IRS(send) filter according to ITU-T P830. This filter reduces the effective bandwidth to the range of 507000
Hz with a pre-emphasis in the range of 2500 Hz. Please note that this bandwidth
restriction will decrease the MOS value already a bit due to the band-filter applied to
the signals. These signals are name, for example, AM_fm_IRS_wide.wav.

2.3 Where Wide-Band Quality Can Be Assessed


Although wide band is a normal use case in daily lifes communication such as TV and
FM radio, it is still not popular in telecommunications.
It was used for commercial video conferencing systems, but it is Internet Telephony
that enables wideband telephony for normal users for the first time. Today, common
VoIP clients support a wide range of wide band codecs and use them if a sufficient bitrate is available for the service.
Now the next step in using wide band telephony is the evolution of cellular networks
and handsets. The networks and user devices become equipped with AMR-WB, allowing an audio bandwidth up to 7000Hz while still remaining in the typical bit-rate range
used for GSM and UMTS.

White Paper 01

10

SwissQual

SQuad08 Wide-Band Voice Quality Measurements


Wide-Band Analysis In Diversity

The main focus of Diversitys wide-band test solution is of course the evaluation and
benchmarking of wide-band channels in cellular networks.
An additional application area for wide-band voice testing in Diversity is video streaming. In video streaming audio codecs are usually used; these dont have any bandwidth
restriction, except in very low bi-rate conditions. Consequently, Speech Wideband as a
test case is also applied to video streaming starting with Release 10.2 of Diversity.

2.4 Wide-Band Analysis In Diversity


The wide-band test application forms an own-standing test Speech Wideband, whilst
the test Speech remains at narrow-band.
Speech Wideband can be selected as a separate test with SwissQuals NQView and
in the NQDI post-processing tool NQDI.
Within NQDI, the detailed analysis is actually named "Listening Quality Wideband".

Fig. 2-1: Presentation of the main set of SQuad Wideband results in SwissQuals NQDI

The application type (highlighted in green) explains the modeled listening situation in
detail. In addition, since a potential band-width reduction is a serious impact in a wideband scenario, the actual band-width of the channel is measured and reported as well
(highlighted in red). There are three classes: narrow-band, wideband (up to 8000 Hz)
and super-wideband (up to 14000 Hz). The remaining values are the same as usual
and well known for SQuad
The "SQuad Details" tab clearly shows the audio bandwidth of the measured audio
channel.

Fig. 2-2: Presentation of the channels frequency response in SwissQuals NQDI

The lower and upper bound are marked with blue lines. As is clearly visible, SQuad
version 08 makes use of real full-band signals. The frequency scale here ends at
16000 Hz; this corresponds to an internal sampling frequency of 32 kHz.

White Paper 01

11

SwissQual

SQuad08 Wide-Band Voice Quality Measurements


SQuad08 Wide-Band Performance

2.5 SQuad08 Wide-Band Performance


To gain an impression of the performance of an objective quality measure, usually a
comparison to subjectively obtained data is made.
For more detailed information, see the
White Paper About MOS and Quality Measurements, which was published
by SwissQual AG in 2009
The average accuracy across a data set (often called experiment or speech data base)
is given by means of correlation coefficients and/or root means square errors. These
values give an overview about the performance in general. However, the actual
reached numbers depend on the construction of the data set and the kind of conditions
is contains. It is always true that there are test condition that can be predicted easily in
an accurate way by a model, that is, noises, waveform codecs and so on, and others
where the deviation is higher (usually combinations of distortions). The occurrence of
such conditions in a data set has a strong influence on these figures.
For this reason the reliability of a predicted score is sometimes more important to a
user. Reliability means: How is the prediction of a certain condition relatively to
another one I am also interested in? It is mainly the correct relative prediction that is of
interest, allowing benchmarking such as A is better than B and C is even worse.
figure 2-3 tries to give an impression for both figures. Basis for that analysis plot is a
database covering an extremely wide range of different distortion types and amounts. It
covers codecs, real live channels, noises, bandwidth limitation as well as interruptions.
It even contains recordings made by the use of an artificial ear simulator in the acoustical domain. The example data set corresponds to the strong specifications made within
ITU-Ts P.OLQA model selection.

White Paper 01

12

SwissQual

SQuad08 Wide-Band Voice Quality Measurements


SQuad08 Wide-Band Performance

Fig. 2-3: Performance of SQuad 08 WB for a wide-band data set (up to 14000 Hz)

Firstly, we can derive that SQuad version 08 does an excellent job in wide-band mode
as well. The correlation coefficient is very high at 0.95. In addition, the filled symbols
include AMR, AMR-WB as well as EVRC-B and EVRC-WB conditions. The relative
ranking to each other is excellent, showing a high reliability.
Over a range of other data sets, SQuad version 08 performs similar in wide-band
mode and usually shows correlation coefficients above 0.9 up to 0.97.
It is very important to note that all these data sets cover a much wider range of distortions as used for Diversity today. In addition to standard use cases for Diversity, the
new SQuad version 08 can also handle acoustical recordings in real room environments. Even hands-free devices can be assessed if the speech signal is recorded by
an artificial head and ear simulator.

White Paper 01

13

SwissQual

SQuad08 Narrow-Band Voice Quality Measurements

3 SQuad08 Narrow-Band Voice Quality


Measurements
SQuad version 08 also supports the narrow-band operational mode narrow-band that
is concurrent with the existing SQuad. Due to the revision of SQuad and the extension
of the internal SQuad psycho-acoustic and cognitive models, the performance in narrow-band mode was further improved. The new version 08 of SQuad fits to new transmission technologies to be launched now or in the near future and provides stable and
reliable results here along with an improvement in performance for existing technologies too.
The development of SQuad version 08 is coordinated with SwissQuals activities in
ITU-Ts P.OLQA standardization efforts. Within the last years a series of new subjective listening tests were conducted by SwissQual and other partners, which were used
as reference and anchor scores for the training of SQuad. These tests are focused on
new technologies and their rating against previous speech processing techniques in
narrow-band and wide-band applications. These tests were used to improve and train
the SQuad algorithm in narrow-band operational mode too.
According to the subjective test results, the SQuad scores were revised to predict
accurately how MOS-LQ is perceived in todays subjective experiments. There are
small shifts compared to experiments conducted 10 years ago; most notably people
became more familiar with speech coding artifact due to the extended use of cellular
phones over the last years. A customers expectation 10 years ago was mainly set by
high quality ISDN connections, and any kind of coding distortions were rated significant
lower. Today, the expectation changed to a new experience. People are more relaxed
in case of low coding artifacts as produced by an EFR codec for example. Consequently, the obtained MOS scores are slightly more optimistic in those conditions when
compared to traditional tests. It stands for the high accuracy of the conducted experiments as well as SQuad as a predictor of listening quality that such small differences
can be detected and modeled accordingly.
One typical shortcoming of common prediction methods in narrow-band was the lack of
training data covering CDMA technologies. Especially coding schemes such as
QCELP, EVRC and EVRC-B were not be part of traditional ITU-T reference data sets.
Within the revision of SQuad, EVRC-type codecs were widely included in the data sets
to get reliable and accurate results especially in their relationship to GSM/UMTS coding schemes. A reliable and accurate comparative rating is the key for true benchmarking of GSM/UMTS and CDMA networks. These evaluations were not restricted to offline simulation of those codecs, rather live recordings via physical handsets and all
interactions with other processing components in the phone and networks were included.
Another focus of the newly conducted tests were Voice Quality Enhancement Systems
and their typical set of processing means such as Active Gain Control, Noise Reduction, Time-Variant Filtering and much more. These types of systems are being used
more and more in cellular networks and intelligent devices.

White Paper 01

14

SwissQual

SQuad08 Narrow-Band Voice Quality Measurements


SQuad08 Prediction Performance in Narrow-Band Test Cases

3.1 SQuad08 Prediction Performance in Narrow-Band


Test Cases
The prediction performance or prediction accuracy is usually shown by correlation
coefficients to subjective test data. figure 3-1 makes use of so-called scatter plots.
For more detailed information, see the
White Paper About MOS and Quality Measurements, which was published
by SwissQual AG in 2009
SQuad is trained on several sets of data containing an extreme wide range of possible
distortion types. These types exceed typical distortions in cellular networks by sizes.
This enables SQuad to predict accurate and reliable MOS-LQ even in future network
structures without being revised in short intervals.
As examples two data sets shall be used. One represents a typical data set provided
by ITU-T covering common coding schemes in different combinations. The black symbols indicate conditions with a strong involvement of cellular codecs.
The second one is a data set derived by real field recordings as they are made by
benchmarking systems such as Diversity as well as VoIP connections and handset
influences.

Fig. 3-1: Performance of SQuad version 08 NB ITU-T G series codecs

In case of traditional test conditions, established and common quality prediction methods show quite good results. Those conditions were already available back at the time
when these measures were developed. There is obviously less space to improve the
prediction performance further.
The improvement of the revised SQuad version 08 can be recognized immediately for
newer and more complex setups were traditional measures fail or become more inaccurate.

White Paper 01

15

SwissQual

SQuad08 Narrow-Band Voice Quality Measurements


Differences to the Previous Version Of Squad

table 3-1 and table 3-2 provide an impression of the improvement by means of correlation coefficients on relation to subjective MOS scores.
The correlation coefficients are calculated on a per-condition basis after 3rd order
monotonous fitting as referenced in ITU-T P.862
Table 3-1: Performance for typical traditional data sets for SQuad and P.862.1
Traditional data
sets

Description

SQuad-LQ

P.862.1

SQuad08

ITU-T G series
codecs

The data set covers more


than 500 individual speech
samples in three different
languages. The data set is
made available by ITU-T
SG12 as Supplement 23.

0.95

0.95

0.97

ITU-T G series
codecs under
frame loss and
background
noise**

The data set covers more


than 700 individual speech
samples in four different
languages. The data set is
made available by ITU-T
SG12 as Supplement 23.

0.92

0.94

0.96

Table 3-2: Performance for typical complex data sets for SQuad and P.862.1
Complex data sets

SQuad-LQ

P.862.1

SQuad08

Real field connections


GSM/UMTS w/ different
handsets and analogue
matching circuits

0.87

0.85

0.95

Real field connections


GSM, UMTS and CDMA
w/ noise reduction

0.88

0.88

0.95

Narrow-band test cases


acc. to P.OLQA specification

0.80

0.78

0.87

Todays real field connections cover components that were explicitly excluded when
ITU-Ts P.862 was standardized or which were simply not available and tested. In particular, the relation of different distortions types to each other was less covered by the
data sets used. This relative assessment of a wide range of different distortion types
and amounts is a key point of the current development and standardization of P.OLQA.

3.2 Differences to the Previous Version Of Squad


Actually the differences are very small for common applications in cellular networks. A
customer may only see a slightly higher MOS-LQ value for error-free or high quality
transmission using EFR or AMR with higher bit-rates. Instead of a typical value in the
range of 4.0 for EFR, they may now obtain 4.10 4.15. This is also more in line with

White Paper 01

16

SwissQual

SQuad08 Narrow-Band Voice Quality Measurements


Differences to the Previous Version Of Squad

published reference values within ITU-T P.862.3 in late 2007. The lower bit-rates of
AMR remain in the range of 3.5.
An improvement will be seen for EVRC type codecs as used in CDMA. The new version of SQuad shows an even better comparability to EFR/AMR codecs.
ITU-T and 3GPP do not recommend the use of the P.862 family for EVRC-type of
codecs.
Furthermore, the new SQuad is trained for scoring complex channels including more
than just a codec, for example, noise reduction, variable gain and filtering as well as
strong time warping.
table 3-3 shows the main differences in scores between the previous version of SQuad
and SQuad version 08. The P.862.1 scores are also listed for reference.
The results are based on typical speech samples, that is, American English as used in
Diversity. The codecs are used as reference SW implementations. In addition one EFR
condition is shown as it behaves in a real loss-free channel, using a commercial Nokia
handset as access device to the network. The channel was terminated by an ISDN
card device running G.711 A-Law.
Table 3-3: Typical predicted MOS-LQ values for common transmission techniques
SQuad-LQ (narrowband)

SQuad-LQ 08 (narrow-band)

P.862.1 (narrow-band)

Transparent transmission 300 3400 Hz


(POTS)

4.50

4.45

4.50

G.711 (A-Law standard


PCM)

4.35

4.35

4.45

EFR / AMR 12.2kbps

4.00

4.10

4.10

EFR (real loss-free connection)

3.95

4.00

4.00

QCELP 13kbps

3.95

3.90

3.90

EVRC 9.5 kbps

3.60

3.85

3.75

EVRC-B 9.3 kbps

3.80

3.95

3.70

AMR 7.95 kbps

3.75

3.90

3.75

AMR 6.70 kbps

3.70

3.85

3.65

AMR 4.75 kbps

3.50

3.60

3.25

Firstly, a slightly more optimistic prediction is enabled by the new SQuad version. It
comes closer to recent subjective testing as already discussed above.
Along with the launch of EVRC type of codecs a new challenge was added to the
objective predictions methods. Traditional methods rate EVRC type codecs relatively
low compared to AMR codecs or ITU-T G series codecs. Both the previous SQuad and

White Paper 01

17

SwissQual

SQuad08 Narrow-Band Voice Quality Measurements


Differences to the Previous Version Of Squad

P.862.1 rate the new EVRC-B at 9.5kbps as equivalent to an AMR 7.95 kbps signal on
average. This seems too low in the light of recent subjective tests.
Since the new version of SQuad was developed on data sets reflecting todays techniques, EVRC type of codecs are scored much more realistically now. Applying the
revised SQuad to EVRC-B at its highest rate results in a MOS-LQ that is equivalent to
AMR at 10.2 kbps, and even close to EFR, fitting human perception much better. In
addition, SQuad is able to differentiate between EVRC and ERC-B by a reliable difference in scoring.
SQuad version 08 is now the core algorithm in SwissQuals speech quality analysis
suite. With its psycho-acoustic and cognitive models it predicts the perceived listening
quality by a customer. Nevertheless, SQuad is a suite enabling a strong frame work
providing much more data than a MOS-LQ prediction. All additional results, such as
sanity checks of the signals prior to the evaluation, cause-analysis, level analysis and
much more, remain in place and were even extended.
Consequently, the look and feel of SQuad version 08 remains the same as the well
accepted previous SQuad version. To avoid un-wanted mix-up of results obtained with
SQuad version 08 and its predecessor, separate reports and tables in NQDI are supported.
The current ITU-T standard is supported as an optional additional measure in the
SQuad-LQ suite by both the previous SQuad as well as the revised version 08.

White Paper 01

18

SwissQual

SQuad in Diversity
Voice Telephony

4 SQuad in Diversity
The following sections describes how Diversity measurement systems use SQuad.

4.1 Voice Telephony


SQuad is the core algorithm for Listening Quality in Diversity. It is used in an intrusive
test call application, where a far-end side sends a predefined speech sample to the
measuring probe.
The synchronously captured speech sample is analyzed by the SQuad suite and a set
of results including a predicted MOS-LQ are reported.
The related test application is simply called Speech. Because different reference sample are used in a wide-band application (50-14000 Hz), the wide-band test application
is called Speech Wideband for separation. Both test types can be combined in a call
and/or run with different samples to be used.
The definition of the tests is fully transparent to the used version of SQuad. The job
and test type simply define the desired set of results. The decision whether the old
SQuad or the revised SQuad is should be applied is made during the installation of the
Diversity software on the measurement probe itself.
Thus, the customer can easily upgrade their unit without any additional update of existing measurement jobs or campaigns. The new SQuad as well as wide-band test scenarios can be used for voice and video telephony in intrusive mode.
In the previous chapter the performance and typical values for isolated codecs were
shown. A real Diversity application however considers many more factors, as there is a
physical handset involved, an air-link and core-network components as well. The
actually obtained scores always reflect the entire channel in an end-to-end view. The
individual scores can be significantly lower than isolated codec processing due to
these additional influences and processing steps.

4.2 Video Streaming


Video Streaming is of course focused on video quality evaluation and the behavior
of a streaming service on the data layer. SwissQuals video streams used in the
Streaming PC Full Reference test also contain an audio track which is used for
voice transmission.
Consequently, the transmitted voice signal is worth to be evaluated by SQuad too.
Along with Release 10.2 a main change is applied to this voice analysis: In previous
releases only narrow-band speech was transmitted along with the video and the common narrow-band SQuad was applied.
For Release 10.2 a revision of the provided video streams was made. The new
streams now cover a much wider range in the bit-rate and image resolutions. This revi-

White Paper 01

19

SwissQual

SQuad in Diversity
Video Streaming

sion made it worth to re-think the voice part as well. In multi-media applications such
as video-streaming, wide-band or full-band audio signals encoded by audio codecs are
typical nowadays. For this reason SwissQual combined their new video streams with
speech signals without band limitations. The applied codecs are now AMR for the lowest bit-rates (resulting in an internal down-sampling to narrow-band) as well as AAC,
allowing for almost transparent transmission at higher bit-rates.
Consequently, SQuad in wide-band mode is the adequate means to measure the voice
quality in case of such streaming applications. The wide-band mode is now the default
measurement for voice signals in full-reference streaming applications.

White Paper 01

20

SwissQual

Conclusion

5 Conclusion
The revised version of SQuad fulfills SwissQuals expectations for a new level of core
algorithm in Diversity. It enables true and reliable quality estimations for traditional,
todays and future transmission techniques.
At the same time ITU-Ts P.OLQA is on its way to being considered for standardization. The revised SQuad is already a confirmed candidate algorithm for P.OLQA. During the following optimization and selection period a huge amount of additional speech
data sets will be created and made available. These additional data will improve the
performance of SQuad further. Due to the ongoing development process individual
scores may change slightly but the performance will increase.

White Paper 01

21

Вам также может понравиться