Вы находитесь на странице: 1из 191

Helsinki University of Technology

Faculty of Electrical Engineering


Laboratory of Acoustics and Audio Signal Processing

Discrete-Time Modeling of Acoustic Tubes


Using Fractional Delay Filters

Vesa Vlimki

Report 37
December 1995

Dissertation for the degree of Doctor of Technology to be presented


with due permission for public examination and debate in Auditorium
S4 at the Helsinki University of Technology (Espoo, Finland) on the
18th of December, 1995, at 12 noon.

ISBN 9512228807
ISSN 0356083X
TKK Offset 1995

Discrete-Time Modeling of Acoustic Tubes


Using Fractional Delay Filters
Vesa Vlimki

Abstract
This work deals with digital waveguide modeling of acoustic tubes, such as bores of
musical woodwind instruments or the human vocal tract. The acoustic tube systems
considered in this work are those consisting of a straight cylindrical or conical tube
section or of a concatenation of several cylindrical or conical tube sections. Also, the
junction of three tube sections is studied. Of special interest for our application are
junctions where a side branch is connected to a cylindrical or conical tube since these
are needed in the simulation of woodwind instrument bores.
Basic waveguide models are generalized by employing the concept of fractional
delay, which means a fraction of the unit sample interval. A fractional delay is implemented using bandlimited interpolation. A novel discrete-time signal processing technique, deinterpolation, is defined. Applying fractional delay filtering techniques, a spatially discretized waveguide model is turned into a spatially continuous one. This
implies that the length of the digital waveguide can be adjusted as accurately as
required, and a change of the impedance of a waveguide may occur at any desired point
between sampling points. This kind of a system is called a fractional delay waveguide
filter (FDWF). It is a discrete-time structure but yet a spatially continuous model for a
physical system.
The basic principles of digital waveguide modeling are first reviewed. Modeling
techniques for cylindrical and conical acoustic tubes are described, as well as methods
to simulate junctions of two or more of these sections. Different design methods for
both FIR and IIR (allpass) fractional delay filters are reviewed and the theoretical foundations of FDWFs are studied. Fractional delay extensions for acoustic tube model
structures are discussed and approximation errors due to fractional delay filters are analyzed. In addition, a new technique for eliminating transients due to time-varying filter
coefficients in recursive filters is introduced.
The models described in this work are directly applicable to physical modeling and
model-based sound synthesis of speech and wind instruments.
Keywords: digital signal processing, digital filter design, fractional delay, interpolation, acoustic tube, conical acoustic tube, vocal tract modeling, physical modeling,
waveguide synthesis, time-varying digital filters

Preface
This work was carried out at the Laboratory of Acoustics and Audio Signal Processing in
the Helsinki University of Technology (HUT) (Espoo, Finland) during the years 1993
1995. I am deeply grateful to my supervisor, Prof. Matti Karjalainen, for constant help
and friendship during all phases of my work and also for organizing financial support for
my research.
I have had the possibility to conduct part of this study at CARTES (Espoo, Finland). I
am grateful to Mr. Otto Romanowski (director of CARTES), Ms. Sini Sormunen, Mr.
Harri Kaimio, Mr. Jan G. Liljestrand, Mr. Tuomo Rnkk, Mr. Erkka Suominen, Mr.
Juuso Voltti, and other people I have met in CARTES, for a pleasant and inspiring atmosphere.
I have also spent more and more time working at the Acoustics Laboratory in
Otaniemi, and I want to thank the personnel of this slightly overcrowded laboratory for
being friendly every day. Especially I want to thank our secretary, Ms. Lea Sderman,
for her kind help in practical issues, and Mr. Jyri Huopaniemi and Mr. Martti Rahkila for
their assistance in questions associated with computer networks and file formats. Jyri
read the manuscript of this thesis and gave me several useful comments. I wish to thank
Mr. Timo Kuisma with whom I had fruitful discussions on the nature of deinterpolation.
We also worked together on two conference papers. Lic. Tech. Toomas Altosaar has
proof-read many of my conference papers and has helped me to improve my English
writing. I am deeply indebted to him for this. At the Acoustics Laboratory, I have had
many interesting discussions with experienced scientists, such as Dr. Paavo Alku, Mr.
Juha Backman, and Dr. Unto K. Laine, which I gratefully acknowledge here. Paavo read
the manuscript and helped me with his insightful comments. My special thanks go to Mr.
Tommi Huotilainen, with whom I share an office room at the Acoustics Lab, for being a
tough opponent in the game of squash.
I wish to thank the pre-examiners of my thesis, Assoc. Prof. Olli Simula (Laboratory
of Computer and Information Science, HUT) and Assoc. Prof. Julius O. Smith
(CCRMA, Stanford University, California, USA) for their positive feedback and profound comments. I am especially happy that I have had the opportunity to discuss interesting issues related to physical modeling of musical instruments with Dr. Smith several
times during the years 19911995 all over the world (e.g., in Finland, Sweden,
Denmark, Japan, Italy, USA, Canada, and the Internet).
I would like to express my gratitude to Prof. Timo I. Laakso (SEMSE, University of
Westminster, London, U.K.) who has played an important role in this study: with him I
have written numerous papers that are related to this thesis. Furthermore, Timo gave me
valuable comments on my manuscript thus helping me to improve my text.
I also want to thank Prof. Iiro Hartimo (Laboratory of Signal Processing and
Computer Technology, HUT) who read an early version of my manuscript and gave me
useful feedback. Mr. Luis Costa worked hard to find language deficiencies and typing
mistakes in the manuscript. I want to thank him for his diligence. I learned a lot from his
comments.
Dr. JeanMarc Jot (IRCAM, Paris, France) and I had some inspiring e-mail discussions on time-varying digital filters. Mr. Stephan Tassart (IRCAM, Paris, France) has
amused me with theoretically interesting questions related to Lagrange interpolation. I
7

have had some friendly debates with Mr. Maarten van Walstijn (Royal Conservatory, the
Hague, the Netherlands) via e-mail, and also in Ferrara, Padua and Venice, about fractional delays and conical waveguides. I want to express my gratitude to these fine persons for their interest in my work.
I have had the possibility to write a journal paper with Prof. Tapio Tassu Takala
(Laboratory of Information Processing Technology, HUT). I thank him for fluent cooperation and friendship. I wish to thank my coauthor Dr. Jonathan Mackenzie (SEMSE,
University of Westminster, London, U.K.) for innovative co-operation. Mr. Rami
Hnninen (Laboratory of Information Processing Technology, HUT) has recently started
implementing a woodwind instrument model that employes fractional delay techniques
presented in this work. Discussions with him have helped me to understand how to better
describe some of the filter structures. My sincere thanks go also to Mr. Sami Linnainmaa
(HUT) for his clever comments on the Thiran allpass filter. I am grateful to Mr. Martti
Vainio and Mr. Olli Salmensaari for their help in getting hold of some interesting references that appeared to be difficult to find.
I thank my good friend, mathematician, poet, and rock musician, Timo Kojonen, for
trying to weed out inaccurate mathematical notations in my work and especially for great
company in the evenings. I would also like to thank my fiance Miia Jaatinen for love and
understanding also when I have been under pressure. Luckily enough, she is still not
showing any particular interest in digital signal processing, and has always been successful in making me think of issues other than work or this thesis.
Finally, my warmest thanks go to my parents, Anneli and Pekka Vlimki, who have
constantly encouraged me to study as much as possible, and given me the opportunity to
do so.
The financial support of the Academy of Finland, the Finnish Cultural Foundation,
and the Jenny and Antti Wihuri Foundation is acknowledged.

Westend, Espoo
November 30, 1995

Vesa Vlimki

Table of Contents
List of Symbols ............................................................................................................. 12
List of Abbreviations ................................................................................................... 15
1 Introduction ............................................................................................................. 17
1.1 Historical Notes.................................................................................................. 17
1.2 Overview of the Work........................................................................................ 20
1.3 Contributions of the Author ............................................................................... 21
1.4 Related Publications........................................................................................... 22
2 Digital Waveguide Modeling .................................................................................. 25
2.1 Waveguide Modeling of Strings ........................................................................ 25
2.1.1 Wave Equation and the Traveling Wave Solution .................................. 25
2.1.2 Incorporating Losses and Dispersion ...................................................... 26
2.1.3 KarplusStrong Algorithm and Its Extensions........................................ 27
2.1.4 Calibration of the Waveguide String Model ........................................... 29
2.1.5 Excitation Signal of the Waveguide String Model.................................. 32
2.1.6 Discussion ............................................................................................... 33
2.2 Waveguide Modeling of Cylindrical Tubes ....................................................... 34
2.2.1 The One-Dimensional Wave Equation.................................................... 35
2.2.2 A Model Composed of Coupled Uniform Tube Sections ....................... 35
2.2.3 KL Scattering Junction for Volume Velocity Waves.............................. 36
2.2.4 KL Scattering Junction for Pressure Waves ............................................ 38
2.2.5 Boundary Conditions of a Cylindrical Tube ........................................... 39
2.2.6 General Properties of the Cylindrical Tube Model ................................. 40
2.2.7 Sparse Tube Model.................................................................................. 41
2.2.8 Extensions to the Basic KL Model .......................................................... 42
2.3 Junction of Three Acoustic Tubes ..................................................................... 42
2.3.1 Three-Port Scattering Junction for Volume Velocity.............................. 43
2.3.2 Three-Port Scattering Junction for Sound Pressure ................................ 44
2.3.3 Side Branch in a Uniform Tube .............................................................. 45
2.4 Waveguide Modeling of Conical Tubes ............................................................ 48
2.4.1 Traveling Wave Solution for Spherical Waves ....................................... 49
2.4.2 Junction of Conical Tube Sections .......................................................... 50
2.4.3 Reflection Function with Taper Discontinuity Only ............................... 52
2.4.4 Closed and Open End of a Truncated Conical Tube ............................... 53
2.4.5 Discrete-Time Reflection and Transmission Filters................................ 54
2.4.6 Stability of the Reflection Filter .............................................................. 56
2.4.7 Modeling the Ends of a Conical Tube ..................................................... 56
2.4.8 Conclusions and Discussion .................................................................... 57
2.5 Waveguide Models for Wind Instruments ......................................................... 58
2.5.1 Need for Finger-Hole Models ................................................................. 59
2.5.2 Modeling of a Single Woodwind Finger Hole ........................................ 59
2.5.3 Discrete-Time Scattering Junction .......................................................... 61
2.6 Relation to Other Methods ................................................................................. 62
9

10

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


2.6.1 Comparison of Waveguide Filters and Wave Digital Filters .................. 63
2.6.2 Reverberation and Simulation of Room Acoustics ................................. 63
2.7 Conclusions........................................................................................................ 64

3 Fractional Delay Filters...........................................................................................65


3.1 Ideal Fractional Delay........................................................................................ 65
3.1.1 Continuous-Time System for Arbitrary Delay........................................ 65
3.1.2 Discrete-Time System for Arbitrary Delay............................................. 66
3.1.3 Fractional Delay and Shannon Reconstruction ....................................... 68
3.1.4 Characteristics of the Ideal Fractional Delay Element............................ 69
3.1.5 Conclusion and Discussion ..................................................................... 71
3.2 Design Methods for Fractional Delay FIR Filters ............................................. 71
3.2.1 Least Squared Integral Error Design....................................................... 72
3.2.2 LS Design with Reduced Bandwidth ...................................................... 75
3.2.3 Windowing the Ideal Impulse Response................................................. 75
3.2.4 General Least Squares FIR Approximation of a Complex Frequency
Response ................................................................................................. 76
3.2.5 Minimax Design of FIR FD Filters ......................................................... 79
3.2.6 Other Methods for the Design of FIR FD Filters .................................... 81
3.3 Maximally Flat FD FIR Filter: Lagrange Interpolation..................................... 82
3.3.1 Derivation of the Maximally Flat Fractional Delay FIR Filter ............... 82
3.3.2 Classical Approach ................................................................................. 84
3.3.3 Symmetry of the Lagrange Interpolation Coefficients ........................... 85
3.3.4 Computation of Coefficients for a Lagrange Interpolator ...................... 86
3.3.5 Windowing Approach to Lagrange Interpolation ................................... 87
3.3.6 Approximation Errors of Lagrange Interpolation ................................... 90
3.3.7 Farrow Structure of Lagrange Interpolation ........................................... 95
3.3.8 Related Polynomial Interpolation Techniques ...................................... 102
3.3.9 Conclusion and Discussion ................................................................... 103
3.4 Design Methods for Fractional Delay Allpass Filters ..................................... 105
3.4.1 Properties of the Discrete-Time Allpass Filter ..................................... 105
3.4.2 Design of Fractional Delay Allpass Filters ........................................... 106
3.4.3 Maximally Flat Group Delay Design of FD Allpass Filters ................. 107
3.4.4 Choice of Optimal Range for D in the Thiran Interpolator................... 110
3.4.5 Discussion ............................................................................................. 113
3.5 Time-Varying Fractional Delay Filters ........................................................... 114
3.5.1 Consequences of Changing Filter Coefficients..................................... 114
3.5.2 Implementation of a Variable Delay Using FIR Filters ........................ 114
3.5.3 Transients in Time-Varying Recursive FD Filters................................ 116
3.5.4 Transient Elimination Methods for Time-Varying Recursive Filters ... 117
3.5.5 A Novel and Efficient Method for Elimination of Transients .............. 121
3.5.6 Examples with a Time-Varying Thiran Allpass Filter .......................... 125
3.6 Conclusions...................................................................................................... 126

Table of Contents

11

4 Fractional Delay Waveguide Filters .................................................................... 128


4.1 Deinterpolation................................................................................................. 128
4.1.1 Introduction to Deinterpolation ............................................................. 128
4.1.2 Deinterpolation Versus Interpolation .................................................... 130
4.1.3 Implementing Deinterpolation Using an FIR Filter .............................. 131
4.1.4 Adding a Signal to a Fractional Point of a Delay Line Using an FIR
Deinterpolator........................................................................................ 133
4.1.5 Discussion ............................................................................................. 134
4.2 Fractional Delay Waveguide Systems ............................................................. 135
4.2.1 Variable-Length Digital Waveguide ..................................................... 135
4.2.2 Connecting Waveguides at Arbitrary Points ......................................... 136
4.3 Interpolated Waveguide Tube Model .............................................................. 137
4.3.1 Interpolated Scattering Junction ............................................................ 138
4.3.2 Implementing the FD Scattering Junction Using FIR Filters ................ 140
4.3.3 Transfer Functions of FIR Interpolators and Deinterpolators ............... 142
4.3.4 Choice of FIR FD Filter Design ............................................................ 144
4.3.5 Reflection and Transmission Functions ................................................ 145
4.3.6 Analysis of Errors Due to FD Approximation ...................................... 148
4.3.7 Simulation of a Two-Tube System Using FIR FD Filters..................... 150
4.3.8 Conclusion ............................................................................................. 153
4.4 Interpolated Finger-Hole Model ...................................................................... 153
4.4.1 FIR-Type Fractional-Delay Finger-Hole Model ................................... 154
4.4.2 Approximation Error Due to FD Filters ................................................ 155
4.4.3 Implementation Issues ........................................................................... 157
4.5 Fractional Delay Operations with Allpass Filters ............................................ 158
4.5.1 Implementing a Variable Delay Line Using an Allpass Filter .............. 158
4.5.2 Implementing Interpolation and Deinterpolation with Allpass Filters.. 159
4.6 Interpolated Waveguide Model: Allpass Filter Approach ............................... 160
4.6.1 Derivation of the Allpass Fractional Delay Junction ............................ 160
4.6.2 Approximation Errors due to an Allpass Fractional Delay Junction..... 162
4.6.3 Simulation of a Two-Tube Model with Allpass Filters......................... 163
4.6.4 Alternative Allpass Filter Structure....................................................... 165
4.7 Allpass Fractional Delay Finger-Hole Model .................................................. 166
4.8 Conclusions ...................................................................................................... 168
5 Conclusions and Further Research ..................................................................... 169
5.1 Summary and Conclusions............................................................................... 169
5.2 Suggestions for Future Work ........................................................................... 170
References ................................................................................................................... 172
Appendix A. Bandlimited Squared Integral Error of FIR FD Filters ................. 184
Appendix B. Effective Length of the Infinite Impulse Response .......................... 186
Index ............................................................................................................................ 193

List of Symbols
ak

allpass filter coefficients

vector of allpass filter coefficients

cross-sectional area

A(z)

allpass transfer function

area ratio

speed of sound

vector of transfer functions or of cosine functions

scaling coefficient

C(z)

transfer function of a subfilter in the Farrow structure

fractional part of the total delay D

total delay to be approximated

D(z)

denominator of A(z)

vector of complex exponentials

E(e

frequency-domain error function


frequency variable

f0

fundamental frequency

fN

Nyquist frequency ( f N = f s / 2)

fs

sampling frequency

square matrix

coefficient vector

G(z)

transfer function including the approximation error

h(n)

impulse response of an FIR filter

hid (n)

impulse response of the ideal interpolator

hr (n)

impulse response of the recursive part of an IIR filter

vector of FIR filter coefficients

H(z)

transfer function of a discrete-time filter

Hid (z)

the ideal (desired) transfer function

imaginary unit ( j = 1)

wave number (k = / c) or index variable

kn

slope of a discrete-time sequence at time index n

length in meters

length in meters or in samples (e.g., FIR filter length)

index variable

integer constant

discrete time index


12

List of Symbols
nc

time index of change in time-varying filters

order of a digital filter or an integer constant

Na

advance time (in samples) in the transient elimination method

sound pressure

volume velocity

reflection coefficient or radius

column vector (e.g., a column of matrix Q)

inverse matrix of Vandermonde matrix U

R( )

reflection function

vector of sine functions

time variable

tk

transmission coefficient of the kth branch in a junction of waveguides

sampling interval (T = 1 / f s )

T(z)

transmission function

transformation matrix

Vandermonde matrix

vk (n)

state variables of a discrete-time filter

column vector

v(n)

state variable vector

Vandermonde matrix

w(n)

discrete-time sequence (e.g., a window function)

W(z)

transfer function of w(n)

W( )

frequency response (e.g., a frequency-domain weighting function)

x(n)

input sequence of a discrete-time system

xc (t)

input signal of a continuous-time system

spatial sampling interval (X = cT)

X(z)

z-transform of the input sequence x(n)

y(n)

output sequence of a discrete-time system

yc (t)

output signal of a continuous-time system

acoustic admittance

Y(z)

z-transform of the output sequence y(n)

z-transform variable

column vector of powers of z 1

acoustic impedance

13

14

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

real parameter (e.g., passband width parameter)

( )

phase error function in allpass filter design

complementary fractional delay ( = 1 d)

(k)

Kronecker delta function

complementary total delay ( = N D)

mass density of a string

( )

phase response

id ( )

phase response of the ideal interpolator

delay

g ( )

group delay

p ( )

phase delay

density of air

angular frequency

sampling frequency in radians ( s = 2f s )

analog angular frequency ( = 2f )

List of Abbreviations
CFD

complementary fractional delay

DF

direct form

DFT

discrete Fourier transform

DSP

digital signal processing

DTFT

discrete-time Fourier transform

FD

fractional delay

FDWF

fractional delay waveguide filter

FIR

finite impulse response

FM

frequency modulation

GLS

general least squares

IIR

infinite impulse response

IR

impulse response

KL

KellyLochbaum

KS

KarplusStrong

LS

least squares

MF

maximally flat

PPI

piecewise parabolic interpolation

SISO

single-input-single-output

STFT

short-time Fourier transform

TLI

tangent-line interpolation

WDF

wave digital filter

WGF

waveguide filter

ZZ

ZetterbergZhang

15

1 Introduction
The theme of this work is computational modeling of acoustic tubes. The models are
intended for use in sound synthesizers based on physical modeling. Such synthesizers
can be used for producing realistic sounds of woodwind instruments or the human
voice.
The technique used throughout this work is called digital waveguide modeling. The
term digital waveguide was proposed by Smith (CCRMA, Stanford University) who
has had a prominent role in the development of the theory of physically based sound
synthesis (Smith, 1987, 1992b, 1995a). The digital waveguide synthesis technique is
well suited to modeling of one-dimensional resonators, such as a narrow acoustic tube,
a vibrating string, or a thin bar.

1.1 Historical Notes


Methods similar to digital waveguide modeling have been used in digital sound synthesis for more than 30 years (Kelly and Lochbaum, 1962; Flanagan, 1972, pp. 272276).
They were first applied to articulatory speech synthesis which means production of
human-like sounds by emulating the geometry and movements of articulators, such as
the tongue, the jaws, and the larynx. Nevertheless, another technique referred to as formant synthesis, has become more popular in commercial speech synthesizers. For
speech research, the articulatory approach is more attractive, since it is closely related to
the physiology and physics of speech production.
At the end of 1970s and in the early 1980s, related techniques were used for modeling string instruments, especially the violin (McIntyre and Woodhouse, 1979; Smith,
1983). The theory of waveguide modeling began to develop after Karplus and Strong
(1983) introduced their simple algorithm for synthesis of string instrument sounds.
Their methodthat computer music people have named the KarplusStrong algorithmwas found to be a simplified version of a more elaborate physical model for a
vibrating string (Smith, 1983).
Recently, physical modeling of musical instruments has become one of the most
active areas of musical acoustics research. Some scientists work in this field primarily
to understand the working principles of musical instruments (e.g., McIntyre and
Woodhouse, 1979) while others aim at developing a model-based sound synthesizer
(e.g., Jaffe and Smith, 1983; Karjalainen and Laine, 1991; Smith, 1992b, 1995b).
The methods for physical modeling can be divided into five categories (see, e.g.,
Vlimki and Takala, 1995):
1) source-filter modeling,
2) numerical solving of partial differential equations,
17

18

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


3) vibrating mass-spring networks,
4) modal synthesis, and
5) waveguide synthesis.

These approaches are briefly reviewed in the following.


The source-filter method is included here because it can, in some cases, be interpreted as a physical modeling technique. When the system is divided into the
excitation and resonator parts according to its physics, there is no argument about
this being a physical model. An example is the separated speech production model
where the volume velocity waveform generated by the vibrating vocal folds acts as a
source and the vocal tract is the filter. This approach has been used for synthesizing
speech (Fant, 1960) and singing (see, e.g., Rodet, 1984 or Rodet et al., 1984). In some
musical instruments it is also easy to separate the source and the filter. For example, in
many percussion instruments, the excitation is a short impulse that is not affected by the
feedback from the resonator. However, most musical instruments are more complicated
systems than just a combination of two subsystems. For instance in wind instruments a
feedback from the resonator to the excitor is required, and in addition the interaction of
the excitation mechanism and the feedback signal is nonlinear.
The first attempts to use a physical model for generating sounds of musical instruments were made by Hiller and Ruiz (1971). They started with the differential equation
that governs the vibrations of a string and approximated this equation with finite differences. This technique is computationally very intensive. Even today real-time sound
synthesis with this approach using affordable hardware is out of the question. This line
of physical modeling has been continued by Chaigne, Askenfelt, and Jansson (1990).
(See also Chaigne, 1992; Chaigne and Askenfelt, 1994)
Later in the seventies another approach to physical modeling was taken in Grenoble,
France. A system called CORDIS simulates a musical instrument as a collection of
point masses that have certain elasticity and frictional characteristics (Cadoz et al.,
1984; Florens and Cadoz, 1991). This approach is closely related to the so called finite
element method (FEM) that is used in mechanical engineering to simulate the vibration
of structures. The object is divided into a large number of pieces in space, each piece is
connected to its neighbors with springs and microdampers, thus forming networks that
imitate the vibrating object. At the beginning of eighties a special-purpose processor
was built which enabled real-time sound synthesis based on a physical model (Cadoz et
al., 1984).
At IRCAM in Paris a third technique for physical modeling was developed (Adrien,
1991). It is called modal synthesis and is based on a representation of a vibrating structure as a collection of frequencies and damping coefficients of resonance modes, and
coordinates that describe the mode shapes. When the instrument is excited at some
point, this force excites some or all of the modes. An advantage of modal synthesis is
that analysis tools exist and they are not too laborious to use. Adrien (1991) points out
that modal analysis of a new object, say a violin body, only takes a couple of days. For
some simpler structures, such as a string, the model data can be computed in an analytical form.
In modal synthesis, one of the problems to be solved is where to take the output of
the model. At IRCAM a clever solution was found using the body of a real instrument
as the loudspeaker: in the case of the violin, for example, this means that the simulated

Chapter 1. Introduction

19

vibration of strings is fed to electrical shakers that are attached to a violin body, the
strings of which have been carefully damped (Adrien, 1991). This approach has the
very nice advantage that the radiational properties of the instrument need not be simulated. In virtual-reality applications, however, this practical trick is not amenable, since
all the vibrating structures must create numerical data to be used by other parts of the
virtual reality system. Recently, a new implementation of modal synthesis, called
Modalys, has been developed at IRCAM (Eckel et al., 1995).
In the first half of this decade, waveguide synthesis (or, waveguide modeling) has
turned out to be the most important of all the physical modeling methods (Smith,
1995b). This applies to both the academic and industrial communities: a majority of
recent advances in the theory of physical modeling have concerned digital waveguide
techniques, and all existing commercial physical modeling synthesizers utilize digital
waveguides. This methodology is also used in this work and is reviewed in detail in
Chapter 2.
The first commercial synthesizer to employ a physical model was the Yamaha VL-1
Virtual Acoustic Synthesizer that was introduced early in 1994. It was based on digital
waveguide modeling techniques. Recently, other electronic instruments following the
same guidelines have been released, and more are like to appear in the future. Physical
modeling is considered the most important novel digital sound synthesis technique in
the past decade. FM synthesis, the previous revolutionary technique, was introduced in
1973, and the first commercial product to employ itthe Yamaha DX-7was introduced in 1983. Nowadays, FM synthesizers along with sampling keyboards, are the
most widespread electronic musical instruments. Model-based synthesizers are bound to
become popular because they sound and behave as acoustic instruments and thus offer
more possibilities of expression than the earlier devices (Jaffe, 1995). A fascinating
future prospect is to build hybrid synthesizers employing physical modeling combined
with sampling and filtering techniques (Smith, 1995b).
Physical modeling is interesting also for research purposes. A physical model
enables independent adjustment of individual parameters of an instrument, such as the
mass or stiffness of a vibrating reed. This is typically very difficult with an acoustic
instrument. Thus a model can help to more clearly understand how an instrument can
and should be played and calibrated.
With a highly sophisticated model, it would be possible to design new acoustic
musical instruments, and these could be played before they are actually built.
Modification of existing instruments could also be tried out using such a system. Today
this is still a dream, since modern physical models that produce sound in real time are
greatly simplified.
The major strategy in the development of current waveguide synthesis models has
been to include only the most crucial mechanisms of the instrument and eliminate the
less significant ones. Properties of the human auditory system need to be considered,
since it is the human ear that finally judges whether the synthetic sound is satisfactory.
The rule of thumb is that those features of the instrument that do not bring any audible
improvement to the sound are discarded. Learning to do this also means that one has
understood the essential factors that make a musical instrument sound the way it does.

20

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

1.2 Overview of the Work


In this work, the objective is further development of waveguide models of acoustic
tubes. Models of tube systems that are constructed of sections of cylindrical or conical
tubes are treated. Consequently, the application that this study mainly aims at is modelbased sound synthesis of woodwind instruments and speech. Some of the methods can
as well be applied to waveguide models of brass instruments, strings, or bars.
In Chapter 2, the basic theory of digital waveguide modeling is summarized. The
principles of modeling vibrating strings and tubes are discussed. Since waveguide modeling of strings is already quite well established, the theory of a complete synthesis
model and an analysis technique are briefly described first. Thereafter, the classical
lossless tube model constructed of concatenated tube sections of different diameters is
examined. A step further is taken by considering modeling of a junction of three tubes
and also a junction of two piecewise conical tubes.
At the end of Chapter 2, relations of waveguide synthesis to other signal processing
techniques, especially to wave digital filters, are examined and former waveguide
models for synthesis of speech and musical instruments are reviewed.
Chapter 3 deals with fractional delay filters that are important building blocks of
waveguide models. Namely, the main task in waveguide models is to process time
delays that it takes for a sound signal to propagate from one point of the system to
another, e.g., the propagation time from one end of a tube to another. In digital systems,
signals are represented as samples or sequences of numbers. Thus a delay is simply
generated by storing the numbers in the computers memory and retrieving them after
the desired time. However, in uniformly sampled digital signal processing systems the
delay time is always a multiple of the shortest possible time, called the unit delay. This
delay is equal to the sample interval that is determined by the sampling rate of the system.
It is quickly discovered that time delays needed in a physical model are in general
not multiples of any finite unit, but they can take any real value. For this reason, the
concept of fractional delay is inevitable and is used for correcting the difference
between the desired time delay and the nearest multiple of the sample interval. In mathematical terms, a fractional delay is produced by using interpolation.
Chapter 3 reviews several methods for designing digital interpolating filters. A
choice that commonly has to be made in digital signal processing problems is whether
to use recursive (IIR) or nonrecursive (FIR) digital filters. In Chapter 3 both approaches
are reviewed, and the message to a DSP engineer is that in both cases the maximally flat
error criterion yields an interpolating filter suitable for audio signal processing. In the
case of recursive filters, an allpass filter with a maximally flat group delay approximation is a good choice while in the case of nonrecursive filters, a maximally flat frequency response approximation (taking into account both magnitude and phase) is preferred. The argument here is that the approximation error due to maximally flat approximation can be forced to be small at low frequencies which are in many audio applications perceptually more important than details at high frequencies.
In most fractional delay systems the delay must be changed during operation. For
this reason it is important to consider time-varying fractional delay filters. Time-varying
filters are not as simple to use as time-invariant (fixed) filters. Especially with recursive
digital filters, transients will appear every time the coefficients of the filter are changed.

Chapter 1. Introduction

21

At the end of Chapter 3, a method to suppress these transients caused by changing the
coefficients is introduced.
Chapter 4 discusses the possibilities that the use of fractional delays offer. In short,
fractional delays are needed for two purposes in waveguide models. First, they are used
for tuning the total length of the waveguide system. In musical instrument models this is
equivalent to tuning. The fundamental frequency of a one-dimensional resonator is
determined by the time it takes for a sound signal to travel from one end of it to the
other. Obviously, this delay has to be very precisely modeled. Otherwise the waveguide
synthesizer sounds out of tune. In vocal tract models, the length of the digital waveguide determines the formant frequencies.
Second, fractional delays are required for simulation of waveguides where some
change in the medium takes place at an arbitrary point along the waveguide. For example, in a model of the human vocal tract the different vowels are generated by changing
its profile. This is achieved by moving the tongue, jaws or other articulators. Hence the
exact location where the change takes place is very significant because it determines
how the voice sounds.
Chapter 4 first presents a new signal processing technique, deinterpolation. Using
both interpolation and deinterpolation operations, a vocal tract model can simulate a
change of profile at any given point. A new feature of the proposed model is that
thanks to fractional delaysit is possible to independently vary the length of each tube
section.
Furthermore, tube systems with side branches are discussed. A practical example of
an acoustic tube system with a side branch is the human vocal tract with both oral and
nasal tracts included. Another example is a finger hole in the side of a woodwind bore.
These systems can be modeled with digital waveguides employing fractional delay filters. Structures for both FIR and IIR fractional delay waveguide filters are presented.
Errors due to the fractional delay approximation are examined.
Finally, Chapter 5 summarizes the results of this work and provides suggestions for
future research.

1.3 Contributions of the Author


The principal innovation in this work is the interpolated scattering junction described in
Chapter 4. The fractional delay filters are combined with the waveguide filter structure
so that a new kind of discrete-time filter structure, that has been called a fractional
delay waveguide filter, is formed. The author had an essential role in the development
of this framework.
The author is responsible for the definition of the inverse operation to interpolation,
which was named deinterpolation in January 1993. This new concept simplifies the
discussion of fractional delay waveguide systems. Especially in FIR fractional delay
waveguide filters, deinterpolation helps in the systematic construction of new structures.
However, its usefulness in other fields of signal processing still remains to be examined.
More new results developed by the author follow directly from the previous ones:
several novel filter structures involving FIR and IIR-type fractional delay filters are
described in Chapter 4. They all are physically meaningful since they are discrete-time
simulations of a junction of two or three acoustic tubes. The acoustic tubes considered
in this work are either cylindrical or conical.

22

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

In Chapter 2 of this work, modeling of vibrating strings is also discussed as an introduction to the topic of waveguide synthesis. The author has contributed to the development and implementation of the string models. He also developed and programmed the
analysis methods described in Section 2.1.4 and 2.1.5.
The design methods for fractional delay filters described in Chapter 3 were first
described in a report co-authored by this author (Laakso et al., 1994). The author has
made considerable contributions to this work describing the maximally flat design
techniques for both FIR and allpass filters (see Sections 3.3 and 3.4 of this work). He
also developed the Farrow structure for Lagrange interpolation (see Section 3.3.7).
Furthermore, the author has invented a new technique for minimizing the transients
caused by time-varying filter coefficients in recursive filters (Section 3.5.5). This problem is relevant not only to this work but arises in every DSP application where filter
coefficients need to be changed during the filtering operation. Here this technique is
utilized with allpass fractional delay filters, but the same theory and algorithm are
applicable to any recursive filter. The determination of the parameter Na (the advance
time of the transient eliminator), based on the cumulated energy of the recursive filters
impulse response, is a result of cooperations with Prof. Laakso. He initialized this work
but the author derived many of the results.
To conclude, in this work the author presents a new framework for simulation of
physical systems. His main message is that it is not necessary to suffer from discretization of time delays inside a discrete-time system, such as a digital waveguide model,
although its input and output signals are inevitably discretized in time. Consequently,
discretization of the spatial variable in simulation of wave propagation is not obligatory.
This limitation can be overcome using interpolation techniques described in this work.

1.4 Related Publications


Parts from the following publications which the author has contributed have been used
in this work:
Timo I. Laakso, Vesa Vlimki, Matti Karjalainen, and Unto K. Laine, Real-time
implementation techniques for a continuously variable digital delay in modeling musical instruments, in Proc. Int. Computer Music Conf. (ICMC92), San Jose, California,
October 1418, 1992, pp. 140141.
Vesa Vlimki, Matti Karjalainen, and Timo I. Laakso, Fractional delay digital filters,
in Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS93), Chicago, Illinois, May
36, 1993, vol. 1, pp. 355358.
Matti Karjalainen and Vesa Vlimki, Model-based analysis/synthesis of the acoustic
guitar, in Proc. Stockholm Music Acoustics Conf. (SMAC93), pp. 443447,
Stockholm, Sweden, July 28August 1, 1993. Published in October 1994. Sound examples are included on the SMAC93 CD.
Vesa Vlimki, Matti Karjalainen, and Timo I. Laakso, Modeling of woodwind bores
with finger holes, in Proc. Int. Computer Music Conf. (ICMC93), Tokyo, Japan,
September 1015, 1993, pp. 3239.

Chapter 1. Introduction

23

Vesa Vlimki, Matti Karjalainen, and Timo Kuisma, Articulatory control of a vocal
tract model based on fractional delay waveguide filters, in Proc. IEEE Int. Symp.
Speech, Image Processing and Neural Networks (ISSIPNN94), Hong Kong, April 13
16, 1994, vol. 2, pp. 571574.
Vesa Vlimki, Matti Karjalainen, and Timo Kuisma, Articulatory speech synthesis
based on fractional delay waveguide filters, in Proc. IEEE Int. Conf. Acoust., Speech,
Signal Processing (ICASSP94), Adelaide, Australia, April 1922, 1994, vol. 1, pp.
585588.
Vesa Vlimki and Matti Karjalainen, Digital waveguide modeling of wind instrument
bores constructed of truncated cones, in Proc. Int. Computer Music Conf. (ICMC94),
Aarhus, Denmark, September 1217, 1994, pp. 423430.
Vesa Vlimki and Matti Karjalainen, Improving the KellyLochbaum vocal tract
model using conical tube sections and fractional delay filtering techniques, in Proc.
1994 Int. Conf. Spoken Language Processing (ICSLP94), Yokohama, Japan,
September 1822, 1994, vol. 2, pp. 615618.
Timo I. Laakso, Vesa Vlimki, Matti Karjalainen, and Unto K. Laine, Crushing the
DelayTools for Fractional Delay Filter Design. Report no. 35, Espoo, Finland,
Helsinki University of Technology, Faculty of Electrical Engineering, Laboratory of
Acoustics and Audio Signal Processing, 46 pages + figures + 2 tables, October 1994. A
revised version entitled Splitting the unit delaytools for fractional delay filter
design has been accepted for publication in the IEEE Signal Processing Magazine, vol.
13, no. 1, January 1996.
Vesa Vlimki, Jyri Huopaniemi, Matti Karjalainen, and Zoltn Jnosy, Physical
modeling of plucked string instruments with application to real-time sound synthesis,
presented at the 98th AES Int. Convention 1995, preprint no. 3956, 52 p., Paris, France,
February 2528, 1995. A revised version has been submitted to the Journal of the Audio
Engineering Society, July 1995.
Vesa Vlimki, A new filter implementation strategy for Lagrange interpolation, in
Proc. IEEE Int. Symp. Circuits and Systems (ISCAS95), Seattle, Washington, April 29
May 3, 1995, vol. 1, pp. 361364.
Vesa Vlimki and Matti Karjalainen, Implementation of fractional delay waveguide
models using allpass filters, in Proc. IEEE Int. Conf. Acoustics, Speech, Signal
Processing (ICASSP95), Detroit, Michigan, May 912, 1995, vol. 2, pp. 15241527.
Vesa Vlimki, Fractional delay waveguide filters for modeling acoustic tubes, in
Proc. Second Int. Conf. Acoustics and Music Research (CIARM95), Ferrara, Italy, May
1921, 1995, pp. 4146.
Vesa Vlimki, Timo I. Laakso, and Jonathan Mackenzie, Elimination of transients in
time-varying allpass fractional delay filters with application to digital waveguide
modeling, in Proc. 1995 Int. Computer Music Conf. (ICMC95), Banff, Canada,
September 37, 1995, pp. 327334.

24

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Vesa Vlimki and Tapio Takala, Virtual musical instrumentsnatural sound with
physical models, submitted to Organised Sound, vol. 1, no. 1, April 1996 (special issue
on sounds and sources).
Parts of the present work have been published in the same or slightly modified form in
Vesa Vlimki, Fractional Delay Waveguide Modeling of Acoustic Tubes. Report no.
34, Espoo, Finland, Helsinki University of Technology, Faculty of Electrical
Engineering, Laboratory of Acoustics and Audio Signal Processing, 133 pages, July
1994.

2 Digital Waveguide Modeling


The simulation technique used in this work is called digital waveguide modeling. This
methodology has recently gained much interest in musical acoustics and computer
music, since it is suitable for developing physical models of musical instruments that
can be efficiently implemented. This chapter reviews the principles of waveguide modeling and discusses related approaches.
The aim in digital waveguide modeling is to design a discrete-time model that
behaves like a physical system. The theory of digital waveguide modeling has been
primarily developed by Smith (Smith, 1987, 1992b, 1993a, 1995a). This technique is
especially well suited to simulation of one-dimensional resonators, such as a vibrating
string, a narrow acoustic tube, or a thin bar. The method was, however, first applied to
artificial reverberation using delay line networks (Smith, 1985) and only after that to the
synthesis of wind and string instruments (Smith, 1986).
A vibrating string is one of the traditional topics of research in musical acoustics, and
waveguide modeling has been applied to this case. In the following we first examine the
foundations of waveguide modeling using the string as an example. The motivation for
this is that the waveguide string models are nowadays highly developed: the synthesis
and also analysis algorithms are studied. The simplest model for a vibrating stringthe
KarplusStrong algorithmand its extensions are considered. A large part of this
chapter concentrates on waveguide modeling of acoustic tubes which are the main
application area of this work. The final sections discuss relations of waveguide modeling to wave digital filters and to filter structures that are used in digital reverberation.

2.1 Waveguide Modeling of Strings


The most important single idea in waveguide modeling is that approximate traveling
wave solutions of wave equations encountered in musical instruments can be used for
time-domain simulation.
2.1.1 Wave Equation and the Traveling Wave Solution
The following exposition has been adapted from Smith (1992b). The one-dimensional
wave equation for an ideal vibrating string is given by
K

2y
2y
= 2
t 2
x

(2.1)

where t is time, x is distance, y(t, x) is the string displacement, K is the string tension,
and e is the mass density of the string. The general solution of the wave equation was
published by dAlembert in 1747 and is given by
25

26

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


y + (n, m)
z-1

z-1

z-1

z-1

z-1

y(n,m)
z-1

y (n,m )

Fig. 2.1 A digital waveguide consists of two delay lines where the signal components
travel in opposite directions. The amplitude of the signal at any point (n, m) is
obtained summing the two components.
y(x,t) = y + (x - ct) + y - (x + ct)

(2.2)

where y + (x - ct) and y - (x + ct) are two arbitrary twice-differentiable functions that
represent waves traveling in opposite directions with speed c = K / e .
When the traveling waveforms are bandlimited lowpass signals, it is straightforward
to sample the traveling wave solution of Eq. (2.2). This yields
y(n, m) = y + (n, m) + y - (n, m)

(2.3)

Here we have discretized both the temporal variable t and the spatial variable x by substituting
t = nT

for n = 0,1,2,K

x = mX

for m = 0,1,2,K

(2.4)

where T is the sampling interval, i.e., the inverse of the sampling rate, and X is the spatial sampling interval which is related to T as
X = cT

(2.5)

where c is the speed of sound (c 340 m/s).


Equation (2.3) can be interpreted as a bidirectional delay line where a fixed sampled
waveform y + (n, m) travels in the positive direction and y - (n, m) in the negative direction, as illustrated in Fig. 2.1. The amplitude of the signal at any point is obtained by
summing the values of the components at that point, as indicated by Eq. (2.3). This
structure is called a digital waveguide (see Fig. 2.1). Usually the spatial index m is not
used since it depends on the temporal index n.
2.1.2 Incorporating Losses and Dispersion
Above we have considered modeling of a lossless ideal string. This kind of a string is
purely hypothetical but it has several properties in common with a realistic string. Smith
(1990) has pointed out that the wave equation including frequency-dependent losses and
dispersion can also be solved analytically. This results in a modified traveling wave
solution that can be sampled and used for waveguide simulation.
Resistive losses (i.e., yielding terminations of the string, drag of air, and internal
friction of the material) add to the wave equation a term that is proportional to y / t ,

Chapter 2. Digital Waveguide Modeling

27

i.e., the time-derivative of displacement. Smith (1990) has shown that this kind of distributed losses can be modeled by a real gain factor associated with each unit delay of
the waveguide. The multipliers between each input and output point can be commuted
and combined thus preserving the computational efficiency of the waveguide system.
Frequency-dependent losses, such as those due to the bridge where the strings are
attached, can be modeled by placing a linear digital filter between each unit delay element of the waveguide. These elements may also be commuted and combined so that
between each input-output pair the total loss is modeled correctly. The computational
load of the commuted system may be two or three orders of magnitude less than that of
the straightforward one.
Physical systems are always passive and stable. In waveguide simulation it is of
paramount importance to preserve the passive nature of the real-world system. A
waveguide model includes a feedback loop, and if the loop gain exceeds unity at any
frequency, the system becomes unstable. Hence, digital filters to be included in a digital
waveguide have to be designed so that their magnitude response is equal to or less than
unity at all frequencies. This perspective has to be remembered when any linear systemsuch as a fractional delay filteris added to a waveguide system.
2.1.3 KarplusStrong Algorithm and Its Extensions
Karplus and Strong (1983) proposed a simple algorithm for the synthesis of plucked
string sounds. The method is based on the idea that a wavetable, i.e., a table containing
a sampled waveform of an audio signal, is modified while it is read. This technique was
found (Smith, 1983) to be a special case of a string simulation studied by McIntyre,
Woodhouse, and Schumacher (McIntyre and Woodhouse, 1979; McIntyre et al., 1983),
and it was extended with these ideas in mind by Jaffe and Smith (1983). The Karplus
Strong (KS) algorithm is an important predecessor of current waveguide models
because it established the usefulness of highly simplified cases and recently it has led to
quite detailed string instruments models (see, e.g., Karjalainen et al., 1993; Karjalainen
and Vlimki, 1993; Smith, 1993; Vlimki et al., 1995a).
The block diagram of the KS algorithm is shown in Fig. 2.2. The system is a recursive comb filter. The delay line (or wavetable) is initialized with white noise. The output of the delay line is fed to a lowpass filter that is called the loop filter. The filtering
result is the output of the system and it is also fed back to the delay line.
This technique does not produce a purely repetitive output signal as does the basic
wavetable synthesis (sampling) technique. Namely, those frequency components of the
random signal that coincide with the resonances of the comb filter will attenuate slower
than the other components. Thus, the signal in the delay line progressively turns into a
pseudo-periodic signal with a clearly perceivable fundamental frequency.
In the original KS model, the loop filter Hl (z) is a two-tap averager which is easy to
implement without multiplications. This simple filter, however, cannot match the frequency-dependent damping of a physical string. We feel that an FIR filter is not suited
to this purpose since its order should be rather high to match the desired characteristics.

Delay Line

Loop Filter

Output

Fig. 2.2 The KarplusStrong algorithm.

28

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Input

Pluck Point
Equalizer

String
Model

x(n)

P(z)

S(z)

Output
y(n)

Fig. 2.3 The generalized KarplusStrong model consists of a string model S(z) cascaded with a plucking point equalizer P(z).
Instead we suggest the use of an IIR lowpass filter for simulating the damping characteristics of a physical string. From a physical point of view, it seems well motivated to
use a second or third-order all-pole filter as a loop filter (Fletcher and Rossing, 1991). It
is meaningful to use the simplest satisfying loop filter since one such filter is needed for
the model of each string. Also, the loop filter is not a static filter but its coefficients
should be changed as a function of the string length and other playing parameters. This
is easiest to achieve using a low-order filter.
Furthermore, it is important to remember that the loop filters magnitude response
must not exceed unity at any frequency since otherwise the system becomes unstable.
When the magnitude response of the filter is less than unity at all frequencies the resulting sound will gradually attenuate. The timbre will also vary over time if the magnitude
response is not flat, because the harmonics of the signal will attenuate at different rates.
Dispersion can be brought about using a filter with a nonlinear phase response.
We have found that a reasonable approximation for the frequency-dependent damping in a string may be obtained using a first-order all-pole filter
Hl (z) = g

1 + a1
1 + a1z -1

(2.6)

where g is the gain of the filter at w = 0, and a1 is the filter coefficient that determines
the cutoff frequency of the filter. For Hl (z) to be a stable lowpass transfer function, we
require that 1 < a1 < 0. The numerator 1 + a1 scales the frequency response (divided
by g) at w = 0 to unity thus allowing control of the low-frequency gain using the coefficient g. We require that g 1.
A comb filter P(z) may be cascaded with the string model to cause the filtering effect
due to the plucking position (Jaffe and Smith, 1983). Its transfer function is
P(z) = 1 + z - M Rf (z)

(2.7)

where Rf (z) is the reflection filter of one end of the string model and M is the delay (in
samples) between the excitation point in the upper and lower line of the original
waveguide model. In practice Rf (z) in (2.7) need not be modeled very accurately since
its magnitude response is only slightly less than unity at all frequencies. In practice the
most remarkable difference between the direct and reflected excitation is the inverted
phase and thus the filter can be replaced by a constant multiplier rf = 1 + e where e is
a small nonnegative real number. This modified system that we call a generalized KS
model is illustrated in Fig. 2.3.
One of the major problems of the basic KarplusStrong algorithm is that the fundamental frequency of the tone cannot be accurately controlled. The fundamental frequency f 0 of the synthetic signal is determined by the length of the delay line, that is

Chapter 2. Digital Waveguide Modeling


x P (n)

29

y(n)
Hl (z)

F(z)

z- LInt

Fig. 2.4 The block diagram of the generalized KarplusStrong model S(z) for a vibrating string. The filter F(z) is an FIR or allpass-type fractional delay filter, LInt
is the length of the delay line (in samples), and Hl (z) is the loop filter that
accounts for losses and dispersion. The input signal x P(n) has been filtered
with the pluck point equalizer (see, Fig. 2.3).

f0 =

fs
M +1/ 2

(2.8)

where f s is the sampling rate, M is the length of the delay line in samples, and 1/2
refers to the phase delay (or, equivalently, the group delay) of the two-point averaging
filter. It is seen that Eq. (2.8) is a function of the integer-valued variable M. Thus the
fundamental frequency is quantized and an arbitrary pitch cannot be produced.
The length of the delay line loop, and consequently the fundamental frequency f 0 of
the synthetic signal, can be accurately controlled by adding a fractional delay filter in
the feedback loop of the KarplusStrong algorithm. The block diagram of the string
model S(z) is shown in Fig. 2.4. Jaffe and Smith (1983) introduced a first-order allpass
filter for producing the desired fractional delay. Later, a linear interpolator (Sullivan,
1990) and a third-order Lagrange interpolator (Karjalainen and Laine, 1991) have been
used for this task. Chapter 3 of this work discusses various digital filter approximations
for the fractional delay.
2.1.4 Calibration of the Waveguide String Model
The synthesis system includes three parts that completely determine the character of the
synthetic sound: the string model, the plucking point equalizer, and the input sequence.
This implies that to calibrate the model to some particular instrument it is needed to
estimate the values for the length of the delay line L, the coefficients of the loop filter
Hl (z), the delay M and the parameter rf of the plucking-point equalizer, and the input
signal x(n). In this section, parameter estimation procedures are described that will
extract these values.
The delay length L (in samples) determines the fundamental frequency f 0 of the
synthetic signal as follows
f0 =

fs
L

(2.9)

where f s is the sample rate. For pitch detection we have used a well-known method
based on the windowed autocorrelation function (Rabiner, 1977). An estimate for the
pitch is obtained by searching for the maximum of this function for each frame. Median
filtering may be used to remove erroneous results from the pitch estimate sequence. We
have successfully used three-point median filtering for this purpose.
The pitch of a plucked string tone is usually a function of time, decreasing slightly
after the attack and reaching a steady value within about 0.5 s. The problem then is to

30

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

determine the best estimate f0 for the fundamental frequency in a perceptual sense. In
practice a good solution is to use the average pitch value after some 500 ms as the
nominal value, since it is important to have a reliable pitch estimate towards the end of
the note. This improves the quality of synthesized tones. When the estimate f0 for the
fundamental frequency has been chosen, the effective length L of the delay line can be
computed as
L=

fs

f0

(2.10)

When also the loop filter has been designed, we can subtract its phase delay t H (w )
(at the estimated fundamental frequency) from L and define the nominal fractional delay
Lf as
Lf = L - t H (w0 ) - L - t H (w0 )

(2.11)

where w0 = 2pf0 , and is the floor function [see Eq. (3.12)]. The delay Lf is used as
the desired delay in the design of the fractional delay filter F(z). In practice the phase
delay of this filter can be expressed as t f (w0 ) = Mf + Lf , where Mf Z + is the integral part of the phase delay that depends on the order of the fractional delay filter and
Lf is the fractional part.
Next we discuss how to measure the damping of the string as a function of frequency . All losses including the reflection at the two ends of the string are incorporated in the loop filter Hl (z). In order to design a loop filter we need to estimate the
damping factors at the harmonic frequencies of the sound signal. This is achieved using
the short-time Fourier transform (STFT) and tracking the amplitude of each harmonic.
The STFT of signal y(n) is a sequence of discrete Fourier transforms (DFT) defined
as
Y m (k) =

N -1

w(n)y(n + mH)e- jw n
k

for m = 0, 1, 2,K

(2.12a)

n=0

with

wk =

2pk
N

for k = 0, 1, 2,K , N - 1

(2.12b)

where N is the length of the DFT, w(n) is a window function, and H is the hop size or
time advance (in samples) per frame. In practice we compute each DFT using the FFT
algorithm. To obtain a suitable compromise between time and frequency resolution, we
use a window length of four times the period length of the signal. The overlap of the
windows is 50% implying that H is 0.5 times the window length. We apply excessive
zero-padding by filling the signal buffer with zeros to reach N = 4096 in order to
increase the resolution in the frequency domain. The spectral peaks corresponding to
harmonics can be found from the magnitude spectrum by first finding the nearest local
minimum at both sides of an assumed maximum. The largest magnitude between these

Smith (1993b) has independently developed a very similar analysis method for the waveguide string
model.

Chapter 2. Digital Waveguide Modeling

31

two local minima is assumed to be the harmonic peak. The estimates for the frequency
and magnitude of the peak are fine-tuned by applying parabolic interpolation around
the local maximum and solving for the value and location of the maximum of this interpolating function. For details, see Smith and Serra (1987) or Serra (1989). The number
of harmonics to be detected is typically Nh < 20 (for the acoustic guitar). The STFT
analysis is applied to the portion of the signal that starts before the attack and ends after
some 0.51 s after the attack.
The spectral peak detection results in a sequence of magnitude and frequency pairs
for each harmonic. The sequence of magnitude values for one harmonic is called the
envelope curve of that harmonic. A straight line is matched to each envelope curve on a
dB scale since ideally the magnitude of every harmonic should decay exponentially,
i.e., linearly on a logarithmic scale. Measurements show that this idealized case is rarely
met in practice, and many different kinds of deviations are common. It is possible to
decrease the error in the fit by starting the fit at the maximum of the envelope curve and
by terminating it before the data gets mixed with the noise floor, e.g., after the decay of
about 40 dB. As a result, a collection of slopes b k for k = 1, 2, K, Nh is obtained. The
corresponding loop gain of the string model at the harmonic frequencies is computed as
Gk

bk L
20
= 10 H

for k = 1, 2,K, Nh

(2.13)

where b k are the slopes of the envelopes, and H the hop size. The sequence Gk determines the prototype magnitude response of the loop filter Hl (w ) at the harmonic frequencies kf 0 , where k = 1, 2, K, Nh .
The desired phase response for Hl (w ) can be determined based on the frequency
estimates of the harmonics. We do not, however, try to match the phase response of the
filter. There are two principal reasons for this. First, we believe that it is much more
important to match the time constants of the partials of the synthetic sound than the frequencies of the partials since quite small deviations from harmonic frequencies occur in
the case of plucked string instruments. In case we wanted to resynthesize strongly
inharmonic tones or a piano with properly stiff strings, the phase response of the loop
filter should be considered. The second reason is that as we want to use a low-order
loop filter, the complex approximation of the frequency response would not be very
successful. Thus we restrict ourselves to magnitude approximation only in the loop filter design.
The human ear is sensitive to the change of decay rate of a sinusoid, and in practice
we measure the time constant of some 20 lowest harmonics with the object of matching
the frequency response of the loop filter Hl (w ) to that data. Since we use a first-order
loop filter, it is clear that there are more restrictions than unknowns in this problem, and
only an approximate solution is possible. We have decided to use weighted least
squares design (Vlimki et al., 1995a).
It is reasonable to choose a weighting function W(Gk ) that gives a larger weight to
the errors in the time constants of the slowly decaying harmonics since the hearing
tends to focus on their decay. A candidate for such a weighting function is
W(Gk ) =

1
1 - Gk

(2.14)

32

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

We require that 0 Gk < 1 for all k, which is a physically reasonable assumption since
the system to be modeled is passive and stable.
The gain of the loop filter at low frequencies, i.e., g, can be chosen based on the loop
gain values of the lowest harmonics. In many cases it is good enough to set g = G0
(Vlimki et al., 1995a). The value for the coefficient a1 that minimizes the error function can then be found easily by differentiating the error function with respect to a1 and
finding the zero of the derivative. In practice we find a near-optimal solution in the following way: the value of the derivative is evaluated and, depending on the sign of the
result, a1 is changed by a small increment, the derivative is evaluated again, and so on.
After the derivative has reached a very small value, the iteration is terminated and the
final value for a1 is used in the synthesis model.
We have verified the convergence of this design procedure in practice by analyzing
signals generated using the synthesis model. The loop filter designed based on the analysis data yields the same filter parameterswithin numerical accuracyas used in the
synthesis. Also, matches with natural tones have been found to be satisfactory in many
cases (Vlimki et al., 1995a).
It is well understood that when a string is set into to vibration by plucking it, the
sound signal will lack those harmonics that have a node at the plucking point (see, e.g.,
Fletcher and Rossing, 1991). However, in general, the string is not plucked exactly at
the node of any of the lowest harmonics, and since the amplitude of the higher harmonics is considerably small anyway, it is not always possible to accurately detect the
plucking point by simply searching for the lacking harmonics in the magnitude spectrum. Another practical problem may occur because of nonlinear behavior of the string.
Namely, the amplitude of vibration of a weak harmonic can gain energy from other
modes so that its amplitude begins to rise reaching a maximum about 100 ms after the
attack and then begin to decay (Legge and Fletcher, 1984). This can often be seen in the
analysis of harmonic envelopes of the guitar. For these reasons we believe that a more
comprehensive understanding of the effect of the plucking point can be achieved by
studying the time-domain behavior of the string in terms of the short-time autocorrelation function. A frequency-domain technique for estimating the plucking point has been
reported by Bradley et al. (1995).
Estimation of the plucking position is an inherently difficult problem since a
recorded tone can include contributions of several delays of approximately the same
magnitude, e.g., early reflections from objects near the player, such as the floor, the
ceiling, or a wall. To minimize the effect of these additional factors, we suggest analysis
of tones recorded in an anechoic chamber.
It is, however, not absolutely obligatory to estimate and model the effect of the
plucking position. Its contribution can be left in the excitation signal obtained using
inverse filtering, which is described in the following section.
2.1.5 Excitation Signal of the Waveguide String Model
The input signal x(n) of the waveguide string model can be estimated using inverse filtering, i.e., filtering of the signal y(n)which is now assumed to be the output of the
modelwith the inverted transfer function S -1 (z) of the string model. The transfer
function of the string model shown in Fig. 2.4 can be expressed as

Chapter 2. Digital Waveguide Modeling


S(z) =

1- z

33

- LInt

1
F(z)Hl (z)

(2.15)

where LInt is the length of the delay line (in samples), F(z) is the transfer function of the
fractional delay filter, and Hl (w ) is the transfer function of the loop filter. If S(z) had
zeros outside the unit circle, the inverse filter S -1 (z) would be unstable. In that case,
inverse filtering would not yield an acceptable result. Fortunately, this is not a problem
in practice as can be seen by substituting (2.6) into (2.15). Inverting this equation yields
1 + a1z -1 - g(1 + a1z -1 )z - LI F(z)
S (z) =
1 + a1z -1
-1

(2.16)

This technique is simple to apply but since the order of the loop filter is low, the
harmonics are not canceled accurately using the inverse filter S -1 (z). The resulting x(n)
may suffer from high-frequency noise. The low-order loop filter was chosen primarily
since it is efficient to implement the synthesis algorithm. However, inverse filtering is
an off-line procedure where accuracy is much more important than efficiency. We could
thus design a higher order filter to be used in the inverse filtering as suggested by
Laroche and Meillier (1994).
Another deficiency of the inverse filtering technique described above is that it does
not take into account the nonlinear effects of the physical strings. This could be
accounted for by using a time-varying inverse filter to cancel the transfer function of the
string. This filter can be designed based on the STFT analysis that was discussed in the
previous section.
The residual signals resulting from the analysis and inverse filtering of an original
string instrument tone can be directly applied as the excitation to the string model. The
excitation signal can be shortened by windowing the first 100 ms of the residual signal
using, e.g., the right half of a Hamming window. This truncated signal can be used as
the excitation of the synthesis model. In the case of the electric guitar, the length of the
residual signal used in resynthesis can be reduced to about 50 ms without any significant loss of information. This is due to the lack of body resonances.
Usually the residual signal of an acoustic guitar tone includes only a few low-frequency components of the body of the instrument. They have large time constants. The
most prominent ones are typically the air mode (the Helmholtz resonance) and the lowest mode of the back plate. Vlimki et al. (1995a) as well as Smith and Van Duyne
(1995) have suggested the extraction of these two strongly resonating modes. Once they
have been removed from the residual signal, the resulting signal is very short. The two
body resonances can be simulated using second-order all-pole filters.
2.1.6 Discussion
In this section we have discussed waveguide modeling of string instruments. We have
seen that these synthesis models have reached an advanced level: the model is compact
and efficient to implement but still mainly based on the physics of the instrument. The
analysis method enabled resynthesis of string tones or copying the timbre of a given
plucked string instrument. Smith and Van Duyne (1995) have used similar waveguide
techniques and some of their extensions for modeling the piano (see also Van Duyne
and Smith, 1995).

34

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Plucked string instruments can be considered relatively easy from the viewpoint of
physical modeling, since their resonator is a simple linear system. Later in this chapter
we will tackle modeling of wind instruments which are strongly nonlinear systems with
a complicated linear resonator: a bore that contains finger holes or valves. For these reasons modeling is more complicated. Furthermore, nonlinearities make analysis tools
difficult to devise.

2.2 Waveguide Modeling of Cylindrical Tubes


The cylindrical tubes that are of interest in speech communication and in musical
acoustics are narrow with respect to the wavelength of the sound waves propagating in
them. In narrow tubes, only plane waves can propagate at frequencies below a certain
critical frequency. Above this frequency, the wavelength is no longer much larger than
the diameter of the tube and thus also transverse modes can propagate. The main interest in this work is the plane-wave propagation since it can be simulated using a onedimensional digital waveguide model.
An articulatory speech synthesizer is a model of the human speech production system that generates speech-like signals based on slowly-varying physiological parameters, such as lung pressure, shape of the tongue, and lip opening. A common feature of
articulatory speech synthesizers is that they contain a model for the wave propagation in
the human vocal tract. There are, nevertheless, several approaches available to do this.
Kelly and Lochbaum (1962) used a simple time-domain model of the vocal tract.
Their method has now become classic and is called the KellyLochbaum (KL) model.
Later several research groups have used versions of the technique (e.g., Kabasawa et al.,
1983; Liljencrants, 1985; Meyer et al., 1989). Other approaches include solving of partial differential equations (see, e.g., Maeda, 1982) and a chain-matrix method (Sondhi
and Schroeter, 1987). All of the methods are based on the linear wave equation. Since
the model is also assumed to be linear, the problem can be handled either in the time or
in the frequency domain.

Lips

Vocal Folds
Vocal Tract

L = cT
Fig. 2.5 Vocal tract profile and its approximation by cylindrical tube sections.

Chapter 2. Digital Waveguide Modeling

35

The KellyLochbaum speech synthesis model is of interest here since it is based on


the same idea as the digital waveguide technique. The KL model has not been used for
commercial speech synthesizers but rather for research purposes. Recently, Cook (1989,
1990, 1993) has applied the KL model to the synthesis of the singing voice. His model
can be used by composers of computer music, and it may also generate stimuli for psychoacoustic testing.
The KL model implements a piecewise zeroth-order polynomial approximation for
the continuously variable shape of the tube. This is illustrated in Fig. 2.5. Each tube
section of the approximated system has its length L determined by the sampling interval
T and the speed of sound c. At the junction of tube sections of different diameter the
acoustic wave is scattered, i.e., the wave is partly reflected from the discontinuity and
partly transmitted through it. Digital implementation of the KL model is discussed in
the following.
2.2.1 The One-Dimensional Wave Equation
The wave equation for a plane wave in a lossless cylindrical tube is a partial differential
equation given by
2
2p
2 p
=
c
t 2
x 2

(2.17)

where c is the propagation speed of the sound wave ( 340 m/s), p is the longitudinal
pressure displacement in the tube, t is time, and x is the distance. This equation may be
derived using Newtons second law of motion (Fletcher and Rossing, 1991). The second-order terms have been neglected in Eq. (2.17) and thus it is sometimes called the
linearized wave equation. This is, of course, an approximation, but it appears to be quite
adequate even with high sound pressures. This equation is practically always valid
when considering the human vocal tract or bores of musical instruments.
The general solution to the one-dimensional wave equation has the form
p(x,t) = p + (x - ct) + p - (x + ct)

(2.18)

where p + and p - are arbitrary continuous functions. The term p + (x - ct) represents a
pressure wave component traveling in the positive x direction at speed c whereas
p - (x + ct) represents a pressure wave traveling in the negative x direction at the same
speed. Provided that the signal traveling in the tube is bandlimited, this solution can be
discretized and a discrete-time waveguide model can be constructed as explained in
Section 2.1.1.
The exact shape of the pressure and volume velocity waves in a cylindrical tube
depends on the initial conditions and the boundary conditions. Typical cases for boundary conditions are a closed and an open end.
2.2.2 A Model Composed of Coupled Uniform Tube Sections
The characteristic impedance Z of an acoustic tube is defined by
Z=

rc rc
=
A pa 2

(2.19)

36

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Zk +1
Zk
uk++1 (t)

pk+ (t)

uk (t)

pk- (t)

uk (t)

uk +1 (t)

pk +1 (t)
pk-+1 (t)

Fig. 2.6 Junction of two uniform tubes of different diameter.


where r is the density of air in the tube, c is the speed of sound, and A is the cross-sectional area of the tube the diameter of which is 2a. The characteristic impedance
determines the ratio of pressure and volume velocity components in the tube, that is
Z=

p + (x,t) p - (x,t)
=
u + (x,t) u - (x,t)

(2.20)

This is an analogy of Ohms law in electrical circuit theory.


A discontinuity in diameter of a cylindrical tube changes the characteristic impedance and, consequently, disturbs the acoustic wave. The incident wave is only partly
transmitted through the discontinuity and part of its energy is reflected back. This phenomenon is called scattering. Here we shall consider a system consisting of M coupled
uniform tubes of different diameter but same length. In Chapter 4, an extended model
with variable-length sections is presented.
The computational model for a discontinuity of a cylindrical acoustic tube is derived
below for both the volume velocity and the pressure waves. The derivations are based
on a classical assumption that pressure and volume velocity are continuous at every
junction. Sondhi (1983) has stated that, strictly speaking, the pressure is discontinuous
and therefore an additional acoustic inductance should be added at every junction. We
shall, however, use the classical approach and Sondhis improvement is left for future
work.
2.2.3 KL Scattering Junction for Volume Velocity Waves
The reflection and transmission coefficients needed for modeling the discontinuity of
impedance in a cylindrical tube are derived below. Figure 2.6 introduces the variable
quantities that are used in the derivation.
The total sound pressure in the kth tube section is expressed as

pk (x,t) = Zk uk+ (t - t ) + uk- (t + t )

(2.21)

where t = x / c and Zk is the characteristic impedance of the kth tube section. The total
volume velocity in the kth segment is the difference of the two components
uk (x,t) = uk+ (t - t ) - uk- (t + t )

(2.22)

The pressure has to be continuous at the connection of the tubes and the net flow in

Chapter 2. Digital Waveguide Modeling

37

the contiguous tubes has to be the same:


pk (L,t) = pk +1 (0,t)

uk (L,t) = uk +1 (0,t)

(2.23)

where L = cT is the length of tube sections. We continue by substituting Eqs. (2.21) and
(2.22) into Eq. (2.23). This yields

Zk uk+ (t - t ) + uk- (t + t ) = Zk +1 uk++1 (t) + uk-+1 (t)

uk+ (t - t ) - uk- (t + t ) = uk++1 (t) - uk-+1 (t)

(2.24)

From this pair of equations, it is easy to derive the equations describing the volume
velocity components uk- and uk++1. They are
Zk +1 - Zk +
2Zk +1
uk (t - t ) +
uk-+1 (t)
Zk +1 + Zk
Zk +1 + Zk

uk- (t + t ) =

uk++1 (t) =

2Zk
Z - Zk +1 uk+ (t - t ) + k
uk +1 (t)
Zk +1 + Zk
Zk +1 + Zk

(2.25a)

(2.25b)

Based on the above equations it is reasonable to define the reflection coefficient for
the kth junction in the positive direction as (Liljencrants, 1985, p. 1-3; Fletcher and
Rossing, 1991, pp. 147148)
rk =

Zk +1 - Zk Ak - Ak +1
=
Zk +1 + Zk Ak + Ak +1

(2.26a)

where Zk and Ak denote the acoustic impedance and cross-sectional area of the kth tube
section, respectively. The latter form has been devised using relation (2.19). It is seen
that rk 1, since the areas (and consequently the impedances) are always nonnegative.
The reflection coefficient of the same junction in the negative direction is opposite in
sign. The transmission coefficient in the positive direction is written as
tk =

2Zk
2 Ak +1
=
= 1 - rk
Zk +1 + Zk Ak + Ak +1

(2.26b)

The transmission coefficient in the negative direction is 1 + rk . Note that if Zk +1 = Zk ,


no scattering occurs.
Equations (2.25a) and (2.25b) may be expressed more compactly using Eq. (2.26a),
that is
uk- (t + t ) = rk uk+ (t - t ) + (1 + rk )uk-+1 (t)

(2.27a)

uk++1 (t) = (1 - rk )uk+ (t - t ) - rk uk-+1 (t)

(2.27b)

There is some confusion in the literature about the definition of the reflection coefficient for the twoport junction. In some references the definition is opposite in sign with respect to the definition used here
(see, e.g., Rabiner and Schafer, 1978, pp. 8485).

38

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

a)

b)
1 - rk

uk+
rk
uk-

1 + rk

+
uk +1

uk+

uk-+1

uk-

rk

uk +1

- rk
uk-+1

Fig. 2.7 a) The two-port scattering junction for acoustic volume velocity signals and b)
an equivalent one-multiplier junction.
The signal flow diagram describing the scattering of volume velocity at a junction of
two uniform tube segments is illustrated in Fig. 2.7a. A more economical implementation of this junction is obtained by rewriting equations (2.27a) and (2.27b) in the following way:
uk- (t + t ) = uk-+1 (t) + w(t)

(2.28a)

uk++1 (t) = uk+ (t - t ) - w(t)

(2.28b)

where

w(t) = rk uk+ (t - t ) + uk-+1 (t)

(2.28c)

Since the term w(t) is common to Eqs. (2.28a) and (2.28b), it needs to be computed only
once. This one-multiplier configuration of the two-port junction is depicted in Fig. 2.7b.
2.2.4 KL Scattering Junction for Pressure Waves
The two-port junction may also be derived for pressure signals. The total pressure in the
kth section is given by (2.18) and the net flow (total volume velocity) is written as
uk (x,t) = uk+ (t - t ) - uk- (t + t ) =

1 +
pk (t - t ) - pk- (t + t )
Zk

(2.29)

The application of continuity requirements for both pressure and volume velocity
results in the following pair of equations

1 p + (t - t ) - p - (t + t ) = 1 p + (t - t ) - p - (t + t )
k
k
k +1
k +1
Z
Zk +1
k
p + (t - t ) + p - (t + t ) = p + (t) + p - (t)
k
k +1
k +1
k

(2.30)

By eliminating the term pk++1 (t) it is easy to solve for pk- (t + t ) and, similarly by eliminating pk- (t + t ), for pk++1 (t)
pk- (t + t ) =

Zk +1 - Zk +
2Zk
pk (t - t ) +
pk-+1 (t)
Zk +1 + Zk
Zk +1 + Zk

(2.31a)

Chapter 2. Digital Waveguide Modeling

a)

b)
1 + rk

pk+
rk
pk-

39

pk++1

pk+

pk-+1

pk-

rk

pk++1

- rk

1 - rk

pk-+1

Fig. 2.8 a) The two-port scattering junction for the acoustic pressure signal and b) the
equivalent one-multiplier junction.
pk++1 (t) =

2Zk +1
Z - Zk +1 pk+ (t - t ) + k
pk +1 (t)
Zk +1 + Zk
Zk +1 + Zk

(2.31b)

The same reflection coefficient as in the case of volume velocity signals can still be
used. By using Eq. (2.26a), the pair of Eqs. (2.31a) and (2.31b) may be expressed as
pk- (t + t ) = rk pk+ (t - t ) + (1 - rk ) pk-+1 (t)

(2.32a)

pk++1 (t) = (1 + rk ) pk+ (t - t ) - rk pk-+1 (t)

(2.32b)

The scattering of the pressure wave at a junction of two uniform tube segments is illustrated in Fig. 2.8a. Note that the transmission coefficients differ from those defined for
volume velocity waves (see Fig. 2.7a).
A one-multiplier two-port junction for the sound pressure signals is obtained by
rewriting Eqs. (2.32a) and (2.32b) as follows
pk- (t) = pk-+1 (t) + w(t)

(2.33a)

pk++1 (t) = pk+ (t - t ) + w(t)

(2.33b)

where

w(t) = rk pk+ (t - t ) - pk-+1 (t)

(2.33c)

The one-multiplier scattering junction for sound pressure signals is shown in Fig. 2.8b.
2.2.5 Boundary Conditions of a Cylindrical Tube
Modeling of the end point of a cylindrical tube is achieved with the help of the reflection functions defined above. For a closed end, the radiation impedance Zrad tends to
infinity. Then the reflection coefficient for the closed end (from the positive side) is
Zrad - Z M
=1
Z rad Zrad + Z M

r M = lim

(2.34)

where Z M is the characteristic impedance of the last tube section before the closed end.
Equation (2.34) means that the wave (either pressure or volume velocity) reflects back
from a closed end as such.

40

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Input

-1

-1

-1

-1

SMk-1
1

S1

S0

-1

Output
SM

-1

Fig. 2.9 The KellyLochbaum model is a discrete-time implementation of an M-tube


model of the vocal tract. The blocks labeled Sn (n = 0, 1, ..., M) are two-port
scattering junctions.
Modeling of an open end of a cylindrical tube is more complicated because the radiation impedance depends on the frequency. Continuous-time models for the radiation
impedance Zrad (w ) can be found in the acoustics literature. One approximation is the
impedance of a vibrating piston at the end of a long tube (see Beranek, 1954, pp. 123
128). The continuous-time radiation impedance can be converted into a discrete-time
one using the impulse-invariant transform which is a standard technique in digital signal
processing (e.g., Jackson, 1989).
In the case of an open end, a reflection filter, instead of a real reflection coefficient,
must be used. It can be expressed in the z-domain as
R(z) =

Zrad (z) - Z M
Zrad (z) + Z M

(2.35)

The radiated volume velocity signal is obtained by filtering the signal at the open end of
the tube using the filter 1 - R(z) as suggested by Eq. (2.27b) (see also Fig. 2.7). The
radiated pressure signal, however, is computed using the filter 1 + R(z) (see Fig. 2.8).
In some applications it is of interest to model radiation from an open tube in a flange.
This is the case, e.g., in vocal tract modeling. Laine (1982) has derived a simple discrete-time reflection filter that is suitable for speech synthesis. It is based on the use of a
first-order differentiator (with a real scaling factor) as a model for the radiation
impedance. This approximation leads to a first-order IIR reflection filter. This model
has been used successfully as an approximation for the radiation impedance of a woodwind instrument (Vlimki et al., 1992b).
2.2.6 General Properties of the Cylindrical Tube Model
The traditional KellyLochbaum tube model is a ladder or lattice filter with a two-port
adaptor between every two unit delays. A model composed of M cylindrical tube sections is illustrated in Fig. 2.9. The model has two delay lines (a digital waveguide) and
M+1 two-port scattering junctions Sn .
Bonder (1983a) has derived an expression for the formant frequencies of the cylindrical M-tube model. He has also shown six general properties of the model. These
properties are recapitulated below.
1) If the area of each tube is scaled by a common factor, the resonance frequencies
remain unchanged. This is a consequence of the fact that the reflection coefficients depend on the ratio of the tube impedances (or areas), not on their exact
values [see Eq. (2.26a)].
2) Formant frequencies are inversely proportional to the length of the tube segments.
This property suggests that if the lengths of all the tube segments are multiplied

Chapter 2. Digital Waveguide Modeling

41

by some factor x, then the resonance frequencies are those of the original model
multiplied by 1/x.
3) A tube model with a certain configuration has the same formant frequencies as the
configuration with tubes in the reverse order.
4) In case of an odd-tube, i.e., when the number of tube segments is odd (or can be
reduced to odd by combining pieces with the same diameter), the model has fixed
formants at certain frequencies, irrespective of the profile of the tube. These frequencies are given by
1 cM
f fix,k = k +

2 2L

for k = 0,1,2,K

(2.36)

where c is the speed of sound, M is the total number (odd) of tube sections, and
L = Ml is the overall length of the tube. The indices of the fixed formant frequencies are [(M + 1) / 2] + kM (for k = 0,1,2,K). A model consisting of an even
number of sections also has fixed formants if it can be reduced to an odd-tube.
As an example of property 4, let us consider a 3-tube model (M = 3) with L = 17.5
cm. This model has fixed formants at frequencies 1500 Hz, 4500 Hz, 7500 Hz, etc.
These correspond to the resonance frequencies f 2 , f 5 , f 8 , and so on (Bonder, 1983a).
The other formants depend on the shape of the tube.
5) The lowest M / 2 formants of an M-tube model determine all the other formants except for the fixed ones: if f is one of the formants of an M-tube model,
then also cM / 2L - f and f + kcM / 2L (for k = 1,2,3,K) are its formants. Here
is the floor function defined by Eq. (3.12) and cM / 2L corresponds to the
Nyquist frequency f N of the system. Property 5 is another way of saying that the
frequency response of a sampled system is periodic with a period of f N .
6) There are (M - 1) / 2 different tube configurations with the same formant
structure. This feature is a consequence of the previous property. For a 2-tube
model there is no degree of freedom because 1 / 2 = 0 . For tube models with M
3, there is always a number of different ways to choose the diameters of the
sections and still end up with the same formant frequencies. Property 6 is the
explanation for the difficulty of estimating the diameters of the tube sections
directly from an acoustic signal. This question has been investigated in Bonder
(1983b).
2.2.7 Sparse Tube Model
One possible generalization of the M-tube model is such that some of the two-port
junctions are removed (or their reflection coefficient is set to 0). This kind of model is
more efficient to implement but can still produce essentially a similar spectral envelope
as the corresponding acoustic tube system. The number of degrees of freedom is, of
course, less than that for the traditional M-tube KL model.
Usually an M-tube model is considered in the context of digital ladder or lattice filters. Probably for this reason, there do not seem to have been many studies of models
where fewer than M + 1 scattering junctions are employed with M unit delays.
Examples are works by Lacroix and Makai (1979), Frank and Lacroix (1986), and Chan

42

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

and Leung (1991). From the physical viewpoint this seems like a natural way to reduce
the complexity of the model. We shall call a model where less than M + 1 junctions are
used in connection with M unit delays (with the original sampling frequency) a sparse
tube model. A further generalization of the M-tube model using fractional delay filtering
techniques is introduced in Chapter 4 of this work.
2.2.8 Extensions to the Basic KL Model
There are two severe limitations in the basic KellyLochbaum (KL) model:
1) variable tract length is not incorporated, and
2) all the tube sections have the same length.
Due to the first restriction it is impossible to exactly match given formants, since the
vocal tract length is in general not correctly modeled. The second one gives rise to difficulties in the control of the model. Namely, the mapping of the physical parameters to
the model parameters (reflection coefficients in the scattering junctions) is not simple.
The articulators move in the front-back dimension and the diameter of the tract also
varies, whereas in the model only the diameter of the tube can be adjusted at the fixed
points. Control would be more intuitive if the junctions could be moved.
There have been two solutions to the these problems. Strube (1975) suggested the
use of a fractional delay element in place of the last unit delay element of the delay line.
He showed that the frequencies and bandwidths of the formants were then changed
compared with the ideal ones. Later, Lagrange interpolation was proposed by Laine
(1988).
The other method for the accurate adjustment of the vocal tract length is due to Wu
et al. (1987a, 1987b). They keep the number of sections in the model constant but scale
the length of each section by a common factor, which can in principle be an arbitrary
real number. This is achieved by changing the sample rate of the model. This, however,
does not require that the sample rate of the computer hardware and DA converters
should be variable, but the operation is done by software instead. A time-varying FIR
interpolating filter is used for sampling-rate conversion. A more efficient form of this
technique has been reported by Wright and Owens (1993).
A drawback of both techniques described above is that the length of each tube section cannot be independently controlled in the discrete-time model. In Chapter 4 of this
work we present a fractional delay waveguide model where the length of each tube section can be arbitrarily controlled. Consequently, the total length of the vocal tract is
continuously variable.

2.3 Junction of Three Acoustic Tubes


A three-port junction of waveguides is needed, for example, in modeling of woodwind
instruments. Namely, the finger holes, or tone holes, of these musical instruments may
be interpreted as short side branches connected to the main bore. Another application is
simulation of the human vocal tract with both the oral and the nasal tract models.
In this section, we present the three-port scattering junction. In addition, we study the
special case of the three-port junction where two of the three branches have the same
impedance, and discuss application of this model to the simulation of finger holes.

Chapter 2. Digital Waveguide Modeling

43

Y1
u1-

u1+

p1-

p1+

p2 - u 2

p3

u3
-

u3

p3

p2 + u +
2
Y

Y3

Fig. 2.10 A junction of three cylindrical acoustic tubes.


2.3.1 Three-Port Scattering Junction for Volume Velocity
The equations governing the reflection and transmission of volume velocity waves at a
three-port junction are derived first. Generally, for a passive M-way junction, the net
volume velocity vanishes, i.e.,
M

uk = 0

(2.37)

u1+ (t) - u1- (t) + u2+ (t) - u2- (t) + u3+ (t) - u3- (t) = 0

(2.38)

k =1

For a three-port junction this yields

The pressure across the M-way junction must be continuous, that is


p1 = p2 = K = pM

(2.39)

This rule gives another equation that is needed in the derivation of the three-port junction. Equations (2.37) and (2.39) are analogous to Kirchhoffs current and voltage laws,
respectively, used for electrical networks.
In the following the three-port junction for the acoustic volume velocity is derived.
The variable quantities are illustrated in Fig. 2.10. Superscript + means that the signal
propagates towards the junction, and denotes the reverse direction. Subscript k refers
to the index of the branch.
The pressure in each branch of the junction at point x is expressed in the following
form:
pk (x,n) = pk+ (n - t ) + pk- (n + t ) =

1 +
uk (n - t ) + uk- (n + t )
Yk

(2.40)

where t = x / c is the propagation delay and Yk is the characteristic admittance of the


kth branch, defined as Ak / r c , where Ak is the cross-sectional area of the tube.
According to Eq. (2.39) and (2.40), the pressure signals in the branches of the junction are related in the following way

44

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

1 +
1 +
1 +
u1 (t) + u1- (t) =
u2 (t) + u2- (t) =
u3 (t) + u3- (t)
Y2
Y3
Y1

(2.41)

Using Eqs. (2.38) and (2.41), the outgoing volume velocity signals u1- (t), u2- (t), and
u3- (t) may now be solved in terms of the ingoing signals u1+ (t) , u2+ (t) , and u3+ (t) :
u1- (t)

u2- (t)

u3- (t) =

(2.42a)

(2.42b)

(2.42c)

(Y1 - Y2 - Y3 )u1+ (t) + 2Y1 u2+ (t) + u3+ (t)


Y1 + Y2 + Y3

(Y2 - Y1 - Y3 )u2+ (t) + 2Y2 u1+ (t) + u3+ (t)


Y1 + Y2 + Y3

(Y3 - Y1 - Y2 )u3+ (t) + 2Y3 u1+ (t) + u2+ (t)


Y1 + Y2 + Y3

Let us define the reflection coefficient for the kth branch as


rk =

2Yk - Y1 - Y2 - Y3
Y1 + Y2 + Y3

(2.43)

Using this reflection coefficient for each branch, the Eqs. (2.42) may be rewritten as

[
]
u2- (t) = r2 u2+ (t) + (1 + r2 ) [u1+ (t) + u3+ (t)]
u3- (t) = r3 u3+ (t) + (1 + r3 ) [u1+ (t) + u2+ (t)]
u1- (t) = r1 u1+ (t) + (1 + r1 ) u2+ (t) + u3+ (t)

(2.44a)
(2.44b)
(2.44c)

2.3.2 Three-Port Scattering Junction for Sound Pressure


We shall next derive the equations governing the scattering of acoustic pressure waves
in a three-port junction. The continuity equation (2.38) for volume velocity can be
rewritten in terms of pressure as

Y1 p1+ (t) - p1- (t) + Y2 p2+ (t) - p2- (t) + Y3 p3+ (t) - p3- (t) = 0

(2.45)

According to Eq. (2.39) we can write


p1+ (t) + p1- (t) = p2+ (t) + p2- (t) = p3+ (t) + p3- (t)

(2.46)

From the above two equations it is possible to solve for the outgoing pressure signals
p1- (t), p2- (t), and p3- (t). This results in the following three equations
p1- (t) =

(Y1 - Y2 - Y3 ) p1+ (t) + 2Y2 p2+ (t) + 2Y3 p3+ (t)


Y1 + Y2 + Y3

(2.47a)

Chapter 2. Digital Waveguide Modeling


p2- (t)

45

(Y2 - Y1 - Y3 ) p2+ (t) + 2Y1 p1+ (t) + 2Y3 p3+ (t)


=
Y1 + Y2 + Y3

(2.47b)

(Y3 - Y1 - Y2 ) p3+ (t) + 2Y1 p1+ (t) + 2Y2 p2+ (t)


Y1 + Y2 + Y3

(2.47c)

p3- (t) =

The reflection coefficient for each branch can be defined similarly as earlier for the
volume velocity signals. However, the transmission coefficients, which were simply
1 + rk for the kth branch in the case of volume velocity, are now different in every case.
Equations (2.47) can be rewritten as
p1- (t) = r1 p1+ (t) + (1 + r2 ) p2+ (t) + (1 + r3 ) p3+ (t)

(2.48a)

p2- (t) = r2 p2+ (t) + (1 + r1 ) p1+ (t) + (1 + r3 ) p3+ (t)

(2.48b)

p3- (t) = r3 p3+ (t) + (1 + r1 ) p1+ (t) + (1 + r2 ) p2+ (t)

(2.48c)

2.3.3 Side Branch in a Uniform Tube


Let us now consider a special case of the three-port junction, namely, a connection of a
side branch to a uniform tube. This model is useful in the simulation of a woodwind
instrument bore including finger holes (Vlimki et al., 1993b). Figure 2.11 shows the
variables used in the following. Note that there are essential differences in the notation
with respect to earlier sections. Superscript + now means that the signal propagates
towards the end (output) of the bore, and means that the signal propagates towards
the input. Subscripts a and b refer to the pieces of the bore before and after the side
branch connection, respectively, and s stands for the side branch.
In this case, two of the three branches have the same characteristic admittance
because they are parts of the same uniform tube. We denote the admittances of the uniform tube and the side branch, respectively, by Y0 and Ys (w ). Note that the latter is a
function of frequency, and for this reason we use the frequency-domain equivalents of
all equations in the following.
The reflection functions for the branches are obtained by substituting Y1 = Y2 = Y0
and Y3 = Ys (w ) into Eq. (2.43). This yields

Zs (w )
+

us
ua+
Z0

ua-

+
ps

us
ps-

pa+

pb+

pa

pb

ub+
ub-

Z0

Fig. 2.11 A three-port junction where a side branch is connected to a uniform tube.

46

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


Ra (w ) = Rb (w ) = Rs (w ) =

Ys ( w )
Z0
=Ys (w ) + 2Y0
Z0 + 2Zs (w )

Ys (w ) - 2Y0 Z0 - 2Zs (w )
=
Ys (w ) + 2Y0 Z0 + 2Zs (w )

(2.49)

(2.50)

Note that the configuration in Fig. 2.11 is symmetrical, and hence the reflection coefficients Ra (w ) and Rb (w ) are identical and we may define
R0 (w ) = Ra (w ) = Rb (w )

(2.51)

The reflection function Rs (w ) can also be written in terms of R0 (w ):


Rs (w ) = -1 - 2R0 (w )

(2.52)

Side Branch Junction for Volume Velocity


We shall first give the equations for the Fourier transforms of the volume velocity signals ua- (t), ub+ (t) , and us+ (t) that propagate away from the junction. These formulas are
obtained from Eqs. (2.42) by substituting Y1 = Y2 = Y0 , Y3 = Ys (w ), p1 = pa , u2 = ubm ,
and u3 = usm , and then writing the frequency-domain equivalents of all equations:
Ua- (w )

Ub+ (w )

Us+ (w )

(2.53a)

(2.53b)

-Ys (w )Ua+ (w ) + 2Y0 Ub- (w ) + Us- (w )


Ys (w ) + 2Y0

-Ys (w )Ub- (w ) + 2Y0 Ua+ (w ) + Us- (w )


Ys (w ) + 2Y0

[Ys (w ) - 2Y0 ]Us- (w ) + 2Ys (w )[Ua+ (w ) + Ub- (w )]


Ys (w ) + 2Y0

(2.53c)

These equations can be rewritten using the reflection function R0 (w ) defined by Eq.
(2.51) as

[
]
Ub+ (w ) = R0 (w )Ub- (w ) + [1 + R0 (w )][Ua+ (w ) + Us- (w )]
Us+ (w ) = -[1 + 2R 0(w )]Us- (w ) - 2R0 (w ) [Ua+ (w ) + Ub- (w )]
Ua- (w ) = R0 (w )Ua+ (w ) + [1 + R0 (w )] Ub- (w ) + Us- (w )

(2.54a)
(2.54b)
(2.54c)

It is possible to derive an efficient configuration for the side branch junction for volume velocity that is reminiscent of the one-multiplier KL junction developed in Section
2.2.1. This is achieved by regrouping the terms in the above equations. This results in
Ua- (w ) = Ub- (w ) + Us- (w ) + W(w )

(2.55a)

Ub+ (w ) = Ua+ (w ) + Us- (w ) + W(w )

(2.55b)

Chapter 2. Digital Waveguide Modeling

47

Us+ (w ) = -Us- (w ) - 2W(w )

(2.55c)

where we have defined

W(w ) = R0 (w ) Ua+ (w ) + Ub- (w ) + Us- (w )

(2.55d)

The term W(w ) is common to Eqs. (2.55a)(2.55c) and the reflection function R0 (w )
appears only in Eq. (2.55d). Thus the implementation of this side branch junction
involves only one filtering operation.
In Vlimki et al. (1993b) this result was derived using characteristic impedances
instead of admittances. A special case of the three-port junction, where the volume
velocity us- was assumed to be zero, has been presented by Borin et al. (1990).
Side Branch Junction for Sound Pressure
We shall now derive the model for the side branch junction in terms of sound pressure
signals. The equations for the pressure signals are obtained from Eqs. (2.47) by substituting Y1 = Y2 = Y0 , Y3 = Ys (w ), p1 = pa , p2 = pbm , and p3 = psm . This yields
Pa- (w ) =

-Ys (w )Pa+ (w ) + 2Y0 Pb- (w ) + 2Ys (w )Ps- (w )


Ys (w ) + 2Y0

(2.56a)

Pb+ (w ) =

-Ys (w )Pb- (w ) + 2Y0 Pa+ (w ) + 2Ys (w )Ps- (w )


Ys (w ) + 2Y0

(2.56b)

Ps+ (w )

[Ys (w ) - 2Y0 ]Ps- (w ) + 2Y0 [ Pa+ (w ) + Pb- (w )]

(2.56c)

Ys (w ) + 2Y0

These equations can be simplified, using the reflection function R0 (w ) that has been
defined by Eq. (2.51), as
Pa- (w ) = R0 (w ) Pa+ (w ) + [1 + R0 (w )]Pb- (w ) - R0 (w )Ps- (w )

(2.57a)

Pb+ (w ) = R0 (w ) Pb- (w ) + [1 + R0 (w )]Pa+ (w ) - R0 (w )Ps- (w )

(2.57b)

Ps+ (w ) = -[1 + 2R 0(w )]Ps- (w ) + [1 + R0 (w )] Pa+ (w ) + Pb- (w )

(2.57c)

Unfortunately, it is not possible to derive a form where the filtering by R0 (w ) needs


to be computed only once, as was the case for the volume velocity signals. If, however,
the pressure signal ps- is zerowhich is a reasonable assumption in many practical
casesthen an economical implementation can be derived. The result is

where we have defined

Pa- (w ) = Pb- (w ) + W(w )

(2.58a)

Pb+ (w ) = Pa+ (w ) + W(w )

(2.58b)

Ps+ (w ) = Pa- (w ) + Pb+ (w ) + W(w )

(2.58c)

48

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

W(w ) = R0 (w ) Pa+ (w ) + Pb- (w )

(2.58d)

and assumed that Ps- (w ) = 0 .

2.4 Waveguide Modeling of Conical Tubes


In this section we discuss modeling of acoustic tube systems that are constructed of
truncated cones. The basic idea of simulating non-cylindrical tubes using digital
waveguides was presented by Smith (1991). Formerly, conical tubes have been modeled
by approximating the profile of the tube by cylindrical tube sections (see, e.g., Causs et
al., 1984; Cook, 1988). As the number of the cylindrical sections is increased, the
approximation approaches the ideal solution. The same idea can as well be applied to
other tube profiles, such as exponential or catenoidal horns that are used in brass wind
instruments and loudspeakers.
The motivation for the conical tube models is that many practical acoustic systems
are constructed of conical sections. For example, most woodwind instruments, e.g., the
saxophones, oboes, and bassoons, have a conical bore. The reason for the success of
cylindrical and conical bores in musical acoustics is that they are non-dispersive
waveguides and thus support harmonically related modes (Fletcher and Rossing, 1991).
An important field of application for conical modeling is approximation of other
non-cylindrical tube shapes. For these approximations, a model consisting of conical
sections will converge more quickly towards the ideal solution than when using the
traditional cylindrical sections (Causs et al., 1984). This approach offers another
extension to the modeling of the human vocal tract. Figure 2.12 shows the basic idea of
this technique.
In this new kind of articulatory model, the diameter profile of the vocal tract is
piecewise linear. This may be interpreted as a first-order polynomial approximation of
the profile function. With this vocal tract model, synthetic speech signals with realistic

Vocal Folds

Lips
Vocal Tract

L = cT
Fig. 2.12 Approximation of the vocal tract profile using conical tube sections.

Chapter 2. Digital Waveguide Modeling

49

formant structure can be obtained using a sparse tube model with only a few sections.
In this section, we first introduce a digital filter model for a junction of two conical
tube sections with different tapers and diameters. A useful special case is a junction of
two conical sections with different tapers but the same diameter. This is applicable to
the modeling of at least partly conical acoustic tubes as well as modeling of arbitrary
tube profiles. In addition, a discrete-time model for the radiation and reflection at the
open end of a conical tube is developed. This model can be applied to the modeling of
wind instruments.
We first discuss wave propagation in a conical tube and derive the reflection and
transmission functions that govern the scattering of pressure waves in a junction of two
conical tube section.
2.4.1 Traveling Wave Solution for Spherical Waves
The wave equation for spherical waves propagating in a conical tube is obtained by
writing the general three-dimensional wave equation in spherical coordinates and
assuming that the pressure p is a function of r and t only. This yields
1 2 p 1 2 p
r
=
r 2 r r c2 t 2

(2.59)

where r is the distance from the cone apex. This may be rewritten as
2
2 rp
2 rp
=
c
t 2
r 2

(2.60)

Substituting y = rp it is seen that the equation is of the same form as the onedimensional wave equation [see Eq. (2.1)]
2
2y
2 y
=
c
t 2
r 2

(2.61)

The general solution thus has a form similar to that given by Eq. (2.2), that is

y (r,t) = f + (r - ct) + f - (r + ct)

(2.62)

where f + and f - are two traveling waves of arbitrary shape. The solution in terms of
pressure p is obtained by substituting y = rp so that
p(r,t) =

1 +
1
f (r - ct) + f - (r + ct)
r
r

(2.63)

where (1 r) f + (r - ct) represents a pressure wave component propagating in the positive x direction with speed c whereas (1 r) f - (r + ct) represents a pressure wave propagating in the negative x direction with the same speed.
The solution is similar to Eq. (2.2) except for the factor 1/r. This difference is interpreted so that the amplitude of a pressure wave is decreased as it travels further away
from the apex of the cone and, equivalently, the pressure is increased as the wave travels towards the apex. Note that the amplitude of the wave would be infinite at r = 0, and
therefore it is not possible to consider the behavior of the wave at the cone apex.

50

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Cone a

Cone b
ra

rb
+

pa (t,r)
ua+(t,r)

Ab
Aa

ua (t,r)
pa (t,r)

pb (t,r)
ub+(t,r)
-

ub (t,r)
pb (t,r)
Ya+
Ya

Yb+
Yb

Fig. 2.13 A junction of conical tubes that have a different diameter at the junction. In
this example Ab > Aa , ra < 0, and rb > 0 .
2.4.2 Junction of Conical Tube Sections
In the following we derive equations that govern the reflection and transmission of
pressure waves at the junction of two conical tubes of different diameter and taper. We
shall not discuss the scattering of volume velocity waves, since this case is essentially
more complicated (see Ayers et al., 1985).
The following conditions must be fulfilled at the junction:
pa+ (r,t) + pa- (r,t) = pb+ (r,t) + pb- (r,t)

(2.64)

ua+ (r,t) - ua- (r,t) = ub+ (r,t) - ub- (r,t)

(2.65)

The physical interpretation of the above and the following equations is depicted in Fig.
2.13. Our notation follows the guidelines of Gilbert et al. (1990) .
The admittances in different directions determine the ratio of volume velocity and
pressure components of the propagating wave, that is
Ya =

ua
pa

(2.66a)

Y b =

ub
pb

(2.66b)

On the other hand, the acoustic admittances of spherical waves in a conical tube are
defined as (see, e.g., Kinsler and Frey, 1950, pp. 300303)
Ya (w ) =

Aa
j
1 m

r c kra

(2.67a)

Note that in some references, e.g., Martnez and Agull (1988) and Gilbert et al. (1990), the superscripts + and are used to denote opposite directions with respect to our notation.

Chapter 2. Digital Waveguide Modeling

51

Y b (w ) =

Ab
j
1 m

r c krb

(2.67b)

where k = w / c is the wave number. Note that the acoustic admittance is always a
function of frequency in the case of conical tubes. For this reason we use frequencydomain description in the following.
Based on Eqs. (2.65) and (2.66) we may write
Pa+ (w )Ya+ (w ) - Pa- (w )Ya- (w ) = Pb+ (w )Y b+ (w ) - Pb- (w )Y b- (w )

(2.68)

Eliminating Pb+ (w ) using Eq. (2.6) and solving for Pa- (w ) yields
Pa- (w ) =

Ya+ (w ) - Y b+ (w ) +
Y b+ (w ) + Y b- (w ) P
(
w
)
+
Pb (w )
a
Ya- (w ) + Y b+ (w )
Ya- (w ) + Y b- (w )

(2.69)

Here the first term determines the pressure signal that reflects back from the junction
while the second term determines the transmitted pressure signal. Thus we may define
the reflection function R + (w ) and the transmission function T - (w ) in the following
way:
R + (w ) =

Ya+ (w ) - Y b+ (w )
Ya- (w ) + Y b+ (w )

(2.70a)

T - (w ) =

Y b+ (w ) + Y b- (w )
Ya- (w ) + Y b+ (w )

(2.70b)

Similarly, we may eliminate Pa- (w ) from Eq. (2.68) and solve for Pb+ (w ). We then
obtain
Pb+ (w ) =

Ya+ (w ) + Ya- (w ) +
Y b- (w ) - Ya- (w ) Pa (w ) + +
Pb (w )
Y b+ (w ) + Ya- (w )
Y b (w ) + Ya- (w )

(2.71)

The reflection and transmission functions are now expressed as


R - (w ) =

Y b- (w ) - Ya- (w )
Y b+ (w ) + Ya- (w )

(2.72a)

T + (w ) =

Ya+ (w ) + Ya- (w )
Y b+ (w ) + Ya- (w )

(2.72b)

Now we can derive general formulas for the reflection and transmission functions by
substituting Eqs. (2.67a) and (2.67b). After some algebraic manipulation this yields
R + (w ) =

B -1
2Ba
B + 1 (B + 1)( jw + a )

(2.73a)

R - (w ) =

1- B
2a
1 + B (1 + B)( jw + a )

(2.73b)

52

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


T + (w ) =

2B
a
+
1 = 1 + R (w )
B +1
jw + a

(2.73c)

T - (w ) =

2
a
1 = 1 - R (w )
B +1
jw + a

(2.73d)

where B is the area ratio of the conical sections a and b at the junction, i.e.,
B=

Aa
Ab

(2.74)

and a is defined by

a=-

Aa Ab
c
-

Aa + Ab ra rb

(2.75)

A major difference between the junction of cylindrical tubes used in the KL model
and of the conical tubes presented above is that now the scattering is described by
means of a filter instead of a real coefficient. Note, however, that in limit ra , rb the
reflection and transmission functions given by Eqs. (2.73) approach the reflection and
transmission coefficients for a junction of two cylindrical tubes specified for pressure
waves (see Section 2.2.4).
The expression for the reflection function R + of a junction of conical tubes of different diameters and tapers has also been derived by Martnez and Agull (1988) but using
another approach. Also, they use the subscripts + and to mean exactly the opposite.
Above we have derived equations for the reflection and transmission functions of a
junction of two conical tubes using acoustic admittances. This choice was made because
it leads to simpler formulas. As an example we give the equation for R + (w ) using
acoustic impedances to show the difference between these two formulations:
1

Zb+ (w ) - Za+ (w )
R (w ) =
= +
1
1
Za (w )Zb+ (w )
+
+ Za+ (w )
+
Za (w ) Zb (w )
Za (w )
+

Za+ (w )

Zb+ (w )

(2.76)

This form should be compared with Eq. (2.70a).


2.4.3 Reflection Function with Taper Discontinuity Only
A useful special case of the junction of two conical tube sections is obtained by setting
areas Aa and Ab to be equal (see Fig. 2.14). This implies that B = 1. Substituting this
into Eqs. (2.73) we obtain the reflection and transmission functions for a junction with
taper discontinuity only, that is
R + (w ) = R - (w ) = T + (w ) = T - (w ) = 1 -

a
jw + a
a
jw + a

(2.77a)

(2.77b)

Chapter 2. Digital Waveguide Modeling

53

Cone a

rb

+
pa (t,r)
ua+(t,r)

Cone b
ra

pb (t,r)
ub+(t,r)

ua (t,r)
pa-(t,r)

Ya
Ya

ub (t,r)
pb (t,r)

Yb
Yb

Fig. 2.14 A junction of conical tubes that have the same diameter A at the junction. In
this example ra < 0 and rb > 0.
where a is defined as in Eq. (2.75). However, now when the areas Aa and Ab are equal,
the coefficient a is simplified into the form

a=-

c
c
c(ra - rb )
+
=
2ra 2rb
2ra rb

(2.78)

where c is the speed of sound, and ra and rb are the distances from the (imagined) tips
of cones a and b, respectively, as illustrated in Fig. 2.14. Note that now the reflection
and transmission functions given by Eqs. (2.77) are the same seen from both sides of
the junction.
2.4.4 Closed and Open End of a Truncated Conical Tube
When the end of a truncated conical tube is closed, the radiation impedance Zb+ (w )
tends to infinity, which implies that the radiation admittance Y b+ (w ) tends to zero. Then
the reflection function of a spherical wave is given by
R + (w ) =

Ya+ (w ) - 0 1 - j / kra jw + c / ra
2c / ra
=
=
=1+
jw - c / ra
Ya (w ) + 0 1 + j / kra jw - c / ra

(2.79)

This equation can be used for modeling the reflection of acoustic wave at the closed end
of a conical tube. For ra = 0 Eq. (2.79) is not well defined. This is not a problem in
practice because the case is not physically meaningful. Note that in the limit ra
the transfer function approaches unity, which is the reflection coefficient for a closed
end of a lossless cylindrical tube (see Section 2.2.5).
The acoustic impedance of an open end of a conical tube has not been formulated.
Thus it is not possible to derive the corresponding reflection function. Martnez and
Agull (1988) have proposed an ad hoc continuous-time reflection function that can be
used as a first approximation:
-a 2te -at
r0 (t) =
0
with

t0
t<0

(2.80a)

54

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


0.787e
1
a = c

2ra
Do

(2.80b)

where c is the speed of sound, Do is the diameter of the open end, ra the distance of the
opening from the apex of the conical tube section, and e the Neperian (e = 2.71828).
The function given by Martnez and Agull approximates the reflection of a pressure
wave from the open end of a conical tube in an infinite wall. This assumption is often
used in models of speech production. The above reflection function is not a very accurate approximation as may be observed by comparing the impulse responses shown in
Martnez and Agull (1988), but it may be good enough for a speech synthesizer.
Note that the case of a conical tube in an infinite wall is not interesting for modeling
wind instruments. Rather, we should find a proper approximation for the situation
where there is no wall of any kind, but the tube is in free space.
Another approximation for the radiation impedance of the open end of a conical tube
is obtained by using the impedance of a spherical source of an equivalent radiation area.
Interestingly enough, this impedance has the same form as that of an infinite conical
section.
2.4.5 Discrete-Time Reflection and Transmission Filters
In this section we discuss digital waveguide modeling of conical tube systems. We convert the continuous-time reflection functions derived in Section 2.1 into discrete-time
filters using the impulse-invariant transformation. This approach has been suggested by
Smith (1987) to be used in digital waveguide modeling.
The reflection function for the taper and diameter discontinuity in a conical tube is
given by Eq. (2.73a) as
R(w ) =

B -1
2Ba
B + 1 (B + 1)( jw + a )

(2.81)

where the coefficient a is defined by Eq. (2.75). The digital filter for the reflection
junction is obtained applying the impulse-invariant transform to Eq. (2.81):
R(z) =

B - 1 2BTa (B + 1)
B + 1 1 - e - aT z -1

(2.82)

In many applications, such as modeling of wind instruments bores or the human


vocal tract, it is needed to model a junction of conical tubes with no diameter discontinuity. Since the reflection filters R + (w ) and R- (w ) are in this case the same and the
transmission filters are defined by T + (w ) = 1 + R + (w ) and T - (w ) = 1 + R- (w ), we can
use a single reflection filter R(w ) = R + (w ). The impulse-invariant transform of the
reflection function given by Eq. (2.77a) yields a first-order all-pole filter
R(z) =

b0
1 + a1z -1

(2.83a)

with the filter coefficient


a1 = -e - aT

(2.83b)

Chapter 2. Digital Waveguide Modeling

pk

pk-

55

R(z)

pk++1

pk-+1

Fig. 2.15 The two-port junction that models the scattering of pressure signals at the
junction of two conical tubes when there is no discontinuity of diameter.
where a is defined as in Eq. (2.78). It is necessary to scale this transfer function to have
a gain of 1 at w = 0. This is achieved by setting
b0 = -(1 + a1 ) = -1 + e - aT

(2.83c)

The discrete-time model for the junction of two conical tubes that have the same diameter at the junction is illustrated in Fig. 2.15.
Figure 2.16 shows the magnitude responses of the continuous-time filter R(w ) and
its discrete-time version R(e jw ) (obtained with the impulse-invariant method) with five
ra = 0.1

-5
-10

rb = 0.01

-15
rb = 0.02

Magnitude (dB)

-20
-25

rb = -0.2

-30
rb = 0.2

-35
-40

rb = 0.12

-45
-50
-55
0

10
Frequency (kHz)

15

20

Fig. 2.16 Magnitude responses of the analog (dashed lines) and digital (solid lines)
reflection functions with taper discontinuity only. The sampling rate is 44.1
kHz.

56

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

R(
z)
s

ta
bl
e

R(
z)
u

R(z) stable

ns
ta
bl

rb

(0,0)

ta
bl
e

R(z) unstable

R(
z)
s

R(
z)
u

ns

ta
bl

ra

Fig. 2.17 The stability of the reflection filter R(z) depends on the parameters ra and rb .
The filter is unstable in the shaded regions and stable in the white ones.
different values of rb . The parameter ra is equal to 0.1 m in these examples. The effect
of aliasing due to the impulse-invariant transformation is clearly seen by comparing the
curves of the analog (dotted lines) and the digital (solid lines) filters. Note, however,
that the difference between the magnitude response pairs is several dB only at frequencies over 10 kHz. At low frequencies, which are of great importance, the digital approximations coincide with the analog ones.
2.4.6 Stability of the Reflection Filter
The reflection filter R(z) defined by Eq. (2.83) is stable when e - aT < 1 which implies
that a must be positive (since T > 0). The sign of a depends on the values of ra and rb
according to Eq. (2.78).
The regions of stability and instability of R(z) are illustrated in Fig. 2.17 on a
parameter plane where the abscissa corresponds to the value of ra and the ordinate to
the value of rb . There are three distinct regions where the filter R(z) is stable and three
where it is unstable. The unstable regions include half of all possible conical tube configurations. These cases may be impossible to avoid in practice.
Fortunately it appears that having an unstable filter as a part of a larger system does
not necessarily imply that the overall system is unstable. Physical systems are passive
and they are always stable. Also digital models that simulate physically realizable systems have to be stable.
2.4.7 Modeling the Ends of a Conical Tube
The reflection of pressure waves at an open or a closed end of a conical tube depends on
the frequency as was discussed in Section 2.4.4. The continuous-time formulas that
describe the reflection at the end of a conical tube may be converted into digital filters
using the impulse-invariant transformation.
The impulse-invariant transform of the reflection function (2.79) for the closed end
of a conical tube yields a first-order IIR filter

Chapter 2. Digital Waveguide Modeling

57

R(z) = 1 +

b0
1 + a1z -1

(2.84)

with the coefficients b0 = 2cT / ra and a1 = -ecT / ra . This transfer function is stable
when ra < 0 but unstable when ra > 0.
The open end of a conical tube in an infinite wall can be modeled using the continuous-time reflection function of Eq. (2.80) designed by Martnez (Martnez and Agull,
1988). It is easily transformed into the frequency domain by using the property that
differentiation in the frequency domain corresponds to multiplication by time in the
time domain. Thus, the Laplace transform may be written as
R0 (s) = -a 2

d 1
a3
=ds s + a
(s + a)2

(2.85)

The corresponding z-transform obtained by the impulse-invariant method is given by


R0 (z) = -a 2

d
1

-aT -1

dz 1 - e z

(2.86)

This yields
R0 (z) =

b1z -1
1 + a1z -1 + a2 z -2

(2.87a)

with the coefficients


b1 = -a 2 Te -aT , a1 = -2e -aT , a2 = e -2aT

(2.87b)

The impulse response of the discrete-time filter given by Eq. (2.87) is a sampled version of Eq. (2.80a), that is
h(nT ) = -a 2 nTe -anT

for n = 0, 1, 2, K

(2.88)

2.4.8 Conclusions and Discussion


This section has discussed computational modeling of conical acoustic tubes. A tube
system consisting of truncated cones was studied. An interesting special case of this
configuration is one with no discontinuity in the diameter between the contiguous conical tube sections, that is, only the taper of the tube is changed at the junction. This kind
of model may be utilized for conical waveguide simulation of the human vocal tract or
the bores of wind instruments.
The waveguide simulation of acoustic tubes constructed of conical parts is a new
area of research. It was first discussed by Vlimki and Karjalainen (1994a, 1994b), see
also Vlimki (1994b). Recently, de Bruin and Walstijn (1995) have implemented a
conical waveguide model following the same guidelines as presented here (see also
Walstijn and de Bruin, 1995).
Several questions are left for future research. For example, waveguide simulation of
finger holes in a conical bore should be studied using the formulations given in
Martnez et al. (1988). This would be useful for the modeling of woodwind instruments

58

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

such as the saxophone. Furthermore, the stability of the overall waveguide model is an
issue that should be demonstrated in the general case or at least for certain useful configurations.
Another important topic for future research is to verify the accuracy of the digital
model by acoustic measurements. Tube systems including conical parts can be excited
by a short pulse and the response measured by a microphone. The actual impulse
response is then obtained by deconvolution of the pulse and the response. These results
may be compared to those obtained using a waveguide model. Agull et al. (1994,
1995) have measured conical tube systems and showed that the results are in good
agreement with the theoretical formulations reported in Martnez and Agull (1988).
A major unsolved problem is the reflection function for the open end of a conical
tube. The exact mathematical solution is obviously involved. Hence we suggest a series
of measurements of impulse responses and reflection functions of open conical horns of
different taper and diameter. A useful digital reflection filter may then be designed by
applying one of the standard techniques used in digital signal processing.

2.5 Waveguide Models for Wind Instruments


One of the first applications of the waveguide modeling technique was a model-based
sound synthesis of the clarinet (Smith, 1986; Hirschman, 1991; Vlimki et al., 1992b;
Rocchesso and Turra, 1993). A waveguide flute model was introduced in Karjalainen et
al. (1991) and Vlimki et al. (1992a). The block diagram of a general wind instrument
model is depicted in Fig. 2.18.
Now the waveguide represents the propagation of wave components in opposite
directions in an acoustic tube. The length L of the delay lines is related to the effective
length leff of the tube in the following way:
L=

leff f s
c

(2.89)

where f s is the sampling rate. Note that in general L is not an integer and again a fractional delay filter must be used to implement a real-valued delay.
The reflection model consists of a digital filter that brings about the frequencydependent reflection and radiation of the sound wave at the end of the bore. Note that
the reflection model has two outputs, one for the outgoing signal and the other for the
reflected, ingoing signal.
The excitation model includes a nonlinearity which is an essential part of the sound
production mechanism in wind instruments. In the case of the clarinet, this system
models the operation of a reed that controls the air flow through the mouth piece. The
same nonlinearity is also suitable for other reed wind instruments, such as the saxo-

Input

Excitation
Model

Delay Line
Delay Line

Reflection
Model

Output

Fig. 2.18 A generic waveguide wind instrument model.

Chapter 2. Digital Waveguide Modeling

59

phone (Cook, 1988). In the flute, the nonlinearity models the airflow into the bore and
is similar to a sigmoid function (Karjalainen et al., 1991; Vlimki et al., 1992a;
Vlimki, 1992).
Cook (1991, 1992) introduced a waveguide model that is capable of synthesizing
brass instrument tones. His system is also based on the principle shown in Fig. 2.18.
However, the nonlinearity has been obtained by modeling the lips of the player as a
mass-spring oscillator. For discrete-time simulation the differential equations have been
replaced with difference equations.
2.5.1 Need for Finger-Hole Models
Later in this work we will address the problem of simulating finger holes in the bore of
a woodwind instrument. Finger holes (or tone holes, or side holes) are a unique feature
of woodwinds. They have two essential functions:
1) controlling the fundamental frequency, and
2) influencing the sound quality.
The effect of open finger holes on the sound is characterized by the open-hole lattice
cutoff frequency (Benade, 1960, 1976). Above this frequency the sound waves mainly
radiate out of the tube and produce only weak resonances in the tube. Furthermore, even
closed tone holes have their effect, as they effectively lengthen and enlarge the bore in
their vicinity (Benade, 1960, 1976).
So far the tone holes have usually been neglected in digital waveguide models for
woodwind instruments (see Smith, 1986; Vlimki et al., 1992a, 1992b; Cook, 1992).
The obvious limitation in these models is that the fundamental frequency of a synthesized tone can only be controlled by changing the length of the delay lines that form the
digital waveguide model for the bore. From the viewpoint of the sound wave propagating inside the bore, this is a very unnatural procedure. Dropping or adding unit delays
causes a discontinuity in the signal which can be heard as an annoying click. Another
deficiency of simplified woodwind bore modeling is that the tone hole lattice cutoff
effect is not automatically included in the model. Thus, its effect has to be added separately, usually during the design of the radiation filter.
In addition to the above mentioned problems, the lack of tone holes limits the playing styles that are available for the user of the woodwind instrument model. Multiphonics, for example, cannot be produced by a model without finger holes. When
proper finger-hole models are included, multiphonics as well as other special effects,
like the key claps, become available to the user of the computer model such as they are
for the player of an acoustic instrument.
2.5.2 Modeling of a Single Woodwind Finger Hole
In traditional transmission line models for woodwind instruments a side hole is presented as a T network (Keefe, 1982). A two-by-two linear transformation will convert T
parameters to reflections and transmission coefficients. The T network model includes
both shunt and series impedances. In practice, the series impedance has a small value
and can be neglected (Coltman, 1979).
It is most important to simulate accurately the open finger holes since they determine
the fundamental frequency of the tone. Closed holes have a considerably less significant

60

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

effect and for simplification they are omitted in our model. In the following a computational model for an open hole is derived.
Model for an Open Finger Hole
The simplest approximation for the acoustic impedance Zs of a small open hole is pure
acoustic inertance, that is (see, e.g., Fletcher and Rossing, 1991)
Zs (w ) = jw

rl
As

(2.90)

where r is the density of air, As is the cross-sectional area of the finger hole, and l is
its effective height (see Keefe, 1982, 1990). Multiplication by j w in the frequency
domain corresponds to differentiation in the time domain. In the discrete-time model the
impedance of an open hole can be approximated by means of a digital differentiator
HD (z). The transfer function of a digital filter approximating Eq. (2.89) is then

rl
Zs (z) =
HD (z)
As

(2.91)

Let us denote the discrete-time version of the reflection function of Eq. (2.49) by
R(z). This reflection function is expressed as
R(z) = -

Z0
Ys (z)
=Ys (z) + 2Y0
Z0 + 2 Zs (z)

(2.92)

where Ys (z) and Zs (z) denote the discrete-time approximations for the continuous-time
functions Ys (w ) and Zs (w ), respectively. When Eq. (2.91) is substituted into Eq. (2.92)
the digital reflection filter associated with the finger-hole junction is given by
R(z) = -

Z0
r c A0
=
(r c A0 ) + (2 r l As ) HD (z)
Z0 + 2 Zs (z)

(2.93)

where A0 is the cross-sectional area of the bore.


The simplest transfer function of a digital filter approximating the differentiator is
HD (z) =

1
(1 - z -1 )
T

(2.94)

where T is the sampling interval. This filter computes the difference of two successive
input samples and acts as a high-pass filter. Using this approximation Eq. (2.93) can be
simplified into the form
1+ a
1 + az -1

(2.95a)

2 A0 l
2 A0 l + AsTc

(2.95b)

R(z) = where the filter coefficient is given by


a=-

Equation (2.95a) is the transfer function of a first-order all-pole filter. Since the coeffi-

Chapter 2. Digital Waveguide Modeling

61

-5

Magnitude (dB)

-10

-15

-20

-25

-30

-35

-40
0

10
12
Frequency (kHz)

14

16

18

20

Fig. 2.19 Analog (dashed lines) and digital (solid lines) approximations to the reflection function of a finger-hole junction.
cient a is always negative this is a lowpass filter. At very low frequencies it acts like a
constant multiplier, multiplying by 1.
The magnitude responses of the analog reflection function R0 (w ) computed using
Eq. (2.49) with the impedance defined in Eq. (2.90) and the digital reflection function
R(e jw ) of Eq. (2.95) are illustrated in Fig. 2.19. The sampling rate used in the discretetime simulation is 44.1 kHz, i.e., T = 22.7 ms. The upper curves are computed for a hole
radius of 8.0 mm and an effective height of 17.5 mm, and the lower curves for a hole
radius of 6.25 mm and an effective height of 31.7 mm. These examples correspond to
two different finger holes of the flute (Coltman, 1979). The bore radius in the flute is
9.5 mm. The approximation error in the magnitude response of the digital filter is less
than 1 dB at frequencies below 10 kHz.
Each finger-hole model requires one filter consistent with Eq. (2.95). By controlling
the filter coefficient a in the range [ a0 , 0] (where a0 is the nominal value for the open
hole) in a suitable way, the finger hole can be closed and opened as in real woodwind
instruments to attain desired changes in the effective length of the bore.
2.5.3 Discrete-Time Scattering Junction
Let us consider the implementation of a finger-hole model via a three-port junction as
discussed in Section 2.3.3. The frequency-domain Eqs. (2.55) that describe the threeport junction for volume velocity waves can be expressed in the (discrete) time domain

62

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


+

us

us

ua+

ua-

R(z)

ub+

w(n)

ub-

Fig. 2.20 A computational model for scattering of volume velocity waves in a fingerhole junction.
as
ua- (n - D) = ub- (n - D) + us- (n) + w(n)

(2.96a)

ub+ (n - D) = ua+ (n - D) + us- (n) + w(n)

(2.96b)

us+ (n) = -us- (n) - 2w(n)

(2.96c)

where D is the real-valued delay corresponding to the position of the center of the finger
hole and w(n) is the output of the reflection filter R(z) (see Fig. 2.20), i.e.,
w(n) = r(n)*w1 (n)

(2.97)

where r(n) is the impulse response of the reflection filter R(z), * denotes the convolution sum, and the input signal w1 (n) is
w1 (n) = ua+ (n - D) + ub- (n - D) + us- (n)

(2.98)

The signal flow graph for a finger-hole model is depicted in Fig. 2.20. In this figure
the variable quantities are the same as in Fig. 2.11.

2.6 Relation to Other Methods


Smith (1987) has found that there are other approaches that have a lot in common with
digital waveguide filters (WGF). Digital ladder and lattice filters investigated by Gray
and Markel (1973, 1975) are discrete-time structures similar to the WGFs. The normalized ladder filter structure is a special case of a WGF. The KL vocal tract model is also
a kind of digital lattice filter.
Digital waveguide filters are also closely related to wave digital filters (WDF). In
WDFs, circuits are constructed utilizing adaptors that interconnect ports having different impedances (Fettweis, 1972). Adaptors are equivalent to junctions of waveguides. A
comprehensive tutorial on WDFs has been written by Fettweis (1986).

Chapter 2. Digital Waveguide Modeling

63

2.6.1 Comparison of Waveguide Filters and Wave Digital Filters


Smith (1987, pp. 9293) has stated the differences between WDFs and WGFs:
WDFs typically digitize lumped analog filter structures, in general RLC networks,
whereas WGFs digitize distributed physical systems such as acoustic resonators.
In WDFs the frequency variable is transformed into the discrete-time domain via
the bilinear transform. In WGFs, the frequency is digitized using the impulseinvariant method.
In WDFs, the wave variable is a combination of voltage and current. The wave
variable in WGFs is typically a basic physical quantity, e.g., the acoustic pressure
or volume velocity.
A major motivation for using WDFs in an application is their low sensitivity to
changes in multiplier coefficients. Namely, the coefficients of WDFs can be represented
with the accuracy of a few bits without changing the behavior of the filter considerably.
On the other hand, WGFs are used because they provide a simple, intuitive representation of a physical system. Furthermore, WGFs can be implemented efficiently using a
computer.
The KellyLochbaum speech synthesis model is based on a waveguide model for the
human vocal tract, although in some references it is stated to be an example of the wave
digital filter approach (see, e.g., Schroeter and Sondhi, 1994). This is a reflection of the
fact that WDFs and WGFs are closely related. Today WGFs are considered to be a special case of WDFs .
Smith (1987, p. 115) criticizes the use of the bilinear transform in WDFs by pointing
out that the model is exact only at three frequencies: 0, , and an arbitrary mapping
frequency. The waveguide model, on the other hand, utilizes the impulse-invariant
method for transforming the frequency variable. Thus wave propagation is modeled
accurately at discrete points. Smith states that for this reason it is possible to use voltage
or current as a signal variable without having realization problems. In WDFs it is necessary to use a wave variable that is a combination of voltage and current (see, e.g., Fettweis, 1986).
2.6.2 Reverberation and Simulation of Room Acoustics
Reverberators are used to add spatial impression to studio recordings of speech and
music and also to change the acoustics of listening rooms. Recursive digital filters
including delay lines have been used for implementing artificial reverberators for about
30 years. The initial work in this field is due to Schroeder (1962). A comprehensive
study on digital filter structures for reverberation was carried out by Moorer (1979).
These techniques have been extended to include a set of delay lines with a feedback
matrix (Jot and Chaigne, 1991; Rocchesso, 1993; Rocchesso and Smith, 1994). This
kind of system is called a feedback delay network.
Artificial reverberation was the first problem where waveguide filters were applied
(Smith, 1985). In this paper Smith defined the digital waveguide as a lossless bidirectional signal branch, the simplest special case being a bidirectional delay line. As a
generalization he mentions a cascade of allpass filters. As we have seen in this chapter,
J. O. Smith, personal communication, November 27, 1995.

64

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

a practical digital waveguide system also incorporates losses in terms of gain factors
less than one or digital (lowpass) filters whose magnitude response is not greater than
unity at any frequency.
The relationship of the feedback delay network and the digital waveguide network
has been studied by Smith and Rocchesso (1994). Recently, the digital waveguide
technique has been generalized to two dimensions (Van Duyne and Smith, 1993).
Further generalization to three or more dimensions is straightforward. Savioja et al.
(1994) have demonstrated that a three-dimensional waveguide mesh can be used for the
simulation of room acoustics.

2.7 Conclusions
In this chapter the fundamentals of waveguide modeling were discussed. Digital
waveguides provide a systematic method for designing computational models for
physical systems. So far this approach has primarily been applied to one-dimensional
resonators since two and three-dimensional extensions have been introduced only
recently.
The digital waveguide methods have been used for synthesis of music and speech.
Other techniques that lead to similar structures as WGFs were reviewed. These include
the KarplusStrong plucked string algorithm, the KellyLochbaum vocal tract model,
delay networks for reverberation, and wave digital filters.
One of the main drawbacks of the WGFs is the discretization of the time variable,
which leads to non-accurate modeling of propagation delays. This is the motivation for
the use of fractional delays in waveguide models. In the next chapter we discuss the
theory of fractional delay filters.

3 Fractional Delay Filters


In this chapter we review the digital filter design techniques for the approximation of a
fractional delay (FD). They can be utilized in many areas of digital signal processing.
Examples of these fields are time delay estimation (Smith and Friedlander, 1985), null
steering in the direction pattern of antenna arrays (Ko and Lim, 1988), timing adjustment of digital modems (Farrow, 1988; Armstrong and Strickland, 1993; Gardner,
1993; Erup et al., 1993), speech coding (Marques et al., 1989, 1990; Kroon and Atal,
1991; Medan, 1991), speech synthesis (Matsui et al., 1991), and arbitrary sampling-rate
conversion (Tarczynski et al., 1994). Fractional delays are essential in digital waveguide models for one-dimensional resonators (Jaffe and Smith, 1983; Laine, 1988;
Smith, 1983; Sullivan, 1990; Karjalainen and Laine, 1991; Vlimki et al., 1992a).
There are plenty of design methods for both FIR and IIR fractional delay filters.
Only the most useful of them are discussed in detail. A more comprehensive study of
known filter design methods for FD approximation has been written by Laakso et al.
(1994). Of special interest in our applications are digital filters that approximate the
ideal interpolation in a maximally flat manner at low frequencies. The maximally flat
FIR filter approximation is equivalent to the classical Lagrange interpolation method.
This technique is discussed in Section 3.3. The corresponding IIR or allpass filter has a
maximally flat group delay response. It is called the Thiran interpolator and is studied in
Section 3.4.3. These two design methods appear to be most effective in digital waveguide modeling mainly because they approximate ideal interpolation very accurately at
frequencies near the fundamental pitch of speech and music signals.

3.1 Ideal Fractional Delay


We first discuss the concept of fractional delay in continuous and discrete time and
consider the ideal solution of the FD problem to show why approximation is necessary.
3.1.1 Continuous-Time System for Arbitrary Delay
Consider a delay element, which is a linear system whose purpose is to delay an incoming continuous-time signal xc (t) by (in seconds). The output signal yc (t) of this system can be expressed as
yc (t) = xc (t )

(3.1)

where the subscript c refers to continuous-time. The Fourier transform Xc () of a


continuous-time signal xc (t) is defined as

65

66

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


Xc () =

xc (t)e

jt

dt

(3.2)

where W = 2pf is the angular frequency in radians. The Fourier transform Yc (W) of the
delayed signal yc (t) can be written in terms of Xc (W)
Yc (W) =

yc (t)e

- jWt

dt =

xc (t - t )e

- jWt

dt = e - jWt Xc (W)

(3.3)

The transfer function Hid (W) of the delay element can be expressed by means of
Fourier transforms Xc (W) and Yc (W). This yields
Hid (W) =

Yc (W) e - jWt Xc (W)


=
= e - jWt
Xc (W)
Xc (W)

(3.4)

The term e - jWt corresponds to the Fourier transform of the delay of t .


3.1.2 Discrete-Time System for Arbitrary Delay
Next we consider a discrete-time delay system. If the Fourier transform Xc (W) is nonzero only on a finite interval around w = 0, the continuous-time signal xc (t) is said to be
bandlimited. It may then be expressed by its samples x(nT ), where n Z is the sample
index and T is the sample interval, i.e., the inverse of the sampling rate. For convenience of notation, we omit T and use shortly x(n) to denote the samples of the discretetime signal.
We want to express the discrete-time version of the delay operation for a sampled
bandlimited signal. The outcoming discrete-time signal y(n) can simply be written as
y(n) = x(n - D)

(3.5)

where D = t T is the desired delay as multiples of the unit delay. Note that t T is generally irrational since t is usually not an integral multiple of sampling interval T.
Unfortunately, Eq. (3.5) is meaningful only for integral values of D. Then the samples
of the output sequence y(n) are equal to the delayed samples of the input sequence x(n),
and the delay element is called a digital delay line. However, if D were real, the delay
operation would not be this simple, since the output value would lie somewhere
between the known samples of x(n). The sample values of y(n) would then have to be
obtained by way of interpolation from the sequence x(n).
The spectrum of a discrete-time signal can be expressed by means of the discretetime Fourier transform (DTFT). In this integral transform, the time variable is discretized, but the frequency variable is continuous. The DTFT of signal x(n) is defined as
(see, e.g., Roberts and Mullis, 1987, pp. 8788; Jackson, 1989, pp. 106118)
X(w ) =

x(n)e- jwn ,

w p

(3.6)

n=-

where w = 2pfT is the normalized angular frequency. The DTFT of the output signal
y(n) can be written as

Chapter 3. Fractional Delay Filters


Y(w ) =

y(n)e

- jwn

n=-

67
=

x(n - D)e- jwn = e- jwD X(w )

(3.7)

n=-

The transfer function of an ideal discrete-time delay element can be given as


Hid (w ) =

Y(w ) e - jwD X(w )


=
= e - jwD ,
X(w )
X(w )

w p

(3.8)

This is the same result as given in Eq. (3.4)only now the angular frequency is circular
due to discretization in time. In order to be consistent with the z-transform notation used
commonly in digital signal processing, let us express the transfer function as
Hid (z) =

Y(z) z - D X(z)
=
= z- D
X(z)
X(z)

(3.9)

where D R+ is the length of the delay in samples. The delay D may be written in the
form
D = Dint + d

(3.10)

where 0 d < 1 is the fractional delay and the integer part Dint is given by
Dint = D

(3.11)

where is the greatest integer function. It is often called the floor function and is
defined as

x = max k

(3.12)

k x

where x R and k Z. The block diagram of the ideal delay element is illustrated in
Fig. 3.1.
Note that the z-transform representation in (3.9) is used in the Fourier transform
sense so that z = e jw . In principle, the z-transform is defined only for integral powers of
z and thus, if D were real, the term z - D should be written as an infinite series making
the notation unnecessarily involved.
To understand how to produce a fractional delay using a discrete-time system it is
necessary to discuss interpolation techniques. Interpolation of a discrete-time signal is
based on the fact that the amplitude of the corresponding continuous-time bandlimited
signal changes smoothly between the sampling instants.

x(n)

-D

y(n)

Fig. 3.1 Ideal discrete-time delay element.

68

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

3.1.3 Fractional Delay and Shannon Reconstruction


The fractional delay d can in principle have any value between 0 and 1. To be able to
produce an arbitrary fractional delay for a discrete-time signal x(n), one has to know a
way to compute the amplitude of the underlying continuous-time signal x(t) for all t.
This leads us to fundamental issues of digital signal processingsampling and reconstruction.
In 1949, Shannon introduced the sampling theorem (Shannon, 1949). It states that if
the Fourier transform Xc (W) of a continuous-time signal xc (t) is zero outside the interval (- f c , f c ) then the signal xc (t) is completely determined by its values at equidistant
points spaced 1 / 2 f c apart. This implies that the sampling rate f s has to be at least 2 f c .
Shannon gave the reconstruction formula for a sampled signal using the cardinal series

w
sin s (t - nT )

w
= x(nT )sinc s (t - nT )
xc (t) = x(nT ) 2
ws
2p

n=-
n=-
(t - nT )
2

(3.13)

where w s = 2pf s is the sampling angular frequency in radians per second and T is the
corresponding sampling interval, i.e., T = 1 f s . The sinc function is defined as
sinc(t) =

sin(pt)
pt

(3.14)

Note that sinc(0) = 1 since lim {[sin(pt)] / pt} = 1.


t0

According to Eq. (3.13) the ideal bandlimited interpolator has a continuous-time


impulse response

wt
sin s
2
wt
= sinc s
hc (t) =
w st
2p
2

(3.15)

for t R. This impulse response that Shannon called the Whittaker cardinal function
converts a discrete-time signal to a continuous-time one. In delay applications, however,
it is necessary to know the value of a signal at a single time instant between the
samples. The desired result may be obtained by shifting Eq. (3.15) by D and then
sampling it at equidistant points. Hence, the output y(n) of the ideal discrete-time fractional delay element is computed as
y(n) = x(n - D) =

x(k)sinc(n - D - k)

(3.16)

k =-

for n Z and D R. Here we have simplified the notation setting the sampling rate f s
to 1 (and consequently, the sampling interval T = 1), since with a discrete-time system
description it is not necessary to fix the actual sampling rate.
It can be concluded that producing a fractional delay requires reconstruction of the
discrete-time signal and shifted resampling of the resulting continuous-time signal. In
Eq. (3.16) these two operations have been combined.

Chapter 3. Fractional Delay Filters

69

3.1.4 Characteristics of the Ideal Fractional Delay Element


In digital signal processing, it is often advantageous to study linear filters in the frequency domain, that is, by their frequency response. The ideal bandlimited interpolator
given by (3.16) has the frequency response obtained from (3.8) as
Hid (e jw ) = e - jwD ,

w p

(3.17)

The frequency response of the ideal delay element has the following magnitude and
phase characteristics
Hid (e jw ) 1

(3.18a)

arg Hid (e jw ) = Q id (w ) = -Dw

(3.18b)

In other words, the delay element is a linear-phase allpass system. Its phase delay and
the group delay are, respectively,
Q id (w )
=D
w

(3.19a)

Q id (w )
=D
w

(3.19b)

t p, id (w ) = and

t g, id (w ) = -

where Q id (w ) is the ideal phase response as defined in Eq. (3.18b). From these results
it can be concluded that the ideal delay element passes all the frequency components of
an incoming signal with the same delay D.
The inverse discrete-time Fourier transform of Eq. (3.17) gives the impulse response
(IR) of an ideal delay element [compare with Eq. (3.16)] as
p

1
1
1
Hid (e jw )e jwn dw =
e - jwDe jwn dw =
e jw (n- D) dw
hid (n) =

2p -p
2p -p
2p -p
(3.20)
=

e jp(n- D) - e - jp(n- D) sin[p(n - D)]


=
= sinc(n - D)
p(n - D)
j2p(n - D)

If D is an integer (d = 0), the IR of the delay element is zero at all sampling points
except at n = D, that is
1 for n = D
d = 0 hid (n) =
0 otherwise

(3.21)

In this case, the delay element is implemented by a cascade of unit delays as discussed
earlier. When D is a fractional number, i.e., 0 < d < 1, the IR has non-zero values at all
index values n Z , or
0 < d < 1 hid (n) 0 for all n

(3.22)

70

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Amplitude

0.5

-0.5
-2

-1

3
4
Sample Index

-1

3
4
Sample Index

Amplitude

0.5

-0.5
-2

Fig. 3.2 (Upper) The sinc function (dashed line) shifted by D = 3.0. The circles indicate the sampled values of the sinc function. All the sample values except the
centermost are zero when the delay is an integer, that is d = 0. (Lower) The
sinc function shifted by D = 3.3. The dotted line indicates the center point of
the shifted sinc function. This figure illustrates the fact that all the sample values (circles) are non-zero in the case of a fractional delay, i.e., d 0.
These time-domain properties of the ideal delay element are illustrated in Fig. 3.2. In
the upper part of this figure, the delay D is an integer and only one sample is non-zero
because the zero crossings of the sinc function coincide with the other sampling points.
In the lower part of Fig. 3.2, however, the delay D is a fractional number and all the
samples on the interval (, ) are non-zero.
To conclude, the impulse response hid (n) of the delay element is a shifted and sampled version of the sinc function which is infinitely long. Due to this the IR corresponds
to a noncausal filter which cannot be made causal by a finite shift in time. In addition,
the filter is not BIBO stable since the impulse response (3.20) is not absolutely
summable. This kind of filter is nonrealizable. To produce a realizable fractional delay
filter, some finite-length approximation for the sinc function has to be used.
Before considering these approximations we notify one particular property of the
ideal transfer function that makes the fractional delay approximation difficult. The
imaginary part of the ideal transfer function is

BIBO is short for Bounded Input Bounded Output.

Chapter 3. Fractional Delay Filters

71

} w =p = - sin(pD)

Im Hid (e jw )

(3.23)

This implies that when D is a fractional number, the transfer function has a complex
value at w = p. Discrete-time filters with real coefficients have the property that
H(e jw )

w =p

(3.24)

This implies that, at the Nyquist frequency, the approximation error cannot be smaller
than sin(pD) . Cain and Yardim (1994) call this the Tarczynski bound.
Some of the useful fractional delay approximation techniques will be discussed in
the following sections.
3.1.5 Conclusion and Discussion
Up to now we have discussed the theoretical background of noninteger digital delay, or
fractional delay. The relationship between the underlying analog signal and fractional
delay was clarified. Furthermore, the characteristics of the ideal fractional delay element were reviewed and it was shown that the ideal FD system (i.e., ideal bandlimited
interpolator) is nonrealizable, since the corresponding impulse response is infinitely
long and noncausal. Thus, approximation techniques have to be employed to obtain a
finite-length causal implementation of FD.
Formerly, similar theoretical aspects have been considered in the context of signal
recovery and multirate signal processing. Implementation of a fractional delay requires
that the system is in principle able to compute the amplitude of the signal at any time
instant between known samples. Thus, the FD problem and recovery of an analog signal
from its samples are essentially equivalent.
The major differences between the FD problem and sampling-rate conversion are as
follows.
1) In FD processing only one new sample per sampling interval needs to be computed, whereas in interpolation for increasing the sampling rate, several new
samples per original sample interval may be needed.
2) In the FD problem the fractional interval d to be approximated is in general irrational, whereas in sample-rate conversions, simple rational ratios are often used.
The underlying theory of these two DSP problems is, of course, the same.
Even today, FD processing is not a well-known topic in signal processing.
Discussion on this topic has appeared only in few textbooks (see, e.g., Crochiere and
Rabiner, 1983, pp. 271274; Regalia, 1993, pp. 948953).

3.2 Design Methods for Fractional Delay FIR Filters


In this section, the FIR filter approximation of the fractional delay is discussed. For
more detailed discussion and examples see Laakso et al. (1994). A comparison of four
FD FIR filter design methods has been made by Cain et al. (1994).
The transfer function of an FIR filter is of the form

72

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


H(z) =

h(n)z -n

(3.25)

n=0

where N is the order of the filter and h(n) (n = 0, 1, ..., N) are the real coefficients that
form the impulse response of the FIR filter. Note that the length of the impulse response
(i.e., the number of the filter coefficients) is
L=N+1

(3.26)

In the design procedure our aim is to minimize the error function defined by
E(e jw ) = H(e jw ) - Hid (e jw )

(3.27)

i.e., the difference of the ideal frequency response and the approximation.
3.2.1 Least Squared Integral Error Design
The intuitively most attractive method for designing realizable FD filters is certainly the
truncation of the ideal IR defined by Eq. (3.20). This method minimizes the least
squared (LS) error function ELS which is equal to the L2 norm (integrated squared
magnitude) of the error frequency response E(e jw ), that is
p

ELS

2
2
1
1
= E(e jw ) dw = H(e jw ) - Hid (e jw ) dw
p0
p0

(3.28)

Using Parsevals relation this equation can be converted into the time domain. This
results in

ELS =

h(n) - hid (n) =


2

n=-

[hid2 (n) + h2 (n) - 2h(n)hid (n)]

(3.29)

n=-

From this equation, it is possible to derive a closed-form solution for the squared integral error in the case of fractional delay approximation. According to Parsevals relation, the first term in (3.29) can be evaluated in the following way:

2
2
1
1
hid (n) = p Hid (e jw ) dw = p e- jwD dw = 1
n=-
0
0
2

(3.30)

The second and third term of Eq. (3.29) include the coefficients of the Nth-order FIR
filter and thus the summation indices can be limited. The closed-form solution is
ELS = 1 +

[h2 (n) - 2h(n)sinc(n - D)]

(3.31)

n=0

where the sinc function is defined by Eq. (3.14).


It is also possible to derive a closed-form formula for the bandlimited squared integral error. The derivation is presented in Appendix A.
According to Eq. (3.31), the optimal solution for an Nth-order FIR filter in the L2
sense is obviously the one with N + 1 coefficients truncated symmetrically around the

Chapter 3. Fractional Delay Filters

73

maximum value, i.e., the central point of hid (n). The approximation error resulting from
this approach may be written as
ELS =

-1

hid (n)

hid (n)

(3.32)

n= N +1

n=-

It is seen that the approximation error decreases as N increases. This is intuitively clear.
The impulse response h(n) of the LS FD FIR filter can be expressed as
sinc(n - D), 0 n N
h(n) =
otherwise
0,

(3.33)

The delay D should be located between the two central taps of the filter when N is
odd (L = N + 1 is even), or within half a sample from the central tap when N is even (L
odd), since then the approximation error is smallest. This means that the delay D should
be chosen so that the following inequality is valid:
N -1
N +1
D
2
2

(3.34)

For odd-order FIR interpolators (N odd and L even) this simply implies that the integer
part Dint of the delay has to be chosen in the following way:
Dint =

N -1
2

(3.35)

When N is even (L odd) the integer part of the delay should be chosen so that

Dint

1
N
when 0 d <

2
= 2
N
1
- 1 when d < 1
2
2

(3.36)

This will ensure that point D is located within half a sample from the midpoint of the
interpolator and consequently that the approximation error is smallest possible with the
used FIR design technique.
Above we have assumed that the FIR filter coefficients are truncated from the
beginning of the ideal impulse response. This is not the only possible choice. The index
M of the first non-zero sample should be chosen in the following way:
round(D) - N

2
M=
D - N - 1

for even N
(3.37)
for odd N

where round() denotes the operation of rounding to the nearest integer, and is the
greatest integer function as given by Eq. (3.12). The FIR filter h(n) will be causal if M
0. If M < 0, the filter will be noncausal and thus nonrealizable. An integer must then
be added to D in order to shift the central point of the sinc function so that the requirement of Eq. (3.34) is fulfilled. This is equivalent to adding unit delays to the delay line.

74

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Magnitude (dB)

5
0
-5
-10
-15

Phase Delay in Samples

-20
0

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

2
1.8
1.6
1.4
1.2
1
0

Fig. 3.3 Magnitude responses (upper) and phase delay curves (lower) of a third-order
FIR filter designed by truncating the shifted sinc function. The curves are
plotted for 11 fractional delay values between 1.0 and 2.0. Note that the
magnitude responses for N - D are the same as for D.
However, unit delays cannot be added if the actual delay D, including the integer part, is
important. Another method to make the FIR filter causal is to choose a smaller value for
N, that is, to design a filter of lower order. This will obviously make the approximation
less accurate.
If Eq. (3.31) is valid, then M = 0 and the entire delay D is implemented by the FIR
filter. This case corresponds to the best (highest-order) causal approximation but also to
the heaviest computational load that an LS FIR filter can offer. If M > 0 then part of the
desired delay has to be implemented by a sequence of unit delays while the rest of that
delay is approximated by the FD filter. The implementation of a fractional delay with
FIR filters will be discussed further in Sections 3.5.2 and 4.1.
Truncating the shifted sinc function is an easy way to design FD FIR filters. This
approach has been proposed, e.g., by Sivanand et al. (1991) (see also Sivanand, 1992)
and Cain et al. (1994). However, it is often not useful since truncation of the impulse
response introduces ripple to the frequency response. This is called the Gibbs phenomenon. It causes the maximum deviation from the ideal frequency response to remain
approximately constant irrespective of the filter order. This goes for both the magnitude
and the phase response.
The magnitude and phase delay characteristics of an LS FD FIR filter are illustrated
in Fig. 3.3. The order N of the filter is 3 and thus the best approximation (M = 0) is

Chapter 3. Fractional Delay Filters

75

obtained with delay values 1 D < 2. Were N = 4, M would be 0 for 1.5 D < 2.5.
Notice the overshoot in both the magnitude and the phase delay curves. It is seen that
the phase delay curves are symmetrical with respect to the delay of 1.5 samples, which
corresponds to the central point of the filters impulse response.
3.2.2 LS Design with Reduced Bandwidth
A variation of the LS design technique is to use a lowpass interpolator as a prototype
filter instead of a fullband filter (Laakso et al., 1994). The ideal solution is then defined
in the interval [0, ap] where 0 < a < 1. The solution corresponding to Eq. (3.33) is now
sin[a p(n - D)] for M n M + N

h(n) = p(n - D)
0
otherwise

(3.38)

with M defined as in Eq. (3.37). It appears that the Gibbs phenomenon will be reduced
considerably, as desired, but the usable bandwidth will contract by a. Thus we conclude
that narrowing the bandwidth does not necessarily improve the LS FD filter design.
Another variation of the basic LS design is to use a reduced bandwidth with a smooth
transition band function (Parks and Burrus, 1987, pp. 6370). This will make the
impulse response of the FD element decay fast. The IR will still be infinitely long and
must be truncated, but the Gibbs phenomenon is guaranteed to be reduced, since the
discontinuity is not as sharp as originally. A good choice for the transition band is a
low-order spline multiplied by e - jwD . Using this design the magnitude response of the
FD FIR filter will remain constant with high precision. However, the price paid is that
the phase response will be severely nonlinear (Laakso et al., 1994).
3.2.3 Windowing the Ideal Impulse Response
A well-known method to reduce the Gibbs phenomenon in FIR filter design is to use a
bell-shaped window function for weighting in the time domain (see, e.g., Parks and
Burrus, 1987, pp. 7183). Truncation corresponds to windowing with a rectangular
window function. An extensive tutorial on window functions and their properties has
been written by Harris (1978). The impulse response of an FIR filter designed by the
windowed LS method can be written in the form
w(n - D)sinc(n - D) for M n M + N
h(n) =
otherwise
0

(3.39)

Note that the midpoint of the window function w(n) of length N + 1 has been shifted by
D so that the shifted sinc function will be windowed symmetrically with respect to its
center. The window function w(n D ) is asymmetric, however. Many window functions, such as the Hamming and von Hann windows, can be easily delayed by a fractional value D (Laakso et al., 1994; Cain and Yardim, 1994; Cain et al., 1994, 1995)
whereas some others cannot. For instance, there is no known method to design exact
fractionally shifted DolphChebyshev or Saramki windows for an arbitrary D. This is
because these windows have been designed in the frequency domain. An approximative
technique for shifting these windows has been proposed by Laakso et al. (1995b).
It is readily seen that the approximation error ELS [see Eq. (3.29)] will be larger with

76

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

a window function than without it [see Eq. (3.32)]. Thus, this design method does not
minimize the LS error measure, but it is an ad hoc modification of the LS technique. In
general, the frequency response of an FD FIR filter designed using this technique has a
lower ripple but also a wider transition band than a corresponding LS filter (see, e.g.,
Parks and Burrus, 1987, p. 73). A drawback is that the control of the magnitude error is
difficult when changing the parameters of, e.g., the Kaiser or the DolphChebyshev
window.
The windowing method is suitable for real-time systems where the fractional delay is
changed since the coefficients can be updated quickly. The samples of the sinc and the
window function can be stored in memory for several values of D. The filter coefficients for delay values between the stored ones may be obtained, e.g., by linear interpolation (Smith and Gossett, 1984). Smith (1992a) has studied the implementation and
error analysis of this kind of interpolator in detail.
3.2.4 General Least Squares FIR Approximation of a Complex Frequency
Response
In principle, the FIR fractional delay filter with the smallest LS error in the defined
approximation band is accomplished by defining the response only in that part of the
frequency band and by leaving the rest out of the error measure as a dont care band
(Laakso et al., 1994). This scheme also enables frequency-domain weighting of the LS
error. This technique is called general least squares (GLS) FIR filter approximation, and
it results in the following error function
EGLS

1
=
p

ap

W(w ) E(e

1
) dw =
p

jw 2

ap

W(w ) H(e jw ) - Hid (e jw ) dw

(3.40)

where the error is defined in the lowpass frequency band [0, ap] only and W(w ) is the
nonnegative frequency-domain weighting function [which has nothing to do with the
time-domain window function w(n)]. The derivation of the bandlimited squared integral
error function for the case W(w ) 1 is presented in Appendix A. Now it is shown how
a filter design algorithm can be obtained based on this approach.
The DTFT of the FIR filter can be written as
H(e jw ) = hT e

(3.41a)

where h is the coefficient vector


h = [h(0) h(1) L h(N)]

(3.41b)

and e is defined as

e = 1 e - jw

L e - jNw

(3.41c)

The error function (3.40) can now be rewritten in the following way:
EGLS

1
=
p

ap

][

W(w ) hT e - Hid (e jw ) hT e - Hid (e jw ) dw

(3.42)

Chapter 3. Fractional Delay Filters

77

where the superscript * denotes complex conjugation. This equation can be further
elaborated as
EGLS

1
=
p

ap

W(w )h
0

2
Ch - 2hT Re Hid (e jw )e* + Hid (e jw ) dw

(3.43a)

where

{ }

C = Re ee H

1
cos(w )
L
cos(Nw )

cos(w )
1
cos[(N - 1)w ]

=
M
O
M

cos(Nw ) cos (N - 1)w L

1
[
]

(3.43b)

Here the superscript H stands for the Hermitian operation, i.e., transposition with conjugation. The error function can further be expressed as
EGLS = hT Ph - 2hT p1 + p0

(3.44a)

where we have used the following matrices and vectors:


1
P=
p
1
p1 =
p

ap

ap

W(w )Cdw

W(w )[Re{Hid (e
0

1
p0 =
p

ap

(3.44b)

jw

}]

) c - Im Hid (e jw ) s dw
2

W(w ) Hid (e jw ) dw

(3.44c)

(3.44d)

c = [1 cos(w ) L cos(Nw )]

(3.44e)

s = [0 sin(w ) L sin(Nw )]

(3.44f)

and

The optimal solution in terms of the L2 norm is obtained by solving for the minimum
of the error measure (3.44a). The unique minimum-error solution is found by setting its
derivative with respect to h to zero. This results in the following normal equation
2Ph - 2p1 = 0

(3.45)

which is solved formally by matrix inversion, i.e.,


h = P -1p1

(3.46)

In practice, the optimal solution is obtained by determining the integrals involved in


Eqs. (3.44) (usually numerically) and solving the set of N + 1 linear equations (3.46).
Numerical problems may arise, particularly in narrowband approximation (Laakso et
al., 1994). However, in FD filter design this is not typical. The design is much easier if

78

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Magnitude (dB)

5
0
-5
-10
-15

Phase Delay in Samples

-20

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

2
1.8
1.6
1.4
1.2
1
0

Fig. 3.4 Magnitude responses (upper) and phase delay curves (lower) of a third-order
GLS FIR filter with a = 0.5. The curves are plotted for 11 fractional delay values between 1.0 and 2.0. Note that the magnitude responses for N - D are the
same as for D.
the weighting function is not used, i.e., W(w ) 1, since the solution can be given in
closed form. Then elements of P and p1 can be expressed as
Pk,l

1
=
p

ap

cos[(k - l)w ]dw = asinc[a (k - l)]

k,l = 1,2,K, L

(3.47)

k = 1,2,K, L

(3.48)

and
p1,k

1
=
p

ap

cos[(k - D)w ]dw = asinc[a (k - D)]


0

Note that matrix P is independent of the delay parameter D and only needs to be
inverted once. The resulting FIR filter coefficients depend on the choice of frequency
band parameter a and the weighting function.
Figure 3.4 presents a third-order GLS filter with unity weighting function W(w ) 1
and a = 0.5 (i.e., halfband approximation). This result can be compared with the fullband LS FIR approximation of Fig. 3.3. It is seen that the maximum error in both magnitude and phase delay has been reduced in the approximation band. At high frequencies the error has slightly increased since now the approximation error outside the pass-

Chapter 3. Fractional Delay Filters

79

band [0, 0.5p] has not been considered in the design.


3.2.5 Minimax Design of FIR FD Filters
If it is desired to minimize the peak approximation error, it is suitable to use the minimax or Chebyshev design. These approximation problems can usually be solved only by
iterative techniques. Advanced algorithms for complex approximation with minimax
error characteristics have been presented, e.g., by Parks and Burrus (1987), Pyfer and
Ansari (1987), Preuss (1989), Schulist (1990), Alkhairy et al. (1991), and Karam and
McClellan (1995). For a more detailed discussion on minimax FIR FD filters, see
Laakso et al. (1994).
Here we consider a technique proposed by Oetken (1979). The observation that led
to the development of this design method is that the magnitude error functions E(e jw )
of odd-order equiripple FIR FD filters are almost exactly proportional to each other
over the whole frequency range. This implies that it is only needed to determine one
optimal filter (the Chebyshev prototype filter), calculate the zeros of its magnitude error
function, and then use these zeros to design other FIR FD filters with similar characteristics.
The following discussion has been adapted from Laakso et al. (1994). Let us consider that an Nth-order (with odd N) symmetric FIR filter has been designed. This filter
must approximate flat magnitude response in the equiripple sense in the passband. The
error function of this filter has K = L/2 1 zeros, i.e.
E(e jw ) = H(e jw ) - Hid (e jw ) = 0 , w = W k , k = 1,2,K, K

(3.49)

implying that
N

h(n)e- jnW

= e - jW k N / 2 , k = 1,2,K, K

(3.50)

n=0

where N/2 is the delay of the filter. The filter coefficients can be solved from (3.50) for
a chosen total delay D which is close to N/2. This can be expressed in matrix form as
EWh = e D

(3.51a)

where EW is a K (N + 1) matrix defined by


1 e - jW1

1 e - jW 2
EW =
M
M

- jW K
1 e

e - j2W1
e - j2W 2
M
- j2W K
e

L e - jNW1

L e - jNW 2
M

L e - jNW K

(3.51b)

and

e D = e - jDW1

e - jDW 2

L e - jDW K

(3.51c)

Equation (3.51) is a set of K complex equations with L = 2K unknowns, which can


be expressed as a fully determined set of L real equations by equating the real and
imaginary parts of both sides as

80

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Magnitude (dB)

5
0
-5
-10
-15

Phase Delay in Samples

-20

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

2
1.8
1.6
1.4
1.2
1
0

Fig. 3.5 The magnitude and phase delay responses of a third-order (N = 3) approximately equiripple FIR FD filter. The curves are plotted for 11 delay values
between 1.0 and 2.0. Note that the magnitude responses for N - D are the
same as those for D.
P W h pW

(3.52a)

CW
PW =
SW

(3.52b)

c D
pW =
s D

(3.52c)

with

and

where the matrices and vectors contain appropriate cosine and sine elements such that
E W = CW - jSW and e D = c D - js D . The coefficient vector of the almost-equiripple FIR
FD filter is obtained as
-1
h = PW
pW

(3.53)

Note that the cosine-sine matrix P W depends only on the prototype filter but not on
the delay parameter D. Thus it can be inverted once and the same inverse matrix can be

Chapter 3. Fractional Delay Filters

81

used for approximating FD filters for several values of D.


An example of third-order equiripple FIR FD filters designed using Oetkens method
is shown in Fig. 3.5. The prototype filter is a linear-phase halfband (a = 0.5) equiripple
FIR filter designed using the Remez algorithm (see, e.g., Parks and Burrus, 1987).
Figure 3.5 can be compared with Figs. 3.3 and 3.4. It is apparent that at the lower half
of the frequency band the equiripple design (Fig. 3.5) yields a much more accurate
approximation than the LS approximation (Fig. 3.3). However, the result is comparable
to the halfband GLS filter of Fig. 3.4. In both of these figures, the magnitude and phase
response curves oscillate around the nominal value at the lower half of the frequency
band, and show lowpass behavior at high frequencies.
3.2.6 Other Methods for the Design of FIR FD Filters
In principle, many general FIR filter design techniques may also be applied to the
design of FD filters. Altogether, there is a large variety of methods, some of which are
mentioned in the following. These techniques are not discussed in detail because they
have not appeared to be very useful.
Oetken et al. (1975) proposed a stochastic approach to the design of interpolating
filters. In this technique, the design criterion is the minimum expectable mean squared
output error. It appears that this is mathematically equivalent to the general LS design
with the frequency-domain weighting function being equal to the average power spectrum of the input signal (Laakso et al., 1994). All the LS methods can also be modified
so that the desired frequency response is sampled using a uniform or nonuniform grid.
The maximally flat FD filter design seems to be appropriate for digital waveguide
systems since it approximates the ideal bandlimited interpolator very accurately at low
frequencies, and its magnitude response never exceeds unity. For these reasons we
examine this technique in detail in the following.

82

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

3.3 Maximally Flat FD FIR Filter: Lagrange Interpolation


A useful FIR filter approximation for the fractional delay (FD) is obtained by setting the
error function and its N derivatives to zero at zero frequency. This is the maximally flat
(MF) design at = 0 . It is interesting to notice that the FIR filter coefficients obtained
by this method are the same as the weighting coefficients in the classical Lagrange
interpolation.
It appears that already in the 1950s Lagrange interpolation was used for recovering
an analog bandlimited signal from its samples (see Jerri, 1977). For some reason this
approach never gained much popularity and it is not described in basic text books.
Later, Lagrange interpolation has been used for increasing the sampling rate of signals
and systems (see, e.g., Schafer and Rabiner, 1973; Oetken, 1979).
To our knowledge, Lagrange interpolation was first used for fractional delay approximation by Strube (1975) who derived it using the Taylor series approach. He did not,
however, notice that he was actually using the Lagrangian interpolation technique.
Laine (1988) applied Lagrange interpolation for FD approximation and observed its
maximally flat property but he did not give the mathematical derivation. The MF design
of an FD FIR filter has been independently proposed by Ko and Lim (1988) and
Sivanand et al. (1991). Liu and Wei (1990, 1992) derived the FD FIR filter approximation letting an Nth-order polynomial pass through N + 1 equidistant signal values. This
approach is often used for deriving the classical Lagrange interpolation formula (see
Section 3.3.2), but it does not reflect the frequency-domain properties of the technique.
Liu and Wei give the solution for even-order Lagrange interpolation only.
Recently, Hermanowicz (1992) pointed out the equivalence of the MF FD approximation and Lagrange interpolation. This fact was already stated by Oetken in 1979, but
since his paper addressed the design of FIR interpolators for sampling-rate conversion,
it was supposedly not known among those who studied FD approximation. Minocha,
Dutta Roy, and Kumar (1993) demonstrated that the Lagrange interpolation formula
may also be derived by truncating the Taylor expansion of the error function.
Kootsookos and Williamson (1995) have shown that Lagrange interpolation is also
obtained by windowing the impulse response of the ideal bandlimited interpolator, i.e.,
the sinc function, using a scaled binomial window (see Section 3.3.5).
Below we derive the explicit formula for the coefficients of the MF approximation in
a manner that is familiar in digital filter design and describe the relationship of this
technique to the basic idea applied in Lagrange interpolation. We also discuss the properties of this FD filter in the frequency domain and introduce a novel structure for
implementing Lagrange interpolation. Two related polynomial approximation techniques are considered as potential solutions for the FD approximation problem.
3.3.1 Derivation of the Maximally Flat Fractional Delay FIR Filter
The error function E and its N derivatives are first set to zero at a frequency 0 , or
d k E(e j )
d k

= 0

= 0 for k = 0,1,2,K, N

(3.54)

where E(e j ) is the complex error function defined by Eq. (3.27). Equation (3.54) can

Chapter 3. Fractional Delay Filters

83

thus be written as
dk
d k

jn
e jD
h(n)e
n=0

= 0

= 0 for k = 0,1,2,K, N

(3.55)

We proceed by differentiating and setting 0 = 0. For k = 0 this yields


N

h(n) 1 = 0

n=0

h(n) = 1

(3.56)

n=0

This is the requirement that the coefficients of the FIR filter must sum up to unity, i.e.,
the magnitude response at = 0 has to be equal to 1. For k = 1, Eq. (3.55) yields
N

jnh(n) + jD = 0

nh(n) = D

n=0

n=0

(3.57)

for k = 2
n 2 h(n) + D2 = 0
n=0

n2h(n) = D2

(3.58)

n=0

and so on until k = N.
The N + 1 requirements of Eq. (3.55) can be collected together as
N

nk h(n) = Dk

for k = 0,1,2,K, N

(3.59)

n=0

This is a set of N + 1 linear equations and may be rewritten in the matrix form as
Vh = v

(3.60)

where V is an L L Vandermonde matrix ( L = N + 1)


0 0 10
1
11
0
V = 0 2 12

M
0 N 1N

20
21
22
2N

N 0 1

N 1 0
N 2 = 0

O M M
L N N 0
L

1
1
1

1
2
4

1 2N

1
N

N2
O M
L N N

(3.61a)

h is the coefficient vector of the FIR filter


h = [h(0) h(1) h(2) K h(N)]T

(3.61b)

v = [1 D D2 K D N ]T

(3.61c)

and

The Vandermonde matrix is known to be nonsingular. Thus it has an inverse matrix V 1


and the solution of Eq. (3.60) can be expressed as

84

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


h = V 1v

(3.62)

where V 1 may be evaluated applying Cramers rule (see, e.g., Hildebrand, 1974, p.
541). The solution is given in an explicit form as
Dk
k =0 n k
N

h(n) =

for n = 0,1,2,KN

(3.63)

k n

where D is the fractional delay and N is the order of the FIR filter. The coefficients h(n)
can be seen to be equal to those of the Lagrange interpolation formula for equally
spaced abscissas (see, e.g., Hildebrand, 1974, pp. 8991).
Ko and Lim (1988) and Hermanowicz (1992) have proposed a more general maximally flat FD approximation where the frequency 0 of maximal flatness is arbitrarily
chosen from the interval [0, ]. The closed-form solution is (Hermanowicz, 1992)
Dk
k =0 n k
N

h(n) = e j 0 (n D)

for n = 0,1,2,KN

(3.64)

k n

Note that 0 = 0 results in Eq. (3.63). If 0 = the filter coefficients are real and they
are the same as the Lagrange interpolation coefficients except that the sign of every
second one of them has been toggled. This implies that the filter has been modulated to
the Nyquist frequency, i.e., its frequency response has been shifted by .
In the other cases, that is 0 < 0 < , the coefficients h(n) will be complex-valued.
Complex FIR filters are computationally much more expensive than the real-valued FIR
filters of the same order. We prefer the real-valued FIR interpolators because they are
computationally efficient and conceptually simple, and also because in audio signal processing it is normally appropriate to have the smallest approximation error at low frequencies.
3.3.2 Classical Approach
The Lagrange interpolation method is based on the well-known result in polynomial
algebra that using an Nth-order polynomial it is possible to match N + 1 given arbitrary
points (see, e.g., Davis, 1963, pp. 2426). The uniformly spaced Lagrange interpolation
formula can be derived in many different ways. The simplest of them is to approximate
a function x(t), t R, known at N + 1 integral points n = 0, 1, 2, ..., N using Nth-order
polynomials h(n, D). The polynomial approximation x(t) in the range [0, N] is given by
x(D) =

h(n, D)x(n)

(3.65)

n=0

where D R (0 D N) is the distance from the point n = 0 (analogously to fractional


delay) and h(n, D), n = 0, 1, 2, ..., N are polynomials in D of order N or less which take
on the value 1 when n = D and 0 otherwise. This means that the polynomial h(n, D)
may be expressed by means of the Kronecker delta

Chapter 3. Fractional Delay Filters


1 if n = D
h(n, D) = d (n - D) =
0 if n D

85
(3.66)

It is easy to write an Nth-order polynomial that vanishes at points n = 0, 1, 2, ..., n 1,


n + 1, ..., N:
h(n, D) = Cn [ D(D - 1)L(D - n + 1)(D - n - 1)L(D - N + 1)(D - N)]

(3.67)

The requirement that h(n, D) = 1 when n = D is then fulfilled by the scaling constant
Cn =

1
n(n - 1)L1(-1)L(n - N + 1)(n - N)

(3.68)

Substitution of Eq. (3.68) into (3.67) yields Eq. (3.63), the Lagrange interpolation
formula. This approach is a simple, but purely mathematical way to derive the solution.
It does not throw light upon the signal processing aspects of the technique like the
derivation given in Section 3.3.1.
3.3.3 Symmetry of the Lagrange Interpolation Coefficients
One remarkable feature of Lagrange interpolation is that the coefficients h(n) for the
fractional delay N D are the same as those for D, but in reverse order, that is
h(n, D) = h(N - n, N - D) for n = 0,1,2,K, N

(3.69)

This property of Lagrange interpolation is easily proved by writing


h(n, D) =

D-k
k =0, k n n - k

(3.70)
=

D(D - 1)L(D - n + 1)(D - n - 1)L(D - N + 1)(D - N)


n(n - 1)L(1)(-1)L(n - N + 1)(n - N)

and similarly,
h(N - n, N - D) =

N - D-k
k =0, k N -n N - n - k

(N - D)(N - D - 1)L(-D + n + 1)(-D + n - 1)L(-D + 1)(-D)


(N - n)(N - n - 1)L(1)(-1)L(-n + 1)(-n)

D(D - 1)L(D - n + 1)(D - n - 1)L(D - N + 1)(D - N)


n(n - 1)L(1)(-1)L(n - N + 1)(n - N)

(3.71)

The equality in the second and third row of Eq. (3.71) is obtained by changing the sign
of every term in the denominator and numerator of the second row. This is valid, since
there is the same number of terms (N) in both the denominator and the numerator. It is
seen that the last row of Eq. (3.71) is equivalent to that of Eq. (3.52).

86

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

The symmetry of Lagrange interpolation coefficients implies that the complementary


fractional delay D = N D can be approximated by the same coefficients h(n) as the
delay D. The order of coefficients just has to be reversed. This property is generally true
for FD FIR filters. In Chapter 4 it is shown that this feature is advantageous when considering the operation of nonrecursive deinterpolation in digital waveguide systems.
3.3.4 Computation of Coefficients for a Lagrange Interpolator
The first-order Lagrange interpolator corresponds to the well-known linear interpolation. The coefficients are obtained from Eq. (3.63) with N = 1 as
h(0) = 1 - D , h(1) = D

(3.72)

Note that in the first-order case D = d since the integer part of D is zero. The formulas
for the filter coefficients of the Lagrange interpolator with N = 2 are
h(0) =

1
1
(D - 1)(D - 2) , h(1) = -D(D - 2) , h(2) = D(D - 1)
2
2

(3.73)

and with N = 3,
1
1
h(0) = - (D - 1)(D - 2)(D - 3) , h(1) = D(D - 2)(D - 3) ,
6
2
1
1
h(3) = D(D - 1)(D - 2)
h(2) = - D(D - 1)(D - 3) ,
6
2

(3.74)

In general, the coefficients h(n) of the Lagrange interpolator are determined by an


Nth-order polynomial in D. This implies that N additions and N multiplications per
coefficient are required to evaluate its value for a given D. Altogether, it takes
N(N + 1) = N 2 + N additions and N 2 + N multiplications to update all the N coefficients. An efficient procedure to update the coefficients of a Lagrange interpolator was
introduced in Vlimki (1992, Appendix A):
1) evaluate the differences D 1, D 2, ..., D N;
2) evaluate the products D(D 1), (D 2)(D 3), ..., (D N + 1)(D N);
3) evaluate the coefficients h(0), h(1), ..., h(N) by multiplying the results of Steps 1
and 2 and constant coefficients.
Using this technique, the computational burden of evaluating the coefficients of an Nthorder interpolator consists of N additions (subtractions) and less than N 2 + N multiplications.
For example, computing the values of the four coefficients of a third-order Lagrange
interpolator using this method requires 3 additions and 10 multiplications. This is a
remarkable saving when compared to 12 additions and 12 multiplications that would be
required were the straightforward polynomial evaluation used.
The most efficient practical solution to the updating of interpolating coefficients is to
use table lookup. In software implementations this is usually the fastest way, and thus
most recommendable, provided that the required memory space is available.

Chapter 3. Fractional Delay Filters

87

3.3.5 Windowing Approach to Lagrange Interpolation


Kootsookos and Williams (1995) have proved that the Lagrange interpolation coefficients can also be obtained by windowing the shifted and sampled sinc function with a
scaled binomial window. They only discuss the even-order Lagrange interpolator. It is
straightforward to extend this result for odd-order Lagrange filters. Below we repeat the
derivation of Kootsookos and Williams (1995) with our notation and give the odd-order
extension.
The Lagrange interpolation coefficients can be expressed in the following way:
h(n, D) =

D-k
D D - n - 1
= (-1) N -1-n
for n = 0,1,2,K, N (3.75a)
n L - n - 1
k =0, k n n - k
N

where L = N + 1 and
G(D + 1)
D
=
n n! G(D - n + 1)!

(3.75b)

where G() is the gamma function. (The binomial coefficient must be evaluated using
the gamma function when one of its parameters is real or non-positive). It is easy to
verify Eq. (3.75) by examining the form in Eq. (3.70). This result is valid for both even
and odd N.
When N is even, the samples of the sinc function under the window of length L = N +
1 can be expressed as (Kootsookos and Williams, 1995)
hid (n) = sinc(n - D) =

sin[p(n - D)]
p(n - D)
(3.76)

(-1) N -n+1 sin(pD)


p(n - D)

for n = 0, 1, 2, K, N (N even)

Now the formula of the Lagrange interpolation coefficients can be manipulated in the
following way (Kootsookos and Williams, 1995):
1
D D - n - 1
N -n D D - n
h(n, D) = (-1) N -n

= (-1)

n N - n
n N - n D - n
D N 1
D N N + 1
= (-1) N -n
= (-1) N -n

N n D - n
N + 1 n D - n

(-1) N -n+1 sin(pD) D N p(N + 1)


N + 1 n sin(pD)
p(n - D)

p(N + 1) D N

hid (n) = Cbin,e (D, N)wbin (n)hid (n)


sin(pD) N + 1 n

(3.77)

88

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


Even N
1
0.9
0.8

Scaling Coefficient

0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-1

-0.8

-0.6

-0.4

-0.2
0
0.2
Fractional Delay

0.4

0.6

0.8

Fig. 3.6 Examples of the scaling coefficient of the binomial window for computation of
even-order Lagrange interpolation coefficients (solid line: N = 2; dashed line:
N = 4; dash-dot line: N = 6).
where wbin (n) is the binomial window function defined by
N
wbin (n) =
n

for n = 0, 1, 2, ..., N

(3.78)

and Cbin,e (D) is the scaling coefficient (in the case of even N)
Cbin,e (D) =

p(N + 1) D

sin(pD) N + 1

(3.79)

Note that the binomial window approaches the Gaussian function as N approaches
infinity. The scaling coefficient Cbin,e (D) is illustrated in Fig. 3.6 for some low-order
Lagrange interpolators when D N/2 varies between 1 and 1.

Chapter 3. Fractional Delay Filters

89

Odd N
When N is odd, the sinc function can be rewritten as
hid (n) = sinc(n - D) =

sin[p(n - D)]
p(n - D)
(3.80)

(-1) N -n sin(pD)
p(n - D)

for n = 0,1,2,K, N (N odd)

Now the formula of the Lagrange interpolation coefficients can be manipulated in the
following way:
D D - n - 1
h(n, D) = (-1) N -n

n N - n

(-1) N -n+1 sin(pD) D N p(N + 1)


N + 1 n sin(pD)
p(n - D)

= (-1)

(3.81)

p(N + 1) D N

hid (n) = Cbin,o (D, N)wbin (n)hid (n)


sin(pD) N + 1 n

where the scaling coefficient is


Cbin,o (D, N) = -

p(N + 1) D

sin(pD) N + 1

(3.82)

Figure 3.7 shows the scaling coefficient for some low-order Lagrange interpolators
(with odd N) for fractional delay values 0 d < 1.
General Result
The two results can be combined by defining a scaling coefficient Cbin (D, N) as
Cbin (D, N) = (-1) N

p(N + 1) D

sin(pD) N + 1

(3.83)

and the general form of the window-based design of Nth-order for Lagrange interpolator
can be expressed as
h(n) = Cbin (D, N)wbin (n)sinc(n - D), for 0, 1, 2, K, N

(3.84)

90

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


Odd N
10
9

Magnitude of Scaling Coefficient

8
7
6
5
4
3
2
1
0
0

0.1

0.2

0.3

0.4
0.5
0.6
Fractional Delay

0.7

0.8

0.9

Fig. 3.7 Examples of the scaling coefficient of the binomial window for computation of
odd-order Lagrange interpolation coefficients (solid line: N = 1; dashed line: N
= 3; dash-dot line: N = 5).
3.3.6 Approximation Errors of Lagrange Interpolation
The approximation error of Lagrange interpolation depends drastically on the fractional
part d of the interpolation interval D. The error vanishes when d = 0. Namely, if D is an
integer, Eq. (3.63) can be written as
h(n) = d (n - D) for n = 0,1,2,KN

(3.85)

where d (n - D) is the Kronecker delta function defined by


1 when n = D
d (n - D) =
0 when n D

(3.86)

In other words, the impulse response of the Lagrange interpolator reduces to a delayed
unit impulse when D is an integer. This implies that the Lagrange interpolation polynomial goes through the known signal samples.
The worst case is obtained with d = 0.5. Then, in the odd-order case, the impulse
response of the interpolator is even-symmetric with respect to its center of gravity, or
h(n) = h(N - n) for n = 0,1,2,K, N

(3.87)

Chapter 3. Fractional Delay Filters

91

Magnitude (dB)

0
-5
-10
-15
-20
0

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Phase Delay in Samples

1
0.8
0.6
0.4
0.2
0
0

Fig. 3.8 The magnitude (upper) and phase delay (lower) response of a linear interpolator for eleven different fractional delay values (D = 0, 0.1, 0.2, ..., 1.0). Note
that there are only six different curves in the upper figure (not eleven), because
the magnitude responses for fractional delays d and 1d are the same. In the
lower figure, the ideal phase delay in each case is illustrated by a dotted line.
The interpolator is thus a linear-phase FIR filter and its phase response is exactly -Dw
as it should be. Unfortunately, for an odd-order FIR filter this also means that it has a
real zero at the Nyquist frequency (see, e.g., Parks and Burrus, 1987, pp. 2326) and
consequently the total error is larger than for any other value of d. For an even-order
Lagrange interpolator there is no particular symmetry when d = 0.5, but the approximation error in both magnitude and phase is the largest.
The magnitude and phase responses of the first, second, and third-order Lagrange
interpolators are illustrated in Figs. 3.83.10. Note that in these figures the normalized
frequency 0.5 corresponds to the Nyquist frequency. It is seen that both the magnitude
and the phase delay curves coincide with the ideal response at low frequencies as
expected.
Cain et al. (1994) have compared the approximation error of Lagrange interpolation
with that of other FD FIR filters. They concluded that it is preferable for applications
where a low-order FD filter is needed and a fullband approximation is not necessary.
From the viewpoint of waveguide models, Lagrange interpolation has two favorable
features:
1) it is accurate at low frequencies and

92

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Magnitude (dB)

0
-5
-10
-15
-20
0

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Phase Delay in Samples

1.5

0.5
0

Fig. 3.9 The magnitude (upper) and phase delay (lower) response of a second-order
Lagrange interpolator (N = 2) for eleven different fractional delay values (D =
0.5, 0.6, ..., 1.0, ..., 1.4, 1.5). Note that the magnitude responses for fractional
delays D and N D are the same. In the lower figure, the ideal phase delay in
each case is illustrated by a dotted line.
2) it never overestimates the amplitude of the signal when the delay has been chosen
so that (N 1)/2 D (N + 1)/2 when N is odd and (N/2) 1 D (N/2) + 1
when N is even.
The first advantage is justified by the fact that audio signals are usually lowpass signals
and thus it is wise to use an approximation technique that has the smallest error at low
frequencies.
The second property is called passivity and it implies that the magnitude response of
the Lagrange interpolator is less than or equal to one for the mentioned values of D.
This property is advantageous because in digital waveguide models the interpolator is
normally used inside a feedback loop and then it is extremely important to preserve the
loop gain less than unity. Otherwise the system may become unstable. Since a Lagrange
interpolator is a passive filter, the interpolation error only decreases the loop gain but
never increases it. It is interesting that in the case of even-order Lagrange interpolators,
the filter is passive on a range of two unit delays, while odd-order filters only have a
range of one unit delay.
It has been experimentally noticed that the magnitude response of the Lagrange
interpolator exceeds unity when the delay parameter is out of the optimal range. An
example of this phenomenon is given in Fig. 3.11 for the third-order Lagrange interpo-

Chapter 3. Fractional Delay Filters

93

Magnitude (dB)

0
-5
-10
-15
-20
0

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Phase Delay in Samples

2
1.8
1.6
1.4
1.2
1
0

Fig. 3.10 The magnitude (upper) and phase delay (lower) response of a third-order
Lagrange interpolator (N = 3) for 11 equally spaced fractional delay values
(D = 1.0, 1.1, 1.2, ..., 2.0).
lator when the parameter D varies from 0 to 1. (In this case the delay parameter should
be in the range 1 D 2). When 0.5 D < 1, the magnitude response is greater than
one at all other frequencies except w = 0. When 0 < D < 0.5, the magnitude response
exceeds unity at middle frequencies but shows lowpass behavior at the high end. (Note
that the scale of the magnitude response in Fig. 3.11 is different than in Figs. 3.83.10.)
Also the phase delay approximation appears to be substantially worse now than when D
is on the optimal range (compare with the phase response in Fig. 3.10).
The squared integral error has been computed for Lagrange interpolators of different
order using Eq. (3.31). Figure 3.12 shows the error as a function of the delay parameter
for odd-order filters N = 1, 3, and 5 and Fig. 3.13 for even-order filters N = 2, 4, and 6.
The odd-order Lagrange interpolators have an error curve that is symmetrical with
respect to the point D = N/2. The lobe between the middle taps has approximately the
form of the sine-squared function (Laakso et al., 1995a).
The even-order Lagrange interpolators also have error curves that are symmetrical
with respect to D = N/2 (see Fig. 3.13). Note, however, that individual lobes of the error
curves are asymmetrical. In practice the shape of these curves suggests that the lowest
error is obtained when the even-order interpolator is used for the values (N 1)/2 D
(N + 1)/2. This is the same requirement as for the odd-order filter, although the evenorder filter is passive when (N/2) 1 D (N/2) + 1.

94

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Magnitude (dB)

2
0
-2
-4
0

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Phase Delay in Samples

1
0.8
0.6
0.4
0.2
0
0

Fig. 3.11 The magnitude (upper) and phase delay (lower) response of a third-order
Lagrange interpolator (N = 3) the for 11 equally spaced fractional delay values (D = 0.0, 0.1, 0.2, ..., 1.0). Note that now D is varied on a non-optimal
range since D (N 1)/2.
As seen above, both even and odd-order Lagrange interpolators can be used for FD
approximation. This is in contrast to some statements in the literature that recommend
the use of odd-order interpolators only (see, e.g., Erup et al., 1993, p. 999). However, in
dynamic FD applications even-length interpolators introduce a practical problem: when
d passes the value 0.5, the locations of the filter taps have to be changed to maintain d
within half a sample from the center point of the filter. This ensures that the overall
approximation error is always as small as possible. However, this change of filter taps
causes a discontinuity to the output signal. The odd-order Lagrange interpolators do not
have this problem since the filter taps need to be moved only when D passes an integer
value.

Chapter 3. Fractional Delay Filters

95

1.5

Squared Error

0.5

0
0

0.5

1.5

2.5
Delay

3.5

4.5

Fig. 3.12 Squared error of some odd-order Lagrange interpolators as a function of


delay D: N = 1 (dash-dot line), N = 3 (dashed line), and N = 5 (solid line).
3.3.7 Farrow Structure of Lagrange Interpolation
In this section we present a new implementation structure for Lagrange interpolation.
This derivation has been first published by Vlimki (1994a, 1994b, 1995a) . Lagrange
interpolation is usually implemented using a direct-form FIR filter structure. An alternative structure is obtained approximating the continuous-time function xc (t) by a polynomial in D, which is the interpolation interval or fractional delay. The interpolants, i.e.,
the new samples, are now represented by
y(n) = xc (n - D) =

c(k)Dk

(3.88)

k =0

that takes on the value x(n) when D = n. The coefficients c(k) are solved from a set of N
+ 1 linear equations. Farrow (1988) suggested that every filter coefficient of an FIR
interpolating filter could be expressed as an Nth-order polynomial in the delay parameter D. He stated that this results in N + 1 FIR filters with constant coefficients. The
above approach to Lagrange interpolation is seen to be related to Farrows idea.

After publication of the theory of this new structure, it was found out that the same idea had already
been mentioned by Erup et al. (1993, Appendix, pp. 10071008) but without derivation. This is however
an independent work. To the knowledge of the author, the derivation given here has not been published
by anybody else.

96

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


3.5

Squared Error

2.5

1.5

0.5

0
0

3
Delay

Fig. 3.13 Squared error of some even-order Lagrange interpolators as a function of


delay D: N = 2 (dash-dot line), N = 4 (dashed line), and N = 6 (solid line).
The alternative implementation for Lagrange interpolation is obtained formulating
the polynomial interpolation problem in the z-domain as
Y(z) = H(z)X(z)

(3.89)

where X(z) and Y(z) are the z-transforms of the input and output signal, x(n) and y(n),
respectively, and the transfer function H(z) is now expressed as a polynomial in D
(instead of z -1).
H(z) =

Ck (z)Dk

(3.90)

k =0

The familiar requirement that the output sample should be one of the input samples for
integer D may be written in the z-domain as
Y(z) = z - D X(z) for D = 0,1,2,K, N

(3.91)

Together with Eqs. (3.89) and (3.90) this leads to the following N + 1 conditions
N

Ck (z)Dk = z - D

k =0

for D = 0,1,2,K, N

(3.92)

Chapter 3. Fractional Delay Filters

97

This may be expressed in matrix form as


Uc = z

(3.93)

where the L L matrix U is given by


00
0
1
U = 20

M
N 0

01
11
21

02
12
22

N1

N2

0 N 1 0

1N 1 1
2 N = 1 2

O M M
L N N 1 N
L

O M
L N N

0
1
4
N2

0
1
2N

(3.94)

vector c is
c = [C0 (z) C1 (z) C2 (z) K CN (z)]

(3.95)

and the delay vector

z = 1 z -1

z -2 K z - N

(3.96)

Note that U is the transpose of the matrix V given in Eq. (3.60b).


The matrix U has the Vandermonde structure and thus it has an inverse matrix U -1 .
The solution of Eq. (3.93) can thus be written as
c = U -1z

(3.97)

The inverse matrix U -1 , that we shall denote by Q, may be solved using Cramers rule.
The rows of the inverse Vandermonde matrix Q contain the filter coefficients used in
the new structure, and thus it is convenient to write
Q = [q 0

q1 q 2 K q N ]T

(3.98)

The transfer functions Cn (z) are thus obtained by inner product as


Cn (z) = q n z =

qn (k)z -k

for n = 1,2,K, N

(3.99)

k =0

The coefficients qn (k) for the FIR filters Cn (z) are computed inverting the Vandermonde matrix U.
By setting D = 0 in Eq. (3.92), it is seen that
N

Ck (z)0 k = 1

C0 (z) 1

(3.100)

k =0

This implies that the transfer function C0 (z) = 1 regardless of the order of the interpolator. The other transfer functions Cn (z) given by Eq. (3.99) are Nth-order polynomials in
z -1 , that is, they are Nth-order FIR filters.
We shall call this the implementation technique describe above the Farrow structure
of Lagrange interpolation. A remarkable feature of this form is that the transfer func-

98

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

x(n)
CN -1 (z)

CN (z)
D

C1 (z)

C2 (z)
D

y(n)

Fig. 3.14 The Farrow structure of Lagrange interpolation implemented using Horners
method. The overall transfer function has been formulated as a polynomial in
the delay parameter D. The transfer functions Cn (z) are Nth-order FIR filters
with constant coefficients for a given N.
tions Cn (z) are fixed for a given order N. The interpolator is directly controlled by the
fractional delay D, i.e., no computationally intensive coefficient update is needed when
D is changed.
The Farrow structure is most efficiently implemented using Horners method (see,
e.g., Hildebrand, 1974, p. 28), that is
N4
647
8
C
(z)D
=
C
(z)
+
[C
(z)
+
[C
(z)+K+[C
(z)
+
C
(z)D]DL]D
(3.101)
k
0
1
2
N -1
N
N

k =0

With this method N multiplications by D are needed. A general Nth-order Lagrange


interpolator that employs the suggested approach is shown in Fig. 3.14. Since there is
no need for the updating of coefficients, this structure is particularly well suited to
applications where the fractional delay D is changed often, even after every sample
interval.
If the delay is constant or updated very seldom, it is recommended to implement
Lagrange interpolation using the standard FIR filter structure because it is computationally less expensive. Namely, with the FIR filter structure N + 1 multiplications and
N additions are needed. In Farrows structure, there are N pieces of Nth-order FIR filters
which results in N(N + 1) multiplications and N 2 additions. There are also N multiplications by D and N additions. Altogether this means N 2 + 2N multiplications and
N 2 + N additions per output sample.
As examples, let us solve for the transfer functions Cn (z) for linear interpolation and
second-order Lagrange interpolation. For N = 1, Eq. (3.93) yields
1 0 C0 (z) 1
1 1 C (z) = z -1

(3.102)

-1
C0 (z) 1 0 1 1 0 1 1
C (z) = 1 1 z -1 = -1 1 z -1 = -1 + z -1

(3.103)

The solution is given by

It is seen that C1 (z) = z -1 - 1 when N = 1. The overall transfer function H(z) of the
Farrow structure of linear interpolation is written as

Chapter 3. Fractional Delay Filters

99

a)

b)
x(n)

-1

x(n)

-1

1d

y(n)

y(n)

Fig. 3.15 a) The direct-form FIR filter structure for linear interpolation and b) the
equivalent Farrow structure.
H(z) = 1 + (z -1 - 1)D

(3.104)

Note that in this case D = d. The linear interpolator may thus be implemented by the
structure illustrated in Fig. 3.15b. It is seen that the Farrow structure is as efficient as
the direct-form nonrecursive structure (Fig. 3.15a) when N = 1, since the number of
operations is the same in both. If multiplication is more expensive than addition, like it
is in VLSI implementations, then the Farrow structure (Fig. 3.15b) is preferable.
For N = 2, Eq. (3.93) is written as
1 0 0 C0 (z) 1
1 1 1 C (z) = z -1

1
1 2 4 C2 (z) z -2

(3.105)

Now the inverse Vandermonde matrix Q is given by


Q=U

-1

0
0
1

= - 3 2 2 -1 2

1 2 -1 1 2

(3.106)

The transfer functions thus obtained are


C0 (z) = 1, C1 (z) = -

3
1
1
1
+ 2z -1 - z -2 , C2 (z) = - z -1 + z -2
2
2
2
2

(3.107)

and the overall transfer function H(z) can be written as


H(z) = C0 (z) + C1 (z)D + C2 (z)D2

(3.108)

The Farrow form of parabolic or second-order Lagrange interpolation is illustrated in


Fig. 3.16. In this example it is seen that the transfer functions Cn (z) can share unit
delays, because they all use the same delayed signal values x(n - k) with k = 0, 1, 2, ...,
N. Furthermore, the number of multiplications can be reduced in a practical implementation. It may be taken into account that some of the coefficients have the value 1 or 1
thus eliminating the need for a multiplication [e.g., q2 (2), the second coefficient of
C2 (z) ] and that the corresponding coefficients of two transfer functions can be equal

100

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

x(n)

-1

-1

1/2
1/2

3/2

y(n)

Fig. 3.16 The Farrow structure for a second-order (N = 2) Lagrange interpolator. This
structure involves 6 additions and 6 multiplications. The best fractional delay
approximation is obtained when 0.5 D 1.5.
[e.g., the third coefficient of C1 (z) and C2 (z) ]. These special cases are often met with
the inverse Vandermonde matrix.
Modified Farrow Structure
The Farrow structure can be made more efficient changing the range of the parameter D
so that the integer part is removed. The new parameter range is 0 d 1 (for odd N) or
0.5 d 0.5 (for even N). This change can be obtained introducing a transformation
matrix T defined by

Tn,m

n-m n


N
round

for n m

=
m
2
0
for n < m

(3.109)

where n, m = 0, 1, 2, ..., N. For N = 2 this matrix is expressed as


1 1 1

T = 0 1 2

0 0 1

(3.110)

Multiplying the coefficient matrix Q by matrix (3.110) a modified coefficient matrix Q


is obtained as
0
0 0
1
0
1 1 1 1

= TQ = 0 1 2 -3 / 2 2 -1 / 2 = -1 / 2 0 1 / 2
Q

0 0 1 1 / 2 -1 1 / 2 1 / 2 -1 1 / 2

(3.111)

This transformation is equivalent to substituting D = D + 1. The FIR filters of the


modified structure are written as
1 1
1
1
C0 (z) = z -1, C1 (z) = - + z -2 , C2 (z) = - z -1 + z -2
2 2
2
2

(3.112)

Chapter 3. Fractional Delay Filters

x(n)

101
-1

-1

1/2

1/2

d
y(n)

Fig. 3.17 The modified Farrow structure for a second-order (N = 2) Lagrange interpolator. This structure involves 4 additions and 4 multiplications. In this case the
best fractional delay approximation is obtained when 0.5 d 0.5. Note,
however, that the structure produces an extra delay of one sample so that the
actual delay then lies within the range [0.5, 1.5].
Figure 3.17 shows the implementation of this modified second-order Farrow structure of Lagrange interpolation. It is seen that now only 4 multiplications and additions
are needed in contrast with 6 in the basic implementation shown in Fig. 3.16.
The final example of the modified Farrows structure of Lagrange interpolation presents the third-order system. The inverse Vandermonde matrix Q associated with the
third-order case is given by
1

-1 1
Q=U =
1
1

0
1
2
3

0 0 -1 1
0
0
0

1 1
-11 / 6
3
-3 / 2 1 / 3
=

4 8
-5 / 2
2
-1 / 2
1
-1 / 6 1 / 2 -1 / 2 1 / 6
9 27

(3.113)

This coefficient matrix does not yield a very efficient implementation since there are
for the
only two pairs of coefficients that can be combined. The coefficient matrix Q
modified structure is obtained multiplying Q by T, which yields
1

= TQ = 0
Q
0
0

1
1
0
0

1
2
1
0

1
0
0
1
0
-1 / 3 -1 / 2
1
-1 / 6
3

Q=
-1
1/ 2
0
3
1/ 2
-1 / 6 1 / 2 -1 / 2 1 / 6
1

(3.114)

The block diagram of this system is shown in Fig. 3.18. It is seen that 11 additions and
9 multiplications are needed in the implementation of this system.
For comparison we consider third-order Lagrange interpolation implemented using
the direct-form FIR filter structure with the coefficients updated according to the technique suggested in Section 3.3.4. The computational load for the coefficient update
would be 3 additions and 10 multiplications, and for the computation of the output 3
additions and 4 multiplications. Altogether this yields 6 additions and 14 multiplica-

102

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

x(n)

z-1

z-1

z-1

1/2

1/3

1/2

1/2

+
+

1/6

1/6

y(n)

Fig. 3.18 The modified Farrow structure for a third-order (N = 3) Lagrange interpolator. This structure involves 11 additions and 9 multiplications. In this case the
best fractional delay approximation is obtained when 0 < d 1. Note, however, that the structure produces an extra delay of one sample so that the
actual delay lies within the range [1, 2].
tions, which makes 20 operationsthe same number as with the modified Farrow
structure. The number of multiplications, however, is smaller in the case of the modified
Farrow structure.
3.3.8 Related Polynomial Interpolation Techniques
Lagrange interpolation is one representative of a class of polynomial interpolation
techniques. Other interpolation methods based on polynomials have not been widely
studied. Here we discuss some examples from recent DSP literature.
Matsui et al. (1991) proposed a tangent-line interpolation (TLI) technique for fractional delay approximation. They used it to control the fundamental frequency of a
speech synthesizer. The basic idea of the TLI method is to approximate the value of a
sampled signal in the neighborhood of a known sample x(n) by a straight line that takes
on the value x(n) at point n. This technique is not equivalent to linear interpolation since
the slope of the line is computed from the previous and next sample values. This results
in a three-tap FIR filter with coefficients
h(0) =

1- D
D -1
, h(1) = 1, and h(2) =
2
2

(3.115)

The frequency-domain properties of the TLI method were studied in Vlimki (1994b).
It was found that there is an overshoot in the magnitude response of the filter with a
maximum at about f s / 4. In the worst case (d = 0.5) this overshoot is about 1 dB. The
phase delay of the TLI filter is exact w = 0 as in the case of Lagrange interpolators.
Unfortunately the phase delay approaches the value D = 1 almost linearly as a function
of frequency. Thus this method is unsuitable for tasks where the accuracy of fractional
delay approximation is important.

Chapter 3. Fractional Delay Filters

103

Another polynomial interpolator with a closed-form expression for the coefficients


has been introduced by Erup et al. (1993). This technique is called piecewise parabolic
interpolation (PPI), and it is implemented with a four-tap FIR filter with coefficients
h(0) = a d 2 - a d
h(1) = -a d 2 + (a - 1)d + 1
h(2) = -a d 2 + (a + 1)d

(3.116)

h(3) = a d 2 - a d
where a is a real-valued parameter and d is the fractional delay (0 d 1). This interpolation technique produces a constant delay of one sample plus a delay of approximately d. Note that the coefficients h(0) and h(3) are identical. With a = 0 , the PPI filter reduces to linear interpolation.
Erup et al. (1993) recommend the PPI method with parameter value a = 0.5 for FD
approximation. This filter was analyzed in Vlimki (1994b). Both the magnitude
response and the phase delay are almost flat at low frequencies. At the high end the
magnitude decreases and the phase delay approaches 1 or 2 depending on d. The main
drawback of this technique is that the magnitude response exceeds unity at middle frequencies. In the worst case (d = 0.5), the overshoot at the culmination point 0.2 f s is
0.74 dB or slightly less than 9%.
The properties of the PPI filter are quite comparable to those of low-order Lagrange
interpolators. Furthermore, the PPI filter can be implemented very efficiently with the
Farrow structure (Erup et al., 1993). This method is preferred over the third-order
Lagrange interpolator if the small overshoot in its magnitude response is not a serious
drawback. In waveguide models, the PPI filter could be used for tuning the delay-line
lengths if the loop gain were much less than unity at middle frequencies, that is, if a
great deal of losses were included. In this case the overshoot would not risk the stability
of the waveguide system.
Splines are a class of polynomial interpolation techniques that have several applications in numerical computation. So far they have been tried in many DSP applications,
such as image processing (Hou and Andrews, 1978; Unser et al., 1993b), sampling-rate
conversion (e.g., Cucchi et al., 1991; Zlzer and Boltze 1994), and signal reconstruction
(Unser et al., 1992, 1993a). Aldroubi et al. (1992) have proved that a cardinal spline
interpolator converges toward the ideal interpolator as the order of the filter approaches
infinity. An interesting field of future research is to compare this technique with other
known methods, e.g., Lagrange interpolation.
3.3.9 Conclusion and Discussion
In this section the maximally flat design of an FD FIR filterwhich is equivalent to
classical Lagrange interpolationwas discussed. It appears to be well suited to waveguide modeling because the approximation error is small at low frequencies and it is
passive (i.e., its magnitude response never exceeds unity) when the delay parameter is
within the optimal range. It was shown that both the even and odd-order Lagrange
interpolation filters can also be designed by windowing the shifted and sampled sinc
function with a scaled binomial window.
A novel implementation technique for the Lagrange interpolation was derived. This

104

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

formulationcalled the Farrow structureleads to a version of Lagrange interpolation


that is well suited to time-varying FD filtering. The computational cost of this structure
is in general the same as that for the direct-form FIR filter when the Lagrange interpolation coefficients are updated every cycle. The number of multiplications, however, is
always smaller in the new structure. Suitable applications for the Farrow structure of
Lagrange interpolation are, for example, irrational sampling-rate conversion (Laakso et
al., 1995a) or time-varying resampling of a discrete-time signal.
Before leaving Lagrange interpolation, it is worthwhile emphasizing a potential point
of misunderstanding. In a classical paper Rabiner and Schafer (1973) explain that when
Lagrange interpolation is used for increasing the sample rate of a discrete-time signal
with an integer factor Q, the operation will not change the phase of that signal since the
interpolator has a linear phase function. Above we have conversely seen that in general
Lagrange interpolators do not have a linear phase response. This disagreement is caused
by the different way the interpolator is used: in upsampling with factor Q, the interpolator will compute Q 1 new signal values between every two input samples. Each of the
new samples is obtained with a different nonlinear-phase Lagrange interpolator but the
resulting signal (consisting of the original signal and its Q 1 fractionally shifted versions) has a linear phase. On the other hand, in fractional delay filtering (usually) a single interpolator is used to produce the contiguous output samplesone sample per
sampling interval, and for this reason the output signal suffers from phase distortion. If
upsampling by Q is followed by decimation by an integer factor P, the resulting signal
is not generally linear phase.

aN
x(n)

z-1
a N-1

y(n)
- a1

z-1

z-1
a1

z-1

- aN -1

-1

- aN

-1

Fig. 3.19 Direct form I implementation of an Nth-order discrete-time allpass filter.

Chapter 3. Fractional Delay Filters

105

3.4 Design Methods for Fractional Delay Allpass Filters


Above we have studied the design of FIR filters for fractional delay approximation.
Now we show how recursive or IIR filters can be used for the same problem. Since the
magnitude response of an ideal FD element is perfectly flat, we consider only the socalled allpass filters . Their magnitude response is always flat irrespective of the filter
coefficients. Allpass filters are typically used for phase equalization and other signal
processing tasks where the phase characteristics are of greatest concern. Also the FD
approximation is essentially a phase approximation problem and thus the allpass filter is
particularly well suited to this task.
3.4.1 Properties of the Discrete-Time Allpass Filter
A discrete-time allpass filter has the transfer function
A(z) =

z N D(z 1 ) aN + aN 1z 1 +K+a1z ( N 1) + z N
=
D(z)
1 + a1z 1 +K+aN 1z ( N 1) + aN z N

(3.117)

where N is the order of the filter, D(z) is the denominator polynomial, and the filter
coefficients ak (k = 1, 2, ..., N) are real. The block diagram of the allpass filter is
depicted in Fig. 3.19.
The numerator of the allpass transfer function is a mirrored version of the denominator polynomial. The poles (i.e., roots of the denominator) of a stable allpass filter are
located inside the unit circle in the complex plane and as a result its zeros (i.e., roots of
the numerator) are located outside the unit circle so that their angle is the same but the
radius is the inverse of the corresponding pole. For this reason the magnitude response
of an allpass filter is flat. It is expressed as
A(e j ) =

e jN D(e j )
1
D(e j )

(3.118)

Obviously, the name allpass filter comes from the above property that this filter passes
signal components of all frequencies without attenuating or boosting them.
The frequency response of the allpass filter can be expressed in the form
A(e j ) = e j A ( )

(3.119)

This form stresses the fact that the main feature of an allpass filter is its phase response
A ( ) . It can be written as

A ( ) = arg A(e j ) = N + 2 D ( )

(3.120)

where D ( ) is the phase response of 1 D(e j ), that is

The term allpass filter is sometimes used to denote any discrete-time system (FIR or IIR) whose
magnitude response is almost flat at all frequencies (|H(ejw)| 1). Throughout this text this term is used
only for recursive discrete-time filters which have the exact allpass property, that is, |H(ejw)| 1.

106

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


N

ak sin(kw )
aT s
1
k =0

Q D (w ) = arg
= arctan T
jw = arctan N
D(e )
a c
a cos(kw )

k
k =0

(3.121a)

with a0 = 1 and the vectors a, s, and c are defined by


a = [1 a1 a2 L aN ]

(3.121b)

s = [0 sin w

sin(2w ) L sin(Nw )]

c = [1 cos w

cos(2w ) L cos(Nw )]

T
T

(3.121c)
(3.121d)

The phase delay t p, A (w ) of an allpass filter can be expressed with the help of Eq.
(3.120) as

t p, A (w ) = -

Q A (w )
= N - 2 t p,D (w )
w

(3.122)

where t p,D (w ) is the phase delay of 1 D(e jw ). The corresponding group delay is given
by

t g, A (w ) = -

dQ A (w )
= N - 2 t g,D (w )
dw

(3.123)

where t g,D (w ) is the group delay of 1 D(e jw ).


3.4.2 Design of Fractional Delay Allpass Filters
There is a noteworthy difference between the design of FIR and allpass filters: the coefficients of an FIR filter are easily obtained by the inverse discrete-time Fourier transform of the frequency-domain specifications, since the coefficients of an FIR filter are
equal to the samples of its impulse response. However, the relationship between the
transfer function coefficients and the impulse response of an allpass filter is not this
simple. Hence, most of the design techniques for allpass filters are iterative. Many of
the design methods for FD allpass filters are counterparts of FIR design methods discussed in Section 3.2.
The desired or ideal phase response in the fractional delay approximation problem is
Q id (w ) = -Dw

(3.124)

as may be seen from the ideal frequency response given by Eq. (3.8). The phase error of
an allpass filter can be defined as the deviation from the desired phase function Q id (w )
as
a T sb
DQ(w ) = Q id (w ) - Q A (w ) = 2 arctan T
a cb
where a is the coefficient vector given by Eq. (3.121b), sb and cb are defined as

(3.125)

Chapter 3. Fractional Delay Filters

107

sb = [sin{b (w )} sin{b (w ) - w } L sin{b (w ) - Nw }]

cb = [cos{b (w )} cos{b (w ) - w } L cos{b (w ) - Nw }]

(3.126a)
(3.126b)

and b (w ) as

b (w ) =

1
[Qid (w ) + Nw ]
2

(3.126c)

When approximating a fractional delay D = N + d, or the ideal phase function Q id (w ) =


= -Dw = -(N + d)w , the last form reduces to

b (w ) = -

wd
2

(3.127)

The weighted least-squares phase error that is to be minimized is defined as


p

ELS

1
= W(w ) DQ(w ) 2 dw
p0

(3.128)

where W(w ) is a nonnegative weighting function. Lang and Laakso (1994) have introduced an iterative algorithm for the design of the coefficient vector a. The algorithm
typically converges to the desired solution, but this cannot be guaranteed.
The LS phase delay error can be defined as
p

ELS

2
1
1
DQ(w ) 2
= W(w ) Dt p (w ) dw = W(w )
dw
p0
p0
w
p

1 W(w )
2
=
2 DQ(w ) dw
p0 w

(3.129)

In other words, the phase delay error solution is obtained by introducing an additional
weighting function W(w ) = 1 w 2 to the phase error, Eq. (3.128). Allpass filters may be
designed using the iterative algorithm modified from that mentioned above.
There exists a large variety of algorithms for equiripple or minimax allpass filter
design in terms of phase, group delay, or phase delay (see Laakso et al., 1994 for references). New iterative techniques for equiripple phase error and phase delay error design
are described in Laakso et al. (1994).
It appears that, just as with the FD FIR filters, a well suited digital allpass filter for
FD approximation in audio signal processing is obtained by using the maximally flat
design at w = 0. This technique deserves more attention than the others.
3.4.3 Maximally Flat Group Delay Design of FD Allpass Filters
The maximally flat group delay design is the only known FD allpass filter design technique that has a closed-form solution (Laakso et al., 1992, 1994). Here we discuss this
solution.
Thiran (1971) proposed an analytic solution for the coefficients of an all-pole lowpass filter with a maximally flat group delay response at the zero frequency

108

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


N

2d + n
ak = (-1)
k n=0 2d + k + n
k N

for k = 0,1,2,K, N

(3.130)

with the binomial coefficient


N!
N
=
k k!(N - k)!

(3.131)

It has been shown by Thiran (1971) that when d > 0 the denominator polynomial
obtained by the above method will have its roots within the unit circle in the complex
plane, i.e., the filter will be stable. A drawback of Thirans design technique is that the
magnitude response of the all-pole lowpass filter cannot be controlled. In the allpass
design, however, this kind of problem does not exist. Thus it seems that the result of
Thiran is better suited to the design of allpass than all-pole filters.
As can be seen from Eq. (3.123) the group delay of an allpass filter is twice that of its
all-pole counterpart. Thus Thirans solution may easily be applied to the design of allpass filters by making the substitution d = d/2. The solution for the coefficients of a
maximally flat (MF) allpass filter is thus (Laakso et al., 1992)
N N d + n
ak = (-1)k
k n=0 d + k + n

for k = 0,1,2,K, N

(3.132)

Note that a0 = 1 always and thus the coefficient vector a need not be scaled. This
closed-form solution to the allpass filter approximation is called the Thiran allpass filter.
The allpass filters designed using Eq. (3.132) approximate a group delay of N + d
samples. This kind of parametrization is rather impractical. For this reason we substitute
d = D N into Eq. (3.132). As a result we obtain the following formula:
N N D - N + n
ak = (-1)k
k n=0 D - N + k + n

for k = 0,1,2,K, N

(3.133)

Now the delay parameter D refers to the actual delay rather than the offset from N
samples. In Table 3.1 we give the coefficients for the Thiran allpass filters with N = 1,
2, and 3.
Table 3.1 Coefficients of the Thiran allpass filters of order N = 1, 2, and 3.
a1

a2

a3

D -1
D +1

N=1

N=2

-2

D-2
D +1

(D - 1)(D - 2)
(D + 1)(D + 2)

N=3

-3

D-3
D +1

(D - 2)(D - 3)
(D + 1)(D + 2)

(D - 1)(D - 2)(D - 3)
(D + 1)(D + 2)(D + 3)

Chapter 3. Fractional Delay Filters

109

D=3
D = 2.75

2.5

D = 2.5

Phase Delay in Samples

D = 2.25
2

D=2
D = 1.75

1.5

D = 1.5
D = 1.25

D=1
D = 0.75

0.5

D = 0.5
D = 0.25

0
0

D=0
0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Fig. 3.20 The phase delay of a first-order (N = 1) Thiran allpass filter. The dashed lines
indicate the ideal phase delay which is constant.
Thirans proof of stability implies that this allpass filter will be stable when D > N.
We have experimentally observed that the Thiran allpass filter is also stable when the
delay is N 1 < D < N. At D = N 1, one of the poles (as well as one zero) of the
Thiran allpass filter will be on the unit circle making the filter asymptotically unstable.
When D < N 1, one or more poles will be outside the unit circle and the filter will be
unstable.
In Figs. 3.20, 3.21, and 3.22, the phase delay responses of first, second, and thirdorder MF allpass interpolators are presented for several values of the delay parameter D
starting at D = N 1. The ideal phase delay (constant at all frequencies) is illustrated
with a dotted line in each figure. It is seen that the Thiran allpass filter is most accurate
at low frequencies. This is an expected result since Thiran designed the denominator
polynomial so that the group delay and its N derivatives evaluated at the zero frequency
match the desired result. As the filter order is increased, the phase delay curves remain
nearly constant at higher frequencies.
The closed-form design of the first-order allpass filter for varying the length of a
delay line has been discussed in papers by Jaffe and Smith (1983) and Smith and
Friedlander (1985). They obtained the solution by first expressing the phase response of
the first-order allpass filter by its Taylor series with the expansion point z = 0 and then
truncating this series after the first term. The expression for the filter coefficient a1 of
the first-order (N = 1) allpass filter is the same as that given in Table 3.1.

110

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

D=4
D = 3.75

3.5

D = 3.5

Phase Delay (samples)

D = 3.25
3

D=3
D = 2.75

2.5

D = 2.5
D = 2.25

D=2
D = 1.75

1.5

D = 1.5
D = 1.25

D=1

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Fig. 3.21 The phase delay of a second-order (N = 2) Thiran allpass filter.


3.4.4 Choice of Optimal Range for D in the Thiran Interpolator
In the case of FIR FD filters we noticed that it is a good choice to use values (N 1)/2
D < (N + 1)/2, since then the mean squared error is the smallest in all cases of d. In the
case of recursive filters this rule does not apply. In the following we analyze the optimal
range for the delay parameter in the case of the Thiran allpass filter.
Let us examine the mean squared error function of the Thiran allpass filter given by
p

2
2
1
1
ES = E(e jw ) dw = A(e jw ) - Hid (e jw ) dw
p0
p0

][

2
1
1
e jQ A (w ) - e - jwD dw = e jQ A (w ) - e - jwD e - jQ(w ) - e jwD dw

p0
p0

(3.134)

2
= 1 - cos[Q A (w ) + w D]dw
p 0

Integration of the last form is difficult and we thus use numerical methods for investigating the approximation error. We use a numerical integration technique to evaluate
the error for Thiran allpass filters of different order. Figure 3.23 gives the error ES as a

Chapter 3. Fractional Delay Filters

111

D=5
D = 4.75

4.5

D = 4.5

Phase Delay (samples)

D = 4.25
4

D=4
D = 3.75

3.5

D = 3.5
D = 3.25

D=3
D = 2.75

2.5

D = 2.5
D = 2.25

2
0

D=2
0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Fig. 3.22 The phase delay of a third-order (N = 3) Thiran allpass filter.


function of the delay parameter D for Thiran allpass filters of orders 1 to 6.
It is seen that the error is zero for integral delay values N D = 1 or 0. This implies
that the Thiran allpass filter yields the original signal samples in these cases. As discussed before, the case N D = 1 yields an asymptotically unstable filter. It is suggested not to use this kind of filter in practical work. The Thiran allpass filter can thus
only be used for delay approximations such that D > N 1.
In the case D = N the Thiran filter reduces to a delay line because all the coefficients
of the numerator polynomial are zero. This is easily shown by noticing that each coefficient computed according to (3.133) has a factor D N in its numerator.
A fractional delay filter can usually be used for approximating a delay D for which it
holds
D0 D < D0 + 1

(3.135)

To solve for the optimal choice for D0 we may evaluate the average error Eave written
as
D0 +1

Eave (D0 ) =

ES (D)dD

(3.136)

D0

We can evaluate this integral numerically. Figure 3.24 presents the average error as a
function of D0 for some low-order Thiran allpass filters. The minimum of each curve is
indicated by a circle in Fig. 3.24. These points correspond to the optimal D0 for each

112

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

0.6

Squared Integral Error

0.5

0.4

0.3

0.2

0.1

0
-1

-0.5

0.5
D-N

1.5

Fig. 3.23 Squared integral error of Thiran allpass filters of orders N = 1 to 6 (top to bottom).
filter. The values of D0 and the corresponding average error Eave are presented in
Table 3.2.
Some general observations can be made from Fig. 3.24. Apparently, the approximation error of the Thiran allpass filter is quite small for delay values N 1 < D < N
reaching a minimum at the value given in Table 3.2. For much larger values of D, the
approximation error increases considerably. It is seen in Fig. 3.24 that the average error
does not increase very rapidly in the neighborhood of the minimum. Thus, if the best
possible accuracy is not essential, one may use a simple rule of thumb to choose the
Table 3.2 The optimal values for D0 and minimum average errors for low-order Thiran
allpass filters. Eave (N - 0.5) is the average error when D0 = N 0.5.
N
1
2
3
4
5
6

D0, opt

Eave (D0, opt )

Eave (N - 0.5)

0.418
1.403
2.396
3.392
4.390
5.389

0.040
0.030
0.025
0.022
0.019
0.018

0.043
0.033
0.027
0.024
0.022
0.020

Chapter 3. Fractional Delay Filters

113

0.16

0.14

0.12

Average Error

0.1

0.08

0.06

0.04

0.02

0
-1

-0.5
D0 - N

Fig. 3.24 The average error of low-order Thiran allpass filters as a function of D0 . The
filter orders from top to bottom are N = 1, 2, 3, 4, 5, and 6. The minimum of
each error function is indicated by a circle.
range for D: for the Thiran allpass filter, a close-to-minimum error is obtained when D0
= N 0.5, which means that D is
N - 0.5 D < N + 0.5

(3.137)

A similar statement as in the case of FIR FD filters can be made: the total delay of the
Thiran allpass filters increases linearly with order N.
3.4.5 Discussion
This section has dealt with allpass filter approximation of fractional delay. Digital allpass filters are a good choice for this task since their magnitude response is exactly flat
and the design can concentrate on the phase properties. The Thiran interpolator was discussed in detail since it is easy to design with closed-form formulas and it is very accurate at low frequencies.
The design of general IIR filters (with different numerator and denominator polynomials) for FD approximation has not been much discussed in the DSP literature. One
example is the paper by Tarczynski and Cain (1994) where reduced-bandwidth IIR filters are optimized iteratively. The design of optimal IIR filters is a difficult problem but
new techniques are certainly worth trying since generally speaking recursive filters are
more powerful than FIR filters. This is a fascinating topic for future research.

114

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

3.5 Time-Varying Fractional Delay Filters


It is interesting to investigate the situation where the delay parameter of an FD filter is
changed during operation, since in many applications the desired delay is not constant
but varies with time. In this section we study the implementation of a variable-length
digital delay line using FIR and IIR filters.
3.5.1 Consequences of Changing Filter Coefficients
The main issue is to understand what the change of the filter characteristics means for
the output signal of the filter. We consider a single change at the coefficient set at time
index n = nc. There will be two kinds of consequences:
1) The output signal may suffer from a transient phenomenon that starts at n = nc if
the state variables of the filter contain intermediate results related to the former
coefficient set, or if they are cleared or part of the input data is otherwise missing.
In the case of nonrecursive filters this problem does not exist when the filter has
been implemented in a correct way, but in the case of recursive filters the occurrence of transients must be taken into account somehow.
2) The output signal of a time-varying FIR or IIR filter experiences a discontinuity
at n = nc, which is simply caused by the fact that after the coefficients have been
changed the output signal will be a result of a different kind of filtering operation.
However, the discontinuity can cause problems in some cases.
Both the transient signal and the discontinuity depend on the input signal of the filter
and the magnitude of coefficient change: a small change in the values of the coefficients
causes a transient with a smaller amplitude and a smaller discontinuity than a large
change in coefficient values. The severity of the transients and discontinuity can be
decreased by making the change in filter parameters smaller, e.g., by allowing the filter
a reasonably long transition time when the values of the coefficients are gradually
changed (by interpolation) from the initial to the target values.
3.5.2 Implementation of a Variable Delay Using FIR Filters
We first examine the realization of a variable-length digital delay line using FIR interpolating filters. Interpolation is needed when the desired length of the delay line is not
an integral multiple of unit samples, in other words, when it is needed to know the value
of a discrete-time signal at a point D = D + d .
In principle it is straightforward to exploit FIR filters in a time-varying situation.
This is because they do not have transient problems when the filter coefficients are
modified. This is true, of course, only when the FIR filter has been implemented so that
the input samples that are needed for computing the output of the filter after the change
in the delay value are available. Special care is needed with FIR filters when the order
of the filter is changed or whole unit delays are added to or subtracted from the system.
An advantage of nonrecursive filters is that they are stable in all (time-invariant and
time-varying) situations.
Figure 3.25 illustrates the use of an FIR interpolator for controlling the length of a
delay line. The coefficient vector of the FIR interpolator is denoted by h and is defined
as

Chapter 3. Fractional Delay Filters

x(n)

z M
h(0)

z1
h(1)

115

z1

z1

h(2)

h(N)

x(n M N)

x (n M D)

Fig. 3.25 A variable-length digital delay line implemented using FIR interpolation.
h = [h(0) h(1) h(2) L h(N)]T

(3.138)

For simplicity of notation, the dependency of the coefficients h(n) on the delay D is not
explicitly shown. It should be remembered that we should use h(n, D) in all the equations. Various techniques for determining the interpolation coefficients h(n) have been
discussed earlier in this chapter.
An estimate for the ideal interpolated signal value x(n M D) is computed as (see
Fig. 3.25)
x(n M D) =

h(k)x(n M k) = hT x n

(3.139a)

k =0

where M is the number of unit delays in the delay line before the FIR interpolator [as
defined by Eq. (3.37)] and the vector of signal samples to be used in interpolation is
x n = [x(n M) x(n M 1) x(n M 2) L x(n M N)]T (3.139b)
The length of a delay line can now be changed continuously by computing new values
for the FIR filter coefficients h(n) at every sampling interval.
In the following we concentrate on the case where the delay parameter is changed at
time index nc. The delay parameter is thus a function of time so that
D1, n < nc
D(n) =
D2 , n nc

(3.140)

If the next delay value D2 does not fulfill the requirement (N 1)/2 D2 < (N + 1)/2,
the filter taps of the FIR FD filter have to be moved to obtain the best approximation
(for a given approximation method and given N). In other words, the taps of the FIR
filter have to be moved with respect to the delay line if d passes the value 0 when N is
odd or d passes the value 0.5 when N is even. This also implies that an integer must be
added to or subtracted from D2 to have a delay that is within the optimal range.
Special attention must be paid to correctly implement the delay change where D2 >
D1 and the filter taps need to be moved: the transient signal will be completely eliminated if the delayed values of the input signal are available, i.e., the delay line is long
enough. One solution is to postpone the change of delay parameter for the time that it
takes to store the input signal values into the buffer memory (i.e., to wait until the delay
line is filled with samples needed by the FIR filter) and change the delay only then. This
will, however, cause an extra processing delay which depends on the value of the delay
parameter and which may not be acceptable.

116

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

A practical solution is to define a maximum delay Dmax R which is the assumed


largest delay value (in samples) that will be needed in a particular application. The
length of the buffer that stores the delayed values of the input signal is then set long
enough for approximating Dmax. In practice this means that the delay line length Mtot
must be
Mtot = Dmax + N / 2

(3.141)

Here the N / 2 extra samples are required because the fractional point D should be
located near the central tap of the FIR interpolator. This issue has been discussed in
Section 3.2.1 for the LS FIR filter design but it applies to other FD FIR filters as well.
It is also necessary to consider the case D 2 < D 1 when the filter taps should be
moved. Since the filter taps should be located in such a way that (N 1)/2 D2 < (N +
1)/2, a delay D2 < (N 1)/2 cannot be approximated with good accuracy. (This discussion is irrelevant for the case N = 1 since for the first-order filter the smallest delay in
the optimal range is zero.) There are two possible solutions to the problem of a small
fractional delay:
1) to approximate the delay outside the optimal range or
2) to reduce the order N of the FIR FD filter.
Both methods imply reduced accuracy in the FD approximation. The choice depends on
the application. In digital waveguide modeling, when Lagrange interpolation is
employed, it is better to use a lower order filter (choice 2) since the Lagrange interpolator is not a passive filter outside its optimal range. From the discussion in Section 3.3.6
we can also deduce that this solution will actually give a smaller total error than using a
higher-order filter outside its optimal range. Notice that the above discussion is mainly
relevant to real-time implementation of an FIR delay filter .
The use of an FIR interpolator is favorable in the sense that no disturbing transients
will be generated when the delay D is changed. However, a discontinuity will occur at
the output. If the change in the coefficient values is small enough the discontinuity will
be unnoticeable, but if the change is large a serious disturbance may occur for certain
signals. For computational savings it may be necessary to update the interpolating coefficients more seldom than for every sampling interval, say every 10 or 100 sampling
cycles. It is recommended to experimentally verify that updating is frequent enough, so
that the discontinuities do not cause trouble.
3.5.3 Transients in Time-Varying Recursive FD Filters
Due to the recursive nature of IIR filters, abrupt changes in their coefficients cause disturbances in the future values of the internal state variables and transients in the output
values of the filter. These disturbances depend on the filters impulse response so that
they are in principle infinitely long. In the case of stable filters, however, the disturIn off-line processing where the entire input signal is stored in the computer memory and filtering is
executed afterwards, the main concern is to take into account the edge effects due to the finite data size
(see, e.g., Hamming, 1983): near the ends of the signal buffer some of the FIR filters input signal values
are missing (assumed to be zero). This would, in general, give rise to an error (transient) in the output
signal. In most cases this transient at the beginning and at the end of the output signal can be removed by
truncating the output signal.

Chapter 3. Fractional Delay Filters

117

bances decay exponentially and can be treated as having a finite length. In practice,
these transients are often serious enough to cause problems, such as clicks in audio
applications, and they are typically the most serious problem in the implementation of
tunable or time-varying recursive filters.
The waveform of the transient signal depends on the structure of the filter. For
example, recursive filters implemented with direct form (DF) I and DF II structures will
yield a different kind of transient signal after the change of filter coefficients when their
initial and target coefficients and the input signal are identical.
If the coefficients of a recursive filter are changed at constant time intervals, the disturbance caused by the transient bursts can yield a quasiperiodic signal. In audio applications, this is heard as a tonal sound.
3.5.4 Transient Elimination Methods for Time-Varying Recursive Filters
Despite the importance of the problem, there exists only a handful of research reports
on strategies for elimination of transients in time-varying recursive filters. To the
knowledge of the author, there are altogether five different techniques for this task:
1) a cross-fading method,
2) gradual variation of coefficients using interpolation (Mourjopoulos et al., 1990),
3) intermediate coefficient matrix (Rabenstein, 1988),
4) an input-switching method (Verhelst and Nilens, 1986), and
5) updating of the state vector (Zetterberg and Zhang, 1988).
These methods are briefly described in this section.
Cross-Fading Method
The cross-fading method is the most intuitive of all transient elimination techniques. In
this method, typically two copies, H1 (z) and H2 (z), of the time-varying filter are used.
The input signal is connected to both of these filters.
Let us assume that first filter H1 (z) is used for filtering the input signal. When the
coefficients need to be changed, the filter H2 (z) is given the target filter coefficients.
During a transition time Dn, the outputs of the filters are cross-faded (e.g., using linear
interpolation) so that afterwards only the output of filter H2 (z) is connected to the
overall output.
This method has been used, for example, in audio applications such as auralization,
where the incident angle of the binaurally processed audio signal is changed using digital filters designed based on measurements of head-related transfer functions (Jot et al.,
1995). In this application it may be necessary to cross-fade between three or four filters,
if the elevation anglein addition to the horizontal angleis being taken into account.
The overall output signal is then a linear combination of the filter outputs.
Gradual Variation Technique
In gradual variation of coefficients one also allows a transition time Dn (in samples)
during which the coefficients monotonically change from their initial values to the target values (Mourjopoulos et al., 1990; Zlzer et al., 199). Mourjopoulos et al. (1990)
defined a filter variation parameter h as

118

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

h=

M +1
DnT

(3.142)

where M is the number of intermediate coefficient sets and T is the sampling interval.
This parameter describes the number of intermediate coefficients per unit time. Note
that it is assumed that each intermediate coefficient set is running for several sampling
cycles.
Mourjopoulos et al. (1990) studied the useful range of values for parameter h by
analyzing the transients in an audio filtering application using a computational model of
human auditory properties. Their results were that when h is small (i.e., a small number
of intermediate filter coefficients is used) the transition is perceivable because there are
discontinuities in the filter parameters; at least 18 to 35 states per seconddepending
on parameter settingsshould be used to ensure smooth coefficient transition. Abrupt
changes generate audible transients. Mourjopoulos et al. (1990) conclude that these
distortions are most easily heard when the input signal is a sine wave. Speech and music
sounds mask the majority of the transient effects.
Nikolaidis et al. (1993) have considered a variation of the gradual variation technique where filter parameters (instead of coefficients) are interpolated. In this paper
they also propose a generalization where two filters, H1 (z) and H2 (z), are used in parallel. Both filters receive the input signal all the time but the output signal of only one
of them is connected to the output of the overall system. This goes on for nu sampling
cycles, which is determined as
nu =

Dn
M +1

(3.143)

At the same moment when the output of the filter H1 (z) is connected to the overall output, the coefficients of H2 (z) are changed to correspond the next intermediate set and
its output is disconnected. After nu samples, the output is switched again and the coefficients of H1 (z) are changed. Nikolaidis et al. (1993) also propose a two-speed implementation where the off-time filter runs at a higher sampling rate. With this approach
this filter in effect has a longer time (a multiple of n u ) to settle before its output is
switched on.
Intermediate Coefficient Matrix
Another technique to reduce the transients is the use of an intermediate coefficient
matrix (Rabenstein, 1988). There are altogether three coefficient matrices involved in
one change of filter characteristics: the initial, the intermediate, and the target matrices.
The intermediate matrix is used for one sampling cycle to compensate for the transient.
This approach involves assumptions on the nature of the input signal. The drawback of
this technique is that the filter has to be implemented in a state-variable form which is
far more expensive in terms of operations than a direct-form realization. For this reason
we feel that this technique is not suitable for real-time DSP applications.
Input-Switching Method
The input-switching method was introduced by Verhelst and Nilens (1986). Their
application was speech synthesis using a cascade formant filter bank where the coeffi-

Chapter 3. Fractional Delay Filters

119

cients of the recursive filters are abruptly changed every M samples.


In this technique several copies of each filter are used. When the coefficients need to
be changed, the input is switched to the next filter and its coefficients are changed and
the state vector cleared. The outputs of the filters are summed so that the decay of other
filters and the starting transient of the newest one are all present in the output signal.
Verhelst and Nilens (1986) called this the modified-superposition strategy but the
term input-switching method suggested by Jot appears to be more intuitive.
Verhelst and Nilens (1986) point out that the responses of the filters gradually attenuate after their input is disconnected. For this reason it is not needed to increase the
number of parallel filters all the time but to run only a couple of filters so that the
oldest filter is neglected at the time of coefficient change. In speech synthesis, a freerunning filter can be disconnected after some 20 ms (Verhelst and Nilens, 1986).
Naturally, the removed filter can be recycled so that its state is cleared and its coefficients are given the values of the next target filter. In practice, some 2 to 5 filters per
time-varying filter need to be implemented.
Verhelst and Nilens report that the input-switching method is immune to formantlabeling errors that can cause large perturbations in the output signal of a conventional
cascade speech synthesizer.
Updating of the State Vector
The most general approach to transient suppression was presented by Zetterberg and
Zhang (1988) by means of a state-space formulation of the recursive filter. Their main
result was that, assuming a stationary input signal, every change in filter coefficients
should be accompanied by an appropriate change in the internal state variables. This
guarantees that the filter switches directly from one state to another without any transient response. The ZetterbergZhang (ZZ) method can completely eliminate the transients but it does require that all the past input samples are known. This makes the
approach impractical as such but provides a good starting point for more efficient
approximate algorithms. In the next section we build on the ZZ method. First we review
this technique in detail.
Let us consider a recursive filter structure that can be expressed in the state-variable
form as
v(n + 1) = Fv(n) + qx(n)

(3.144a)

y(n) = g T v(n) + g0 x(n)

(3.144b)

where x(n) and y(n) are the input and output signal of the filter, respectively. The column vector v(n) contains the state variables, and the matrix F, vectors q and g, and
coefficient g0 contain the filter coefficients. These matrices and vectors depend on the
implementation structure of the filter.
The state-variable vector can be expressed as a function of the input signal x(n) and
coefficient matrices when the coefficients have been changed at time index nc, that is
(Zetterberg and Zhang, 1988)

J.M. Jot, personal e-mail communication, Sept. 29, 1995.

120

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


n-1
n
v(0)
+
F
1
F1n-1-k qx(k), 0 < n nc

k =0
v(n) =
n-1
F 2n-nc v(nc ) + F 2n-1-k qx(k), n > nc

k =nc

(3.145)

where v(0) is the initial state of the filter, and F1 and F 2 are the coefficient matrices
before and after sample index nc, respectively. After the coefficients have been
changed, the state vector can be expressed as [substituting the upper form of Eq. (3.145)
into the lower one and elaborating]
v(n) =

F1n-nc F 2nc v(0) +

n-1

F2n-k -1qx(k) + F2n-n Dv(nc )


c

(3.146a)

k =0

with
Dv(nc ) =

nc -1

[(F1n -k -1 - F2n -k -1 )q]x(k)


c

(3.146b)

k =0

The first term in (3.146a) is the contribution of the initial conditions to the state of the
filter. We assume that the initial transient has died out so that this term can be
neglected. The second term is the response of the filter to the input after the parameters
have changed, and the last term represents the transient in the state vector due to coefficient change. Note that Dv(nc ) depends on the difference of the coefficient matrices
F1 and F 2 which implies that a small change in the coefficients causes a transient
response with a smaller amplitude than a large change.
A way to completely eliminate the transient caused by the change of coefficients is to
subtract the term Dv(nc ) from the state vector at time nc (Zetterberg and Zhang, 1988).
Equivalently, the state vector can be replaced with the following vector (Vlimki et al.,
1995b, 1995d)
v 2 (nc ) =

nc -1

F2n -k -1qx(k)
c

(3.147)

k =0

A more pragmatic way to view the ZZ method of transient elimination is to consider


that a copy of a recursive filter is running for each coefficient set that is ever needed in
the application. All these filters are connected to the input of the system, but only one of
them (the one with the appropriate coefficient set) is connected to the output at a time.
The change of coefficients is then executed by selecting the filter with the appropriate
coefficient set as illustrated in Fig. 3.26 for two filters. It is easy to understand that there
will not be transients because of change of filter characteristics (assuming that the
starting transient has died out) since all the filters are in the steady state all the time.
A serious drawback of this technique is that one filter is needed for each coefficient
set. In many applications, such as in fractional delay filtering, practically a continuum
of parameter values is desired, and this implies a very large number of filters running in
parallel.

Chapter 3. Fractional Delay Filters

121
H1 (z)

n = nc

x(n)

y(n)
H2 (z)

Fig. 3.26 A transientless implementation of a time-varying recursive filter according to


Zetterberg and Zhang (1988). A single change of filter characteristics [from
H1 (z) to H2 (z)] at time n = nc is considered.
3.5.5 A Novel and Efficient Method for Elimination of Transients
The motivation for the development of a new transient elimination method was to find a
practical way to update the state variables of a recursive filter when the filter coefficients are changed abruptly. It is clearly impractical to compute the state vector for a
given coefficient set starting from time 0. We present a solution for transient suppression that gives an acceptable performance at a reasonable implementation complexity.
This method has been initially proposed by Vlimki et al. (1995b, 1995d).
The new approach is applicable to any recursive filter structure but it is especially
well suited for the direct form (DF) II structure. The DF II structure is described by the
following state-variable presentation:
v(n) = [v1 (n) v2 (n) L vN (n)]

(3.148a)

-a1 -a2 L -aN -1 -aN


1
0
0
0

1
M
F= 0
M
O
0
0

0
0 L
1
0

(3.148b)

q = [1 0 L 0]

(3.148c)

g = [b1 - b0 a1 b2 - b0 a2 L bN - b0 aN ]

(3.148d)

g0 = b0

(3.148e)

where ak and bk are the denominator and numerator coefficients of the recursive filter,
respectively (k = 0, 1, 2, ..., N). The coefficient a0 is equal to 1. The next-state and output equations of the state-variable representation are given by Eq. (3.144a) and
(3.144b), respectively. The transfer function of the recursive filter is
H(z) =

B(z) b0 + b1z -1 +K+bN z - N


=
A(z) 1 + a1z -1 +K+aN z - N

(3.149)

A key observation is that the state vector of a DF II structure is affected by the past
input values only in proportion to the magnitude of the impulse response hr (n) of its
recursive part, that is

122

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

x(n)

Signal
Filter

v 2 (n)

Transient
1 A(z)
Eliminator

H(z)

y(n)

z- Na

D(n)

Fig. 3.27 The implementation of the novel transient elimination technique.


0,
n<0

hr (n) = 1,
n=0
q T F n-1q, n > 0

(3.150)

This is because vector g, that contains the feedforward coefficients, does not affect the
state vector in (3.144a). The elements of the updated state vector (at time n = nc)

v 2 (nc ) = v2,1 (nc ) v2,2 (nc ) L v2,N (nc )

]T

(3.151)

can be rewritten by means of the IR h2,r (n) of the recursive part of the filter H2 (z):
v2,m (nc ) =

nc -1

h2,r (k)x(nc - m + 1 - k) = yr (nc - m + 1),

m = 1,2,K, N (3.152)

k =0

where yr (n) is the output signal of the recursive part of the IIR filter.
Thus, the knowledge of the effective length of the impulse response hr (n) (i.e., how
many values of this impulse response are observably nonzero for the application) helps
to estimate how many past input samples need to be taken into account in updating the
state vector. The effective length of an infinite impulse response can be determined
based on its time constant or by specifying an energy-based criterion. These approaches
are discussed in Appendix B of this thesis. Denoting the effective length of the filters
impulse response (in samples) by Na we can replace expression (3.147) by (Vlimki et
al., 1995b, 1995d)
v2 (nc ) =

nc -1

F 2nc -k -1qx(k)
k =nc - Na

Na

F2k -1qx(nc - k)

(3.153)

k =1

Let us consider implementation of the transient elimination technique when the timevarying filter H(z) = B(z)/A(z) is implemented using the DF II structure. The transient
elimination method then works as shown in Fig. 3.27: the input signal x(n) is fed into
two systems, the actual IIR filter H(z) (hereafter called the signal filter) and the transient eliminator that consists of its recursive part 1/A(z). When the filter coefficients are
changed, the coefficients of the eliminator are updated immediately. The coefficients of
the signal filter are updated Na samples after that, and at the same time the state vector
is copied from the transient eliminator to the signal filter. The parameter Na is called the
advance time (in samples) of the transient eliminator. The larger the value of this
parameter, the more the transient signal is being suppressed.
The state variables of the transient eliminator are updated according to the following

Chapter 3. Fractional Delay Filters

123
Output Signal

1.5

Amplitude

1
0.5
0
-0.5
-1
0

10

15

20

25
30
Time (samples)

35

40

45

50

35

40

45

50

Transient

Amplitude

0.2

-0.2

-0.4
0

10

15

20

25
30
Time (samples)

Fig. 3.28 The output signal of a first-order Thiran allpass filter when the delay parameter is changed from 1.42 to 0.42 at time index 20. The upper part shows the
output signal of the time-varying filter without transient elimination (solid
line) and the ideal output signal (dashed line). The lower part is the transient
signal which is obtained as the difference of the two upper signals.
equation
v2,k (n) = v2,k -1 (n - 1) for k = 2,3,K, N

(3.154a)

v2,1 (nc ) = x(n) + (-ak )v2,k (n - 1)

(3.154b)

k =1

It is seen that Eqs. (3.154a) and (1.154b) give a state-variable description of an allpole
filter without output. The output signal of the transient eliminator is not needed since
the purpose of this filter is to provide the state vector for the signal filter.
This implementation technique requires that the coefficient update is deferred by Na
sample cycles and during that time the eliminator is executed in parallel with the signal
filter as illustrated in Fig. 3.27. If only one eliminator is used, it is then possible to
change the coefficients every Nath sample at the most. This lag can be compensated for
if the next coefficient values are known beforehand at least Na samples prior to the
change. Then it is not necessary to defer the coefficient update but instead the eliminator can be started Na samples before the actual time of change.
The above method of transient elimination can be used with practically all kinds of

124

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


Output Signal

Amplitude

1
0.5
0
-0.5
-1
0

10

15

20

25
30
Time (samples)

35

40

45

50

35

40

45

50

Transient (Na = 0)

Amplitude

0.6
0.4
0.2
0
-0.2
0

10

15

20

25
30
Time (samples)

Fig. 3.29 The same example as in Fig. 3.28 but now the signal with the solid line is
obtained using the new transient elimination technique with advance time Na
= 0. This corresponds to clearing the state of the allpass filter at time index
nc.
recursive filters. The technique is most efficiently implemented when the feedforward
and feedback part of the filter have their own state variables or when only the recursive
part affects the state of the filter, such as in the case of the DF II structure. In these
cases the transient eliminator can be an allpole filter. On the other hand, if the state
variables depend on both the recursive and the nonrecursive parts, the transient eliminator must be a pole-zero filter.
Two-Speed Implementation of the New Method
The new transient elimination technique can be modified to enable instantaneous coefficient update as often as neededeven every sampling interval (Vlimki et al.,
1995b, 1995d). This can be realized by keeping the last N a input samples in a buffer
memory and executing the necessary computations in a single sample interval by running the eliminator at N a times the sampling rate of the signal filter. This two-speed
method, of course, requires sufficiently fast hardware.

Chapter 3. Fractional Delay Filters

125
Output Signal

Amplitude

1
0.5
0
-0.5
-1
0

10

15

20

25
30
Time (samples)

35

40

45

50

35

40

45

50

Transient (Na = 1)

Amplitude

0.1
0
-0.1
-0.2

10

15

20

25
30
Time (samples)

Fig. 3.30 The same example as in Fig. 3.28. The signal with the solid line is obtained
using the new transient elimination technique with advance time Na = 1.
3.5.6 Examples with a Time-Varying Thiran Allpass Filter
We have experimented with the new transient elimination method described in Section
3.5.5 to verify its quality. In the following examples, the input signal is a low-frequency
sine signal. The delay parameter D of a first-order Thiran allpass filter is changed from
1.418 to 0.418 at time index nc = 20. The corresponding pole radii are 0.173 and 0.410,
respectively. Note that the decay rate of the transient is primarily determined by the
pole of the target filter (i.e., Thiran allpass filter with D = 0.418).
The result without transient elimination is given in Fig. 3.28. The upper part shows
the output signal of the filter with a solid line and the output signal of an ideal timevarying FD filter with a dashed line. The ideal result is obtained by concatenating two
shifted sine signals (i.e., output signals of two ideal interpolators). The lower part of
Fig. 3.28 gives the difference of the two output signals. This signal can be interpreted as
the transient. An exponentially decaying transient with peak value 0.41 is produced.
In Fig. 3.29, the state of the filter is cleared at the time of coefficient change. This
case is obtained with the transient eliminator when the advance time parameter Na is set
to zero. The amplitude of the transient is now 0.58substantially larger than without
using the eliminator. This trick is definitely not worth using in practice.
Figure 3.30 presents the result obtained with advance time Na = 1. The amplitude of
the transient is 0.24. In effect, the first sample of the transient in Fig. 3.29 has now been
truncated. It appears that Na should be larger to suppress the transient.

126

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


Output Signal

Amplitude

1
0.5
0
-0.5
-1
0

10

15

20

25
30
Time (samples)

35

40

45

50

35

40

45

50

Transient (Na = 4)

-3

x 10

Amplitude

10

-5
0

10

15

20

25
30
Time (samples)

Fig. 3.31 The same example as in Fig. 3.28. The signal with the solid line is obtained
using the new transient elimination technique with advance time N a = 4. The
slow oscillation in the lower figure is caused by the phase error of the Thiran
allpass filter.
Finally, Fig. 3.31 shows the result with Na = 4. Now the maximum amplitude of the
difference signal is 0.0099. The low-frequency oscillation due to the phase approximation error of the Thiran allpass filter is clearly visible in Fig. 3.31.

3.6 Conclusions
Methods for the design of FIR and allpass fractional delay digital filters were studied.
The maximally flat design of both the FIR and allpass interpolating filters has turned
out to be useful in the context of waveguide modeling and they are also the only FD
filter designs that have closed-form formulas for the coefficients. These techniques were
thus discussed in detail. A novel implementation technique for Lagrange interpolation
was introduced in Section 3.3.7. A new kind of parametrization was proposed for the
Thiran allpass filter that was originally introduced by Laakso et al. (1992). The approximation errors were investigated and an optimal range for the delay parameter of Thiran
allpass filters of different orders was determined.
Finally, this chapter discussed time-varying fractional delay filters. The FIR filters
do not suffer from transients when their coefficients or locations of filter taps are
changedunless the filter has been implemented in an incorrect way. The output signal

Chapter 3. Fractional Delay Filters

127

of an FIR filter will, however, be discontinuous at the time of the change. Recursive
filters suffer from both problems: transients and discontinuities. Several transient elimination methods from recent DSP literature were reviewed.
A new technique to suppress the transients in recursive filters was proposed and its
quality was analyzed. This method can suppress the transient signal as much as needed
by increasing the advance time parameter. The amount of computation is directly proportional to this parameter. The implementation costs of this technique also depend on
the structure of the recursive filter. The direct-form II realization results in an efficient
method where effectively 1.5 filters need to be implemented and executed in parallel:
the signal filter itself and a copy of its recursive part. This solution is computationally
more efficient than any of the previously known transient elimination techniques.
In the next chapter, interpolated waveguide filter structures that employ fractional
delay filters are developed.

4 Fractional Delay Waveguide Filters


The fractional delay filters studied in Chapter 3 are essential in digital waveguide models. This is a consequence of the fact that waveguide models deal with propagation
delays that are generally not integral multiples of the sampling interval. When FD
approximation is used, spatially continuous discrete-time simulation of a physical
waveguide is achieved.
In practice fractional delays are needed for two purposes in waveguide models:
1) for controlling the length of a digital waveguide, and
2) for connecting two or more waveguides at arbitrary points.
The former case, which involves interpolation of the signal value of delay lines, has
been used for several years. Jaffe and Smith (1983) used a first-order allpass filter to
tune a waveguide string model. Karjalainen and Laine (1991) showed how Lagrange
interpolation can be applied to the same problem.
The latter approach was introduced recently by Vlimki et al. (1993a). In that paper
it was shown how two or three digital waveguides with different impedances can be
connected so that their junction is located between two sampling points. This technique
can be employed, e.g., for modeling the human vocal tract or for modeling the finger
holes of woodwind instruments.
In this chapter, several structures for fractional delay waveguide filters are discussed.
Section 4.1 discusses deinterpolation that is the inverse operation of interpolation. In
Section 4.2 it is shown how fractional delay waveguide systems can be realized by utilizing the operations of interpolation and deinterpolation. Two specific fractional delay
junction structures, an interpolated junction of two and three acoustic tubes, are tackled.
In Sections 4.3 and 4.4 they are implemented using FIR FD filters whereas Sections
4.54.7 introduce the corresponding allpass FD structures.

4.1 Deinterpolation
Another basic operation employed in fractional delay waveguide models side by side
with interpolation is called deinterpolation. It is needed when connecting digital
waveguides at fractional points. Deinterpolation was first introduced in Vlimki et al.
(1993a). Earlier, Karjalainen and Laine (1991) had been using this technique for modeling a vibrating string, but the novelty of this technique was not recognized at that time.
4.1.1 Introduction to Deinterpolation
Interpolation is used for estimating the value of the signal between known samples. In
contrast, deinterpolation is used for computing the sample values when the amplitude of
128

Chapter 4. Fractional Delay Waveguide Filters


a)

b)

x(n D)

x(n)

129
x(n D)

x(n)

Delay Line

Delay Line

Fig. 4.1 a) Interpolation from a delay line. b) Deinterpolation into a delay line.

x(n D)

a)
z1

z d

z1

z1

x(n D)

b)
z1

z d

Fig. 4.2 More detailed diagrams of a) interpolation and b) deinterpolation showing the
fractional delay elements.
the signal between sampling points is known. Thus, deinterpolation can be understood
as an inverse operation to interpolation in the sense that it is an operation used for
computing the input of a system while interpolation estimates the value of the output of
a system.
In Fig. 4.1a, an output is located at an arbitrary point on the delay line, and the output
value is generally obtained by bandlimited interpolation. In Fig. 4.1b, an input is located
at an arbitrary point along the delay line. If the input point does not correspond to one
of the sample points of the delay line but is directed between them, bandlimited deinterpolation must be applied to spread the incoming signal to the adjacent signal samples
of the delay line.
More detailed versions of the block diagrams given in Fig. 4.1 are presented in Fig.
4.2 showing the individual unit and fractional delay elements. One of the unit delay
elements of the delay line has been split into two pieces, that is
z 1 = z (d + ) = z d z

(4.1)

where d is the fractional delay (0 < d < 1) and d is the complementary fractional delay
defined by

d =1- d

(4.2)

By examining the two cases presented in Figs. 4.2a and 4.2b, it can be understood
The name deinterpolation has been given to this operation instead of the more obvious inverse interpolation because in mathematics, inverse interpolation denotes the problem of finding an estimate for the
argument x when the value of the function y = f(x) is given (see, e.g., Hildebrand, 1974, pp. 6870 or
Kreyszig, 1988, p. 969). Deinterpolation is thus not the same operation as inverse interpolation. Note also
that the operation of decimation, which is used in multirate signal processing as an inverse operation of
interpolation, is yet another technique.

130

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


a)

z
z

-1

b)

-d

x(n - D)
z

-1

-1

z-d

x(n - D)
z-1

z-1

z-1

Fig. 4.3 The operations of a) interpolation and b) deinterpolation can equivalently be


presented by pulling the fractional delay elements out from the delay lines.
that in interpolation it is needed to produce a fractional delay of d whereas in deinterpolation the complementary fractional delay d is required. Thus, configurations of Fig 4.2
can be equivalently presented as shown in Fig. 4.3. Now the fractional delays have been
pulled out from the delay line. Consequently, the delay line is again constructed of a
chain of unit delays. Note that the deinterpolator is connected to the delay line one unit
delay to the right from the point where the corresponding interpolator is connected (see
Fig. 4.3).
4.1.2 Deinterpolation Versus Interpolation
In the following we discuss the relation between nonrecursive (or FIR) interpolation
and nonrecursive deinterpolation. The latter may be interpreted as an inverse operation
of the former. The inverse nature of these operations is essentially reflected by the fact
that the FIR interpolator approximates the total delay D while the FIR deinterpolator
approximates the complementary total delay D defined by

D=ND

(4.3)

where N is the order of the FIR interpolator. The fractional part d of D is related to d
according to Eq. (4.2). The relation of D and D is illustrated in Fig. 4.4 for an even filter
order N.
The ideal FIR interpolator and the corresponding ideal FIR deinterpolator can thus
be presented as systems with transfer functions z - D and z -D , respectively. Were an
Nth-order FIR interpolator and the corresponding Nth-order FIR deinterpolator casD

D
d
0

N/2

d
N1

Fig. 4.4 A nonrecursive interpolator approximates the delay D but a nonrecursive deinterpolator the complementary delay D. Also shown are the fractional delay d
and the complementary fractional delay d.

Chapter 4. Fractional Delay Waveguide Filters

131

caded, the resulting system would merely cause an incoming sequence x(n) to be
delayed by N samples.
If the impulse response of an FIR interpolating filter approximating delay D is
reversed in time, the resulting filter approximates the complementary delay D . This
statement is easily confirmed by considering a hypothetical Nth-order linear-phase
interpolating filter with the frequency response
H(e jw ) = H(e jw ) e - jwD

(4.4)

The inversion in time corresponds to inverting the sign of the phase function. To make
the filter causal, it is necessary to add a delay of N samples. The resulting time-reversed
and delayed interpolator has a frequency response
Hrev (e jw ) = e - jwN H(e - jw ) = e - jwN H(e jw ) e jwD = H(e jw ) e - jw ( N - D)
= H(e jw ) e - jwD

(4.5)

where D is the complementary total delay defined by (4.3). This clarifies the relation of
the nonrecursive interpolator and deinterpolator. In Section 3.3.3 it was shown that for
Lagrange interpolation the coefficients for the delay N - D are the same as for D, but in
reversed order.
4.1.3 Implementing Deinterpolation Using an FIR Filter
Nonrecursive deinterpolation can be thought of as a superposition of the signal values
onto the delay line using the transposed FIR filter structure with interpolating coefficients (Vlimki et al., 1993a). The transpose of a digital filter structure is obtained by
changing the direction of all branches by replacing all branch nodes with summation
nodes (i.e., adders) and adders with branch nodes, and by interchanging the input and
output nodes of the system (see, e.g., Jackson, 1989, pp. 7476).

z-1

x(n)
h(0)

v1 (n)

h(1)

z-1

v2 (n)

h(2)

v N (n)

z-1

h(N)

y(n)

Fig. 4.5 The standard direct-form FIR filter structure that can be used for nonrecursive
interpolation when interpolating coefficients h(n) are used.

x(n)
h(N)

h(N - 1)
z-1

h(N - 2)

v N (n)

z-1

h(0)

v N -1 (n)

-1

v1 (n)

Fig. 4.6 The transposed direct-form FIR filter structure.

y(n)

132

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

The standard direct-form FIR filter is shown in Fig. 4.5 and its transpose in Fig. 4.6.
Note that the transposed structure in Fig. 4.6 has been vertically and horizontally mirrored to have the signal propagate from left to right or downwards.
The direct-form FIR filter computes a discrete convolution between its input
sequence x(n) and the coefficients h(n) whereas the transposed FIR structure copies
each input sample, multiplies each of them by one coefficient h(n) and adds the products to the delay line, which is represented by the state variables vk (n) for k = 1, 2, 3,
..., N. The output of the transpose structure is computed as the sum of the input sample
x(n) multiplied by one of the weighting coefficients h(n) and the value of the state variable v1 (n). This can be expressed using the state-variable description which is a standard tool in discrete-time signal processing. The next-state equation is given by
v(n + 1) = FT v(n) + gx(n)

(4.6a)

y(n) = v1 (n) + h(0)x(n)

(4.6b)

and the output equation by

where we have defined the state vector (i.e., the delay line) as
v(n) = [v N (n) vN -1 (n) L v1 (n)]

(4.6c)

the N N matrix F as
0
0

F = 0
M

1 0 L
0 1
0 0 O
O
0 0 L

0
0

M
1
0

(4.6d)

and the coefficient vector g as


g = [h(N) h(N - 1) L h(1)]

(4.6e)

Note that vector g excludes the first filter coefficient h(0).


When the coefficients h(n) of the standard direct-form and the transpose structure are
the same, the two filters have identical impulse responses and transfer functions. The
proof of this is quite straightforward (see, e.g., Jackson, 1989, pp. 7476, or Vlimki,
1994b, Appendix B). For another proof based on Tellegens theorem, see Fettweis
(1971) or Oppenheim and Schafer (1975, pp. 173178).
Thus the transpose of an FIR interpolator does not implement deinterpolation; its
impulse response has to be reversed in time. Now we can write the state-variable
description for the nonrecursive deinterpolator as
v(n + 1) = FT v(n) + g rev x(n)

(4.7a)

y(n) = v1 (n) + h(N)x(n)

(4.7b)

where we have defined the coefficient vector g rev as

Chapter 4. Fractional Delay Waveguide Filters

133

x(n)
h(0)

h(1)

z-1

h(N)

h(2)

z-1

v N (n)

z-1

v N -1 (n)

y(n)

v1 (n)

Fig. 4.7 Nonrecursive deinterpolation is implemented by the transpose FIR filter


structure with a time-reversed impulse response.
g rev = [h(0) h(1) L h(N - 1)]

(4.7c)

The vector v(n) and matrix F are defined as in Eqs. (4.6c) and (4.6d) above.
Equivalently, one may imagine that both the delay line of the transpose structure as
well as the order of state variables vk (n) have to be reversed to obtain the FIR realization of deinterpolation. Figure 4.7 illustrates the implementation of nonrecursive deinterpolation by the transpose nonrecursive structure with a reversed impulse response.
Apart from the ordering of coefficients this filter is equivalent to the transpose structure
of Fig. 4.6.
Based on the above discussion it is understood that the transfer function of the FIR
deinterpolator can be written as
Hd (z) =

hd (n)z -n =

n=0

h(N - n)z -n = Hrev (z)

(4.8)

n=0

This implies that if the transfer function H(z) of the FIR interpolator approximates the
transfer function z - D , then Hd (z) approximates the z-transform z -D as discussed earlier.
4.1.4 Adding a Signal to a Fractional Point of a Delay Line Using an FIR
Deinterpolator
Consider addition of a signal to an arbitrary fractional point on a delay line as illustrated
in Fig. 4.1b. The implementation of this system by means of nonrecursive deinterpolation is presented in Fig. 4.8. Note that now the deinterpolator does not have independent
state variables as was assumed in the preceding section, but employs the unit delays of
the delay line.
When a signal w(n) is deinterpolated onto the delay line, the resulting signal may be
expressed as

w(n)
h(0)
x(n)

z- M

h(1)

-1

h(2)

z-1

h(N)

-1

x (n - M - N)

Fig. 4.8 Adding a signal to a fractional point of a delay line using a deinterpolator.

134

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


x(n - M - k) = x(n - M - k) + hrev (k)w(n) for k = 0,1,2,K, N

(4.9)

where hrev (n) are the coefficients of the deinterpolation filter, and x(k) and x(k) are the
signals in the delay line before and after deinterpolation, respectively. For the next sampling cycle, the updated signal values are shifted so that we can write
x(n - M - k - 1) = x(n - M - k) for k = 0,1,2,K, N - 1

(4.10)

Equation (4.9) can be expressed in vector form as


xn = x n + h rev w(n)

(4.11a)

where xn is the modified version of vector xn defined by


x n = [x(n - M) x(n - M - 1) L x(n - M - N)]T

(4.11b)

and h rev is the coefficient vector of the FIR deinterpolator containing the coefficients
hrev (n) that are the same as those used with interpolation but in reversed order, i.e.,
hrev (n) = h(N - n) for n = 0,1,2,K, N :
h rev = [hrev (0) hrev (1) L hrev (N)]T
= [h(N) h(N - 1) L h(0)]T

(4.11c)

The vector notation shows that the result of the deinterpolation appears in the delay line
itself.
4.1.5 Discussion
The same number of additions and multiplications is needed for the implementation of
interpolation and deinterpolation. However, in deinterpolation, cumulative additions to
a delay line are needed and in software implementation they usually require more than
one instruction, e.g., a read from memory, an addition, and a save to memory. Thus, in
practice deinterpolation may be computationally more expensive than interpolation.
It can be argued that deinterpolation could be implemented just as well by a directform FIR interpolator that approximates the delay of D samples instead of D. This is
true and seems to make the definition of a new technique such as deinterpolation quite
unnecessary. However, the main advantage of deinterpolation can be clearly observed
by examining addition to the delay line as illustrated in Fig. 4.8. If this system were
realized by a direct-form FIR interpolator, N extra unit delays would be required. Also,
the implementation would not be as intuitive, since the fractional delay would have to
be processed separately and the result be added to the delay line signal at a point located
about N/2 samples before the fractional point D. In the following, other benefits of
deinterpolation will become clear. Especially the implementation of fractional delay
junctions of waveguides is straightforward using nonrecursive deinterpolation .

Chapter 4. Fractional Delay Waveguide Filters


Input

135
Output

Delay Line

-r

-r
Delay Line

Fig. 4.9 Block diagram of the waveguide model for an ideal string.

Input

-M

-1

h(0)

z
h(1)

-1

h(2)

-1

Output

h(N)

Interpolation

-r

-r
Deinterpolation
h(0)
z- M

h(1)

-1

h(2)

z-1

h(N)

z-1

Fig. 4.10 FD waveguide filter for modeling a vibrating string of arbitrary length.

4.2 Fractional Delay Waveguide Systems


Our main interest in this work is not to operate with single delay lines but with digital
waveguides that are bidirectional delay lines. With the operation of deinterpolation it is
possible to implement a variable-length digital waveguide or connect waveguides at
arbitrary points.
4.2.1 Variable-Length Digital Waveguide
Karjalainen and Laine (1991) used nonrecursive interpolation and deinterpolation for
implementing a waveguide model for a variable-length guitar string. Figure 4.9 illustrates the system. The two delay lines represent waves that travel in opposite directions
along the string. The ends of the vibrating string are fixed and the waves reflect with
inverted phase. In this simple model the end reflection is assumed to be purely resistive,
i.e., independent of the frequency.
The length of the string has to be continuously variable to produce tones on some
musical scale. The actual implementation of this model by means of a fractional delay
waveguide filter is presented in Fig. 4.10. Fractional delay is only needed at one end of
the delay lines. The length of the upper delay line is controlled by an FIR interpolator.
The interpolated result is multiplied by r and the produced reflected signal is superimposed onto the lower line using an FIR deinterpolator. The filter coefficients h(n) of the
deinterpolator are the same as those of the interpolator. Note, however, that the coefficients of the deinterpolator are in the reverse order with respect to those of the

136

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

a)

b)
Delay Line 1

Delay Line 3

Interpolation

Delay Line 1

Deinterpolation
2-Port
Junction

Deinterpolation
Delay Line 2

2-Port
Junction
Interpolation
Delay Line 4

Delay Line 2

Fig. 4.11 Two alternative configurations for the implementation of a fractional delay
junction of two digital waveguides.
interpolator.
In certain cases the two delay lines that form the waveguide may be combined and a
more efficient realization results. For example, the system of Fig. 4.10 could be implemented by a single delay line with a reflection coefficient r 2 at one end. A single fractional delay filter could then be used to control its length. Only the delay between the
input and output points of the one-delay-line version is different from that of the
implementation of Fig. 4.10. This is usually acceptable. Other properties of the modified waveguide system remain the same.
4.2.2 Connecting Waveguides at Arbitrary Points
Deinterpolation proves to be an extremely useful operation in the context of fractional
delay junctions of digital waveguides. An example of the simplest FD junction is the
fractional delay endpoint shown in Fig. 4.10 above. Other elementary cases are
a junction of two waveguides,
a junction of a side branch connected to a waveguide, and
a junction of three waveguides.
A junction indicates a change of impedance. A junction of two waveguides disappears or reduces to pure transmission if the impedances of the connected waveguides
are equal. When more waveguides are coupled, the signal inside each one of them sees
that the input impedance of the junction is equal to the parallel connection of the
impedances of the other waveguides. Thus scattering occurs. Part of the wave is transmitted through the junction to the other waveguides while the rest is reflected back.
Figure 4.11 depicts two ways of connecting two digital waveguides at fractional
points. In Fig. 4.11a the two waveguides are represented by four distinct delay lines:
delay lines 1 and 2 form one digital waveguide and lines 3 and 4 the other. In contrast,
in Fig. 4.11b the left-going and right-going delay lines of the two waveguides have been
combined. In both cases, the signals entering the two-port scattering junction are interpolated out from the delay lines and the outcoming (transmitted) signals are deinterpolated onto the delay lines.
In the case of Fig. 4.11a some extra unit delays have to be reserved for each delay
line when FIR FD filters are used for realizing interpolation and deinterpolation. This is
because FIR interpolators need to make use of signal samples at both sides of the inter-

Chapter 4. Fractional Delay Waveguide Filters

137

3-Port
Junction

Delay Line 4

Delay Line 1

Delay Line 3

Side Branch

Delay Line 2
Fig. 4.12 An example of a fractional delay junction of three digital waveguides: a side
branch in a waveguide.
polation point. The system of Fig. 4.11b does not usually require these extra data, since
there are unit samples on both sides of the junction anyway. It is, nevertheless, important to make sure that the junction is not located nearer than about N/2 unit delays away
from either end of the delay lines. Otherwise the accuracy of the interpolators has to be
reduced by lowering the order of the FD filter.
The above remarks are valid for FD junctions of three or more waveguides as well. A
special case of a junction of three waveguides is illustrated in Fig. 4.12. In this example
a side branch represented by delay lines 3 and 4 is connected to a waveguide. The threeport junction determines how much of each input signal is reflected back and how much
is transmitted to the other waveguides. The input signals of the junction are obtained by
interpolation and the output signals are superimposed onto the delay lines by deinterpolation. Later in this chapter this structure is applied to modeling of a finger hole of a
woodwind instrument. It is also useful for implementing the junction of the vocal and
nasal tract in speech production models.

4.3 Interpolated Waveguide Tube Model


An inherent limitation of the traditional KellyLochbaum model is that the scattering
junctions can only be located at integral points of the waveguide. Consequently, the
total delay length of the model is quantized to a multiple of the unit delay. These problems, and earlier attempts to overcome them, were discussed in Section 2.2.8. In this
section we demonstrate how it is possible to eliminate these limitations using fractional
delay waveguide modeling techniques that were introduced above.
Figure 4.13 illustrates the basic idea of the enhanced tube model that is developed in
this work. An acoustic tube has been split into a number of parts each of which is modeled by a cylindrical tube section like in the traditional approach, or by a conical tube
section, as described in Section 2.4. The length of each tube section can be a real
number independent of the lengths of other sections.

138

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


a)
Vocal Folds

Lips
Vocal Tract

L1

L2

L3

L4

b)
Vocal Folds

L5
Lips

Vocal Tract

L1

L2

L3

L4

L5

Fig. 4.13 The profile of the vocal tract and its approximation by a) cylindrical and b)
conical tube sections of different length.
4.3.1 Interpolated Scattering Junction
As discussed earlier in this chapter, the length of a waveguide may be adjusted by
applying interpolation and deinterpolation techniques. With the help of these operations
it is moreover possible to locate a scattering junction at a fractional point of a waveguide. The block diagram of a single fractional delay, or interpolated, two-port junction
is presented in Fig. 4.14. The input signals of the scattering junction (denoted by S)
are now obtained using interpolation and the output signals of the junction are fed back
to the delay lines using deinterpolation. The wave scattering component is computed by

Chapter 4. Fractional Delay Waveguide Filters

139

Delay Line
Interpolation

Deinterpolation
S

Deinterpolation

Interpolation

Delay Line

Fig. 4.14 Block diagram of the interpolated two-port scattering junction. The block
denoted by S is a two-port scattering junction.
multiplying the difference of the interpolated signals with the reflection coefficient r .
A more detailed block diagram of a single FD two-port junction is shown in Fig.
4.15. This example is valid for pressure signals. In this figure it is seen that only the
reflected and transmitted signal components have to be explicitly computed; the direct
signals propagating in the two delay lines need no attention since the effect of the junction is superimposed onto the delay lines. Note that in the lower delay line, the interpolator approximates the complementary delay d and the deinterpolator the delay d. In
other words, the roles of these two operations are interchanged.
Based on the observation that only the reflected and transmitted signals have to be
explicitly interpolated and deinterpolated we may simplify the block diagrams for the
FD junctions. Figure 4.16 illustrates the junction computation for the volume velocity
and the acoustic pressure. The signal paths starting from the delay line indicate that the
signal is obtained via interpolation. Consequently, the paths leading to the delay line
denote that the signal has to be deinterpolated onto the delay line. It is seen that two

-1

-d

-d

z-d

-1

-1

-d

-1

Fig. 4.15 The FD two-port scattering junction for the pressure signal.
a)

b)
Delay Line

Delay Line

Delay Line
1

Delay Line

Fig. 4.16 Diagrams of the FD two-port junctions for a) volume velocity and b) pressure
signals showing that only the reflected and transmitted signal components for
both directions are actually computed.

140

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Delay Line

Input

S0

S1

S2

R(z)

SM -1

Output

Delay Line
Fig. 4.17 Fractional delay M-tube model for sound pressure.
interpolations and two deinterpolations are required for the realization of an FD twoport junction, i.e., four filters are needed to turn a basic KL junction into a fractional
delay one.
A complete fractional delay M-tube model is shown in Fig. 4.17. In principle an arbitrary number of interpolated two-port junctions may be used between the endpoints of
the waveguide. One of the endpoints of the model may be fixed (the left end in Fig.
4.17) while the other has to be interpolated to enable continuous change of the total
length. In principle, this model does not suffer from the restrictions of the traditional
KL model that were discussed in Section 2.2.8 (see also Vlimki et al., 1994b). A
vocal tract model based on the FD waveguide approach can be controlled in a more
straightforward way, since the scattering junctions are movable (Vlimki et al., 1994a;
Kuisma, 1994).
4.3.2 Implementing the FD Scattering Junction Using FIR Filters
Interpolation and deinterpolation can be implemented by means of FIR filters as
described in Section 4.1. The computation of the interpolated KL junction using Nthorder FIR filters is depicted in detail in Fig. 4.18. It is assumed that the filter coefficients h(n) of all the interpolating and deinterpolating FIR filters in this configuration
are the same. The FIR filter coefficients h(n) can be designed using one of the methods
discussed in Chapter 3.
The signal value is first interpolated out from both the upper and the lower delay
line, i.e.
p+ (n, m + D) = hT p + (n, m)

(4.12a)

p- (n, m + D) = hT p- (n, m)

(4.12b)

where
h = [h(0) h(1) h(2) L h(N)]

(4.12c)

p + (n, m + 1)

p + (n, m + 2) L p + (n, m + N)

p - (n, m + 1)

p - (n, m + 2) L p - (n, m + N)

p + (n, m) = p + (n, m)
p- (n, m) = p - (n, m)

(4.12d)

(4.12e)

Here n is the time index and m the spatial index as was discussed in the beginning of
Chapter 2. Now it is again helpful to use the spatial variable although it is in principle

Chapter 4. Fractional Delay Waveguide Filters

141

Deinterpolation
h(1)

h(0)
p + (n, m)

h(0)

h(2)

z-1
h(1)

h(0)

h(N)

Interpolation

-1

h(1)

+ r

p (n, m + D)
p (n, m + D)

h(2)
z

-1

Scattering

Interpolation

h(1)
z

Upper
Delay Line

z-1

h(0)
p (n, m)

z-1
h(2)

h(N)

h(2)

h(N)
z

-1

Lower
Delay Line

h(N)

Deinterpolation

Fig. 4.18 Implementation of an FD two-port junction for pressure signals using Nthorder FIR interpolators and deinterpolators. The interpolating coefficients
h(n) are assumed to be the same for all the filters.
redundant. Note that the interpolation is essentially a spatial operation in this context,
and it is appropriate to call D the fractional spatial index.
The difference of the interpolated signals p+ (n, m + D) and p- (n, m + D) is computed and multiplied by the reflection coefficient r, that is
w(n) = r[ p+ (n, m + D) - p- (n, m + D)] = r hT [p + (n, m) - p- (n, m)]

(4.13)

The resulting signal w(n) is then deinterpolated onto both delay lines. We can write the
updated delay line vectors as
p+ (n, m) = p + (n, m) + hT w(n)

(4.14a)

p- (n, m) = p- (n, m) + hT w(n)

(4.14b)

Since the last term hT w(n) of the above equations is the same, it is economical to
compute it separately and then use it in both equations. Also, as shown by Eq. (4.13) it
is practical to first compute the difference of signal vectors p + (n, m) and p- (n, m) and
only then evaluate the inner product with hT . Thus it is seen that by reorganizing the
computation it is possible to achieve reduction of complexity. In Fig. 4.19, the two
interpolators as well as the two deinterpolators have been combined. This does not
reduce the number of additions (or subtractions) which is still 4N + 3. However, in the
straightforward configuration (Fig. 4.18) the number of multiplications is 4N + 5, but in
the optimized one (Fig. 4.19) only 2N + 3. The reduction in the number of multiplica-

142

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters


Deinterpolation
h(0)

Upper
Delay Line

Lower
Delay Line

h(0)

h(1)

z-1

-1

h(1)

h(N)

h(2)

-1

z-1

Interpolation

-1

z-1

+
h(2)

w(n)

h(N)

Fig. 4.19 Efficient implementation of an FD two-port junction for pressure signals.


tions is about 50%.
There are only minor differences between the FD junctions for the pressure and volume velocity signals. The junction for the volume velocity signal is obtained by replacing one of the subtractions with an addition and by inverting the sign of the signal that
is to be deinterpolated into the upper delay. These differences are easily noticed in Fig.
4.16.
The FD KL model works correctly even if the FIR filters associated with contiguous
FD junctions overlap, i.e., two junctions can be located so close to each other that they
use the same unit delays of the waveguide for interpolation and deinterpolation. It
should be remembered, however, that then it is of paramount importance to execute the
operations in the following order (Vlimki et al., 1993a):
1) read the signal values of the delay lines and compute the outputs of all the interpolating filters in the system,
2) compute the scattered signals w(n) of all M junctions, and
3) superimpose all the results onto the delay lines using deinterpolators.
If this rule is violated, e.g., so that all the computations associated with the junctions
are evaluated one after the other, the system becomes noncausal since part of the signal
energy will flow through the waveguide within one cycle.
4.3.3 Transfer Functions of FIR Interpolators and Deinterpolators
To understand the functioning of the FD two-port junction discussed above, we consider the transmission and reflection functions of the junction from both sides. We
assume that the coefficients of both the interpolators and the deinterpolators are the
same but they may be in the same or in the reverse order. The prototype FIR transfer
function that is used to assist in the comparison of the transfer functions in the follow-

Chapter 4. Fractional Delay Waveguide Filters

143

ing is defined as
H(z) =

h(n)z -n

(4.15)

n=0

It is assumed that the filter coefficients h(n) have been designed so that the filter
approximates the ideal bandlimited interpolation is some sense. The nominal value of
the total delay of this filter is D = D + d .
The following notation is used: the superscripts + and are used to denote that
the filter operates on the signal that propagates in the positive and the negative direction, respectively; the subscripts i and d indicate that the filter is used for interpolation or deinterpolation, respectively.
Let us first write the transfer function for each of the four FIR filters involved in the
FD junction. The signal value is interpolated from the upper delay line by the FIR interpolator with the transfer function Hi+ (z), that is
Hi+ (z)

hi+ (n)z -n

n=0

h(n)z -n = H(z)

(4.16)

n=0

The deinterpolator that superimposes the scattered signal onto the upper delay line
has the transfer function Hd+ (z). This filter approximates the complementary delay D =
D + d where d is the complementary fractional delay. Thus the impulse response of
the filter is the mirror-image of the prototype filter H(z), that is, hd+ (n) = h(N - n) for n
= 0, 1, ..., N, and the transfer function may be expressed as
Hd+ (z)

hd+ (n)z -n

n=0

h(N - n)z

-n

=z

-N

n=0

h(n)z n = z - N H(z -1 )

(4.17)

n=0

In the lower delay line the fractional delays d and d have changed their roles as is
seen, e.g., in Fig. 4.15. Thus, the interpolator Hi- (z) approximates the complementary
fractional delay d and the deinterpolator Hd- (z) the fractional delay d. This implies that
now the impulse response of the interpolator must be reversed and the transfer function
Hi- (z) is written as
Hi- (z)

hi- (n)z -n

n=0

h(N - n)z -n = z - N H(z -1 )

(4.18)

n=0

The transfer function Hd- (z) of the filter that deinterpolates the scattered signal onto
the lower delay line is thus
Hd- (z)

n=0

hd- (n)z -n

h(n)z -n = H(z)

(4.19)

n=0

It is seen that the interpolator of the upper line and the deinterpolator of the lower
line have the same transfer function, that is
Hi+ (z) = Hd- (z) = H(z)

(4.20)

The equivalence of these transfer functions is a consequence of the fact that they both

144

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

approximate the delay D. On the other hand


Hd+ (z) = Hi- (z) = z - N H(z -1 )

(4.21)

since both the filters Hd+ (z) and Hi- (z) approximate the complementary delay D .
4.3.4 Choice of FIR FD Filter Design
In the above discussion on interpolated waveguide models, the properties of the FIR
fractional delay filters used for interpolation and deinterpolation have not been taken
into account. The theory of the FDWFs is independent of the choice of filter design
method. Let us now consider how to choose the design technique.
The FIR FD filter should yield a good approximation of bandlimited interpolation at
least at low frequencies (from about 50 Hz to some kHz). This is because the human
auditory system is more sensitive at these frequencies than at higher frequencies.
Strictly speaking the filter design technique should take into account the properties of
human hearing. One possibility is to specify an auditory-based weighting function in the
frequency domain and use that as an error weighting function in a filter design algorithm. For example, the complex LS design and minimax (Chebyshev) design algorithms allow the use of an arbitrary non-negative weighting function (Laakso et al.,
1994).
However, the author feels that it is unnecessarily involved to try to design FIR filters
that are optimal in the auditory sense. Bandlimited approximation of ideal interpolation
can be considered a more practical choice. It is apparent that oversampling by a small
factor is necessary in order to obtain a satisfying approximation with a low-order FD
filter. The upper part of the frequency band is left as a guard band.
FIR FD filter design methods that give good approximation on a limited low-frequency band include Lagrange interpolation, general complex least-squares design, and
equiripple design (Laakso et al., 1994). Lagrange interpolation has several good properties (see Section 3.3): it naturally gives a low-frequency approximation, its coefficients
can be computed using a closed-form formula, and its magnitude response does not
exceed unity.
In the general complex LS design (see Section 3.2.4), the approximation band can be
specified (leaving the rest of the frequency band as a dont care band). The filter coefficients are designed solving a matrix equation. In principle, the complexity of the
design should not influence the choice of the filter since it is possible to design the coefficients beforehand and store them into a lookup table. In most practical cases, this is
also the most efficient implementation technique for coefficient update.
From the viewpoint of waveguide modeling, a drawback of the general complex LS
design is that its magnitude response oscillated around unity, potentially exceeding it at
several frequencies. When the FD filter is used in a feedback loop, the magnitude
response of the filter must not be greater than one at any frequency. Otherwise the overall system may become unstable. Hence, it is necessary to determine the maximum of
the magnitude response of the filter for each delay value D, and then scale the output of
the filter (or, equivalently, the filter coefficients) by the inverse of the maximum value.
Also the equiripple (minimax) design results in a filter which has an oscillating
magnitude response. Thus scaling of the filter is necessary in order to restore the stability of the interpolated waveguide filter model. Oetken (1979) has proposed a simple

Chapter 4. Fractional Delay Waveguide Filters

145

T (z)
FD KL
Junction

+
R (z )

R (z)

T (z)
Fig. 4.20 Definitions of the transfer functions.
design technique that yields an approximately equiripple solution (see Section 3.2.5).
This technique involves design of a linear-phase prototype filter, solving the frequencies
where the approximation error (deviation from the ideal interpolator) is zero, and a
matrix multiplication.
In order to have an FDWF system that can be implemented efficiently, it would be
advantageous to use low-order FIR FD filters. Even-length (i.e., odd-order) FIR FD filters are easier to use especially in time-varying systems, since the filter taps need to be
moved only when the integer part of the total delay D is changed. Based on experiments, it has been found that a third-order interpolating filter is a suitable compromise
in terms of implementation costs and quality.
An easy first choice for an FIR FD filter design technique is Lagrange interpolation.
If it is good enough, it is not necessary to consider the other methods. The complex
general LS design and equiripple design yield smaller approximation error at middle
frequencies and give a wider usable frequency band. Later in Section 4.3.7 these design
methods are applied for the simulation of an acoustic tube.
4.3.5 Reflection and Transmission Functions
We proceed by studying the reflection and transmission transfer functions of the FD
junction. These results were originally published in Vlimki et al. (1994b) but the
notation differs slightly from that used here. The definitions of the transfer functions are
illustrated in Fig. 4.20.
The transfer function through the FD junction in the positive direction, i.e., T + (z) in
Fig. 4.20, may then be expressed as

T + (z) = z - N + rHi+ (z)Hd+ (z) = z - N 1 + rH(z)H(z -1 )

= z - N 1 + r H(z) 2

(4.22)

It can be seen from this z-transform that the impulse response (IR) through the port consists of a delayed unit impulse plus the convolution of h(n) and the time-reversed (and
delayed) IR hrev (n) scaled by r. The length of the IR is thus 2N + 1. Note that when the
transfer functions Hi+ (z) and Hd+ (z) are cascaded, their phase functions cancel each
other out and a linear-phase IR results.
Similarly, a linear-phase transfer function is obtained for transmission in the negative
direction (see Fig. 4.20)
Strictly speaking the last equality in (4.20) is true only when z = ejw but we use this notation for convenience.

146

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Linear Magnitude

1
0.8
0.6
0.4
0.2
0
0

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Phase Delay in Samples

1
0.8
0.6
0.4
0.2
0
0

Fig. 4.21 The magnitude and phase delay responses of the reflection function R + (e jw )
of the FD two-port junction. In the upper part the magnitude of the reflection
function using first (dashed line) and third-order (solid line) interpolators is
illustrated. The corresponding phase delay curves are shown in the lower
part. The dotted line corresponds to the desired curve. In this example r = 0.5
and d = 0.4.

T - (z) = z - N - rHi- (z)Hd- (z) = z - N 1 - r H(z) 2

(4.23)

The reflection function R + (z) in Fig. 4.20 can be written as


R + (z) = rHi+ (z)Hd- (z) = r[ H(z)]2

(4.24)

Again we see that a finite IR of length 2N + 1 is obtained. It is the result of convolution


of the IRs of the interpolation and deinterpolation filter (scaled by r), and does not have
linear phase unless both Hi+ (z) and Hd- (z) are linear-phase transfer functions or they
are complementary.
The reflection function R- (z) (see Fig. 4.20) is given by
R- (z) = -rHi- (z)Hd+ (z) = -rz - N H(z -1 )z - N H(z -1 ) = -rz -2 N [ H(z)]2

(4.25)

Here the transfer functions of both the interpolation and deinterpolation filters appear in
the time-reversed and delayed form. Also this transfer function does not in general have
a linear phase.

Chapter 4. Fractional Delay Waveguide Filters

147

Linear Magnitude

2
1.5
1
0.5

Phase Delay in Samples

0
0

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

1
0.8
0.6
0.4
0.2
0
0

Fig. 4.22 The magnitude (upper) and phase delay (lower) responses of the transmission
function T + (e jw ) of the FD KL junction. The dashed lines correspond to the
linear interpolator and the solid lines to the third-order Lagrange interpolator.
The dotted line corresponds to the desired ideal curve. In this example r = 0.5
and d = 0.4.
By setting z = e jw in the above transfer functions it is possible to study the behavior
of a single FD two-port junction. In the following examples we use first and third-order
Lagrange interpolators and show the magnitude response and phase delay in two cases.
In these examples r = 0.5 corresponding to a configuration where two tubes with diameters A1 and A2 = 3A1 are connected. The fractional delay is d = 0.4, which is nearly the
worst case .
In Fig. 4.21 the reflection function R + (z) is studied. In the upper part of Fig. 4.21
the magnitude response R + (e jw ) obtained from Eq. (4.24) is shown for the cases
where the first (N = 1) and third-order (N = 3) Lagrange interpolators are used for the
implementation of the FD two-port junction. Both magnitude responses have lowpass
characteristics, which follows from the lowpass nature of the Lagrange interpolators.
Remember that the error due to FIR FD approximation is applied twice in the reflection
function as demonstrated by Eq. (4.24).
In the lower part of Fig. 4.21 the phase delay response is plotted for both cases. We
have subtracted a delay of 2 samples from the phase delay of the third-order interpolator
In the worst case, i.e., when the fractional delay is 0.5, the odd-order Lagrange interpolators have linear
phase, and the effect of phase error would then not be shown in the figures. For this reason we use a delay
value that is nearly but not exactly 0.5.

148

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

to more clearly view the difference of the two responses. The ideal phase delay 2d is
represented by a dotted straight line. The phase delay of both responses is identical with
the ideal one at low frequencies, but at high frequencies the curves approach zero.
The transmission function T + (z) of the two-port FD junction is examined in Fig.
4.22. The magnitude response T + (e jw ) obtained from Eq. (4.22) is again shown for
both the linear and the third-order interpolator (see upper part of Fig. 4.22). At low frequencies both curves coincide with the ideal result (dotted straight line), but at high frequencies they approach unity because the magnitude responses of the Lagrange interpolators approach zero. The third-order interpolators yield a slightly better result. The
phase delay of both transmission functions is constant corresponding to a linear phase
function. This is due to the cancellation of the phase responses of the interpolators and
deinterpolators, as mentioned earlier.
From these examples we may derive some general properties of the FD KL junction
implemented using Lagrange interpolation.
The reflected signal is a lowpass filtered version of the input signal. The phase of
the reflected signal can be severely distorted at high frequencies.
Transmission through the junction attenuates high frequencies of a signal but
does not cause any phase distortion.
4.3.6 Analysis of Errors Due to FD Approximation
The approximation errors of the FD filters degrade the performance of the waveguide
model. The interpretation illustrated in Fig. 4.23 helps in the error analysis (Vlimki et
al., 1994b). The blocks e - jwD and e - jwD denote the frequency responses of the ideal
fractional delay and ideal complementary fractional delay, respectively. Here we have
normalized the sampling rate to 1, i.e., the sample interval T is also equal to 1. The
complementary total delay D in Fig. 4.23 is defined by Eq. (4.3) as

- jwD
+

e - jwD

jw

Gd (e jw )
r

- jwD

Gi (e jw )

Gi (e

Gi (e

jw

)
e - jwD

Fig. 4.23 Temporal flow diagram of the FD KL junction for pressure waves with ideal
fractional delays of lengths D and D . The frequency response functions
Gi+ (e jw ) and Gi- (e jw ) include the interpolation error and Gd+ (e jw ) and
Gd- (e jw ) the deinterpolation error, and r is the reflection coefficient. Note
that in this figure the direction of the lower delay line has been reversed to
avoid crossing of signal paths.

Chapter 4. Fractional Delay Waveguide Filters

149

25

Magnitude (dB)

20

15

10

-5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Fig. 4.24 The magnitude responses of the FD two-tube model with linear interpolation
(solid line) and ideal interpolation (dashed line). The dotted line shows the
magnitude response of a uniform tube.
D=N-D

(4.26)

The frequency response functions G(e jw ) include the approximation error due to the
fractional delay filters. Were ideal interpolating filters used, the frequency responses
G(e jw ) would be equal to unity. These functions are defined as
Gi+ (e jw ) = e jwD Hi+ (e jw ) = 1 - e jwD Ei+ (e jw )

(4.27a)

Gd+ (e jw ) = e jwD Hd+ (e jw ) = 1 - e jwD Ed+ (e jw )

(4.27b)

Gi- (e jw ) = e jwD Hi- (e jw ) = 1 - e jwD Ei- (e jw )

(4.27c)

Gd- (e jw ) = e jwD Hd- (e jw ) = 1 - e jwD Ed- (e jw )

(4.27d)

where the error functions E represent the difference between the frequency response of
the ideal delay element and the approximation. They are defined by
Ei+ (e jw ) = e - jwD - Hi+ (e jw )

(4.28a)

Ed+ (e jw ) = e - jwD - Hd+ (e jw )

(4.28b)

150

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

25

Magnitude (dB)

20

15

10

-5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Fig. 4.25 The magnitude responses of the FD two-tube model with third-order
Lagrange interpolation (solid line) and ideal interpolation (dashed line). The
dotted line presents the magnitude response of a uniform tube.
Ei- (e jw ) = e - jwD - Hi- (e jw )

(4.28c)

Ed- (e jw ) = e - jwD - Hd- (e jw )

(4.28d)

By using the above definitions it is possible to study the effects of the FD approximation errors due to the two-port junction. Setting all E(e jw ) = 0 or, equivalently, all
G(e jw ) = 1, results in the ideal frequency response.
4.3.7 Simulation of a Two-Tube System Using FIR FD Filters
As an example we shall investigate the frequency response of a two-tube model implemented with an interpolation waveguide model. The total length of the tube corresponds
to eight unit delays. The lengths of the two tube sections are 3.5 and 4.5 samples and
the scattering junction is thus located half way between unit delays. This can be considered the worst case for FIR FD filters since their approximation error is largest when d
= 0.5. The reflection coefficient at the junction of the two tubes is 0.5. One end of the
tube is assumed to be closed and the other one open, but in order to simplify the
example the terminations have been approximated with constant reflection coefficients
r1 = 0.9 and r2 = 0.9.
Figures 4.24 and 4.25 show the magnitude responses of this simulated system with
first and third-order Lagrange interpolation, respectively. The solid line in both figures

Chapter 4. Fractional Delay Waveguide Filters

151

25

Magnitude (dB)

20

15

10

-5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Fig. 4.26 The magnitude responses of the FD two-tube model with general LS FIR
design (N = 3, a = 0.5) (solid line) and ideal interpolation (dashed line). The
dotted line presents the magnitude response of a uniform tube.
is obtained by substituting the sampled frequency response of the interpolating filter
while the dashed line results from setting the error frequency response to zero as proposed above. This corresponds to the case of ideal interpolation, or equivalently, an
infinite sampling rate. The dotted line in figures 4.24 and 4.25 shows the magnitude
response of a uniform tube that is eight unit delays long. (The uniform tube is obtained
when the reflection coefficient is zero in the two-tube model.) The formants of the
uniform tube are equally spaced along the linear frequency axis.
It is seen in both cases that at low frequencies the magnitude response of the FD
model coincides with the ideal one, whereas at high frequencies it joins the magnitude
response of the uniform tube. This is because the interpolators are lowpass filters that
have a transmission zero at the Nyquist frequency, and thus the reflection coefficient of
the FD junction effectively approaches zero (i.e., the uniform case) when the frequency
is increased. The third-order interpolator (Fig. 4.25) yields a much better result at
middle frequencies than the linear interpolator (Fig. 4.24).
For comparison, the two-tube system was simulated using general LS (GLS) and
equiripple FIR FD filters. These filter design methods were discussed in Sections 3.2.4
ad 3.2.5, respectively. The magnitude responses of these two-tube simulations are presented in Figs. 4.26 and 4.27. Also in these figures the solid line gives the result of the
FIR filter approximation, the dashed line the ideal result, and the dotted line the magnitude response of a uniform tube. The bandwidth of approximation for the GLS and

152

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

25

Magnitude (dB)

20

15

10

-5

0.05

0.1

0.15

0.2
0.25
0.3
Normalized Frequency

0.35

0.4

0.45

0.5

Fig. 4.27 The magnitude responses of the FD two-tube model with equiripple (Oetken)
FIR interpolation (N = 3, a = 0.5) (solid line) and ideal interpolation (dashed
line). The dotted line presents the magnitude response of a uniform tube.
equiripple filters was a = 0.5, i.e., the results are optimal up to the normalized frequency 0.25 in Figs. 4.26 and 4.27. The filter coefficients of the GLS and the equiripple
filter were divided by 1.0157 and 1.0224, respectively, in order to force the maximum
of the magnitude response to be equal to one.
To enable comparison of the four simulated results, the errors in the amplitude of the
formants were computed in each case. The results are collected in Table 4.1. The number in parentheses after the name of the filter design method indicates the order N of the
FIR filter approximation. The leftmost column gives the formant number and the sec-

Table 4.1 Formant amplitude errors (dB) in the two-tube simulations (see text).
Formant
1
2
3
4
5
6
7
8

fk/fs
0.021
0.10
0.15
0.22
0.28
0.34
0.42
0.46

Lagrange (1) Lagrange (3) GLS (3)


0.147
0.000551
0.456
1.06
0.0777
0.169
2.26
0.737
0.00322
3.59
0.667
0.241
3.29
2.86
0.710
5.45
4.03
4.35
3.68
2.82
3.39
5.55
5.30
5.16

Oetken (3)
0.410
0.340
0.0798
0.164
0.642
3.94
3.53
5.13

Chapter 4. Fractional Delay Waveguide Filters

153

ond column the normalized center frequency of each formant. The error in each case is
the difference of the peak magnitude of the approximation and the ideal magnitude
response. Note that there are also remarkable errors in the center frequencies of
formants at high frequencies in Figs. 4.244.27 but they are not presented in Table 4.1.
It is seen that the magnitude error of the Lagrange interpolators increases with frequency. The errors of the GLS and Oetken (equiripple) filter approximations are less
than 1 dB for the five lowest formants. At high frequencies the errors are several dB in
all cases.
It can be concluded that the use of FIR FD filters for the realization of an interpolated waveguide model leads to a low-frequency approximation of a tube system.
Implementation of a vocal tract model, or other acoustic tube simulation, thus requires
oversampling and possibly a high-order interpolator to achieve a small enough approximation error up to a desired frequency. In Vlimki et al. (1994a) it was proposed that
the use of a sampling rate of 22 kHz and a third-order Lagrange interpolation yields an
effective bandwidth of about 5 kHz, which is quite adequate for high-quality speech
synthesis.
4.3.8 Conclusion
In this section we tackled simulation of an acoustic tube system, such as the human
vocal tract, with fractional delay filters. Combined use of nonrecursive interpolation and
deinterpolation enabled us to derive a structure for an interpolated scattering junction
which can be located at an arbitrary point along a digital waveguide. The main motivation for this new structure is the elimination of the most severe limitation of the KL
vocal tract modelthe use of equal-length tube sections. In the proposed new model,
the locations of the junctions are not restricted by sampling of the waveguide system.
Realization of the interpolated scattering junction using FIR-type fractional delay
filters was considered. The effects of approximation error on the reflection and transmission functions of the scattering junction, as well as on the frequency response of a
simulated tube system were discussed. FIR interpolation yields a low-frequency approximation of a waveguide system. The application of Lagrange interpolation, general LS
FIR design, and Oetkens almost-equiripple FIR FD design were considered for the
implementation of interpolated waveguide tube models. High-quality simulation
requires oversampling by a factor of two and at least four-tap FD filters for interpolation
and deinterpolation.
In Section 4.6 we consider implementation of acoustic tube models using allpass FD
filters. Next we tackle the simulation of an interpolated three-way scattering junction.

4.4 Interpolated Finger-Hole Model


The position of the finger holes is a critical question in the design of woodwind instruments. Also in a waveguide model of a woodwind instrument, the place of each finger
hole has to be defined carefully. This is conceivable with the aid of FD filtering techniques introduced earlier in this chapter. This approach to finger-hole modeling was
first described in Vlimki et al. (1993b).

154

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

4.4.1 FIR-Type Fractional-Delay Finger-Hole Model


We return to study the finger-hole junction discussed in Section 2.5.3. The volume
velocity signals that describe the three-port junction can be expressed in the (discrete)
time domain as
ua- (n, m + D) = ub- (n, m + D) + us- (n) + w(n)

(4.29a)

ub+ (n, m + D) = ua+ (n, m + D) + us- (n) + w(n)

(4.29b)

us+ (n) = -us- (n) - 2w(n)

(4.29c)

where D is the real-valued spatial index corresponding to the position of the center of
the finger hole, and signal w(n) is the output of the reflection filter R(z), i.e.,
w(n) = r(n)*w1 (n)

(4.29d)

where r(n) is the impulse response of the reflection filter R(z), * denotes the convolution sum, and the input signal w1 (n) is
w1 (n) = ua+ (n, m + D) + ub- (n, m + D) + us- (n)

(4.29e)

The signal flow graph for a finger-hole model has been depicted in Chapter 2 in Fig.
2.20.
The input signals of the tone-hole model, ua+ (n, m + D) and ub- (n, m + D), can be
computed at an arbitrary point on the digital waveguide using interpolation. When FIR
interpolation is employed, the approximations to these signals can be written in vector
form as
ua+ (n, m + D) = hT u + (n, m)

(4.30a)

ub- (n, m + D) = hT u - (n, m)

(4.30b)

where the vectors are


h = [h(0) h(1) h(2) L h(N)]

(4.30c)

(4.30d)

(4.30e)

u + (n, m) = u + (n, m) u + (n, m + 1) u + (n, m + 2) L u + (n, m + N)


u - (n, m) = u - (n, m) u - (n, m + 1) u - (n, m + 2) L u - (n, m + N)

and N is the order of the FIR interpolator. The coefficients h(n) can be, e.g., the
Lagrange interpolation coefficients (see Section 3.3).
The results ua- (n, m + D) and ub+ (n, m + D) of the computation in Eqs. (4.29a) and
(4.29b) also have fractional spatial indices m + D and should thus be stored between
the samples of the delay line. In practice this is not necessary, since the signal
ub- (n, m + D) really is the same as ua- (n, m + D)both are part of the lower delay line
of the waveguide. Also ua+ (n, m + D) is the same as ub+ (n, m + D) . The subscripts a
and b essentially refer to the order of computation. This situation is exactly the same

Chapter 4. Fractional Delay Waveguide Filters

155

as in the case of the FD KL junction (see Section 4.3.2) where the direct signal was not
processed at all. Thus the computations of the FD finger-hole model can be expressed in
vector form as

[
]
u+ (n, m) = u + (n, m) + h[us- (n) + w(n)]
u- (n, m) = u n- (n, m) + h us- (n) + w(n)

(4.31a)
(4.31b)

Note that these forms only involve deinterpolation.


Since the last term of the above equations is the same, it is economical to compute it
separately and then use it in the two equations. This is expressed as
u- (n, m) = u - (n, m) + v(n)

(4.32a)

u+ (n, m) = u + (n, m) + v(n)

(4.32b)

where we have defined a vector

v(n) = h us- (n) + w(n)

(4.32c)

Here the signal w(n) is the output of the reflection filter R(z). Its input signal w1 (n)
defined by Eq. (4.29e) also involves fractional indices, and thus it has to be computed
using interpolation, that is
w1 (n) = hT u + (n, m) + hT u - (n, m) + us- (n)
= hT [u + (n, m) + u - (n, m)] + us- (n)

(4.33)

The last form suggests an efficient way to calculate the input for the reflection filter
R(z): first add the values in the two delay lines of the digital waveguide and only after
that interpolate . Note that as the signals u + (n) and u - (n) propagate in opposite directions the coefficients of the interpolating filter appear in time-reversed order with
respect to one of them. In other words, in the upper delay line the FIR interpolating filter approximates the fractional delay d, while in the lower line it approximates the
complementary fractional delay d.
Note that altogether two FD filterings are involved in the interpolated finger-hole
model: interpolation is needed when Eq. (4.33) is evaluated and deinterpolation when
Eq. (4.32c) is evaluated. In addition, there is a simple digital filter R(z) is the system,
and two additions are needed. It is seen that there is considerably more computation in
this model than, e.g., in the FD KL model discussed in Section 4.3.
4.4.2 Approximation Error Due to FD Filters
As in the case of the FD KL junction, the approximation error due to the fractional
delay filters is in effect applied twice to the FD three-port: first in the interpolation of
the delay-line value and then in the deinterpolation of the result back to the waveguide.
This trick in the interpolated finger-hole model can be compared with the interpolated tube model
discussed above in Section 4.3. Figure 4.19 illustrates the idea that both interpolation and deinterpolation
only need to be applied once in an FD scattering junction.

156

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

us(n)

us+(n)

u(n - D )
Ei (z)

R (z )

w(n)

Ed (z)

digital waveguide

Fig. 4.28 A signal flow diagram of a finger-hole model showing the interpolation and
deinterpolation errors.
It is worth pointing out that the interpolated signal values are only used in the computation of the effect of the junction and deinterpolation is only applied to the result of this
calculation. Thus, the interpolation error does not affect the signal that would anyway
propagate in the waveguide if the junction were not there.
Figure 4.28 shows a high-level block diagram of the finger-hole model. The transfer
functions Ei (z) and Ed (z) present the errors of the interpolating and deinterpolating
filter, respectively. In this diagram the waveguide is considered ideal, i.e., no errors due
to digital implementation are present.
The effect of the FD approximation in the finger-hole model derived above can be
studied by computing the transfer function of a simple bore model. In this experiment
one end of the tube is assumed to be ideally closed (i.e., reflection coefficient is 1) and
the other end is assumed to be matched (or equivalently an infinite-length tube is considered). A finger hole is placed at a distance K = 50.5 unit samples from the closed
end. The transfer function of this system is
H(e jw ) =

-2 jw Ei (e jw )R(e jw )e - jwK
1 - Ei (e jw )R(e jw )Ed (e jw )e - jwK

(4.34)

where the transfer functions presenting the interpolation and deinterpolation error are
Ei (e jw ) = e jwTD Hi (e jw )

(4.35a)

Ed (e jw ) = e jwTD Hd (e jw )

(4.35b)

Figure 4.29 presents the magnitude responses for the ideal analog system, a system
with the reflection filter R(z) but with ideal fractional delays, and a realistic digital system with the digital reflection filter and an FD interpolator and deinterpolator implemented using first-order Lagrange interpolation (i.e., linear interpolation). The sampling
rate used in this example is 44 kHz. It is seen that in this worst case (i.e., the fractional
delay is d = 0.5) the resonances are only slightly disturbed in the frequency band of
interest, i.e., below 10 kHz. Thus it seems to be sufficient to use simple linear interpolators in the implementation of finger-hole models.

Chapter 4. Fractional Delay Waveguide Filters

157

Magnitude (dB)

-5

-10

-15

-20

-25
0

10
Frequency (kHz)

15

20

Fig. 4.29 The magnitude responses of the ideal bore model (dashed line), a model with
a digital reflection function but ideal interpolators (dotted line), and its realistic implementation with first-order Lagrange interpolation and deinterpolation (solid line).
4.4.3 Implementation Issues
A single model for an open finger hole is computationally relatively efficient. However,
woodwind instruments have several, up to approximately twenty finger holes, and if all
of them were simulated by the technique introduced above, the computational load
would be extremely large. Fortunately, it is not necessary to implement all the open
holes since only the two or three first open holes determine the fundamental pitch of the
instrument (Benade, 1960).
The closed holes have only a slight influence on the effective length of the bore and
they also act as lowpass filters. These effects can be taken into account by far simpler
methods than using a fractional delay three-port junction. Thus, it is suggested that a
woodwind bore model with finger holes be implemented based on two or three FD
junctions that model the first open holes. When a hole is closed the junction is removed
and it is grown at another location in the waveguide. Using this strategy, the typical
playing techniques of woodwind instruments, including the key claps, can be simulated.
More tone holes may be needed for producing multiphonics.

158

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

4.5 Fractional Delay Operations with Allpass Filters


The previous sections of this chapter have concentrated on the FIR implementation
techniques of fractional delay waveguide filters. In Sections 4.6 and 4.7 similar structures are developed with allpass FD filters. The idea of using allpass filters in waveguide models is not newJaffe and Smith (1983) tuned their string synthesizer with a
first-order allpass filter. However, to our knowledge allpass filters have not been used
for waveguide junctions before. The initial work in this field was published in Vlimki
and Karjalainen (1995) and Vlimki (1995b). Very recently Cook (1995) used allpass
filters for realizing finger holes for a waveguide flute model.
There are major differences between the FIR and allpass implementations of
FDWFs. These differences are caused by the fact that a given impulse response of an
FIR interpolating filter can be used for approximating both a fractional delay d and a
complementary fractional delay d = 1 d (by flipping the impulse response). The complementary fractional delay (CFD) is needed in FDWFs because the aim is to use two
digital delay lines to represent a waveguide and connect the junctions between the unit
delays of these lines. For a particular FD junction, d is the delay from the last integral
sampling point while d is the delay to the next integral sampling point.
In contrast to an FIR filter, the coefficients of an allpass FD filter are completely different for fractional delays d and d. For this reason it is not advantageous to use a onemultiplier scattering junction as a starting point for derivation of an allpass FD junction:
each junction would require altogether four allpass filters, two for approximating zd
and another two for z d. Furthermore, the approximation error in the transmission function of this kind of junction would be unacceptable (see Section 4.6.4).
4.5.1 Implementing a Variable Delay Line Using an Allpass Filter
In Chapter 3 it was shown that there are two techniques for the design of fractional
delay filters: FIR and IIR filter approximation. While the use of a nonrecursive interpolating filter leads to a structure that is easier to understand, it is useful to examine the
use of a recursive filter as well. As discussed in Section 3.4, the digital allpass filter is a
good choice for an IIR FD filter. Now the implementation of a variable-length delay
line using an allpass filter is described.
The use of an allpass filter is clarified using a first-order filter as an example. Higher
order allpass structures are straightforward extensions. The length of a delay line can be
varied using a first-order allpass filter as depicted in Fig. 4.30. It may be useful to use a

a1

x(n)

z M

z1

x (n M D)

a1
Fig. 4.30 Implementation of a variable-length digital delay line by means of a firstorder allpass filter.

Chapter 4. Fractional Delay Waveguide Filters

159

form of an allpass filter that has a minimum number of delay elements, since the delays
can be shared with the delay line as shown in Fig. 4.30. This scheme can be extended
for higher-order allpass filters.
The structure illustrated in Fig. 4.30 implements the following difference equation
y(n) = x(n - M - D) = a1[ x(n - M) - y(n - 1)] + x(n - M - 1)

(4.36)

Note, however, that the structure in Fig. 4.30 is not a direct-form structure and thus it
may not be apparent that the underlying difference equation is in fact Eq. (4.36).
The coefficients of the interpolating allpass filter can be computed applying one of
the techniques presented in Section 3.4. The simplest and the most useful allpass design
technique for audio signal processing is the maximally-flat group delay, or Thiran,
design (see Section 3.4.3).
4.5.2 Implementing Interpolation and Deinterpolation with Allpass Filters
Figure 4.31 illustrates the first-order allpass filter structure that realizes an output at a
fractional point of a delay line. This system is comparable to that presented in Fig. 4.30.
The principal difference between these structures is that in Fig. 4.31 we assume that the
delay line continues after the fractional point and the allpass filter merely approximates
the signal value at one point along that delay linenot at its end point, like in Fig. 4.30.
When the filter coefficient a1 has been chosen in an appropriate way, the output y(n) of
this structure approximates the value of signal x(n) at point M + D along the delay line,
that is
y(n) = x(n - M - D)

(4.37)

The structure shown in Fig. 4.31 directly implements the following difference equation:
y(n) = a1[ x(n - M) - y(n - 1)] + x(n - M - 1)

(4.38)

This time we use the first-order allpass structure with two unit delay elements and a
minimum number of multiplications. This choice is appropriate since one of the unit
delays can be shared with the delay line to be processed and the other unit delay is
reserved for the recursive part of the filter. The allpass structure with a minimum number of delay elements would not be more efficient since the recursion prevents the
sharing of the unit delay with the delay line. For different structures for the implementation of first-order allpass filters, see Mitra and Hirano (1974).
Next we develop a recursive version of deinterpolation using an allpass filter. The

+
x(n)

z- M

-1

a1

y(n)

-1

Fig. 4.31 A first-order allpass interpolator is used for the implementation of an output
at a fractional point of a delay line.

160

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

w(n)

z-1
a1

x(n)

-M

-1

x (n - M - D)

Fig. 4.32 A first-order allpass deinterpolator implements an input at a fractional point


of a delay line.
allpass deinterpolation scheme is clarified using a first-order allpass filter as an example. In Fig. 4.32, an implementation of a fractional delay input to a delay line is shown.
This allpass filter structure is the transpose of that in Fig. 4.31, i.e., an allpass
deinterpolator may be realized using the transpose structure of the allpass interpolator.
This is an analogy to the FIR filter structures studied earlier.
The transfer functions of the first-order allpass interpolator and deinterpolator,
respectively, are
a1 + z -1
Ai (z) =
1 + a1z -1

(4.39)

a1 + z -1
1 + a1z -1

(4.40)

Ad (z) =

Note that the filter coefficients a1 and a1 are not equal in these structures and they do
not, in general, have a simple relation. In allpass interpolation the coefficient should be
chosen so that the filter approximates the delay D + d , where d is the fractional delay.
The coefficient a1 of the allpass deinterpolator, however, should be chosen so that the
allpass filter approximates the delay D + d , where d is the complementary fractional
delay.

4.6 Interpolated Waveguide Model: Allpass Filter Approach


Here we derive the allpass filter counterpart for the interpolated scattering junction of
two tubes.
4.6.1 Derivation of the Allpass Fractional Delay Junction
Figure 4.33a illustrates the ideal interpolated scattering junction where the fractional
delay elements zd and z d are explicitly shown. Figure 4.33b shows a modified version
of this junction. Now the four FD elements have been pushed through the adders and
branch nodes. This is legal because linear time-invariant systems commute.
In Fig. 4.33b, there are altogether eight FD elements. The FD elements zd and zd
can be combined into a single unit delay element (since d + d = 1), both in the upper
and in the lower line, as presented in Fig. 4.33c. The two FD elements z d on the left of
Fig. 4.33b can be combined into a single block z2d , and it is possible to do the same for
the two CFD blocks on the right.

Chapter 4. Fractional Delay Waveguide Filters


a)

161

1+ r
z

-1

-d

1- r

z -d

b)

1+ r
z

-1

z -1

z -d

z -1

-r

r
z -1

z -d

z -d

-d

z -1

z -d

z -d
r

-r
z -d

z -d
1- r
z

-1

-1

z -d

-d

c)

1+ r
z

-1

z -1

z -1
-r

AD (z)

AD ( z)

r
1- r
z -1

z -1

z -1

Fig. 4.33 Derivation of the FD scattering junction employing commutation. (a) The
original FD scattering junction. (b) The FD elements have been pushed
through the branch nodes and adders. (c) The structure that is implemented
with two first-order allpass filters.
The block diagram presented in Fig. 4.33c is one possible allpass implementation of
the FD junction. Other configurations can also be derived but this one has some desirable properties. We realize the double fractional delay elements using allpass filters
AD(z) and AD(z). Consequently, the desired delay D and complementary total delay D
are defined by
D = 2d

(4.41)

D = 2d = 2 D

(4.42)

This implies that when the noninteger part d of the position of the junction can have
values in the interval [0, 1], the delays implemented by the two allpass filters will be in
the interval [0, 2].
Let us consider implementation of the allpass FD junction using the Thiran allpass
filter that was discussed in Sections 3.4.3 and 3.4.4. The choice of the optimal range for

162

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

the delay parameter D was studied in Section 3.4.4. For example, the first-order Thiran
interpolator yields best results when the delay varies between D0 = 0.42 and D0 + 1 =
1.42. For Thiran allpass filters of different order, the optimal range is also different.
However, an easy rule of thumb for the delay parameter is to maintain it in the range [N
0.5, N + 1.5] where N is the order of the allpass filter.
The implementation of the allpass FD junction described above needs special care,
since both the allpass filters used in this structure must be able to approximate fractional
delays within two unit delays. In the case of the first-order Thiran allpass filter, the
junction can be implemented as shown in Fig. 4.33c only when the junction is located
0.250.75 samples away from the nearest unit delay. Only then is the delay of the
allpass filters AD(z) and AD(z) within the desired range [0.5, 1.5].
When the delay d is smaller than 0.25 or larger than 0.75, the delay D approximated
by the filter AD(z) goes beyond the desired range. In order to change the delay of the
reflection function, we may add or subtract unit delays to or from that allpass filter
simply by changing its input point one unit delay to either direction. If the input point of
the allpass filter A D(z) is moved one unit delay to the left in Fig. 4.33c (i.e., the input of
the allpass filter is taken one sample interval earlier), then one sample of delay must be
added to D in order to maintain the overall delay. Similarly, if the input point is
switched one unit delay to the right (i.e., the input to the allpass filter is delayed by one
unit delay), one sample of delay must be subtracted from the filters nominal delay D.
Another possibility is to change the location of the filters output point, i.e., the point
where the output of the allpass filter AD(z) is added to the lower delay line in Fig. 4.33c.
These are two alternative methods that help to maintain the nominal delay of the allpass
filter within the desired range. One of these techniques must be used for the complementary allpass filter A D (z) as well.
The above discussion concerned the first-order Thiran allpass filter. However, similar procedureschange of the location of input or output pointshould be used to
maintain the delay of the filter in its best range. In the case of higher order allpass filters, one or more unit delays must usually be added to the filters nominal delay value
because these filters cannot implement an arbitrarily small delay. Remember that the
Thiran allpass filter is unstable for delay values D < N 1. For example, the secondorder Thiran filter cannot approximate a delay smaller than one sample.
4.6.2 Approximation Errors due to an Allpass Fractional Delay Junction
Let us consider the approximation error due to allpass interpolators. In the structure of
Fig. 4.33c, there are no FD elements in the straight path in neither direction. Thus there
is no approximation error in the transmission function through this allpass FD junction.
The only limitation that this solution brings about is that the point where the transmission occurs is not explicitly implemented and the signal value at that point cannot be
directly obtained. At the open end of a tube this causes an error in the delay of the output signal (radiated sound). This does not affect the formant structure nor other important properties of the signal and can be easily avoidedif desiredby incorporating an
additional FD filter at the output.
The reflection from the FD junction suffers from an approximation error since the
double FDs in Fig. 4.33b are realized using allpass filter approximations A D(z) and
AD(z). When the MF (Thiran) allpass FD filters are used, the nature of this error can be
predicted from the phase delay responses given in Section 3.4.3.

Chapter 4. Fractional Delay Waveguide Filters

163

25

20

Magnitude (dB)

15

10

-5
0

6
Frequency (kHz)

10

Fig. 4.34 Transfer function of a two-tube system realized using first-order Thiran allpass filters (solid line). For comparison, the simulation result obtained with
ideal interpolation (dashed line) is also given. The location of the junction is
d = 0.25.
The approximation error is zero when the nominal delay of the filter is 1.0. This
implies that the approximation error disappears when the junction is located half way
between unit delay, i.e., d = 0.5. This is in sharp contrast with the FIR FD junctions,
where the approximation error is largest when d = 0.5.
For other values of fractional delay, the approximation error is nonzero, and its effect
can be characterized as follows. At very low frequencies the junction (impedance discontinuity) is located very nearly in the correct position, but with increasing frequency
the junction tends to move towards some erroneous location. This is because the Thiran
allpass filter approximates the delay accurately at low frequencies but poorly at high
frequencies. We may assume that at low frequencies the formant structure of a tube
model implemented with these allpass filters is similar to the ideal one, whereas at
higher frequencies the center frequencies and amplitudes of formants are incorrect. The
following example confirms that this is actually the case.
4.6.3 Simulation of a Two-Tube Model with Allpass Filters
In order to compare the properties of allpass and FIR implementations of the interpolated waveguide model, a two-tube model with one FD junction was simulated. The
example system is the same as the one presented in Section 4.3.7. The total length of the

164

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

25

20

Magnitude (dB)

15

10

-5
0

6
Frequency (kHz)

10

Fig. 4.35 Transfer function of a two-tube system realized using first-order Thiran allpass filters (solid line). For comparison, the simulation result obtained with
ideal interpolation (dashed line) is also given. The location of the junction is
d = 0.75.
tube system is eight unit delays. The junction between the tubes is located between the
third and fourth unit delays, and the reflection coefficient is 0.5. One end of the tube is
assumed to be closed and the other one open. To simplify the example, the terminations
have been approximated with purely resistive loads, r1 = 0.9 and r2 = 0.9.
Figures 4.34 and 4.35 show the magnitude of the transfer function of this system for
two junction positions. The solid line has been computed using first-order MF allpass
filters and the dashed line was obtained with ideal interpolators. The sampling rate is 22
kHz. In Fig. 4.34, the location of the FD junction is d = 0.25 corresponding to delay
parameters D = 0.5 and D = 1.5. In Fig. 4.35, the junction is situated at d = 0.75 between
the unit delays, which means that D = 1.5 and D = 0.5. These can be considered the
worst-case situations since the delay values are on the edge of the range.
The transfer functions of the ideal and approximative system are nearly identical at
low frequencies but at high frequencies the deviation is quite large. However, the magnitude error of the fourth formant (at about 4.7 kHz) is only 1 dB in Fig. 4.34 and 2 dB
in Fig. 4.35. These figures should be compared with Figs. 4.244.27 that have been
computed with FIR interpolators. It can be concluded that the result obtained with the
first-order allpass filters is clearly better (at low and middle frequencies) than with the
linear interpolators and about as good as with the third-order Lagrange interpolators.
The computational complexity of the allpass filter implementation is less than that of

Chapter 4. Fractional Delay Waveguide Filters


a)
z-1

z- d

-1

165

-d

b)
z

-1

z
z

-1

z- d
z

-1

-d

-1

-1

-d

z-1

-d

-1

-1

- d -1

z- d -1
z

-1

-1

Fig. 4.36 An example of an alternative FD junction structure that at first seems to be


correct. Deeper analysis reveals that an allpass implementation of this structure will not yield a stable waveguide model.
the FIR filter implementation: a first-order allpass filters is implemented with one multiplication and two additions and a second-order allpass filter can be realized with two
multiplications and three additions, whereas a third-order FIR filter requires four
multiplications and three additions. Two filters per FD junction are needed both in the
case of allpass and FIR filters. Thus the allpass filter appears to be an attractive choice
for the implementation of fractional delay waveguide models.
4.6.4 Alternative Allpass Filter Structure
Figure 4.33 presents one FD junction structure that can be used for implementing an
interpolated waveguide model. Now we briefly discuss another structure that is perhaps
more obvious than that of Fig. 4.33, and at first seems to be quite correct. Nevertheless,
this structure is nonrealizable.
In Fig. 4.36a we present a one-multiplier scattering junction that is located between
the sampling points of a waveguide. The FD and CFD elements are explicitly shown. It
is advantageous to try to remove these FD elements from the waveguide. Then the FD
junction can be connected to an arbitrary point on the waveguide and it is not needed to
cut the delay lines by adding a fractional delay filter in the middle of the unit delay
chains. One way to achieve this is to commute the FD elements with the adder and
branch nodes of the scattering junction. In Fig. 4.36b, all four FD elements have been
pushed inside the scattering junction. Those copies of the FD and CFD elements that
were left in the delay line chains, have been combined into unit delay elements. In
addition, one extra unit delay element from the right-hand side has been pushed into the

166

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

junction and has been combined with the CFD element. This helps to realize an allpass
junction with better accuracy. The structure of 4.36 is aimed at an implementation using
a first-order allpass filter. Then the fractional delays zd would be implemented with
two allpass filters whose delay parameters would vary within [0, 1]. The delay
parameters of the other two allpass filters (approximating transfer functions zd1)
would vary in the range [1, 2].
Unfortunately this design proves to be useless. Let us consider the transmission
function through the junction from the left to the right. We denote the allpass filter
approximations for the FD and CFD elements by A D(z) and AD(z), respectively. The
transmission function is then z 2 rAD(z)AD(z). We would like this to be an allpass
function or at least a passive function. However, the approximation errors in the two
allpass filters assure that the phase function of rAD(z)AD(z) is nonlinear, and although
this transfer function and also that of the other signal path, z 2, both are allpass functions (the product of two allpass filters is also an allpass filter), the resulting transmission function does not have this property. In fact, a sum of two allpass filters is in general not an allpass function: the differences in the phase functions of the summed functions lead to cancellation or emphasis of different frequencies in its magnitude spectrum. This approach can be and has been used for designing lowpass and highpass filters, for example.
The above explanation implies that the transmission functions through the FD junction of Fig. 4.36b are not guaranteed to be either allpass or passive. This may endanger
the stability of the overall waveguide system, which consists of several feedback loops.
For this reason, the structure is uselessalthough in theory it has been derived in a
correct way.

4.7 Allpass Fractional Delay Finger-Hole Model


Section 4.4 was concerned with implementation of a finger-hole model using FIR FD
filters. Now we present a corresponding model realized using allpass filters (Vlimki,
1995b). It is worth noting that this is merely one possible configuration. Also in this
case there are more but the one discussed here turns out be an efficient one in terms of
computational cost. For an example of another approach, see Cook (1995).
The derivation follows the same guidelines as above in the case of the FD two-port
scattering junction. We begin with the finger-hole junction where there are four reflection filters as illustrated in Fig. 4.37a: the two filters R(z) simulate the reflection of the
volume velocity waves and the two filters 1 + R(z) simulate the transmission of volume
velocity through the finger hole. It is assumed that this finger hole is located at an arbitrary position along the waveguide bore model. Thus one of the unit delays of the digital waveguide has been split into two parts and the hole has been positioned between
them. The fractional delays d and d are explicitly shown in Fig. 4.37a.
We proceed by pushing the four FD elements inside the junction. When an FD element is pushed through a branch or summation node, this element has to be appended
into each input path of that branch. This has been done in Fig. 4.37b. It is seen that now
there are altogether eight FD elements.
The FD elements that are on the same path can be combined: the fractional delays zd
and complementary FDs z d can be combined into single unit delay elements (since d +
d = 1) in the upper and in the lower lines (see Fig. 4.37c). The two FD elements zd on

Chapter 4. Fractional Delay Waveguide Filters

167

us+

a)
2
ua+

z-1

z- d
R(z)

ua-

z-1

z- d

1 + R(z )

R(z)

1 + R(z )

z-d

z-1

ub+

z-d

z-1

ub-

z-1

ub+

z-1

ub-

z-1

ub+

z-1

ub-

us+

b)
2
ua+

z-1

z- d

z-d

1 + R(z )

z-d

z- d

R(z)

R(z)
z-d

z- d
ua-

z-1

z- d

z-d

1 + R(z )
us+

c)
2
ua+

z-1

z-1

AD (z)

R(z)
z-1

AD (z)
ua-

z-1

R(z)
z-1

Fig. 4.37 Derivation of the finger-hole model structure where the fractional delays are
approximated using allpass filters. a) The finger-hole junction located at an
arbitrary point (between unit delays) of a digital waveguide. b) The FD elements have been pushed inside the junction. c) The final structure that can be
implemented using two identical reflection filters R(z) and two allpass filters.
the left of Fig. 4.37b can be combined into a single block z 2d while the two CFD elements zd on the right into z2d.
These double FD elements are implemented using allpass filters A D(z) and A D (z).
Consequently, for these allpass filters the desired delay D and complementary total
delay D are defined by D = 2d and D = 2d. This implies that when the noninteger part d

168

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

of the position of the junction is varied in the interval [0, 1], the delays approximated by
these two allpass filters have values in the interval [0, 2]. Also shown in Fig. 4.37c is an
efficient implementation of transmission functions 1 + R(z) as a sum of the direct signal
and the output of the filter R(z).
The unit delay element in the middle of Fig. 4.37c adjusts the delay between the signals from the upper and lower delay line before their addition. Here we have simplified
the structure considerably by neglecting the FD elements zd and z d that should precede
the adder in the two signal paths. It is believed that this approximation does not affect
the properties of the finger-hole model. In effect we have only neglected a small delay
at the output of the system. This trick does, however, reduce the number of additions
and multiplications, since it saves the implementation of one more fractional delay
filter.
The finger-hole model of Fig. 4.37c can implemented using first-order allpass filters
AD(z) and AD(z). The total computational load of the finger-hole model is thus not very
large since only two one-pole reflection filters and two first-order allpass filters need to
be implemented.

4.8 Conclusions
In this chapter, implementation techniques for fractional delay waveguide filters have
been discussed. Interpolation is a standard method for estimating the value of a signal
between its known sample values. A novel technique, deinterpolation, was introduced.
It can be interpreted as an inverse operation of interpolation in the sense that interpolation is used for computing the output at an arbitrary point of a discrete-time delay line,
whereas deinterpolation is used for locating a corresponding input point. Both operations can be implemented using recursive or nonrecursive digital filters, but deinterpolation is more intuitively implemented using an FIR filter structure. The use of allpass
filters for interpolation and deinterpolation was also discussed.
Interpolation and deinterpolation provide means for creating spatially continuous
digital waveguide models. The simplest examples of such systems are a variable-length
delay line and a variable-length waveguide. These and the more elaborate waveguide
structures are called fractional delay waveguide filters. They are constructed of arbitrarily many digital waveguides that are connected to each other at fractional points. This
approach was applied to the time-domain modeling of acoustic tubes and realization of
two-port and three-port fractional delay junctions. Both FIR and allpass fractional delay
filters were employed for this purpose.
The properties of the FIR and allpass techniques were compared. The results can be
summarized in the following way: FIR structures are conceptually simpler since the
locations of filter taps have a direct relation to the impulse response of the filter and
thus also to the locations of the interpolated input and output points along a digital
waveguide. The use of maximally-flat FIR FD filters or maximally-flat group-delay allpass filters leads to a low-frequency approximation of an ideal tube structure.
Oversampling by a small integral factor is required to obtain a reasonable quality of
approximation. The benefits of allpass filter structures are an identically flat magnitude
response in the FD approximation and a smaller approximation error with a lower-order
filter than that obtainable with an FIR filter structure with a comparable number of
arithmetic operations.

5 Conclusions and Further Research


5.1 Summary and Conclusions
In this thesis, methods have been presented for modeling of acoustic tubes by means of
fractional delay waveguide filters. Chapter 2 has reviewed the methodology of digital
waveguide modeling and related techniques. The analysis/synthesis method for string
instruments presented in Section 2.1 is a new technique. A fresh field of physical
modeling, waveguide simulation of conical acoustic tubes, was also discussed in Section 2.4. This problem has not been approached before this widely and may be considered a new area of research. The conical waveguide models find their applications in
musical acoustics where many wind instruments, such as the saxophone and the bassoon, have a conical bore but also in speech synthesis where more accurate modeling of
the vocal tract profile is achieved when using conical tube sections instead of cylindrical ones.
The major contributions of this work were presented in Chapters 3 and 4. In Chapter
3, the theory of fractional delays has been studied and several FIR and IIR filter approximations for the fractional delay have been examined. For waveguide modeling the
most useful FD approximations seem to be Lagrange interpolation, which is implemented using an FIR filter, and Thiran interpolation, which is implemented using an
allpass filter. Formerly unpublished results for both of these filters were presented. A
novel implementation technique for time-varying Lagrange interpolation was developed
in Section 3.3.7.
Implementation of time-varying digital filters was studied in Section 3.7. Formerly
published transient elimination techniques for recursive digital filters were reviewed
and a new method was proposed. This technique is computationally more efficient than
the former approaches and can suppress the transient as much as needed.
Deinterpolationan inverse operation for interpolationwas formulated in Chapter
4. When deinterpolation is used together with interpolation, it is possible to accurately
adjust the length of a digital waveguide or locate a junction of waveguides between the
unit samples of the delay lines. Implementation of interpolation and deinterpolation
using FIR and allpass FD filters was investigated.
Moreover, Chapter 4 tackled the modeling of cylindrical acoustic tubes by means of
FD waveguide methods. A novel model for the human vocal tract was proposed. The
main improvement over the traditional KellyLochbaum speech production model is
that now the length of each tube section is independent of the sample interval.
Furthermore, the length of every tube section can be arbitrarily adjusted, since the junctions of the tube sections are located on the delay line using interpolation and deinterpolation. This also solves the classical problem of how to adjust the length of the KL
model.
169

170

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

The FIR filter implementation of an FD junction is perhaps more intuitive than the
allpass filter approach. This is due to the equivalence of the coefficient vector and
impulse response of the FIR filter. In Section 4.3 it was shown how the FD KL junction
can be efficiently implemented by FIR filters when half of the multiplications are combined. Analysis of approximation errors due to the FD filters was also investigated. The
reflection and transmission functions of the FD KL junction were derived and, as an
example, the magnitude response of a two-tube model was studied when first and thirdorder Lagrange interpolating filters were used.
Modeling of a side branch at an arbitrary point of a cylindrical tube was also studied
in Section 4.4. The main application considered was simulation of finger holes of
woodwind instruments. Another potential application for this structure is waveguide
modeling of the connection of the human oral and nasal tract.
In Sections 4.6 and 4.7, allpass filter models for fractional delay junctions of two and
three tubes were developed. These structures have not been studied in the DSP literature
before.

5.2 Suggestions for Future Work


This thesis studies the theoretical possibilities of fractional delay waveguide modeling
of the acoustics of wind instruments and the human vocal tract. There are plenty of
practical applications for the results of this study. Future work includes development of
an FD waveguide model of a finger hole in a conical tube, and also application of the
FD techniques to other waveguide algorithms. For example, in waveguide synthesis of
plucked string instruments, the pluck position could be controlled arbitrarily using deinterpolation. Fractional delay filtering techniques can as well be applied to two-dimensional waveguide systems that model a vibrating membrane or three-dimensional systems that can be used, e.g., for room simulation. Digital reverberators, flangers, and
phasers are based on delay lines and thus the utilization of FD techniques could offer
additional possibilities in terms of improved accuracy or exciting new timbres.
The interpolated waveguide tube model can be used for creating a new kind of
articulatory speech synthesizer. This approach seems to be well motivated. For example, Mrayati et al. (1988) have proposed a speech production model where the vocal
tract is divided into segments of different length. The FD waveguide techniques could
be used for implementing such a model.
A fascinating future research project is to implement time-varying fractional delay
waveguide modelsespecially using allpass filters. Maeda (1977), Strube (1982), and
Liljencrants (1985) have proposed techniques that allow the use of time-varying reflection coefficients in digital waveguide tube models. A new transient elimination method
was introduced in this work, and using it together with formerly known techniques of
time-varying waveguide modeling, it should be possible to realize well-behaving models, whose properties can change as they are playing.
Another interesting field of further research is measurement of musical instruments
and estimation of parameters of a digital waveguide model (Vlimki et al., 1995c;
Karjalainen et al., 1995b). Measurements are also needed if the radiational properties of
sound sources are to be incorporated in a digital waveguide model (Huopaniemi et al.,
1994; Karjalainen et al., 1995a).
Design of FIR fractional delay filters can be considered a well-exploited problem.

Chapter 5. Conclusions and Further Research

171

The same is not true for IIR FD filters; merely a subclass of recursive FD filters, the allpass filters, has been thoroughly studied (Laakso et al., 1994). The author is aware of
only one study on fractional delay approximation using pole-zero filters that are not allpass filters (Tarczynski and Cain, 1994).
For real-time applications, it may be interesting to study fractional delay filters
both FIR and IIRthat produce a very small total delay. The FIR filters discussed in
this study have a total delay of approximately N/2 samples (where N is the filter order),
since they interpolate between their centermost state variables. Also, the allpass filters
studied here have a large total delay, typically about N samples.
Fractional delay filters have started to become popular in several areas of digital signal processing (see the very beginning of Chapter 3 for references). There is no doubt
that many more signal processing problems can be solved more efficiently, or better
than before, when someone realizes that fractional delays are of help. Potential future
research projects in this area are, for example, irrational sampling rate conversion or
time-varying fractional resampling of discrete-time signals using allpass or recursive
FD filters.
The above mentioned two new problems are also examples of applications where the
new transient elimination method proposed in this thesis may be employed. Other possible uses can be found in many different kinds of filtering problems where the parameters of a recursive filter need to be changed during operation, such as in equalization of
audio signals.
Nowadays it is not feasible to use very high-order FD filters in real-time applications, such as in waveguide synthesisat least when only a single DSP processor is
available. However, as the computing power of signal processing hardware increases it
will become possible to utilize higher-order interpolators and deinterpolators and still
run the application in real time. This implies that the difference between the practical
FD approximation and the ideal fractional delay will be reduced. At the same time it
will become more and more realistic to talk about virtually continuous processing of
discrete-time signals and systems.

References
Adrien, J. M. 1991. The missing link: modal synthesis. In G. De Poli, A. Piccialli, and C. Roads (eds.)
Representations of Musical Signals. Cambridge, Massachusetts, MIT Press, pp. 267297.
Agull, J., Cardona, S., and Keefe, D. H. 1994. Time-domain measurements of reflection functions for
discontinuities in wind-instrument air columns. Proc. Stockholm Music Acoustics Conf. (SMAC 93),
Stockholm, Sweden, July 28August 1, 1993, pp. 465469.
Agull, J., Cardona, S., and Keefe, D. H. 1995. Time-domain deconvolution to measure reflection functions for discontinuities in waveguides. J. Acoust. Soc. Am., vol. 97, no. 3, March, pp. 19501957.
Aldroubi, A., Unser, M., and Eden, M. 1992. Cardinal spline filters: stability and convergence to the ideal
sinc interpolator. Signal Processing, vol. 28, no. 2, August, pp. 127138.
Alkhairy, A., Christian, K., and Lim, J. 1991. Design of FIR filters by complex Chebyshev approximation
Proc. IEEE Int. Conf. Acoust. Speech Signal Processing (ICASSP91), Toronto, Canada, May 1417,
1991, vol. 3, pp. 19851988.
Armstrong, J., and Strickland, D. 1993. Symbol synchronization using signal samples and interpolation.
IEEE Trans. Communications, vol. 41, no. 2, February, pp. 318321.
Ayers, R. D., Lowell, J. E., and Mahgerefteh, D. 1985. The conical bore in musical acoustics. Am. J.
Phys., vol. 53, no. 6, June, pp. 528537.
Benade, A. H. 1960. On the mathematical theory of woodwind finger holes. J. Acoust. Soc. Am., vol. 32,
pp. 15911608.
Benade, A. H. 1976. Fundamentals of Musical Acoustics. London, Oxford University Press. 596 p. Also:
1990. New York, Dover Publications.
Beranek, L. L. 1954. Acoustics. Cambridge, Massachusetts, The Acoustical Society of America. 491 p.
Reprinted in 1986.
Bonder, L. J. 1983a. The n-tube formula and some of its consequences. Acustica, vol. 52, no. 4, March,
pp. 216226.
Bonder, L. J. 1983b. Equivalency of lossless n-tubes. Acustica, vol. 53, no. 4, August, pp. 193200.
Borin, G., de Poli, G., Puppin, S., and Sarti, A. 1990. Generalizing the physical model timbral class.
Proc. Colloquium on Physical Modeling, Grenoble, France, September 1990.
Bradley, K., Cheng, M., and Stonick, V. L. 1995. Automated analysis and computationally efficient synthesis of acoustic guitar strings and body. Proc. 1995 IEEE ASSP Workshop on Applications of Signal
Processing to Audio and Acoustics, New Paltz, New York, October 1518, 1995, pages not numbered.
de Bruin, G., and van Walstijn, M. 1995. Physical models of wind instruments: a generalized excitation
coupled with a modular tube simulation platform. J. New Music Research, vol. 24, no. 2, June, pp. 148
163.
Cadoz, C., Luciani, A., and Florens, J. 1984. Responsive input devices and sound synthesis by simulation
of instrumental mechanisms: the CORDIS system. Computer Music J., vol. 8, no. 3, pp. 6073. Reprinted
in C. Roads (ed.) 1989. The Music Machine. Cambridge, Massachusetts, MIT Press, pp. 495508.
Cain, G. D., and Yardim, A. 1994. The tunable fractional delay filter: passport to finegrain delay estimation. Proc. Fourth Cost 229 Workshop on Adaptive Methods and Emergent Techniques for Signal
Processing and Communications, Ljubljana, Slovenia, April 57, 1994, pp. 924. Also: Electrotechnical
Review, Ljubljana, Slovenia, vol. 61, no. 4, pp. 232242.

172

References

173

Cain, G. D., Murphy, N. P., and Tarczynski, A. 1994. Evaluation of several FIR fractional-sample delay
filters. Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP94), Adelaide, Australia, April
1922, 1994, vol. 3, pp. 621624.
Cain, G. D, Yardim, A., and Henry, P. 1995. Offset windowing for FIR fractional-sample delay. Proc.
IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP95), Detroit, Michigan, May 912,
1995, vol. 2, pp. 12761279.
Causs R., Kergomard, J., and Lurton, X. 1984. Input impedance of brass musical instrumentscomparison between experiment and numerical models. J. Acoust. Soc. Am., vol. 75, no. 1, January, pp. 241254.
Chaigne, A., Askenfelt, A., and Jansson, E. V. 1990. Temporal synthesis of string instrument tones.
Speech Transmission LaboratoryQuarterly Progress and Status Report (STL-QPSR), no. 4, October
December. Stockholm, Sweden: Royal Institute of Technology (KTH), pp. 81100.
Chaigne, A. 1992. On the use of finite difference for musical synthesisApplication to plucked string
instruments. J. dAcoustique, vol. 5, no. 2, April, pp. 181211.
Chaigne, A., and Askenfelt, A. 1994. Numerical simulations of piano strings. I. A physical model for a
struck string using finite difference methods. J. Acoust. Soc. Am., vol. 95, no. 2, February, pp. 1112
1118.
Chan, C. F., and Leung, S. H. 1991. A vocoder using high-order LPC filter with very few non-zero coefficients. Proc. Second European Conf. Speech Communications and Technology (EUROSPEECH91),
Genova, Italy, September 2426, 1991, vol. 2, pp. 839842.
Coltman, J. W. 1979. Acoustical analysis of the Boehm flute. J. Acoust. Soc. Am., vol. 65, no. 2, February, pp. 499506.
Cook, P. R. 1988. Implementation of single reed instruments with arbitrary bore shapes using digital
waveguide filters. Stanford, California, Stanford University, Dept. of Music, CCRMA, Tech. Report no.
STANM50, May 1988. 7 p.
Cook, P. R. 1989. Synthesis of the singing voice using a physically parametrized model of the human
vocal tract. Proc. 1989 Int. Computer Music Conf. (ICMC89), Columbus, Ohio, October 25, 1989, pp.
6972.
Cook, P. R. 1990. Identification of Control Parameters in an Articulatory Vocal Tract Model, With
Applications to the Synthesis of Singing Voice. Ph.D. dissertation, Stanford, California, Stanford University, Dept. of Music, CCRMA Tech. Report no. STANM68, December, 1990. 171 p.
Cook, P. R. 1991. TBone: an interactive waveguide brass instruments synthesis workbench for the NeXT
machine. Proc. 1991 Int. Computer Music Conf. (ICMC91), Montreal, Canada, October 1620, 1991,
pp. 297299.
Cook, P. R. 1992. A meta-wind-instrument physical model, and a meta-controller for real time performance control. Proc. 1992 Int. Computer Music Conf. (ICMC92), San Jose, California, October 1418,
1992, pp. 273276.
Cook, P. R. 1993. SPASM: a real time vocal tract model editor/controller and singer: the companion
software synthesis system. Computer Music J., vol. 17, no. 1, spring, pp. 3044.
Cook, P. R. 1995. Integration of physical modeling for synthesis and animation. Proc. 1995 Int. Computer Music Conf. (ICMC95), Banff, Canada, September 37, 1995, pp. 525528.
Crochiere, R. E., and Rabiner, L. R. 1983. Multirate Digital Signal Processing. Englewood Cliffs, New
Jersey, PrenticeHall, Inc. 411 p.
Cucchi, S., Desinan, F., Parladori, G., and Sicuranza, G. 1991. DSP implementation of arbitrary sampling
frequency conversion for high quality sound application. Proc. 1991 IEEE Int. Conf. Acoust., Speech,
Signal Processing (ICASSP91), Toronto, Canada, May 1417, 1991, vol. 5, pp. 36093612.

174

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Davis, P. J. 1963. Interpolation and Approximation. New York, Dover. 393 p.


Eckel, G, Iovino, F., and Causs, R. 1995. Sound synthesis by physical modelling with Modalys. Proc.
Int. Symp. Musical Acoustics (ISMA95), Dourdan, France, July 26, pp. 479482.
Erup, L., Gardner, F. M., and Harris, R. A. 1993. Interpolation in digital modemspart II: implementation and performance. IEEE Trans. Communications, vol. 41, no. 6, June, pp. 9981008.
Fant, G. 1960. Acoustic Theory of Speech Production. The Hague, The Netherlands, Mouton & Co.
Farrow, C. W. 1988. A continuously variable digital delay element. Proc. 1988 IEEE Int. Symp. on Circuits and Systems (ISCAS88), Espoo, Finland, June 79, 1988, vol. 3, pp. 26412645.
Fettweis, A. 1971. A general theorem for signal-flow networks, with applications. Archive fr Elektronik
und bertragungstechnik, vol. 25, pp. 557561. Reprinted in L. R. Rabiner and C. M. Rader (eds.) 1972.
Digital Signal Processing. New York, IEEE Press, pp. 126130.
Fettweis, A. 1972. Digital filters related to classical structures. Archive fr Elektronik und bertragungstechnik, vol. 25, February, pp. 7989.
Fettweis, A. 1986. Wave digital filters: theory and practice. Proc. IEEE, vol. 74, no. 2, February, pp.
270327.
Flanagan, J. L., 1972. Speech Analysis, Synthesis and Perception. Second, Expanded Edition. Berlin,
Springer Verlag. 444 p.
Fletcher, N. H., and Rossing, T. D. 1991. The Physics of Musical Instruments. New York, Springer
Verlag. 620 p.
Florens, J.L., and Cadoz, C. 1991. The physical model: modeling and simulating the instrumental universe. In G. De Poli, A. Piccialli, and C. Roads (eds.) Representations of Musical Signals. Cambridge,
Massachusetts, MIT Press, pp. 22768.
Frank, W., and Lacroix, A. 1986. Improved vocal tract models for speech synthesis. Proc. 1986 IEEE Int.
Conf. Acoust., Speech, Signal Processing (ICASSP86), Tokyo, Japan, April 711, 1986, vol. 3, pp.
20112014.
Gardner, F. M. 1993. Interpolation in digital modemspart I: fundamentals. IEEE Trans. Communications, vol. 41, no. 3, March, pp. 501507.
Gilbert, J., Kergomard, J., and Pollack, J. D. 1990. On the reflection functions associated with discontinuities in conical bores. J. Acoust. Soc. Am., vol. 87, no. 4, April, pp. 17731780.
Gray, A. H. Jr., and Markel, J. D. 1973. Digital ladder and lattice filter synthesis. IEEE Trans. Audio and
Electroacoust., vol. AU-21, no. 6, December, pp. 491500.
Gray, A. H. Jr., and Markel, J. D. 1975. A normalized digital filter structure. IEEE Trans. Acoust.,
Speech, Signal Processing, vol. ASSP-23, no. 3, June, pp. 268277.
Gradshteyn, I. S., and Ryzhik, I. M. 1994. Tables of Integrals, Series, and Products. Fifth Edition. San
Diego, California, Academic Press.
Hamming, R. W. 1983. Digital Filters. Second Edition. Englewood Cliffs, New Jersey, PrenticeHall. pp.
244245.
Harris, F. J. 1978. On the use of windows for harmonic analysis with the discrete Fourier transform.
Proc. IEEE, vol. 66, no. 1, January, pp. 5183.
Hermanowicz, E. 1992. Explicit formulas for weighting coefficients of maximally flat tunable FIR
delayers. Electronics Letters, vol. 28, no. 20, 24th September, pp. 19361937.

References

175

Hildebrand, F. B. 1974. Introduction to Numerical Analysis. Second Edition. New York, McGraw-Hill.
Also: 1987. New York, Dover. 669 p.
Hiller, L., and Ruiz, P. 1971. Synthesizing musical sounds by solving the wave equation for vibrating
objects: parts I and II. J. Audio Eng. Soc., vol. 19, no. 6, pp 462470 and vol. 19, no. 7, pp. 542550.
Hirschman, S. E. 1991. Digital Waveguide Modeling and Simulation of Reed Woodwind Instruments.
Engineering thesis. Stanford, California, Stanford University, Dept. of Music, CCRMA Tech. Report no.
STANM72, July 1991. 253 p.
Hou, H. S., and Andrews, H. C. 1978. Cubic splines for image interpolation and digital filtering. IEEE
Trans. Acoust., Speech, Signal Processing, vol. 26, no. 6, November, pp. 508517.
Huopaniemi, J., Karjalainen, M., Vlimki, V., and Huotilainen, T. 1994. Virtual instruments in virtual
roomsa real-time binaural room simulation environment for physical models of musical instruments.
Proc. 1994 Int. Computer Music Conf. (ICMC94), Aarhus, Denmark, September 1217, 1994, pp. 455
462.
Jackson, L. B. 1989. Digital Filters and Signal Processing. Second Edition. Norwell, Massachusetts,
Kluwer Academic Publishers. 410 p.
Jaffe, D., and Smith, J. O. 1983. Extensions of the Karplus-Strong plucked string algorithm. Computer
Music J., vol. 7, no. 2, summer, pp. 5669. Reprinted in C. Roads (ed.) 1989. The Music Machine.
Cambridge, Massachusetts, MIT Press, pp. 481494.
Jaffe, D. 1995. Ten criteria for evaluating synthesis techniques. Computer Music J., vol. 19, no. 1, spring,
pp. 7687.
Jnosy, Z., Karjalainen, M., and Vlimki, V. 1994. Intelligent synthesis control with applications to a
physical model of the acoustic guitar. Proc. 1994 Int. Computer Music Conf. (ICMC94), Aarhus,
Denmark, September 1217, 1994, pp. 402406.
Jerri, A. J. 1977. The Shannon sampling theoremits various extensions and applications: a tutorial
review. Proc. IEEE, vol. 65, no. 11, November, pp. 15651596.
Jot, J.-M., and Chaigne, A. 1991. Digital delay networks for designing artificial reverberators. Presented
at the 90th AES Convention, Paris, France, February 1922, 1991, preprint no. 3030.
Jot, J.-M., Larcher, V., and Warusfel, O. 1995. Digital signal processing issues in the context of binaural
and transaural stereophony. Presented at the 98th AES Convention, Paris, France, February 2528, 1995,
preprint no. 3980.
Jury, E. I. 1964. Theory and Application of the z-Transform Method. New York, Wiley, p. 298.
Kabasawa, Y., Ishizaka, K., and Arai, Y. 1983. Simplified digital model of the lossy vocal tract and vocal
cords. Proc. 11th Int. Congr. Acoust., Paris, France, July 1927, 1983, vol. 4, pp. 175178.
L. J. Karam and J. H. McClellan. 1995. Complex Chebyshev approximation for FIR filter design. IEEE
Trans. Circuits Syst.II: Analog and Digital Signal Processing, vol. 42, no. 3, March, pp. 207216.
Karjalainen, M., and Laine, U. K. 1991. A model for real-time sound synthesis of guitar on a floatingpoint signal processor. Proc. 1991 IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP91),
Toronto, Canada, May 25, 1991, vol. 5, pp. 36533656.
Karjalainen, M., Laine, U. K., Laakso, T. I., and Vlimki, V. 1991. Transmission-line modeling and
real-time synthesis of string and wind instruments. Proc. 1991 Int. Computer Music Conf. (ICMC91),
Montreal, Canada, October 1610, 1991, pp. 293296.
Karjalainen, M., and Vlimki, V. 1993. Model-based analysis/synthesis of the acoustic guitar. Proc.
Stockholm Music Acoustics Conf. (SMAC 93), Stockholm, Sweden, July 28August 1, 1993, pp. 443
447.

176

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Karjalainen, M., Vlimki, V., and Jnosy, Z. 1993. Towards high-quality synthesis of the guitar and
string instruments. Proc. 1993 Int. Computer Music Conf. (ICMC93), Tokyo, Japan, September 1015,
1993, pp. 5663.
Karjalainen, M., Huopaniemi, J., and Vlimki, V. 1995a. Direction-dependent physical modeling of
musical instruments. Proc. 15th Int. Congr. Acoustics (ICA95), Trondheim, Norway, June 2630, 1995,
vol. 3, pp. 451454.
Karjalainen, M., Vlimki, V., Hernoux, B., and Huopaniemi, J. 1995b. Explorations of wind instruments
using digital signal processing and physical modeling techniques. Proc. 1995 Int. Computer Music Conf.
(ICMC95), Banff, Canada, September 37, 1995, pp. 509516. A revised version will be published in
the Journal of New Music Research, vol. 24, no. 4, December 1995.
Karplus, K., and Strong, A. 1983. Digital synthesis of plucked-string and drum timbres. Computer Music
J., vol. 7, no. 1, summer, pp. 4355. Reprinted in Curtis Roads (ed.) 1989. The Music Machine.
Cambridge, Massachusetts, MIT Press, pp. 467479.
Keefe, D. H. 1982. Theory of the single woodwind tone hole. J. Acoust. Soc. Am., vol. 72, no. 3, September, pp. 676687.
Keefe, D. H. 1990. Woodwind air column models. J. Acoust. Soc. Am., vol. 88, no. 1, July, pp. 3551.
Kelly, J. L. Jr., and Lochbaum, C. C. 1962. Speech synthesis. Proc. Fourth Int. Congr. Acoust., Copenhagen, Denmark, September 1962, Paper G42, pp. 14.
Kinsler, L. E., and Frey, A. R. 1950. Fundamentals of Acoustics. New York, John Wiley & Sons. 516 p.
Ko, C. C., and Lim, Y. C. 1988. Approximation of a variable-length delay line by using tapped delay line
processing. Signal Processing, vol. 14, no. 4, June, pp. 363369.
Kootsookos, P. J., and Williamson, R. C. 1995. FIR approximation of fractional sample delay systems.
To be published in the IEEE Trans. Circuits and SystemsII: Analog and Digital Signal Processing.
Kreyszig, E. 1988. Advanced Engineering Mathematics. Sixth Edition. New York, John Wiley & Sons.
1294 p. and appendices.
Kroon, P., and Atal, B. S. 1991. On the use of pitch predictors with high temporal resolution. IEEE
Trans. Signal Processing, vol. 39, no. 3, March, pp. 733735.
Kuisma, T. M. 1994. Computational Modelling of the Vocal Tract and Speech Synthesis by Using Fractional Delay Filters. M. Sc. thesis (in Finnish). Espoo, Finland, Helsinki University of Technology,
Faculty of Electrical Engineering, Laboratory of Acoustics and Audio Signal Processing, June 14, 1994.
75 p.
Laakso, T. I., Vlimki, V., Karjalainen, M., and Laine, U. K. 1992. Real-time implementation techniques for a continuously variable digital delay in modeling musical instruments. Proc. 1992 Int. Computer Music Conf. (ICMC92), San Jose, California, October 1418, 1992, pp. 140141.
Laakso, T. I., Vlimki, V., Karjalainen, M., and Laine, U. K. 1994. Crushing the DelayTools for Fractional Delay Filter Design. Espoo, Finland, Helsinki University of Technology, Faculty of Electrical
Engineering, Laboratory of Acoustics and Audio Signal Processing, Report no. 35, October 1994. 46 p.
and figures. A revised version entitled Splitting the unit delaytools for fractional delay filter design
will be published in the IEEE Signal Processing Magazine, vol. 13, no. 1, January 1996.
Laakso, T. I., Vlimki, V., and Henriksson, J. 1995a. Tunable downsampling using fractional delay filters with applications to digital TV transmission. Proc. IEEE Int. Conf. Acoustics, Speech, Signal
Processing (ICASSP95), Detroit, Michigan, May 912, 1995, vol. 2, pp. 13041307.
Laakso, T. I., Saramki, T., and Cain, G. D. 1995b. Asymmetric Dolph-Chebyshev, Saramki, and transitional windows for fractional delay FIR filter design. Proc. 38th Midwest Symposium on Circuits and
Systems (MWSCAS'95), Rio de Janeiro, Brazil, August 1316, 1995. To be published in December 1995.

References

177

Lacroix, A., and Makai, B. 1979. A novel vocoder concept based on discrete time acoustic tubes. Proc.
IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP79), Washington, DC, April 24, 1979,
pp. 7376.
Laine, U. K. 1982. Modelling of lip radiation impedance in z-domain. Proc. 1982 IEEE Int. Conf.
Acoust., Speech, Signal Processing (ICASSP82), Paris, France, May 35, 1982, vol. 3, pp. 19921995.
Reprinted in Laine (1989).
Laine, U. K. 1988. Digital modelling of a variable-length acoustic tube. Proc. 1988 Nordic Acoustical
Meeting (NAM88), Tampere, Finland, June 1517, 1988, pp. 165168. Reprinted in Laine (1989).
Laine, U. K. 1989. Studies on Modelling of Vocal Tract Acoustics with Applications to Speech Synthesis.
Doctoral thesis. Espoo, Finland, Helsinki University of Technology, Faculty of Electrical Engineering,
Laboratory of Acoustics and Audio Signal Processing, Report no. 32, March 1989.
Lang, M., and Laakso, T. I. 1994. Simple and robust method for the design of allpass filters using leastsquares phase error criterion. IEEE Trans. Circuits and SystemsII: Analog and Digital Signal Processing, vol. 41, no. 1, January, pp. 4048.
Laroche, J., and Meillier, J.L. 1994. Multi-channel excitation/filter modeling of percussive sounds with
application to the piano. IEEE Trans. Speech and Audio Processing, vol. 2, no. 2, April, pp. 329344.
Legge, K. A., and Fletcher, N. H. 1984. Nonlinear generation of missing modes on a vibrating string. J.
Acoust. Soc. Am., vol. 76, no. 1, July, pp. 512.
Liljencrants, J. 1985. Speech Synthesis with a Reflection-Type Line Analog. Doctoral dissertation. Dept.
of Speech Communication and Music Acoustics, Royal Institute of Technology, Stockholm, Sweden,
August 1985.
Liu, G.-S., and Wei, C.-H. 1990. Programmable fractional sample delay filter with Lagrange interpolation. Electronics Letters, vol. 26, no. 19, 13th September, pp. 16081610.
Liu, G.-S., and Wei, C.-H. 1992. A new variable fractional sample delay filter with nonlinear interpolation. IEEE Trans. Circuits and SystemsII: Analog and Digital Signal Processing, vol. 39, no. 2, February, pp. 123126.
Maeda, S. 1977. On a simulation method of dynamically varying vocal tract: reconsideration of the
KellyLochbaum model. Proc. Symp. Articulatory Modeling and Phonetics, Grenoble, France, July 10
12, 1977, pp. 281288.
Maeda, S. 1982. A digital simulation method of the vocal-tract system. Speech Communication, vol. 1,
no. 34, December, pp. 199229.
Marques, J. S., Tribolet, J. M., Trancoso, I. M., and Almeida, L. B. 1989. Pitch prediction with fractional
delays in CELP coding. Proc. European Conf. Speech Communications and Technology
(EUROSPEECH89), Paris, France, September 1989, vol. 2, pp. 509512.
Marques, J. S., Trancoso, I. M., Tribolet, J. M., and Almeida, L. B. 1990. Improved pitch prediction with
fractional delays in CELP coding. Proc. 1990 IEEE Int. Conf. Acoust., Speech, Signal Processing
(ICASSP90), Albuquerque, New Mexico, April 36, 1990, vol. 2, pp. 665668.
Martnez, J., and Agull, J. 1988. Conical bores. Part I: reflection functions associated with discontinuities. J. Acoust. Soc. Am., vol. 85, no. 5, November, pp. 16131619.
Martnez, J., Agull, J., and Cardona, S. 1988. Conical bores. Part II: multiconvolution. J. Acoust. Soc.
Am., vol. 84, no. 5, November, pp. 16201627.
Matsui, K., Pearson, S. D., Hata, K., and Kamai, T. 1991. Improving naturalness of text-to-speech synthesis using natural glottal source. Proc. 1991 IEEE Int. Conf. Acoust., Speech, Signal Processing
(ICASSP91), Toronto, Canada, May 25, 1991, vol. 2, pp. 769772. Also: Research Report no. 3,
Speech Technology Laboratory, Santa Barbara, California, September 1992, pp. 2127.

178

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

McIntyre, M. E., and Woodhouse, J. 1979. On the fundamentals of bowed string dynamics. Acustica, vol.
43, no. X, pp. 83108.
McIntyre, M. E., Schumacher, R. T., and Woodhouse, J. 1983. On the oscillations of musical instruments.
J. Acoust. Soc. Am., vol. 74, no. 5, November, pp. 13251345.
Medan, Y. 1991. Using super resolution pitch in waveform speech coders. Proc. 1991 IEEE Int. Conf.
Acoust., Speech, Signal Processing (ICASSP91), Toronto, Canada, May 25, 1991, vol. 1, pp. 633636.
Meyer, P., Wilhelms, R., and Strube, H. W. 1989. A quasiarticulatory speech synthesizer for the German
language running in real time. J. Acoust. Soc. Am., vol. 86, no. 2, August, pp. 523539.
Minocha, S., Dutta Roy, S. C., and Kumar, B. 1993. A note on the FIR approximation of a fractional
sample delay. Int. J. Circuit Theory and Appl., vol. 21, no. 3, MayJune, pp. 265274.
Mitra, S. K., and Hirano, K. 1974. Digital all-pass networks. IEEE Trans. Circuits and Systems, vol.
CAS-21, no. 5, September, pp. 688700.
Moorer, J. A. 1979. About this reverberation business. Computer Music J., vol. 3, no. 2, pp. 1328.
Reprinted in C. Roads and J. Strawn (Eds.) 1985. Foundations of Computer Music, Cambridge,
Massachusetts, MIT Press, pp. 605639.
Morse, P. M., and Ingard, U. K. 1968. Theoretical Acoustics. Princeton, New Jersey, Princeton University
Press. 927 p.
Mourjopoulos, J. N., KyriakisBitzaros, E. D., and Goutis, C. E. 1990. Theory and real-time implementation of time-varying digital audio filters. J. Audio Eng. Soc., vol. 38, no. 7/8, July/August, pp. 523536.
Mrayati, M., Carre, R., and Guerin, B. 1988. Distinctive regions and modes: a new theory of speech production. Speech Communication, vol. 7, no. 3, pp. 257286.
Nikolaidis, S. S, Mourjopoulos, J. N., and Goutis, C. E. 1993. A dedicated processor for time-varying
digital audio filters. IEEE Trans. Circuits and SystemsII: Analog and Digital Signal Processing, vol.
40, no. 7, July, pp. 452455.
Oetken, G., Parks, T. W., and Schler, H. W. 1975. New results in the design of digital interpolators.
IEEE Trans. Acoust., Speech, Signal Processing, vol. 23, no. 3, June, pp. 301309.
Oetken, G. 1979. A new approach for the design of digital interpolating filters. IEEE Trans. Acoust.
Speech, Signal Processing, vol. 27, no. 6, December, pp. 637643.
Oppenheim, A. V., and Schafer, R. W. 1975. Digital Signal Processing. Englewood Cliffs, New Jersey,
PrenticeHall, 585 p.
Parks, T. W., and Burrus, C. S. 1987. Digital Filter Design. New York, John Wiley & Sons. 342 p.
Preuss, K. 1989. On the design of FIR filters by complex approximation. IEEE Trans. Acoustics, Speech,
and Signal Processing, vol. 37, no. 5, May, pp. 702712.
Proakis, J. G., and Manolakis, D. G. 1992. Digital Signal Processing: Principles, Algorithms, and Applications (Second Edition). New York, Macmillan. 969 p.
Pyfer, M. F., and Ansari, R. 1987. The design and application of optimal FIR fractional-slope phase
filters. Proc. 1987 IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP87), Dallas, Texas, April
69, 1987, vol. 2, pp. 896899.
Rabenstein, R. 1988. Minimization of transient signals in recursive time-varying digital filters. Circuits,
Systems, and Signal Processing, vol. 7, no. 3, pp. 345359.
Rabiner, L. R. 1977. On the use of autocorrelation analysis for pitch detection. IEEE Trans. Acoustics,
Speech, and Signal Processing, vol. 25, no. 1, February, pp. 2433.

References

179

Rabiner, L. R., and Schafer, R. W. 1978. Digital Processing of Speech Signals. Englewood Cliffs, New
Jersey, PrenticeHall, 512 p.
Regalia, P. A. 1993. Special filter designs. In Sanjit K. Mitra and James F. Kaiser (eds.). Handbook for
Digital Signal Processing. New York, John Wiley & Sons. pp. 907980.
Roberts, R. A., and Mullis, C. T. 1987. Digital Signal Processing. Reading, Massachusetts, AddisonWesley. 578 p.
Rocchesso, D. 1993. Multiple feedback delay networks for sound processing. Proc. X Colloquio di
Informatica Musicale, Milan, Italy, December 24, 1993, pp. 202209.
Rocchesso, D., and Turra, F. 1993. A real time clarinet model on Mars workstation. Proc. X Colloquio di
Informatica Musicale, Milan, Italy, December 24, 1993, pp. 210213.
Rocchesso, D., and Smith, J. O. 1994. Circulant feedback delay networks for sound synthesis and processing. Proc. 1994 Int. Computer Music Conf. (ICMC94), Aarhus, Denmark, September 1217, 1994,
pp. 378381.
Rodet, X. 1984. Time domain formant-wave-function synthesis. Computer Music J., vol. 8, no. 3, fall, pp.
1531.
Rodet, X., Potard, Y., and Barrire, J.B. 1984. The Chant project: from synthesis of the singing voice to
synthesis in general. Computer Music J., vol. 8, no. 3, fall, pp. 1531. Reprinted in C. Roads (ed.) 1989.
The Music Machine. Cambridge, Massachusetts, MIT Press, pp. 449466.
Savioja, L., Rinne, T. J., and Takala, T. 1994. Simulation of room acoustics with a 3-D finite difference
mesh. Proc. 1994 Int. Computer Music Conf. (ICMC94), Aarhus, Denmark, September 1217, 1994, pp.
463466.
Schafer, R. W., and Rabiner, L. R. 1973. A digital signal processing approach to interpolation. Proc.
IEEE, vol. 61, no. 6, June, pp. 692702.
Schroeder, M. R. 1962. Natural sounding artificial reverberation. J. Audio Eng. Soc., vol. 10, no. 3, July,
pp. 219223.
Schroeter, J., and Sondhi, M. M. 1994. Techniques for estimating vocal-tract shapes from the speech signal. IEEE Trans. Speech and Audio Processing, vol. 2, no. 1, January, pp. 133150.
Schulist, M. 1990. Improvements of a complex FIR filter design algorithm. Signal Processing, vol. 20,
no. 1, May, pp. 8190.
Serra, X. 1989. A System for Sound Analysis/Transform/Synthesis Based on a Deterministic plus Stochastic Decomposition. Ph.D. dissertation. Stanford, California, Stanford University, Dept. of Music,
CCRMA Tech. Report no. STANM58, October 1989. 151 p.
Shannon, C. E. 1949. Communication in the presence of noise. Proc. IRE, vol. 37, no. 1, January, pp. 10
21.
Sivanand, S., Yang, J. F., and Kaveh, M. 1991. Focusing filters for wide-band direction finding. IEEE
Trans. Signal Processing, vol. 39, no. 2, February, pp. 437445.
Sivanand, S. 1992. Variable-delay implementation using digital FIR filters. Proc. 1992 Int. Conf. Signal
Processing Theory and Applications (ICSPAT92), Boston, Massachusetts, November 25, 1992, pp.
12381243.
Smith, J. O. 1983. Techniques for Digital Filter Design and System Identification with Application to the
Violin. Ph.D. dissertation. Stanford, California, Stanford University, Dept. of Music, CCRMA Tech.
Report no. STANM14, June 1983. 260 p.

180

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Smith, J. O., and Gossett, P. 1984. A flexible sampling-rate conversion method. Proc. 1984 IEEE Int.
Conf. Acoust., Speech, Signal Processing (ICASSP84), San Diego, California, March 1921, 1984, vol.
2, pp. 19.4.14.
Smith, J. O., and Friedlander, B. 1985. Adaptive interpolated time-delay estimation. IEEE Trans.
Aerospace and Electronic Systems, vol. 21, no. 3, March, pp. 180199.
Smith, J. O. 1985. A new approach to digital reverberation using closed waveguide networks. Proc. 1985
Int. Computer Music Conf. (ICMC85), Vancouver, Canada, August 1922, 1985, pp 4753. Also: Stanford, California, Stanford University, Dept. of Music, CCRMA Tech. Report no. STANM31, July
1985. 7 p. Reprinted in Smith (1987).
Smith, J. O. 1986. Efficient simulation of the reed-bore and bow-string mechanisms. Proc. 1986 Int.
Computer Music Conf. (ICMC86), The Hague, The Netherlands, October 2024, 1986, pp. 275280.
Reprinted in Smith (1987).
Smith, J. O. 1987. Music Applications of Digital Waveguides. Stanford, California, Stanford University,
Dept. of Music, CCRMA Tech. Report no. STANM39, May 27, 1987. 181 p.
Smith, J. O., and Serra, X. 1987. PARSHL: An analysis/synthesis program for non-harmonic sounds
based on a sinusoidal representation. Proc. 1987 Int. Computer Music Conf. (ICMC87), Urbana, Illinois,
August 1987, pp. 290297.
Smith, J. O. 1990. Efficient Yet Accurate Models for Strings and Air Columns using Sparse Lumping of
Distributed Losses and Dispersion. Stanford, California, Stanford University, Dept. of Music, CCRMA
Tech. Report no. STANM67, December 1990. 15 p.
Smith, J. O. 1991. Waveguide simulation of non-cylindrical acoustic tubes. Proc. 1991 Int. Computer
Music Conf. (ICMC91), Montreal, Canada, October 1620, 1991, pp. 304307.
Smith, J. O. 1992a. Bandlimited interpolationintroduction and algorithm. Unpublished manuscript,
November 5, 1992.
Smith, J. O. 1992b. Physical modeling using digital waveguides. Computer Music J., vol. 16, no. 4, winter, pp. 7487.
Smith, J. O. 1993a. Efficient synthesis of stringed musical instruments. Proc. 1993 Int. Computer Music
Conf. (ICMC93), Tokyo, Japan, September 1015, 1993, pp. 6471.
Smith, J. O. 1993b. Structured sampling synthesisautomated construction of physical modeling synthesis parameters and tables from recorded sounds. CCRMA Associates Presentation, Stanford University,
Stanford, California, May 1993.
Smith, J. O., and Rocchesso, D. 1994. Connections between feedback delay networks and waveguide
networks for digital reverberation. Proc. 1994 Int. Computer Music Conf. (ICMC94), Aarhus, Denmark,
September 1217, 1994, pp. 376377.
Smith, J. O., and Van Duyne, S. A. 1995. Commuted piano synthesis. Proc. 1995 Int. Computer Music
Conf. (ICMC95), Banff, Canada, September 37, 1995, pp. 335342.
Smith, J. O. 1995a. Discrete-time modeling of acoustic systems. Manuscript in preparation. Preliminary
version published for the CCRMA Associates Conference, Stanford University, Stanford, California,
May 1995. 110 p.
Smith, J. O. 1995b. Physical modeling synthesis update. Unpublished manuscript. To be published in
Computer Music J., vol. 20, no. 2, summer 1996.
Sondhi, M. M. 1983. An improved vocal tract model. Proc. 11th Int. Congr. Acoust., Paris, France, July
1927, 1983, vol. 4, pp. 167170.
Sondhi, M. M., and Schroeter, J. 1987. A hybrid time-frequency domain articulatory speech synthesizer.
IEEE Trans. Acoust., Speech, Signal Processing, vol. 35, no. 7, July, pp. 955966.

References

181

Strube, H. W. 1975. Sampled-data representation of a nonuniform lossless tube of continuously variable


length. J. Acoust. Soc. Am., vol. 57, no. 1, January, pp. 256257.
Strube, H. W. 1982. Time-varying wave digital filters for modeling analog systems. IEEE Trans. Acoust.,
Speech, Signal Processing, vol. 30, no. 6, December, pp. 864868.
Sullivan, C. S. 1990. Extending the KarplusStrong algorithm to synthesize electric guitar timbres with
distortion and feedback. Computer Music J., vol. 14, no. 3, fall, pp. 2637.
Tarczynski, A., and Cain, G. D. 1994. Design of IIR fractional-sample delay filters. Proc. Second Int.
Symp. DSP for Communications Systems (DSPCS94), Adelaide, Australia, April 2629, 1994, pages not
numbered.
Tarczynski, A., Kozinski, W., and Cain, G. D. 1994. Sampling rate conversion using fractional-sample
delay. Proc. 1994 IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP94), Adelaide, Australia,
April 1922, 1994, vol. 3, pp. 285288.
Thiran, J.-P. 1971. Recursive digital filters with maximally flat group delay. IEEE Trans. Circuit Theory,
vol. 18, no. 6, November, pp. 659664.
Unser, M., Aldroubi, A., and Eden, M. 1992. Polynomial spline signal approximations: filter design and
asymptotic equivalence with Shannons sampling theorem. IEEE Trans. Information Theory, vol. 38, no.
1, January, pp. 95103.
Unser, M., Aldroubi, A., and Eden, M. 1993a. B-spline signal processing: Part Itheory. IEEE Trans.
Signal Processing, vol. 41, no. 2, February, pp. 821833.
Unser, M., Aldroubi, A., and Eden, M. 1993b. B-spline signal processing: Part IIefficient design and
applications. IEEE Trans. Signal Processing, vol. 41, no. 2, February, pp. 834848.
Vlimki, V., Karjalainen, M., Jnosy, Z., and Laine, U. K. 1992a. A real-time DSP implementation of a
flute model. Proc. 1992 IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP92), San Francisco,
California, March 2326, 1992, vol. 2, pp. 249252.
Vlimki, V., Laakso, T. I., Karjalainen, M., and Laine, U. K. 1992b. A new computational model for the
clarinet. Proc. 1992 Int. Computer Music Conf. (ICMC93), San Jose, California, October 1418, 1992,
Addendum (pages not numbered).
Vlimki, V. 1992. Sound Synthesis of the Flute by Means of a Computational Model. M. Sc. thesis (in
Finnish). Espoo, Finland, Helsinki University of Technology, Faculty of Electrical Engineering, Laboratory of Acoustics and Audio Signal Processing, December 1992. 109 p.
Vlimki, V., Karjalainen, M., and Laakso, T. I. 1993a. Fractional delay digital filters. Proc. 1993 IEEE
Int. Symp. on Circuits and Systems (ISCAS93), Chicago, Illinois, May 36, 1993, vol. 1, pp. 355358.
Vlimki, V., Karjalainen, M., and Laakso, T. I. 1993b. Modeling of woodwind bores with finger holes.
Proc. 1993 Int. Computer Music Conf. (ICMC93), Tokyo, Japan, September 1015, 1993, pp. 3239.
Vlimki, V., Karjalainen, M., and Kuisma, T. 1994a. Articulatory control of a vocal tract model based
on fractional delay waveguide filters. Proc. 1994 Int. Symp. Speech, Image Processing and Neural Networks (ISSIPNN94), Hong Kong, April 1316, 1994, vol. 2, pp. 571574.
Vlimki, V., Karjalainen, M., and Kuisma, T. 1994b. Articulatory speech synthesis based on fractional
delay waveguide filters. Proc. 1994 IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP94),
Adelaide, Australia, April 1922, 1994, vol. 1, pp. 585588.
Vlimki, V. 1994a. New structures for fractional delay filtering. Proc. Tampere University of Technology Symp. Signal Processing 94, Tampere, Finland, May 20, 1994, pp. 104107.
Vlimki, V. 1994b. Fractional Delay Waveguide Modeling of Acoustic Tubes. Espoo, Finland, Helsinki
University of Technology, Faculty of Electrical Engineering, Laboratory of Acoustics and Audio Signal
Processing, Report no. 34, July 1994. 133 p.

182

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Vlimki, V., and Karjalainen, M. 1994a. Digital waveguide modeling of wind instrument bores constructed of truncated cones. Proc. 1994 Int. Computer Music Conf. (ICMC94), Aarhus, Denmark,
September 1217, 1994, pp. 423430.
Vlimki, V., and Karjalainen, M. 1994b. Improving the KellyLochbaum vocal tract model using conical tube sections and fractional delay filtering techniques. Proc. 1994 Int. Conf. Spoken Language Processing (ICSLP94), Yokohama, Japan, September 1822, 1994, vol. 2, pp. 615618.
Vlimki, V., Huopaniemi, J., Karjalainen, M., and Jnosy, Z. 1995a. Physical modeling of plucked
string instruments with application to real-time sound synthesis. Presented at the 98th AES Convention,
Paris, France, February 2528, 1995, preprint no. 3956, 52 p. Submitted to the Journal of the Audio Engineering Society.
Vlimki, V. 1995a. A new filter implementation strategy for Lagrange interpolation. Proc. IEEE Int.
Symp. Circuits and Systems (ISCAS95), Seattle, Washington, April 29May 3, 1995, vol. 1, pp. 361364.
Vlimki, V., and Karjalainen, M. 1995. Implementation of fractional delay waveguide models using allpass filters. Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP95), Detroit, Michigan,
May 912, 1995, vol. 2, pp. 15241527.
Vlimki, V. 1995b. Fractional delay waveguide filters for modeling acoustic tubes. Proc. Second Int.
Conf. Acoustics and Music Research (CIARM95), Ferrara, Italy, May 1921, 1995, pp. 4146.
Vlimki, V., Laakso, T. I., and Mackenzie, J. 1995b. Time-varying fractional delay filters. Proc. Finnish
Signal Processing Symp. (FINSIG95), Espoo, Finland, June 2, 1995, pp. 118122.
Vlimki, V., Karjalainen, M., and Huopaniemi, J. 1995c. Measurement, estimation, and modeling of
wind instruments using DSP techniques. Proc. 15th Int. Congr. Acoustics (ICA95), Trondheim, Norway,
June 2630, 1995, vol. 3, pp. 513516.
Vlimki, V., Laakso, T. I., and Mackenzie, J. 1995d. Elimination of transients in time-varying allpass
fractional delay filters with application to digital waveguide modeling. Proc. 1995 Int. Computer Music
Conf. (ICMC95), Banff, Canada, September 37, 1995, pp. 327334.
Vlimki, V., and Takala, T. 1995. Virtual musical instrumentsnatural sound with physical models.
Submitted to Organised Sound, vol. 1, no. 1, April 1996 (special issue on sounds and sources).
Van Duyne, S. A., and Smith, J. O. 1993. Physical modeling with the 2-D digital waveguide mesh. Proc.
1993 Int. Computer Music Conf. (ICMC93), Tokyo, Japan, September 1015, 1993, pp. 4047.
Van Duyne, S. A., and Smith, J. O. 1994. A simplified approach to modeling dispersion caused by stiffness in strings and plates. Proc. 1994 Int. Computer Music Conf. (ICMC94), Aarhus, Denmark,
September 1217, 1994, pp. 407410.
Van Duyne, S. A., and Smith, J. O. 1995. Developments for the commuted piano. Proc. 1995 Int. Computer Music Conf. (ICMC95), Banff, Canada, September 37, 1995, pp. 319326.
van Walstijn, M., and de Bruin, G. 1995. Conical waveguide filters. Proc. Second Int. Conf. Acoustics
and Music Research (CIARM95), Ferrara, Italy, May 1921, 1995, pp. 4754.
Verhelst, W., and Nilens, P. 1986. A modified-superposition speech synthesizer and its applications.
Proc. 1986 IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP86), Tokyo, Japan, April 711,
1986, vol. 3, pp. 20072010.
Wright, G. T. H., and Owens, F. J. 1993. An optimized multirate sampling technique for the dynamic
variation of the vocal tract length in the KellyLochbaum speech synthesis model. IEEE Trans. Speech
and Audio Processing, vol. 1, no. 1, January, pp. 109113.
Wu, H. Y., Badin, P, Cheng, Y. M., and Guerin, B. 1987a. Vocal tract simulation: implementation of continuous variations of the length in a KellyLochbaum model, effects of area function spatial sampling.
Proc. 1987 IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP87), Dallas, Texas, April 69,
1987, vol. 1, pp. 912.

References

183

Wu, H. Y., Badin, P, Cheng, Y. M., and Guerin, B. 1987b. Continuous variation of the vocal tract length
in a KellyLochbaum type speech production model. Proc. Eleventh Int. Congr. Phonetic Sciences (XIth
ICPhS), Tallinn, Estonia, August 17, 1987, vol. 2, pp. 340343.
Zetterberg, L. H., and Zhang, Q. 1988. Elimination of transients in adaptive filters with application to
speech coding. Signal Processing, vol. 15, no. 4, December, pp. 419428.
Zlzer, U., Redmer, B., and Bucholtz, J. 1993. Strategies for switching digital audio filters. Presented at
the 95th AES Convention, New York, October 710, 1993, preprint no. 3714, 6 p. and 15 figures.
Zlzer, U., and Boltze, T. 1994. Interpolation algorithms: theory and application. Presented at the 97th
AES Convention, San Francisco, California, November 1013, 1994, preprint no. 3898, 22 p.

Appendix A
Bandlimited Squared Integral Error of FIR FD Filters
The least squared (LS) error function ELS is the L2 norm (integrated squared magnitude) of the error frequency response E(e j ). In the following, we derive the bandlimited version of the error function ELS , that is
ELS

1
=

E(e

1
) d =

j 2

H(e j ) Hid (e j ) d

(A.1)

where 0 < 1. The error function can be rewritten using vectors and matrices, i.e.
ELS

1
=

[h

][

e Hid (e j ) hT e Hid (e j ) d

(A.2)
=

h
0

2
Ch 2hT Re Hid (e j )e + Hid (e j ) d

Here h = [h(0) h(1) L h(N)] is the coefficient vector of the FIR filter and e is
T

e = 1 e j

L e jN

(A.3)

and we have defined the matrix C as

{ }

C = Re ee H

1
cos( )
L
cos(N )

cos( )
1
cos[(N 1) ]

=
M
O
M

cos(N ) cos (N 1) L
1
[
]

(A.4)

The squared error can be expressed as


ELS = hT Ph 2hT p1 + p0

(A.5a)

where we have used the following matrices and vectors


1
P=

Cd
0

184

(A.5b)

Appendix A. Bandlimited Squared Integral Error of FIR FD Filters

1
p1 =

[Re{Hid (e

}]

) c Im Hid (e j ) s d

185

(A.5c)

c = [1 cos( ) L cos(N )]

(A.5d)

s = [0 sin( ) L sin(N )]

(A.5e)

1
p0 =

Hid (e

1
) d =

j 2

d =

(A.5f)

The elements of matrix P can be elaborated as


Pk,l

1
=

1
Cd =
0

cos[(k l) ]d
0

(A.6)
=

sin[(k l) ]
= sinc[(k l) ]
(k l)

for k,l = 1,2,K, N + 1. Also, the elements of vector p1 can be expressed as


p1,k

1
=

[cos(D )cos(k ) + sin(D )sin(k )]d


0

(A.7)
1
=

sin[(D k) ]

cos[(D k) ]d = [(D k)]


0

= sinc[(D k) ]

for k = 1,2,K, N + 1. Here we used the well-known trigonometric identities


cos(a)cos(b) =

1
[cos(a b) + cos(a + b)]
2

sin(a)sin(b) =

1
[sin(a b) sin(a + b)]
2

The second term of (A.5) can be further elaborated as


N

2hT p1 = 2 h(n)sinc[(n D) ]

(A.8)

n=0

Finally, the bandlimited squared frequency response error can be expressed as a


function of filter coefficients:
N

ELS = hT Ph 2 h(n)sinc[(n D) ] +
n=0

where the elements of matrix P are given by (A.6).

(A.9)

Appendix B
Effective Length of the Infinite Impulse Response
As a measure for the length of the impulse response of a recursive filter, we define the
effective length N P as the number of impulse response samples h(n) which bring about
P% or more of its total energy, i.e.,
EP =

NP

h2 (n) 100 E

(B.1)

n=0

where n is the discrete time index and E is the total energy of the impulse response
defined as
E=

h2 (n) =

n=0

1
j 2
H(e
) d
0

(B.2)

and H(e j ) is the frequency response of the system with the impulse response h(n).
The length N P is the value of time index n in (B.1) at which P% of the energy of the
impulse response has arrived. The choice of the value for P depends on the application.
In the following we derive closed-form expressions for the P% energy length of the
first and second-order all-pole filters. The result for the second-order case is particularly
interesting because higher order filters are often realized as a cascade or parallel connection of second-order sections.
First-Order All-Pole Filter
The transfer function of the first-order all-pole filter is
H(z) =

1
1 + a1z 1

(B.3)

where a1 is real-valued and a1 < 1. The impulse response of this filter is


(a1 )n for n 0 and 0 for n < 0.
0
h(n) =
n
(a1 )

for n < 0
for n 0

The square sum (B.4) can be expressed in closed form as

186

(B.4)

Appendix B. Effective Length of the Infinite Impulse Response

187

20
18

Effective Length (in samples)

16
14
12
10
8
6
4
2
0
-1

-0.5

0
Filter Coefficient a1

0.5

Fig. B.1 Effective length N P (in samples) of the first-order all-pole filter as a function
of the filter coefficient a1 . The curves are for the values P = 90% (solid line),
P = 95% (dashed line), and P = 99% (dotted line).

EP =

NP

n=0

NP

(n) =

( )

n
a12

n=0

( )

1 a12

N P +1

1 a12

(B.5)

and the total energy is obtained as a limit as N P to be


E=

1
1 a12

(B.6)

For a given P, N P can be solved from (B.2) by substituting the equations (B.5) and
(B.6), that is
log(1 P 100) log(1 P 100)
1 =
NP =

2
2
log(a1 )
log(a1 )

(B.7)

where and denote the ceiling and floor operations, which correspond to rounding
upwards and downwards. Note that a practical value for N P must be an integer. The
base of the logarithm in (B.7) is arbitrary, e.g., 10. Figure B.1 shows the effective
length N P of the one-pole filter as a function of the filter coefficient a1 for P = 90%,
95%, and 99%.

188

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Second-Order All-Pole Filter


The transfer function of the second-order purely recursive filter can be written as
H(z) =

1
1 2r cos( )z 1 + r 2 z 2

(B.8)

The poles of this filter are at z = re j , and for stability it is required that r < 1. The
corresponding impulse response can be expressed as
n<0
0,
n
h(n) = r
sin sin[(n + 1) ], n 0

(B.9)

The total energy of h(n) can be given by (Jury, 1964)

1 + r2
1
j 2
E = H(e ) d =
0
1 r 2 1 + r 4 2r 2 cos2

)(

(B.10)

and the cumulative energy EP by


EP =

NP

h2 (n) = E E100 P

(B.11)

n=0

where
E100 P

1
1
r 2( N P +1)
2n
2
2n
=
r sin [(n + 1) ] sin 2 r = (1 r 2 )sin 2 (B.12)
sin 2 n= N +1
n= N +1
P

Solving for N P gives an upper bound as


N P N P,sim

log (1 r 2 )E100 P sin 2 ( )

1
log(r 2 )

(B.13)

Equation (B.12) overestimates the residual energy E100 P and consequently the
estimate for EP (B.11) becomes too small. This is because it has been assumed that
sin 2 [(n + 1) ] 1. Thus, Eq. (B.13) gives a value too large for N P .
The estimate in (B.13) for N P of the second-order recursive filter can be improved

by using sin 2 = 12 (1 cos2 ) and by writing (B.12) in the form E100 P = GP,1
GP,2
with

GP,1

r 2( N P +1)
=
2(1 r 2 )sin 2 ( )

(B.14)

and

=
GP,2

r 2( N P +1) 2n
r cos(2n + P )
2sin 2 ( ) n=0

(B.15)

Appendix B. Effective Length of the Infinite Impulse Response

189

where P = 2(N P + 2) . Using cos(x + y) = cos(x)cos(y) sin(x)sin(y) and the following sin formulas (from (Gradshteyn and Ryzhik, 1964), Eq. 1.447)

psin

(B.16)

1 p cos

(B.17)

pn sin(k ) = 1 2 p cos( ) + p2

k =0

pk cos(k ) = 1 2 p cos( ) + p2

k =0

we obtain (B.15) in the form


r 2( N P +1)
F( P )

GP,2
=

2
2
4
2sin ( ) 1 2r cos(2 ) + r

(B.18)

with

F( P ) = cos( P ) 1 r 2 cos(2 ) sin( P )r 2 sin(2 )

(B.19)

It is seen that F( P ) depends on P = 2(N P + 2) in a random-like manner. However,


we can determine bounds for F( P ) and thus for N P by considering P as a continuous
variable in the range [0, 2] and finding the minimum and maximum of F( P ). The
locations of these extrema are
r 2 cos(2 ) 1
P1 = arctan 2
and P2 = P1 +
r sin(2 )

(B.20)

from which one yields the maximum Fmax and the other the minimum Fmin . Using
(B.14) and (B.17), the upper bound for N P can be solved as
NP, max

log 2E100 P sin 2 ( ) log(Gmax )

=
log(r 2 )

(B.21)

1
Fmax
2
2
1 2r cos(2 ) + r 4
1 r

(B.22)

with
Gmax =

In a similar way, the lower bound is obtained as


N P,min

log 2E100 P sin 2 ( ) log(Gmin )


=
1
log(r 2 )

(B.23)

where Gmin is obtained from (B.22) by replacing Fmax by Fmin .


Figure B.2 illustrates P% energy length N P (P = 90) of the second-order all-pole filter as a function of (r = 0.9). The simple approximation given by (B.13) together with
the minimum and maximum estimates (B.21) and (B.23) are presented. For comparison,
Fig. B.2 also shows the result of a numerical simulation.

190

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

20
18

Effective Length (in samples)

16
14
12
10
8
6
4
2
0
0

0.1

0.2

0.3

0.4
0.5
0.6
Pole Angle (rad/pi)

0.7

0.8

0.9

Fig. B.2. The effective length N P (P = 90%) of the second-order all-pole filter as a
function of parameter q (r = 0.9). The solid line gives the actual value for N P
(obtained by numerical simulation) and the dashed lines present its upper and
lower limit. The uppermost curve results from the simple approximation
(B.13).
Closed-form formulas for the length of infinite length impulse responses of first and
second-order recursive systems were derived. These measures are based on the idea of
determining when P% of the total energy of the impulse response has arrived. Formulas
for higher-order systems are more complicated but results can also be obtained by
numerical simulation.
Simple Estimation of the Effective Length
The effective length of an infinite impulse response can also be estimated based on the
time constant of the filter. This approach is used often in analog electronics. The time
constant t of the first-order all-pole filter can be solved in the following way
n

r =e

n
t

(B.24)

where r is the radius of the pole, i.e., r = a1 . Now the time constant can be solved as

Appendix B. Effective Length of the Infinite Impulse Response

t=-

1
ln(r)

191
(B.25)

A simple approximation for the time constant can be derived by considering the Taylor
series of the function ln(r), that is
1
1
ln(r) = (r - 1) - (r - 1)2 + (r - 1)3 -K
2
3

(B.26)

When we truncate the first term of this series, we obtain the following approximation
for the time constant of the one-pole filter :

t=

1
1- r

(B.27)

This is also an approximation for the effective length of the impulse response of a
second-order all-pole filter. This is because the envelope of its impulse response decays
as rn [see Eq. (B.9)].
This simple approximation for the time constant can be used for estimating the
length of the impulse response. For example, the impulse response has decayed about
60 dB in 7t samples. More accurate estimates for the effective length of impulse
responses can be computed using the formulas that were derived above utilizing the
cumulated energy of the impulse response.

J. O. Smith, personal communication, November 1995.

192

Discrete-Time Modeling of Acoustic Tubes Using Fractional Delay Filters

Index
advance time 122, 124, 125, 126, 127
asymmetric window functions 75
binomial coefficient 87, 108
binomial window function 88
CCRMA 17
Chaigne, A. 18
complementary fractional delay 86, 129,
130
CORDIS 18
Cramers rule 84
cross-fading method 117
decimation 104, 129
deinterpolation 21, 128134, 155
deinterpolation 21
digital delay line 66
digital differentiator 60
digital waveguide 17, 26, 63
direct form I structure 104
direct form II structure 121
discrete-time allpass filter 105
discrete-time Fourier transform 66
Farrow structure 103
floor function 30, 41, 67
fractional delay 20, 21, 22, 24
fractional spatial index 141
generalized KarplusStrong model 28
Gibbs phenomenon 74, 75
greatest integer function 67
group delay 69, 106
Horners method 98
impulse-invariant method 40, 54, 55, 56,
57, 63
input-switching method 118, 119
inverse filtering 32
inverse interpolation 129
IRCAM 18, 19
KarplusStrong algorithm 17, 27

KellyLochbaum model 34, 35, 42, 137


Kronecker delta 84, 90
linear interpolation 86, 98, 99, 103
loop filter 27, 30
loop gain 31, 92
modal synthesis 18
Modalys 19
Parsevals relation 72
phase delay 69, 106
physical modeling 1719
piecewise parabolic interpolation 103
plane waves 34
reflection coefficient 37, 39, 44
Rodet, X. 18
sampling theorem 68
scattering 36
Shannon, C. E. 68
short-time Fourier transform 30
sinc function 68
Smith, J. O. 17, 25, 27, 30, 48, 62, 63
source-filter model 17, 18
speech synthesis 17, 34, 119
splines 75, 103
tangent-line interpolation 102
Tarczynski bound 71
Thiran allpass filter 107113, 123126
transmission coefficient 37
uniform tube 151
Vandermonde matrix 97
variable-length digital delay line 115,
158
variable-length digital waveguide 135
wave digital filters 62
waveguide synthesis 19
Yamaha VL-1 19
ZZ method 119, 121

193

Index

October 6, 1995

194

Вам также может понравиться