You are on page 1of 40


MARCH 2011


The Official Publication of The Embedded Systems Conferences and

Use the
14 Estimating with
least squares fit
Better preemption
What’s new in
watchdog timers
Leading Embedded
Development Tools

The complete development environment

for ARM®, Cortex™, 8051, and C166
microcontroller designs.


Certified and Deployed Technology

The INTEGRITY RTOS is deployed and certified to:

Railway: EN 50128 SWSIL 4, certified: 2010
Security: EAL6+ High Robustness, certified: 2008
Medical: FDA Class III, approved: 2007
Industrial: IEC 61508 SIL 3, certified: 2006
Avionics: DO-178B Level A, certified: 2002

Copyright © 2011 Green Hills Software, Inc. Green Hills, the Green Hills logo and INTEGRITY are trademarks of Green Hills Software, Inc. in the U.S.and/or
internationally. All other trademarks are the property of their respective owners.

toolbox 9
MARCH 2011 Can you give me an estimate?
On the road to the Kalman filter: The
job of the least squares fit is to give a
best estimate of the unknown values

break points 33
Watchdogs redux
Vendors and researchers are trying
some new tricks with watchdog timers.
Here are some notable attempts to
improve the old dog.

#include 5
Unintended acceleration


Cover Feature: A window into software Q/A in the

automotive industry.
Think static analysis cures all ills?
Think again. parity bit 7
Static code analysis has been around as long as soft- May 2–5, 2011
ware itself, but you’d swear from current tradeshows San Jose, CA
that it was just invented. Here’s how to choose the
right code-analysis tools for your project.
ESC Chicago
June 6–8, 2011
Chicago, IL

25 Lower the overhead in

RTOS scheduling
ESC India
July 20–22, 2011
Bangalore, India
Research shows that preemption-threshold scheduling helps to ESC Boston
mitigate the deadline-vs.-overhead tradeoff faced by developers July 20–22, 2011
of real-time systems. Boston, MA

EMBEDDED SYSTEMS DESIGN (ISSN 1558-2493) print; (ISSN 1558-2507 PDF-electronic) is published 10 times a year as follows: Jan/Feb, March, April, May, June,
July/August, Sept., Oct., Nov., Dec. by the EE Times Group, 600 Harrison Street, 5th floor, San Francisco, CA 94107, (415) 947-6000. Please direct advertising and editorial
inquiries to this address. SUBSCRIPTION RATE for the United States is $55 for 10 issues. Canadian/Mexican orders must be accompanied by payment in U.S. funds with addi-
tional postage of $6 per year. All other foreign subscriptions must be prepaid in U.S. funds with additional postage of $15 per year for surface mail and $40 per year for
airmail. POSTMASTER: Send all changes to EMBEDDED SYSTEMS DESIGN, EE Times/ESD, PO Box #3609, Northbrook, IL 60065-3257, For cus-
tomer service, telephone toll-free (847) 559-7597. Please allow four to six weeks for change of address to take effect. Periodicals postage paid at San Francisco, CA and additional
mailing offices. EMBEDDED SYSTEMS DESIGN is a registered trademark owned by the parent company, EE Times Group. All material published in EMBEDDED SYSTEMS
DESIGN is copyright © 2010 by EE Times Group. All rights reserved. Reproduction of material appearing in EMBEDDED SYSTEMS DESIGN is forbidden without permission.


When Your Company’s Success, And Your Job, Are On The Line -
You Can Count On Express Logic’s ThreadX® RTOS
Express Logic has completed 14 years simply must succeed. Its royalty-free
of successful business operation, licensing model helps keep your BOM low,
and our flagship product, ThreadX, T H R E A D and its proven dependability helps keep
has been used in over 800 million your support costs down as well. ThreadX
electronic devices and systems, repeatedly tops the time-to-market results
ranging from printers to smartphones, from single-chip reported by embedded developers like you. All the while,
SoCs to multiprocessors. Time and time again, when Express Logic is there to assist you with enhancements,
leading manufacturers put their company on the line, training, and responsive telephone support.
when their engineering team chooses an RTOS for their
next critical product, they choose ThreadX. Join leading organizations like HP, Apple, Marvell, Philips, NASA,
and many more who have chosen ThreadX for use in over 800
Our ThreadX RTOS is rock-solid, thoroughly field-proven, million of their products – because their products are too
and represents not only the safe choice, but the most important to rely on anything but the best. Rely on ThreadX,
cost-effective choice when your company’s product when it really counts!

Contact Express Logic to find out more about our ThreadX RTOS, FileX® file system, NetX™ Dual IPv4/IPv6 TCP/IP stack, USBX™
USB Host/Device/OTG stack, and our new PrismX™ graphics toolkit for embedded GUI development. Also ask about our TraceX®
real-time event trace and analysis tool, and StackX™, our patent-pending stack size analysis tool that makes stack overflows a
thing of the past. And if you’re developing safety-critical products for aviation, industrial or medical applications, ask Newnes

about our new Certification Pack™ for ThreadX.

Second Editio
adX for ARM, Coldfire,
With Thre ices
Now with architectures
MIPS and

For a free evaluation copy, visit • 1-888-THREADX


Copyright © 2010, Express Logic, Inc. Edward

L. Lamie

ThreadX, FileX, and TraceX are registered trademarks, and NetX, USBX, PrismX, StackX, and Certification Pack are trademarks of Express Logic, Inc.
All other trademarks are the property of their respective owners.

Editorial Director
BY Ron Wilson #include
Ron Wilson

Unintended acceleration
(415) 947-6317
Managing Editor
Susan Rambo

s we were preparing this month’s fects and suspicious coding patterns.”
Acquisitions/Newsletter Editor, issue for publication, the U.S. Na- They also used CodeSonar to compare Site Editor
Bernard Cole tional Highway Traffic Safety Ad- Toyota’s code against a Jet Propulsion ministration released an enormous re- Lab coding standard.
Contributing Editors port on its investigation into For model checking, the team
Michael Barr, John Canosa,
Jack W. Crenshaw, Jack G. Ganssle, unintended acceleration in Toyota used open-source Spin and Swarm.
Dan Saks, Larry Mittag cars. It all seems very familiar, if you Here the tale gets more interesting. To
Art Director remember the Audi 5000. New elec- use a formal model checker, you first
Debee Rommel tronic throttle control reaches the mar- have to write formal models. The team
Production Director
ket. Reports of failure emerge from a decided to build models only for those
Donna Ambrosino few users. The vendor denies any prob- software modules they believed could
lem. A few engineers quietly report be culprits—so the formal analysis de-
Subscriptions/RSS Feeds/Newsletters having reproduced the problem. But pended upon human judgment of
intensive publicly-funded investigation possible fault modes.
Subscriptions Customer Service (Print)
Embedded Systems Design finds nothing. The algorithm analysis started
PO Box # 3609 What makes this report particular- with—once again—building models,
Northbrook, IL 60065- 3257 ly interesting is that the NHTSA called this time in Matlab. This process started
(847) 559-7597 in an evaluation team from NASA to with reading Toyota documentation
Article Reprints, E-prints, and do the heavy lifting. And that team in- and talking with Toyota engineers, and
Mike O’Brien cluded a software evaluation group. then progressed to analyzing the source
Wright’s Reprints While the hardware folks were shak- code and finally testing the models
(877) 652-5295 (toll free)
(281) 419-5725 ext.117 ing, baking, and irradiating cars and against actual Camrys. Once the NASA
Fax: (281) 419-5712 car parts, the software team had at the team was satisfied with the models, they
?magid=2210 Engine Controller Module code for the explored failure scenarios in Simulink
four-cylinder 2005 Camry—all 280K and checked delays with AbsInt aiT.
David Blaza lines of ANSI C. The team’s report Some conclusions suggest them-
(415) 947-6929 selves. First, there are no silver bullets:
( could be a case
study for Mark Pitchford’s cover story. effective debug means using everything
Associate Publisher/Sales North
America NASA’s team applied static you’ve got. Second, even when it’s
Bob Dumas source-code analysis, formal logic grounded in exhaustive and formal
(516) 562-5742 model checking, and algorithm analy- techniques, an evaluation is circum-
Editorial Review Board sis through simulation. The report scribed by the evaluators’ beliefs about
Michael Barr, Jack W. Crenshaw, states “The team’s experience is that the possible behavior of the system.
Jack G. Ganssle, Bill Gatliff,
Nigel Jones, Niall Murphy, Dan Saks, there is no single analysis technique to- Third, there is no certainty. Despite
Miro Samek day that can reliably intercept all vul- Toyota’s great care in developing their
nerabilities, but that it is strongly rec- code, NASA’s analysis found significant
ommended to deploy a range of errors, including serious underestimates
different leading tools.” of delays in the multiprocessing system.
For code analysis, the team used But the investigation could not link
Coverity, CodeSonar, and Bell Labs’ those errors to any proposed mecha-
Corporate—EE Times Group
Paul Miller Chief Executive Officer Uno to identify “common coding de- nism for unintended acceleration. Con-
Felicia Hamerman Group Marketing Director
Brent Pearson Chief Information Officer trary to what you probably read in the
Jean-Marie Enjuto Financial Director Ron Wilson is the edi- papers, the NASA Executive Summary
Amandeep Sandhu Manager Audience Engagement
Barbara Couchois Vice President Sales Ops torial director of
design publications at stated “Because proof that the ETCS-i
Corporate—UBM LLC UBM Electronics. You caused the reported UAs [unintended
Marie Myers Senior Vice President, may reach him at accelerations] was not found does not
Pat Nohilly Senior Vice President, Strategic
Development and Business mean it could not occur.”
—Ron Wilson, | embedded systems design | MARCH 2011 5

INNOVATors build networks with explosive speed
and exceptional INTELLIGENCE.

How do innovators build networks of stunning speed and intelligence, while keeping costs squarely under control?
They work with Wind River. Our advanced networking solutions give leading network equipment manufacturers the
packet acceleration, hardware optimization, and system reliability they need to deliver breakthrough performance and
greater value—from the core to the consumer and everywhere in between. All while reducing their costs and cutting
their time-to-market so they can focus on innovation to create a truly competitive edge.

Please visit to see how Wind River

customers have delivered breakthrough performance and greater value. INNOVATORS START HERE.
parity bit
Dreaming of push-button synthesis
Y ou can look at the C-to-Si prob-
lem (Ron Wilson, “C to silicon. Re-
ally?” January/February 2011, as being
much the same as the “How do I get
result in the same cycle. Existing CPUs
do not use that capability, but I can
show a way to take C code and do an it-
eration of a basic “for” loop in about 11
clock cycles at about the typical fmax of
route even though so much automa-
tion has been added.

Coremark and multicore


my C to run parallel problem?”—in the FPGA. Yes, custom HDL will go One interesting thing about the Core-
other words, hardware descriptions are faster, but not much because the cycle Mark benchmark (Shay Gal-On and
fine-grained parallel processing ver- count is so small and the clock is run- Markus Levy, “CoreMark: A realistic
sions of an algorithm. Rather than ning at the typical rate for the chip. way to benchmark CPU performance,”
RTL, you need to refine your software —KarlS January/February 2011, www.eetimes.
down to the level of small finite state com/4212735) is that its performance is
machines communicating asynchro- Push-button synthesis, even for some- almost linearly scalable with the num-
nously, and then give that to your logic thing as mature as RTL synthesis, is not ber of cores in a multicore processor.
synthesis tool (for ASIC or FPGA). a realistic expectation if you are target- So the TilePro 64-core processor ing production silicon. This is why Ca- comes in at the top of the list with a
The push-button idea is not dead, dence focused much of the “C-to-Sili- CoreMark/MHz rating of more than
but a lot of the work needs to be done three times its closest competitor. Of
by the software (compiler) guys rather course, many (most?) real-world appli-
than the EDA guys.
! A commonsense look
may say that Utopia
cations don’t scale linearly with core
count because of shared resources such
as memory and I/O.
Since SystemC has been around for
some time and the dream that you de-
scribe is not yet real, is an intermediate
step of any value? A lot has been said
about the inherent parallelism in HDL,
! is not really worth
the price. Demystifying constructors

We found an effective way to get

around the problem of constructors
but is it really ever done to the max? In con Compiler” development on a being called automatically when you
order to do things in parallel other than graphical design environment where did not expect (Dan Saks, “Demystify-
step counters, a lot of resource can be hardware engineers (yes, they are still ing constructors,” January/February
gobbled up by implementing arrays in needed!) can examine the Control/ 2011,
registers and having adders all over the Dataflow Graph, the critical path from Make them private. That disables new.
place, so a commonsense look may say embedded RTL Compiler annotated to and prevents some terrible casts from
that Utopia is not really worth the price. the SystemC, etc., and make the desired happening. The class then provides a
Although it is generally accepted adjustments. Engineers need to opti- function to make an object (or objects)
that pipelining is a must, a small num- mize the synthesis of the SystemC for
ber of CPU registers must be allocated the end-application. The TI OMAP ex- x = class.newobj(...)
by the compiler, and memory must be ample points out that when they create
accessed one word at a time (maybe the platform and subsystems, they All objects are then explicitly created
both an instruction and data together). aren’t sure how they will be used in the making the construction sequences
These things lead to a lot of serializa- future. If the source of a block/subsys- much more visible.
tion in the CPU and then the need to tem is SystemC, you can generate RTL’s —cdhmanning
carve out things that can be accelerated. that are targeted for different product
But let’s think about what is available goals. But you still need engineers to We welcome your feedback. Letters to the
on an FPGA. Internal multiport memo- drive high-level synthesis correctly, just editor may be edited. Send your comments to
ry can be used to achieve parallel access as you still need engineers to drive RTL Ron Wilson at or post a
comment online, under the article you wish to
to two operands and write the previous synthesis, floorplanning, and place and discuss. We edit letters and posts for brevity. | embedded systems design | MARCH 2011 7

By Jack W. Crenshaw
Can you give me an estimate?
A bout 18 months ago, I wrote a col-
umn on the least squares fit (“Why
all the math,”
4027019) where I emphasized the tech-
nique to fit a curve to noisy data. It’s not
we can process the data sequentially, as
it comes in. That concept, sequential
processing, is going to be our main fo-
cus in this and future columns.
But there’s another distinction
the only way to do that, of course. If between filtering and fitting that’s
you’ve ever worked with any embedded
On the road to the much more fundamental and pro-
system that had a sensor measuring a Kalman filter: The job found than the way you process the
real-world parameter, you know all data. That distinction lies in what you
about dealing with noise. The usual way of the least squares fit know—or think you know—about
is to pass the signal through a low-pass the process that generates the data.
filter, which can be anything from a sim-
is to give an estimate When I’m developing a system to
ple first-order lag to a finite impulse re- of the unknown coeffi- filter noise out of a signal, I don’t have
sponse (FIR) or infinite-impulse re- any idea what’s happening in the real
sponse (IIR) filter with scores, if not cients of the mental world, to create that signal. At the level
hundreds, of terms. of the analog-to-digital (A/D) conver-
So what’s the difference between a
model. sion, it’s just a voltage that comes
least squares fit and a low-pass filter? from somewhere, corrupted by noise.
They both extract a signal from the noise, right? My job is only to extract the signal from the noise.
One clue to the difference lies in the word, signal. As The least squares fit is different. We apply it when we
soon as you say the word, you invoke an image of some- think we know something about the process that generated
thing like a voltage that’s changing with time. When you’re the data. If I’m applying a linear regression to a data set, it’s
processing a signal, you don’t have the luxury of waiting because I think that one element of the set’s data pair de-
until you’ve got a bunch of data points. You have to accept pends on the other. And not just depends, but has a linear
and process the data as it’s coming in. Most days, we call relationship, which I can graph as a straight line. The pur-
such processing real time. pose of the least squares fit is not just to filter the noise, but
By contrast, the term least squares invokes an image of to discover the coefficients of that straight line relationship.
batch processing, where you operate on a whole set of data Of course, the regression doesn’t have to be linear. Us-
items after they’ve been collected. The notion of time is not ing least squares, I can fit a quadratic, or a polynomial of
explicitly involved. Indeed, when you’re applying the least any order. I can fit a logarithmic relationship, or a sum of
squares fit, you can shuffle the data indiscriminately, and logarithmic terms. I can even fit a sum of sine waves (in
the technique will still give you the same solution. which case I’ve done a Fourier analysis). It really doesn’t
As far as we know, the first application of the least matter what we think the relationship is; it only matters
squares fit was invented by Carl Friedrich Gauss in 1795, at that we think that there is one.
the ripe old age of 18. He used it in 1801 to predict the mo- The point is, when I’m applying a least squares fit I’m
tion of the minor planet, Ceres. I think it’s safe to say that doing more than just filtering noise. I have a mental model
Gauss’s analysis was anything but real time. Unless, of of what’s going on in the system that’s generating the data.
course, you accept a clock rate measured in days or weeks. Presumably, that model includes coefficients whose values
The distinction between a filter and a least squares fit are unknown. The job of the least squares fit is to give me a
gets a lot more fuzzy if we set up the least squares fit so that best estimate of those coefficients. For obvious reasons, this
process is often called state estimation, and it’s this disci-
pline that will occupy our interest in the rest of this col-
Jack Crenshaw is a systems engineer and the
author of Math Toolkit for Real-Time Programming. umn, and several more. If things work out right, the series
He holds a PhD in physics from Auburn University. E- will culminate in that dream of all estimators, the justly fa-
mail him at mous Kalman Filter. I’ll leave it to you to ponder why the
pinnacle technique of state estimation is called, not an esti-
mator, but a filter. | embedded systems design | MARCH 2011 9

programmer’s toolbox
YOUR SCORE WAS MERELY Careful, now. I can see some of 1
m1 = y1 = y1
AVERAGE your eyes starting to glaze over, and 1
In this new series, I expect to cover a lot your pupils shrinking down to pin-
of ground, with math that’s going to get points. This isn’t rocket science (yet).
m2 = y1 + y 2
heavier and heavier as we go along. The Equation 6 says nothing at all that isn’t (7)
last thing I want to do is to leave some said by Equation 5. It’s just a shorter
m3 = y1 + y 2 + y3
of you behind in the starting blocks. So way of saying it. Because mathemati-
in the spirit of No Reader Left Behind, cians and physicists are a lazy lot, we
m4 = y1 + y 2 + y3 + y 4
I’m going to begin with the most tend to go for the shorter way when etc.
ridiculously simple example I can think given the chance. Stated in words, the But each of the sums in these equa-
of. If things go as planned, I’ll continue summation sign of Equation 6 says “let tions are merely the sum in the previous
to escalate in such small steps that, like the index i take on all the integer values mean, plus one new term. We can write:
the proverbial frog simmering in the from 1 through N. For each i, add the
proverbial stew pot, you’ll be cooking term yi to the sum.” 1
m1 = y1 = y1
along with state estimation without re- 1
ally noticing how you got there. I’ll start
by defining a set of 10 numbers:

y =[4, 2, 5, 3, 2, 4, 2, 6, 3, 4]
! The problem with taking
an average is that the
m2 = 1m1 + y 2
m3 = 2m2 + y3
) (8)

(1) 3
computation is a batch 1
m4 = 3m3 + y 4 ( )
Your challenge is to compute their aver- process. To add up all etc.
age value. Hey, you know how to do this.
First, add all the numbers:

sum = 4 + 2 + 5 + 3 + 2 + 4 + 2 + 6 + 3 + 4 = 35 (2) ! those numbers, we have

to have them all available. And, in general:

mn+1 =
n +1
nimn + yn+1 ) (9)
Then divide by the number of terms: If the summation sign makes you
nervous, just remember that you can al- As you can see, we don’t need to
y= = 3.5 (3) ways expand it back out into the explic- keep any of the old elements of y
10 it sum of Equation 5. It just takes a little around. In fact, we don’t need to keep
There. That wasn’t so hard, was it? longer to write, that’s all. any of the old elements of m around, ei-
Note the bar over the y, which is the One last point: The average value of ther. We only need the latest value of m,
symbol usually used to indicate an aver- any set is often called its mean. Even plus the newest element of y. The new
age value. It’s not the only symbol for more specifically, its arithmetic mean value of m can overwrite the old value.
the average. Another one is 〈y〉, which is (because there are other kinds of In a software implementation, we only
a little harder to miss. I actually prefer means). need to have two persistent, scalar vari-
〈y〉, precisely because it’s harder to miss. Why introduce another term? Can ables: the past value of m, plus the inte-
But we’ll stick with the “bar” notation it be that we’re so lazy we’d rather write ger counter, k.
here. Note also that, while y is a set of four letters than seven? I guess it must Better yet, let’s not keep the old val-
values, y៮ is only a single scalar number. be so. Deal with it. ue of m, but the running sum. This lets
We can generalize the process with us avoid the multiplication by n. Let:
a little math notation (stay with me, SOME DO IT SEQUENTIALLY n

now): The problem with taking an average us- sn = ∑ yi

i =1
ing Equation 6 (or 5) is that the com- (10)
y + y +…+ y10
10 1 2
) (4) putation is basically a batch process. To Then:
add up all those numbers, we have to
or, more generally, for a set of N values: have them all available. In an embedded mn+1 =
s +y
n +1 n n+1
) (11)
computer, this means that we must
y + y +…+ y N −1 + y N
N 1 2
) (5) keep the potentially huge set of terms Writing the software is almost easier
stored in memory. than describing what it must do. Listing 1
Even better, the shorthand notation of Is there a better way? Of course. To shows a snippet of code that works in ei-
the summation symbol: see how, let’s look at how the average ther C or C++. It’s not perfect, and it’s
value changes as new data comes in. Let not production quality. Because it has
1 N
y= ∑y
N i =1 i
(6) mn be the mean of the first n values of static variables, it won’t work if you’re
the set. As the data comes in, we’ll have: asking it to find more than one average
10 MARCH 2011 | embedded systems design |
Scan Here
Semiconductors and electronic
components for design engineers.

The most advanced, multilingual,

multicurrency mobile site for design engineers.

Compatible with more than 25 mobile platforms,

no one supports more phones and tablets than
Mouser. Get What’s Next now at | The Newest Products For Your Newest Designs

Mouser and Mouser Electronics are registered trademarks of Mouser Electronics, Inc. Other products, logos, and company names mentioned herein, may be trademarks of their respective owners.
programmer’s toolbox

Listing 1 Compute the average. the sum of any pair—or all the pairs—
would be zero, and so could be the aver-
/* Compute a running average of a set of input values age residual. That would tell us nothing
* about the real quality of the data set.
* Jack W. Crenshaw A measure that does work, though,
* is to take the sum of the squares of the
* For illustration only. Will only work if used for a single residuals. This, of course, is the same
* data set.
measure that’s used in linear regression
* For production, consider a C++ class
and similar curve fits. Can we say “least
squares”? Duh. More precisely, let’s de-
double Avg(double x)
fine variance as the mean of the squared
{ residuals:
static int n = 0;
static double s = 0;
1 N 2 1 N
∑e = ∑ y − y )
s += x; V= (13)
N i =1 i N i =1 i
return s/n;
} Once we have the variance, the stan-
dard deviation is easy. It’s just the square
root of the variance:

per program. If there were ever a case for a and also the mean value. As you can see, (14)
σ= V
C++ class, this is it. That task, I’m leaving the data items themselves bounce
“as an exercise for the student.” But the around quite a bit. As often happens, This quantity, standard deviation,
code I’ve shown should give you the idea. none of them are actually equal to the is supposed to be a measure of the
mean value. For each value of n, the val- amount of noise in our signal. Take a
WE NEED A VARIANCE ue in the set differs from the mean by an look at Figure 2. This is the same graph
That task—of computing an average— amount called the residual. as Figure 1, but I’ve added the two hor-
wasn’t too hard, was it? I hope it didn’t tax izontal, dashed lines a distance σ above
your brains too much. Here’s your next ei = yi − y (12) and below the average value. You get
challenge: For the same data set, compute the impression that new measurements
the variance and standard deviation. It would be nice if we had some sin- of y are usually going to lie in the band
Whoa! You want me to . . . what? gle scalar measure—we might even call it between these two limits. Are the limits
Did we cover that in class? a variance—of the quality of the data. absolute? Certainly not. As you can see,
Not yet, but we’ll do it now. In gen- Clearly, this measure would have to in- five of the 10 points lie outside the
eral, the data in our sample data set y volve all the members of the data set. We band. Even so, these lines do seem to
tells us more than just its mean, or aver- could try adding all the residuals togeth- say something about the “scatter” in
age, value. It also tells us something er, or computing their average value, but the data, don’t they? From the defining
about the reliability of each data ele- that wouldn’t work. Because the residuals equations, it’s clear that the size of σ is
ment. In other words, it gives us insight can be positive or negative, they could going to depend on this scatter. If all
into how much noise hides inside the cancel each other, leaving us with a false the values yi are nearly equal, then σ
data. It gives us the statistics. impression. If, for example, the data val- will be small, and the band will be
In Figure 1, I’ve plotted the data set ues alternated around the mean, then more narrow. In the limit, when there

A plot of the data set and the mean value. Example with error bounds. The sequential process.
7 7 7
6 6 6
5 5 5
4 4 4

3 3 3
2 2 2
1 1 1
0 0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12
n n n
Figure 1 Figure 2 Figure 3

12 MARCH 2011 | embedded systems design |

is no noise, or scatter, at all, all the Listing 2 Compute the mean and variance.
measurements will be equal, the value
of σ will go to zero, and so the band /* Compute a running average and variance of a set of input
* values
will have zero width. *
You’ll note that I haven’t said any- * Jack W. Crenshaw
thing, so far, about probabilities, or *
* For illustration only. Will only work if used for a single
distribution functions, or any of those * data set.
terms that relate to statistics. And I * For production, consider a C++ class
won’t, in this column. We’ll have plen- */
ty of time for that, later. For now, you void AvgVar(double x, double & mean, double & var)
only need to get the concept that the {
standard deviation is a measure of the static int n = 0;
static double sum = 0;
amount of scatter, or noise, in the static double sumSq = 0;
sum += x;
sumSq += x*x;
We started this discussion by consider- mean = sum/n;
ing the batch processing of the average, var = sumSq/n - mean*mean;
assuming that we had the entire data set
as embodied in y. If we look at the
problem of computing the variance
from the same perspective, it clearly Because we know that y៮ itself contains a N
isn’t much harder. First, sum all the ele- summation, it’s tempting to substitute ∑y 2
= s N = Ny (18)
i =1
ments of y, and divide by N to get the that sum here. But that would be a mis-
mean y៮. Then use that in Equation 12 to take; it leads to more complexity, while Substitute this into Equation 17 to get:
calculate the residuals term by term. we’re looking for simplicity.
Square them, add them, and divide by The real trick is to recognize that, 1⎡N 2 ⎤
N to get the variance. regardless how we calculate y៮, it’s just a
N ⎢⎣ i =1
( )
⎢∑ yi − 2 y Ny + Ny 2 ⎥
The question is, can we do the same number by the time it shows up in
thing for variance that we did for the Equation 16. It’s not an indexed num- Or:
mean? Can we come up with a sequen- ber, and it isn’t changing as we generate
tial process for the variance? the sums. As far as the summations are 1 N 2
At first glance, it may seem unlikely. concerned, it’s a mere constant, and we
V= ∑ y −2y2 + y2
N i =1 i (19)
The problem is that all the residuals, can boost it out of the summation
from e1 through en, depend on the mean, process. Or simply:
which changes each time we add even Doing that, we can write:
one more data element. From Equation 1 N 2 2
13, it sure looks like we must recalculate
V= ∑y −y
N i =1 i (20)
1⎛N N N ⎞
the entire variance from scratch. V = ⎜⎜∑ yi2 − 2 y ∑ yi + y 2 ∑1⎟⎟ (17)
But looks can be deceiving. You can N ⎝ i =1 i =1 i =1 ⎠ We’re now in a position to calculate the
see the process with more clarity if we ex- variance, and therefore the standard de-
pand the polynomial in Equation 13. Check that last sum. Is that cool, or viation, sequentially. We’ll have to
Write: what? It simply says that we should add maintain another sum; call it:
1 to itself N times. Guess what result N

1 N un = ∑ yi2
( )
we’ll get. That’s right, it’s N. The second (21)
V = ∑ yi2 − 2 yi y + y 2 (15) i =1
N i =1 sum is interesting, also. Does that sum:
N Each time we get a new value of yi,
Now let’s split up the three terms so that ∑y i we can now compute the running sum
i =1
we sum them separately: sn as before, plus the running sum of
look familiar? It’s the same sum that’s squares, un. From sn, compute the latest
used to generate the mean in the first value of y៮. Plug that into Equation 20 to
V = ∑ yi2 − ∑ 2 yi y +∑ y 2 (16) place. See Equations 6 and 10. From
i =1 i =1 i =1 them, we can see that: CONTINUED ON PAGE 32 | embedded systems design | MARCH 2011 13

cover feature

Static code analysis has been around as long as software itself, but you’d swear from
current tradeshows that it was just invented. Here’s how to choose the right
code-analysis tools for your project.

Think static analysis

cures all ills?
Think again.

S tatic analysis (or static code analysis) is a field full of contra-

dictions and misconceptions. It’s been around as long as soft-
ware itself, but you’d swear from current trade shows that it
was just invented. Static analysis checks the syntactic quality
of high-level source code, and yet, as you can tell from listen-
ing to the recent buzz, its findings can be used to predict
dynamic behavior. It is a precision tool in some contexts and
yet in others, it harbors approximations.

With this extent of contradiction,

it’s hard to believe that all of these
statements are accurate. Static analysis,
There isn’t much point in any such
analysis existing in isolation since even
if code is perfectly written, it’s only cor-
a generic term, only indicates that the rect if it meets project requirements.
analysis of the software is performed It’s therefore also important to under-
without executing the code. So, simple stand how well any such analysis fits
peer review of source code fits the defi- within the development lifecycle.
nition just as surely as the latest tools No analysis is good or bad simply by
with their white papers full of various virtue of being static or dynamic in na-
incantations of technobabble. ture. It follows that each analysis tool is

14 MARCH 2011 | embedded systems design |

cover feature
neither good nor bad or perhaps more
pertinently, appropriate or inappropriate
just because they’re statically or dynami- ! Test-tool vendors offer a
plethora of combinations.
when there is a requirement to predict
dynamic behavior. At that point, the
dynamic analysis of code that has been

cally based. It’s, then, important to look compiled, linked, and executed offers
past the subtle advertising and self-con- Despite their lofty claims, an alternative to the prediction of dy-
gratulatory white paper proclamations no single vendor touts an namic behavior through static analysis.
to consider the relevant merits and de- Dynamic-analysis tools involve the
merits of static analysis and its ability to
predict dynamic behavior. Can a solid
static-analysis engine bypass the need for
dynamic analysis? In this article, I ex-
! offering that embraces all
of these attributes.
compilation and execution of the
source code either in its entirety or on a
piecemeal basis. Again, while many dif-
ferent approaches can be included,
plore current technologies and explain The first three are attributes of stat- these characteristics complete the list of
how static analysis predicts dynamic be- ic-analysis tools. Notably, these attrib- the five key attributes that form the
havior. This article will help developers utes don’t comprehensively describe the fundamental “toolbox of techniques.”
understand which method to use under categories of static-analysis tools, and Test-tool vendors offer a plethora of
which circumstances. many tools include more than one of combinations of these key static and
We’ll look specifically at five key at- these attributes. dynamic attributes and claim their par-
tributes of analysis tools, shown in the The issue of what’s static and dy- ticular combination to be invaluable to
sidebars. namic analysis is further confused the efficient and effective development
of software.
Despite their lofty claims, no single
KEY ATTRIBUTES OF STATIC-ANALYSIS TOOLS vendor touts an offering that embraces
all of these attributes. And, attempting
1. Automated code review—automates the peer review process to enforce
to apply every one of the five tech-
coding rules dictating coding style and naming conventions and to
niques through a combination of tools
restrict commands available for developers to a safe subset.
would usually be prohibitively expen-
Code review doesn’t predict dynamic behavior, except to the
sive both in terms of capital investment
extent that code written in accordance with coding standards can be
for the tools and labor costs for soft-
expected to include fewer flaws that might lead to dynamic failure.
ware testing.
2. Formally-defined language—(such as SPARK Ada) defines desired com-
ponent behavior and individual run-time requirements. This may
Given that none of the vendors are keen
be in the form of specially formatted comments in the native lan-
to highlight where their own offering
guage that are ignored by a standard compiler but can be stati-
falls short, some insight into how to
cally analyzed to show that the program is “well-formed,” consis-
reach such a decision on your own
tent with the design information included in its annotations, and
would surely be useful.
has certain properties specified in those annotations. The annota-
By considering which attributes are
tions therefore make it possible to precisely predict dynamic be-
most appropriate for a particular situa-
havior via static analysis.
tion, you then know which product to
The Larch/C++ approach is similar in concept and uses a
predicate-oriented interface language.
Although vendors make presenta-
tions assuming developers are to work
3. Prediction of dynamic behavior through static analysis—models the high-level
on a virgin project where they can pick
code to predict the probable behavior of the executable that would
and choose what they like, that’s often
be generated from it. This approach builds an approximate mathe-
not the case. Many development projects
matical model of the code and then simulates all possible execution
enhance legacy code, interface to existing
paths through that model, mapping the flow of logic on those paths
applications, are subject to the develop-
coupled with how and where data objects are created, used, and
ment methods of client organizations
and their contractual obligations, or are
This approximation is used to predict anomalous dynamic be-
restricted by time and budget.
havior that could possibly result in vulnerabilities, execution failure,
The underlying direction of the or-
or data corruption at run time.
ganization for future projects also influ-
ences choices:

16 MARCH 2011 | embedded systems design |

cover feature
• Is this a quick fix for a problem
project in the field? Is the search for
a software-test tool that will resolve ! With new development
the analysis can begin as
tend to be limited to highly safety-criti-
cal applications where functional in-
tegrity is absolutely paramount over

a mystery and occasional run-time any financial consideration (such as
error crash in final test?
soon as any code is writ- flight-control systems). Even then, the
• Maybe there is a development on
the order books that involves lega-
ten—no need to wait for alternatives outlined here often prove
to be more financially prudent and of-

cy code requiring a one-off change
for an age-old client, but which is
unlikely to be used beyond that.
Perhaps you have existing legacy
! a compilable code set, let
alone a complete system.
fer similar levels of quality.
Unlike the prediction of dynamic
behavior through static analysis, the use
of Design by Contract principles often
code and want to raise the quality of number of rules checked for that in the form of specially-formatted com-
software development on an ongo- standard will vary from one tool to ments in the high-level code can accu-
ing basis for new developments another. rately formalize and validate expected
and/or the existing code base. • How easily you can adopt the tool
into your development process.
run-time behavior of source code.
Such an approach requires a formal
Or perhaps there is a new project to and structured development process,
consider, but the lessons learned from HOW THE FIVE ARE USED textbook style, and uncompromising
past problems suggest that ongoing en- I describe tools slightly out of the order precision. What’s more, applying such
hancement of the software develop- from which they appear in the sidebars: an approach to legacy code would in-
ment process would be beneficial. Automated code review can be ap- volve a complete rewrite of it.
To address your particular situa- plied whether the code under develop- Unit testing and execution tracing
tion, it’s initially useful to consider how ment is for a new project, an enhance- focus on the behavior of an executing
each of the five key attributes fits into ment, or a new application using existing application and are therefore aspects of
the development process. code. With legacy applications, automat- dynamic analysis. Unit, integration, and
The diagram in Figure 1 superim- ed code review is particularly strong for system analyses use code compiled and
poses the different analysis techniques presenting the logic and layout of such executed in a similar environment to
on a traditional “V” development mod- code in order to establish an under- that which is being used by the applica-
el. Obviously, your particular project standing of how it works with a view to tion under development.
may use another development model. further development. On the other Unit testing traditionally employs a
In truth, the analysis is model agnostic. hand, with new development the analy- bottom-up testing strategy in which
A similar representation could be con- sis can begin as soon as any code is writ- units are tested and then integrated
ceived for any other development ten—no need to wait for a compilable with other test units. In the course of
process model—waterfall, iterative, ag- code set, let alone a complete system. such testing, individual test paths can
ile, and so forth. Formally-defined languages (or be examined (execution tracing) to es-
The extent to which it is desirable formal methods) are labor intensive tablish the most comprehensive cover-
to cover all elements of the develop- and, although they have their benefits, age analysis. Clearly it’s not necessary
ment cycle depends very much on the
initial state of development and the de-
During the coding phase, the appli-
cation of coding standards and hence the 4. Execution tracing (or code coverage analysis)—details which parts of com-
use of automated code review is the least piled and linked code have been executed often by means of soft-
controversial of the five attributes. Many ware instrumentation probes that are automatically added to the
tools help with this phase of develop- high-level source code before compilation.
ment, so product selection hinges on:
5. Unit testing—snippets of software code are compiled, linked, and
built in order that test data (also called “vectors”) can be specified
• Support of specific standards you
need to comply with (such as and checked against expectations.
MISRA, IEC 62304, CERT C, or Unit testing can be extended to include the automatic defini-
internal company or project stan- tion of test vectors by the unit test tool itself.
dards). Note that where tools claim
to cover the same standard, the | embedded systems design | MARCH 2011 17

to have a complete code set available in or less rigorously-developed applica-
order to initiate tests such as these. tions. It doesn’t rely on a formal devel-
Unit testing is complemented by opment approach and can simply be
functional testing, a form of top-down applied to the source code as it stands,
testing. Functional testing executes func- even when you have no indepth knowl-
tional test cases, perhaps in a simulator edge of the code. That ability makes
or in a target en- this methodology
vironment, at sys-
tem or subsystem
level. ! The goal of finding every
defect in a nontrivial
very appealing for
a development
team in a fix—per-
Clearly, these
dynamic ap-
proaches test not ! program is unreachable
unless approximations are
haps when
timescales are
short, but cata-

only the source strophic and un-
code, but also the included, which lead to predictable run-
compiler, linker, “false positive” warnings. time errors keep
development en- coming up during
vironment, and system test.
potentially even target hardware. When Prediction of dynamic behavior
the functionality of the code is the pri- via static analysis can take a number of
mary concern, you have little alternative forms, but it commonly uses the soft-
but to deploy dynamic analysis. Unit test ware source code as a model to predict
or system test must deploy dynamic the behavior of that code when it’s ex-
analysis to prove that the software actu- ecuted. Each operation is statically
ally does what it is meant to do. evaluated against a superset of the
Alternatives do exist when robust- whole range of operating conditions
ness testing is the key concern, however. that can occur during program execu-
The static prediction of dynamic tion. The developer can then analyze
behavior works well for existing code the whole data set applicable to the

The five fundamental test-tool attributes directly relate to the specific development
stages of design, code, test, and verification.

Requirements Acceptance
analysis and
and specification maintenance
Formally-defined language

Formally-defined language

Detailed Software
software design integration
Automated code review

Unit testing
static analysis

Execution tracing

Coding Unit/module

Prediction of dynamic behavior through

Formally-defined language

Automated code review

Unit testing

Execution tracing

Figure 1

18 MARCH 2011 | embedded systems design |

code under test, rather than discrete which by definition will lead to “false
data points. Because this approach positive” warnings.
works with a model, developers can The complexity of the mathemati-
potentially use this technique before cal model also increases disproportion-
any code can be built and run. Such an ately to the size of the code sample un-
advantage makes it appear to offer a der analysis. This is often addressed by
universal solution. the application of simpler mathemati-
There is, however, a major down- cal modeling for larger code samples in
side. The code itself is not executing, order to keep the processing time for
but instead is being used as the basis the analysis within reasonable bounds.
for a mathemati- But the simplifica-

cal model. As tions can increase
proven by the Dynamic unit and system the number of
works of Church, testing enable you to these “false posi-
Gödel, and Tur- tives,” which has a
ing in the 1930s,
it is an unfortu-
nate fact that the ! prove that the code is
robust and does what it
significant impact
on the time re-
quired to interpret
resulting mathe-
matical model is
always an ap-
and what’s more,
! should do in the environ-
ment where it will operate.
results. This trade-
off can make the
whole static-pre-
diction approach
unusable for com-
even with future enhancements or en- plex applications.
tirely new developments that will al-
ways be the case for the static predic- DYNAMIC APPROACH
tion of dynamic behavior. Another alternative to the static predic-
It will never be possible for the tion of dynamic behavior involves the
model to be a precise representation of automatic definition of test vectors us-
the code because such representation is ing unit-test tools and hence perform-
mathematical insoluble for all but the ing dynamic analysis.
most trivial examples. In other words, Unlike the static prediction of dy-
the goal of finding every defect in a namic behavior, this dynamic ap-
nontrivial program is unreachable un- proach does not and can never analyze
less approximations are included, the whole data set applicable to the



The intent of this specific practice is to maintain the bidirectional trace-

ability of requirements for each level of product decomposition.
When the requirements are managed well, traceability can be estab-
lished from the source requirement to its lower-level requirements and
from the lower-level requirements back to their source. Such bidirec-
tional traceability helps determine that all source requirements have
been completely addressed and that all lower level requirements can
be traced to a valid source.
Requirements traceability can also cover the relationships to other enti-
ties such as intermediate and final work products, changes in design
documentation, and test plans. (ISO 26262 standard)

20 MARCH 2011 | embedded systems design |

cover feature
code under test. It does, however, in- REQUIREMENTS MANAGEMENT tional traceability. These standards
volve the intelligent analysis of likely AND TRACEABILITY place constant emphasis on the need
problem values such as boundary and Most test tools ignore the requirements for the derivation of one development
inflection conditions, and specifically element of software development. That tier from the one above it.
includes both them and other target is reflected in Figure 1, in that none of Such an approach lends itself to the
values as it automatically generates the the five key attributes directly covers model of a continuous and progressive
test vectors. requirement traceability at all. But the use first of automated code review, fol-
It’s significant that this technique is fact is that static and dynamic analyses lowed by unit test and subsequently
deploying most (or potentially all) of the fail to prove compliance with the func- system test with its execution tracing
development environment and hence tional requirements of the system. Even capability to ensure that all code func-
testing something that reflects the fin- the best static and dynamic analyses tions just as the requirements dictate,
ished product much more accurately will not prove that the software fulfills even on the target hardware itself—a
than the static prediction of dynamic be- its requirements. requirement for the more stringent lev-
havior. Despite that advantage, it too can Widely accepted as a development els of most such standard.
be deployed very early in the develop- best practice, requirements traceability While this is and always has been a
ment cycle. ensures that all requirements are imple- laudable principle, last minute changes
mented and that all development arti- of requirements or code made to cor-
WHAT ARE WE TESTING HERE? facts can be traced back to one or more rect problems identified during test
Perhaps the most telling point with re- requirements. Most static and dynamic tend to put such ideals in disarray.
gards to the testing of dynamic behav- analysis vendors fail to provide what is Despite good intentions, many
ior—whether by static or dynamic needed by modern standards such as projects fall into a pattern of disjointed
analysis—is precisely what is being the automotive industry’s draft stan- software development in which re-
tested. Intuitively, a mathematical dard ISO/DIS 26262 or the medical in- quirements, design, implementation,
model with inherent approximations dustry’s IEC 62304 requiring bidirec- and testing artifacts are produced from
suggests far more room for uncertain-
ty compared with code being com-
piled and executed in its native target
If the requirement is for a quick-fix
solution for some legacy code that will
find most problems without involving a

Announcing SMX v4 ®
deep understanding of the code, the
prediction of dynamic behavior via
static analysis has merit. Similarly, this
approach offers quick results for com-
pleted code that is subject to occasional
dynamic failure in the field.
However, if you need to prove not
only the functionality and robustness of
the code but also provide a logical and It’s a new day at Micro Digital
coherent development environment
along with an integrated and progres-
• New API • Central Middleware Porting Layer
sive development process, it makes
• Kernel Improvements • Multitasking Kernel, File Systems,
more sense to use dynamic unit and
• smxBase Foundation Module GUI, Networking, USB, WiFi
system testing. Dynamic unit and sys-
tem testing enable you to prove that the • Codebase Cleanup & Simplification • ARM, Cortex, Coldfire, PowerPC, X86
code is robust and does what it should • Same Reliable Engine • Royalty-Free, Full Source Code
do in the environment where it will ul-
timately operate.
As soon as the process becomes a
critical factor, the case for an extensive 35th ANNIVERSARY
dynamic element to test is compelling. | embedded systems design | MARCH 2011 21
cover feature
isolated development phases. Such iso- back to earlier phases, and a surround- each individual phase may be conduct-
lation results in tenuous links among ing framework of configuration man- ed efficiently, the links between devel-
requirements, the development stages, agement and process (such as Agile opment tiers become increasingly
and the development teams. and the Rational Unified Process). poorly maintained over the duration
The traditional view of software Traceability is assumed to be part of of projects.
development shows each phase flow- the relationships between phases. The answer to this conundrum lies
ing into the next, perhaps with feed- However, the reality is that while in the requirements traceability matrix
(RTM), shown in Figure 2, which sits at
RTM sits at the heart of the project defining and describing the interaction the heart of any project even if it’s not
between the design, code, test, and verification stages of development. identified as such. Whether the links are
physically recorded and managed, they
still exist. For example, a developer cre-
Manage requirements; Map requirements to ates a link simply by reading a design
Software assign verification and architecture; generate Model or
requirements debug tasks; track defects defects design specification and using that to drive the
and defect specification implementation.
This alternative view of the devel-
opment landscape illustrates the impor-
Project Software tance that should be attached to the
managers Requirements engineers RTM. Due to this fundamental centrali-
matrix ty, it’s vital that project managers place
(RTM) sufficient priority on investing in tool-
ing for RTM construction. The RTM
Test cases Code base
must also be represented explicitly in
any lifecycle model to emphasize its im-
Verifies requirements
Implement requirements; portance as Figure 3 illustrates. With
map to code and verify
against test cases;
design; generate defects this elevated focus, the RTM is con-
generate defects
structed and maintained efficiently and
Test Development
engineers and build accurately.
engineers When the RTM becomes the center
Figure 2
of the development process, it has an
impact on all stages of design from
high-level requirements through to tar-
The requirements traceability matrix (RTM) plays a central role in a development get-based deployment.
lifecycle model. Artifacts at all stages of development are linked directly to
requirements matrix and changes within each phase automatically update the RTM • The Tier 1 high-level requirements
might consist of a definitive state-
so that overall development progress is evident from design through coding and test.
ment of the system to be developed.
Tier 1 High-level
This tier may be subdivided de-
requirements pending on the scale and complexi-
Graphics to high-level ty of the system.
Requirements traceability matrix
Tier 2 Model/
hand code
Low-level to
high-level • Tier 2 describes the design of the
system level defined by Tier 1.
Low-level requirements
to code
Requirements traceability matrix Above all, this level must establish
Code and quality
Tier 3 Implementation
review defects
links or traceability with Level 1
(Source/Assembly code) and begin the process of con-
Requirements traceability matrix structing the RTM. It involves the
Static and
Tier 4 Test cases to
Host low-level capture of low-level requirements
requirements that are specific to the design and
Requirements traceability matrix
Test cases to implementation and have no
Automated Tier 5
test Target low-level impact on the functional criteria
of the system.
• Tier 3’s implementation refers to
the source/assembly code devel-
Figure 3

22 MARCH 2011 | embedded systems design |

You Dream it.
We Can Make it Wireless
with ModFLEX™

ModFLEX™ gives you the

to bring your wireless
product to market

Purchase certified
off the shelf modules.

License certified design
files at anytime.

*LS Research offers royalty free license options.

The lead er in wirele s s p r o d uc t d e ve lo p me nt

Learn more at:
cover feature
oped in accordance with Tier 2. element of formal verification. Where requirements traceability is
Verification activities include code This frequently consists of a simple key within a managed and controlled
rule checking and quality analysis. confirmation that the host-based development environment, the progres-
Maintenance of the RTM presents verification performed previously sive nature of automated code review
many challenges at this level as can be duplicated in the target followed by unit, integration, and sys-
tracing requirements to source environment, although some tests tem test aligns well within the overall
code files may not be specific may only be applicable in that tiered concept of most modern stan-
enough and developers may need environment itself. dards. It also fulfills the frequent re-
to link to individual functions. quirement or recommendation to exer-
In many cases, the system is Where reliability is paramount cise the code in its target environment.
likely to involve several functions. and budgets permit, the static analysis Where robustness testing is consid-
Traceability of those functions of dynamic behavior with its “full ered desirable and justified, it can be
back to Tier 2 requirements in- provided by means of the automatic
cludes many-to-few relationships. definition of unit-test vectors, or
It’s very easy to overlook one or
more of these relationships in a
manually managed matrix. ! Where reliability is
paramount and budgets
through the use of the static prediction
of dynamic behavior. Each of these
techniques has its own merits, with the
• In Tier 4 host-based verification,
formal verification begins. Once
code has been proven to meet the ! permit, the static
analysis of dynamic
former exercising code in its target en-
vironment, and the latter providing a
means to exercise the full data set

relevant coding standards using rather than discrete test vectors. Where
automated code review, unit, inte-
behavior would provide budgetary constraints permit, these
gration and system tests may be a complementary tool mutually exclusive benefits could justi-
included in a test strategy that may fy the application of both techniques.
be top-down, bottom up, or a com- Otherwise, the multifunctional nature
bination of both. Software simula- range” data sets would undoubtedly of the unit-test tool makes it a cost ef-
tion techniques help create auto- provide a complementary tool for fective approach.
mated test harnesses and test case such an approach. However, dynamic If there is a secondary desire to
generators as necessary, and execu- analysis would remain key to the evolve corporate processes towards the
tion histories provide evidence of process. current best practise, both automated
the “testedness” of the code. code review and dynamic analysis tech-
Such testing could be supple- WHICH APPROACHES SHOULD niques have a key role to play in re-
mented with robustness testing if YOU CHOOSE? quirements management and traceabil-
required, perhaps by means of the Each of the five key test tool attributes ity, with the latter being essential to
automatic definition of unit test has merit. show that the code meets its functional
vectors, or through the use of the There is a sound argument which objectives.
static prediction of dynamic supports traditional formal methods, If the aim is to find a pragmatic so-
behavior. but the development overheads for lution to cut down on the number of
Test cases from Tier 4 should such an approach and the difficulty in- issues displayed by a problem applica-
be repeatable at Tier 5 if required. volved in applying it retrospectively to tion in the field, each of the robustness
At this stage, we confirm that existing code limits its usefulness to the techniques—that is, the static analysis
the software is functioning as in- highly safety critical market. of dynamic behavior and the automat-
tended within its development en- Automated code review checks for ic definition of unit-test vectors—has
vironment, even though there is the adherence to coding standards, and the potential to isolate tricky problems
no guarantee it will work when in is likely to be useful in almost all devel- in an efficient manner. ■
its target environment. However, opment environments.
testing in the host environment Of the remaining approaches, dy-
first allows the time-consuming namic analysis techniques provide a Mark Pitchford is a field applications en-
gineer specializing in software test with
target test to merely confirm that test environment much more represen- with LDRA. He has over 25 years’ expe-
the tests remain sound in the target tative of the final application than stat- rience in software development for engi-
environment. ic predictions of dynamic analysis and neering applications, the majority of
which have involved the extension of ex-
• Tier 5’s target-based verification
represents the on-target testing
offers the means to provide functional
isting code bases.

24 MARCH 2011 | embedded systems design |


Research shows that preemption-threshold scheduling helps to mitigate the
deadline-vs.-overhead tradeoff faced by developers of real-time systems.

Lower the overhead in

RTOS scheduling


ngineers creating real-time embedded applications typically use

a real-time operating system (RTOS) to develop a system as a
collection of independent tasks or threads, while maintaining re-
sponsiveness to time-critical events. These threads must run
periodically or in response to some event and must complete
before a certain deadline. Hard real-time systems demand that
deadlines be met with absolute certainty, even under worst-case

conditions. However, the need to meet inhibiting preemption for a thread-spe-

deadlines can also result in increased cific range of priorities.1 PTS preserves
overhead, conflicting with a real-time the ability to meet deadlines 100% of the
system’s demand for low overhead. One time, with fewer high-overhead context
technology—preemption-threshold switches. Developers can elect where and
scheduling (PTS)—reduces overhead when to use PTS and otherwise allow tra-
while maintaining the ability to meet ditional preemptive scheduling to occur.
deadlines under worst-case conditions. PTS gives them control over the respon-
In this article, I explain what PTS is, how siveness of each system thread and en-
it works, and how PTS addresses the ables them to guarantee that all threads
real-time programming challenges fac- meet their individual deadlines.
ing embedded system designers. PTS also offers other valuable bene-
Preemption-threshold scheduling re- fits for resource-constrained systems. PTS
duces the number of context-switches by is optimal among all preemption-limiting | embedded systems design | MARCH 2011 25

Join the industry’s leading embedded systems event
(Translation: Can’t miss event)
ESC brings together the largest community of designers, technologists,
business leaders and suppliers in one place.

Keynote Speaker
Steve Wozniak
Co-Founder, Apple Computer, Inc.
and electronics industry visionary delivers the
opening keynote speech on Tuesday, May 3rd
at ESC Silicon Valley!

Categories and Tracks that address the most relevant issues facing engineers and the industry.
Take a moment to review this content, so that you can customize your educational experience.
Applications Tools and Best Practices
• HMI and Multimedia • Best Practices
• Systems Architecture • Debugging and Optimizing
• Reliability, Security, and Perforance • Design and Test
• Remote Monitoring and Wireless Networking • Managing and Process
Embedded Software • Tools
• Linux/Android/Open Source Topics in Embedded-System Design
• Programming for Storage, I/O, and Networking • Architecture Design
• Programming Languages and Techniques • DSP, Communication, and
• RTOS and Real Time Software Control Design
• Software Processes and Tools • HW and Platform Design
• Windows for Embedded • Quality Design and
• Quality Design and Untellectual Property Intellectual Property
• Safety Design • Safety Design
Hardware for Embedded Systems
• Challenges and Solutions in Embedded Designs
• Connectivity and Security
• Memory in Embedded Designs
• Microcontrollers in
Embedded Designs
• Powering Embedded Designs
• Programmable Logic in
Embedded Designs
methods for minimizing a system’s total tems usually evolve over time, with es delays. Preempted tasks may have
stack memory requirements—critical each new generation of software ac- been using peripherals that need to
for systems with many threads. PTS also creting a new layer onto the existing remain powered during the preemp-
provides a useful mechanism for avoid- code base. This is also known as tion to maintain their state. Trying
ing priority inversion. “The Software Gas Law.” Hence, to guarantee the system will meet
over a long enough time the data real-time deadlines even with delays
WHY SMARTER PREEMPTION? memory requirements can fill avail- to change the state of power man-
Most RTOSes use fully-preemptive able memory, making RAM a pre- agement makes the analysis even
schedulers where a task can preempt any cious resource. This makes the use more difficult.
lower-priority task at any time. Preemp- of real-time kernels for low-end em-
tion allows critical processing to occur bedded systems particularly hard if To summarize, preemption im-
sooner than less-critical processing, de- not impossible. proves system responsiveness at the cost
laying other processing until time is of greater memory requirements and
available. An interrupt service routine more difficult timing analysis and per-
(ISR) is an example of such time-critical
processing, but the same concept is typi-
cally extended to cover all tasks in the ! Each preemption incurs
processing overhead. The
formance optimization.


system. Preemption reduces the re- Preemption need not be an all-or-noth-
sponse time of critical processing in the
scheduler must decide ing design decision. Many systems can in
system. A shorter response time can be which task to run, and fact meet all deadlines without full pre-
used to achieve a more responsive sys- emption among tasks. In fact, there is a
tem or more sophisticated processing.
Alternatively, the processor clock speed
can be reduced to save power, energy, or
cost (through the use of a less expensive
! then swap processor
spectrum of possible approaches, rang-
ing from nonpreemptable to fully pre-
emptable. Researchers have been actively
investigating this field and determining
processor). how to mathematically model the result-
However, preemption has draw-
• Systems with caches suffer from
cache pollution by other tasks
ing systems in order to understand the
design trade-offs better.
(“cache-related preemption delay”) PTS is a promising technique for
• Each preemption incurs processing
overhead. The scheduler must decide
unless precautions are taken. Unfor-
tunately, these precautions reduce
limiting preemptions.2 PTS tries to
minimize preemptions as much as pos-
which task to run, and then swap performance. As the preempting sible while preserving the system’s
processor states. Modern embedded Task B executes, it can evict data and schedulability. (Real-time analysis tech-
processors typically have some hard- instructions of the preempted Task niques enable us to determine mathe-
ware support to reduce this delay, A. When Task A resumes execution, matically if a set of tasks is schedula-
but it isn’t eliminated completely. it runs more slowly due to increased
• Since preemptions can occur asyn-
chronously among tasks, the system
cache misses, which greatly compli-
cates the worst-case timing analysis,
Definitions of terms.
Term Definition
needs enough RAM to fit all worst- since the preemption can typically DVS Dynamic voltage scaling
case stacks simultaneously. This can occur at any place within Task A. DM Deadline monotonic (scheduling)
be prohibitively expensive for sys- Worst-case timing analysis is critical EDF Earliest deadline first (scheduling)
tems with little RAM or many tasks. to ensuring real-time systems will ISR Interrupt service routine
This stack space must be large meet their deadlines. MPTAA Maximal preemption threshold
enough to accommodate worst-case
function call nesting, local variable
• Many embedded systems use dynam-
ic voltage and frequency scaling to PCP
assignment algorithm
Priority ceiling protocol
allocation, ISR activity, and possibly perform processing with as little en- PIP Priority inheritance protocol
the task control block and context. ergy or power as possible. Each task PTS Preemption-threshold scheduling
Adding features such as a GUI, em- has its optimal frequency/voltage RM Rate monotonic (scheduling)
bedded web server, and TCP/IP operating point and requires certain SRP Stack resource policy
stack will add multiple threads, each peripheral devices to be powered up. SRPT Stack resource policy with
needing its own stack. Even in sys- Allowing preemption introduces the preemption-thresholds
tems with larger amounts of RAM, possibility that the RTOS needs to WCET Worst-case execution time
running out of memory can be a transition to a new operating voltage WCRT Worst-case response time
common problem. Embedded sys- and frequency, which further increas- Table 1 | embedded systems design | MARCH 2011 27


Find Your Next

for the global
October 15, 2010

q The Big Idea THE BIG IDEA

Boardrooms in the spotlight

Lay of the Land
Next-gen battery battle page 4
Accountability Comes

Big q
IP Landscape
Trident, Zoran frame a debate page 7

EET on the QT
to the Boardroom
Aquantia circles wagons; IBM fab sale?;
Commercial space boost page 9 The scandal involving former Hewlett-Packard contractor, HP’s board had been embroiled
CEO Mark Hurd has trained a spotlight on in a variety of high-profile controversies,

q VC Watch the role of corporate directorial boards and
how they should function in a crisis.
including clashes with former CEO Carly
Fiorina (now a U.S. Senate candidate in
VC universe shrinks page 12
Corporate boards seldom attract much California) and a scandal involving spying
Samplify stretches page 14 attention, generally operating behind the on journalists that ultimately led to the res-
scenes and, in theory, representing share- ignation of then-chairwoman Patricia Dunn.

holders. Even before the Hurd scandal, HP’s latest fiasco illustrates what several
Market Data which allegedly involved falsified expense
Digital China shakes boxes page 16 reports and an improper relationship with a
The HP fiasco illustrates
Cable box market what experts agree is a
shifts east
Global set-top market shares reliable rule of thumb:
When a board makes
the news, it’s rarely
good for the

How much is that big idea worth? 30%


Priceless, some might say. EE Times Confidential 10%


provides the insight you need to drive your 2010

2011 2012
Cisco Juizhou

business forward. With fast-paced analysis Source: Digital Tech Consulting

Story, page 16

of the global electronics industry, EE Times

Confidential tracks the latest trends and market
research results, deciphers what it all means EE Times Confidential’s incisive analysis of
and provides actionable intelligence you can’t the global electronics industry includes:
find in a Google search. p New business models, ecosystems
and technology/product trends
Breakthrough— first. p Quick but thorough reports on markets,
Get a leg up on your competition and find that technology and business issues
next big idea. You’ll profit from unique insights p Tracking VC investment, startup births and
and actionable intelligence you simply cannot deaths; new industry trends
find anywhere else. p The inside scoop on business deals and
new technology partnerships. Find out
which executive is moving where
Subscribe Now p Analysis of what’s behind the news p Market intelligence, including pricing,
For more information contact Linda Uslaner inventory, production and market or 516.562.5843 movements

lobal elec tronics indu
intelligen ce for the g
ble—all of its tasks will meet their deadlines in all possible Listing 1 Modified maximal preemption-threshold
scheduling situations—a fundamental requirement for many assignment algorithm.
embedded systems.)
Researchers have been trying to understand how and when Algorithm 1 MPTAA(T , Π)
preemptions can be reduced or even eliminated. Baker intro- 1: Γ = ΓI /* initialize preemption threshold to identity
duced the Stack Resource Policy (SRP).3 Gai et al. extended it assignment */
2: for i = N down to 1 do
to support PTS, resulting in the stack resource policy with pre- 3: j = i + 1;
emption-thresholds (SRPT).4 SRP and SRPT prevent priority 4: while ( schedulable == TRUE and γi < N) do
inversions and deadlocks and bound the maximum time any 5: γi = γi + 1;
6: /* check the schedulability of the affected task */
task can be blocked. 7: schedulable = is_task_schedulable(Tj);
8: if (schedulable == FALSE) then
10: end if
Normal fully-preemptive scheduling assigns a single priority to 11: j = j + 1;
each task. This priority determines two aspects of the task’s be- 12: end while
havior—whether it can preempt another task and whether it 13: schedulable = TRUE;
14: end for
can be preempted by another task. PTS assigns a separate pri- 15: return Γ;
ority to each aspect: The nominal priority determines whether
this task can preempt other tasks, and the preemption-thresh-
old sets this task’s effective priority while executing. rithm (MPTAA) because it will always find the preemption-
When the task begins execution, its priority is raised to its threshold assignment that is larger than any other feasible pre-
preemption-threshold. In this way, all the tasks with priorities emption-threshold assignment.
less than or equal to the preemption-threshold of the executing This algorithm can be used for both fixed-priority as well
task cannot preempt it. PTS effectively creates groups of tasks as dynamic-priority schemes. The procedure is_task_schedula-
that are not allowed to preempt each other. ble() evaluates the schedulability of the system using either lev-
It’s easily seen that fully-preemptive and fully-nonpreemp- el-i busy period analysis for fixed-priority systems, or by com-
tive scheduling policies are special boundary cases of PTS. By puting the maximal blocking a task can endure without
assigning the preemption-threshold of each task equal to its violating its schedulability for dynamic-priority schemes.
priority, PTS simplifies to a fully-preemptive scheduling policy. The MPTAA was analyzed and was shown to always find
By assigning the preemption-threshold of each task equal to the maximal preemption-threshold assignment if one exists.
the system’s highest priority, PTS simplifies to a fully-nonpre- This means the preemption-thresholds will be as large as
emptive policy. possible. Since in our case we always start with a feasible as-
signment (the identity assignment), the maximal preemp-
ASSIGNING PREEMPTION-THRESHOLDS tion-threshold assignment always exists (in the worst-case
Preemption between tasks can be limited to occur only when being equal to the identity assignment) and will always be
necessary to maintain system schedulability. Tasks that run found by the MPTAA.
nonpreemptively with respect to each other can be mapped
into the same run-time thread and share the same run-time PTS REDUCES NUMBER OF PREEMPTIONS
stack, minimizing the memory requirements and other pre- In 1999, Wang and Saksena evaluated how effectively PTS re-
emption overheads. duces the number of preemptions required to meet deadlines.2
Preemption-threshold assignment works by starting with a They found that PTS reduced preemptions by 5% to 32%
default, fully-preemptive system—each task’s preemption- when compared with fully-preemptive scheduling.
threshold is equal to its preemption level. This is also called the They evaluated randomly-generated periodic task sets.
identity assignment. Each task set consists of a number of tasks, each with a com-
Starting with the lowest-priority task (setting i to N in Line putation time Ci and period Ti (which is also its deadline).
2 of Listing 1), each task’s preemption-threshold γi is set to j The period is chosen randomly from the range 1 to MaxPeri-
and the system is evaluated for schedulability. This preemp- od using a uniform probability distribution function. The
tion-threshold j (and therefore γi) is incremented until any fur- computation time Ci is then defined with a uniform proba-
ther increase would make the system unschedulable. Wang and bility distribution function. This simulation approach allows
Saksena developed a heuristic shown in Listing 1 that can al- us to see the behavior of different scheduling approaches
ways find a feasible preemption-threshold assignment if it ex- across a range of workloads and evaluate their sensitivity to
ists with a time complexity of O(N 2 · q(N)) where q(N) is the different factors.
complexity of the schedulability checking function. We call this Wang and Saksena measured the number of context
algorithm the maximal preemption-threshold assignment algo- switches occurring in a 100,000 time-unit execution for | embedded systems design | MARCH 2011 29

Average percentage reduction in number of preemptions Number of context switches under PTS compared with
for PTS as compared with pure preemptive scheduling. preemptive scheduling.
Reduction in context switches with PTS

Normalized number of context switches

Percentage reduction in preemptions

45 0.8
MaxPeriod = 10
40 MaxPeriod = 20 0.7
MaxPeriod = 50
MaxPeriod = 100 0.6
30 MaxPeriod = 500 Normalized context switches
MaxPeriod = 1,000 Normalized effective context switches

Jejurikar and Gupta, 2004

Wang and Saksena, 1999
10 0.2

5 0.1

0 0
5 10 15 20 25 30 35 40 45 50 10 20 30 40 50 60 70 80 90 100

Number of tasks Percentage of the processor utilization at maximum speed

Figure 1 Figure 2

each of the workloads. The results are ated using PTS in a cache-based system sentially independent of processor uti-
presented in Figure 1 and show two in conjunction with dynamic voltage lization. The number of effective con-
interesting trends. PTS does reduce the scaling to reduce system energy re- text switches was reduced by 19% to
number of preemptions significantly, quirements.5 They found that PTS sig- 29%, improving memory system per-
depending on the number of tasks and nificantly reduced the number of con- formance by about the same amount.
the MaxPeriod parameter. First, the text switches required. Randomly-
number of tasks in a workload matters generated periodic task sets were used, PTS REDUCES STACK SPACE
because it determines how long the each with 10 to 20 tasks, and deadlines REQUIREMENTS
system runs before a task finishes, al- equaled periods. Periods ranged over a Threads in a nonpreemptive group can
lowing another task to run without factor of 10x. share stack space since their execution
preempting the first task. For constant The authors counted two types of is not interleaved. Run-to-completion
processor utilization, more tasks context switches (actual and effective) threads and some blocking threads
means the processor’s busy time is and compared them against those for a qualify. Baker’s Stack Resource Protocol
broken into smaller pieces, reducing fully-preemptive scheduling approach. calculates the possible maximum block-
the time to wait. Second, the MaxPeri- Effective context switches occur when a ing times of tasks, allowing a system de-
od parameter matters because a broad- task begins or ends execution, and are signer to determine whether deadlines
er range of periods (and hence dead- important because they directly affect can be met.3 SRP involves modifying
lines) provides more opportunities to cache performance by disrupting the the scheduler to not start executing a
limit preemptions while still meeting locality of memory references. task if its preemption level is not high
deadlines. A smaller range reduces pre- Jejurikar and Gupta in Figure 2 enough. With this, the task will not
emption reduction because task dead- showed that on average, PTS reduced block partway through, since it doesn’t
lines are more closely synchronized. the number of context switches by 90% start running unless all resources are
In 2004, Jejurikar and Gupta evalu- from the fully-preemptive approach, es- available.
My own research group and others
Stack space required for PTS with a fixed- Stack space required for PTS with a have evaluated how PTS can reduce
stack space requirements. We showed
priority policy rises with both task period dynamic-priority policy depends mostly
that PTS can be used with any schedul-
variability and system utilization. on task period variability. ing algorithm (such as rate monotonic
55 55
and earliest deadline first) and it will al-
Normalized stack utilization (%)

Normalized stack utilization (%)

= 25%
50 P 50 P = 25% ways render the smallest total stack space
P = 50% P = 50%
45 45 requirements6 when using the maximal
P = 75% P = 75% preemption-threshold assignment algo-
40 40
rithm described by Wang and Saksena.2
Ghattas and Dean, 20076

35 35 This further extends their finding that

the algorithm minimizes the number of
separate stacks required, which does not
25 25
55 60 65 70 75 80 85 90 50 55 60 65 70 75 80 85 90 necessarily minimize total stack space.
System utilization (%) System utilization (%) Gai extended SRP to support preemp-
Figure 3 Figure 4

30 MARCH 2011 | embedded systems design |

tion-thresholds and showed that it mini- the savings are only slightly dependent lization level than those in a fixed-prior-
mizes total stack space requirements.4 on the utilization level. ity scheme.
We investigated workload charac- On the other hand, as the variance In Part 2 of this series online at Em-
teristics that affect the stack size reduc- and standard deviation of the period, PTS offers advantages in
tion achievable through PTS. For exam- increases, the savings attainable de- the schedulability of real-time systems
ple, the optimal stack utilization for crease significantly at higher utiliza- while affording reduced overhead, and
some workloads through the use of PTS tions. This is because task WCETs are other benefits as well. ■
can be as small as 20% of the stack larger so it is much harder to maintain
space utilization required by the fully- the system’s schedulability while mini- Alexander G. Dean, PhD, is associate pro-
fessor in the Department of Electrical and
preemptive version of the system, while mizing preemptions. This becomes very Computer Engineering at North Carolina
others need 80%. We simulated and an- apparent at high system utilizations State University and a member of its Cen-
alyzed 40,000 randomly-generated sys- where there is far less slack time. ter for Efficient, Scalable and Reliable
tems of 10 threads each to better under- Second, consider the dynamic-prior- Computing. He received his BS EE from
the University of Wisconsin and MS and
stand PTS. The results shown in Figure ity scheme in Figure 4. The normalized PhD ECE degrees at Carnegie Mellon. His
3 are summarized below. stack space utilizations in this case are research focus in embedded systems in-
We first investigated the effect of much more uniform across all utilization cludes compiling for concurrency and per-
the system utilization on the optimal levels. This is because the dynamic prior- formance, energy efficient use of COTS
processors, memory allocation, and
stack utilization required with PTS. Us- ity scheme (EDF in this case) is much benchmarking for real-time systems and ro-
ing the MPTAA, the optimal stack more adaptive to the workload charac- bust embedded system design.
space required by each system was teristics and has a higher schedulable
computed and normalized to the stack utilization than a fixed-priority scheme ENDNOTES:
space required by the fully-preemptive such as RM. This more-efficient schedul- 1. Preemption-Threshold is a trademark of Ex-
version of the system. The average nor- ing allows more preemption limiting to press Logic, Inc.
malized stack utilizations were then occur before schedulability is lost. 2. Wang, Yun and Manas Saksena. “Scheduling
plotted as a function of the overall sys- Another interesting property is the fixed-priority tasks with preemption thresh-
tem utilization and the standard devia- distribution of the 40,000 systems old,” Real-Time Computing Systems and Ap-
tion in the task periods. The results are among the different normalized stack plications, 1999. RTCSA ‘99. http://ieeexplore.
shown in Figures 3 and 4 for the fixed- space utilization levels. To evaluate this
priority and the dynamic-priority property, we constructed histograms
3. Baker, T. P. “Stack-based scheduling of real-
schemes, respectively. showing the percentage of systems ver-
time processes,” Real-Time Systems Sympo-
First, consider the fixed-priority sus the normalized (optimal) stack space sium, 1990. Proceedings., 11th. http://ieeex-
scheme in Figure 3. At low utilizations, utilization achievable by each system.
the system’s stack space requirements Figures 5 and 6 show this distribution =128747&isnumber=3599.
might be less than 30% of those of a for the overall system utilization levels of 4. Gai, Paolo, Giuseppe Lipari, and Marco Di
fully-preemptive system. However, at 60%, 70%, 80%, and 90%, respectively. Natale. “Minimizing Memory Utilization of
higher utilization levels, the variation in Again we can see that the workloads Real-Time Task Sets in Single and Multi-
the tasks’ periods increasingly affects scheduled with the dynamic-priority Processor Systems-on-a-Chip,” IEEE Interna-
the savings attainable. With σP = 25% schemes depend less on the system uti- tional, 22nd IEEE Real-Time Systems Sympo-
sium (RTSS’01), 2001. http://doi.ieeecomputer
PTS and a fixed-priority policy dramatically PTS and a dynamic-priority policy reduce 5. Jejurikar, Ravindra and Rajesh Gupta. “Inte-
reduce the stack space required, even the stack space required even better grating Preemption Threshold Scheduling and
for high-utilization systems. than a fixed-priority policy. Dynamic Voltage Scaling for Energy Efficient
14 14 Real-Time Systems,” Proceedings of the Ninth
12 Usys = 60% 12 Usys = 60%
International Conference on Real-Time Comput-
ing Systems and Applications, 2004. http://dar.
Percent of systems (%)

Usys = 70%
Percent of systems (%)

Usys = 70%
Usys = 80% Usys = 80%
8 Usys = 90% 8 Usys = 90% 6. Ghattas, Rony and Alexander G. Dean. “Pre-
6 6 emption Threshold Scheduling: Stack Opti-
mality, Enhancements and Analysis,” Proceed-
4 4
ings of the 13th IEEE Real-Time and Embedded
2 2 Technology and Applications Symposium
0 0 (RTAS 2007), IEEE, 2007. A version of this ar-
20 25 30 35 40 45 50 55 60 65 70 20 25 30 35 40 45 50 55 60 65 70
Normalized memory requirements (%) Normalized memory requirements (%) ticle is at
Figure 5 Figure 6 ghattas-Preemption.pdf. | embedded systems design | MARCH 2011 31

programmer’s toolbox small, integer-only computers. Better yet, compute in dou-
ble precision, where the maximum value is a staggering
from page 13
get a running value of V and therefore of σ. How big is that? Imagine that you’re sampling a signal at
As before, it’s almost easier to show you the code than it the rather high (for embedded systems) rate of 1000 Hz.
is to describe the process. See Listing 2. Imagine also that you’ve chosen your units wisely, so that all
Next, look at Figure 3, a graph like Figure 2 only you see the measurements are of order 1. How long will it take for
the sequential values of the average and the error band, and your filter to overflow?
the way they evolve as new data comes in.
Perhaps you’ve seen graphs with error bars showing the It’s 10308÷1000=10305 sec = 3×10297yr.
expected deviations from each measurement. That’s what I’m
trying to display here, only Microsoft Excel won’t let me That’s 10287 times the current age of the universe!
show error bars. So the large black dashes will have to do. This range is head and shoulders different from, say, the
Y2K problem. The range of a double precision number is so
CAVEAT CALCULATOR staggeringly large, I think it’s safe to say we won’t see the
I need to mention one or two practical problems. Take a look sums overflow in our lifetimes.
at the running sums in Equations 10 and 21. See any poten- So one acceptable solution to the overflow problem is
tial problems? simply to ignore it. As long as you’re using floating point—or
That’s right. The two sums can grow without limit. If even better, double precision—arithmetic, the overflow is
you process the input signal at a high enough rate, and for simply not going to happen.
Another problem in the calculation is more challenging.
It lies in Equation 20. Suppose that the average value of the

! How long will it take for your filter to

overflow? That’s 10287 times the current
data set, y, are reasonably large, but the noise is very small.
The residuals in Equation 12 are all going to be small, as is
their sum in Equation 13.

! age of the universe! This range is head

and shoulders different from, say, the
But we aren’t summing residuals in Equation 20—we’re
summing the squares of the measurements themselves. This
sum can be a whole lot larger. For that reason, we can end up,

in Equation 20, subtracting nearly equal numbers and there-
Y2K problem. It’s safe to say we won’t by introducing a loss of precision.
see the sums overflow in our lifetimes. There is no quick cure for this problem. If you need the
benefits of sequential processing, you’ll have to accept that
your end result may not be as accurate as you might like.
long enough, sooner or later both sums are going to over- You can take quite a number of steps to fix the problem,
flow. Presumably, the sum of the squares is going to over- but none are straightforward, and all are very much problem
flow first. dependent. You might choose, for example, to reference the
We can do certain things to alleviate the problem. First, measurement to some expected value, other than zero. In that
note carefully that the measurements yi can have units. If Alpha Centauri example, a baseline of four light-years might
you’re measuring the distance to Alpha Centauri and using help.
units of millimeters, you’re going to have overflow problems Like most problems, this one can be solved but don’t ex-
a lot sooner. One thing you can do is to choose your units pect your solution to one problem to work in all possible
more carefully and normalize them to some standard value— other problems. In the real world of processing real-world
Astronomical Unit (AU), for example. signals, you can expect to tailor the solution to the problem.
Or you might split up the computation into stages. At How surprising is that?
some point, where the sums are starting to get out of hand,
you might just store away all the values to date and start a COMING SOON
new set of computations. This month, I’ve shown you how to calculate the mean, vari-
The easiest thing to do is to normalize all the numbers by ance, and standard deviation of a set of measurements. I
the number of samples. That is, maintain running averages showed you how to do it the classical way, processing all the
rather than running sums. We can do that easily, as shown in data in batch mode, and I’ve shown you the tricks to do the
Equation 9. same calculation sequentially, as the data comes in. I stu-
Interestingly enough, one approach to the problem is to diously avoided talking about statistics, so we clearly have
ignore it. This potential for overflow is an excellent argu- more things to talk about. I’ll do that next month. See you
ment for computing everything in floating point, even in then. ■

32 MARCH 2011 | embedded systems design |

By Jack G. Ganssle
break points
Watchdogs redux
N early a decade ago I wrote a
series of three articles about
watchdog timers (WDT). It
was my contention at the time that
most WDTs were poorly designed.
command. Try to write to it with any-
thing else in the MSB and the system
will reset. The lower bits have a vari-
ety of configuration information that
disables/enables the WDT, selects the
Too many still are.1 clock source, or even switches the
I won’t repeat my arguments here WDT to act as a simple timer.
since they’re available online on Em- What are the odds that crashed ( code will issue exactly a 0x5a in the
4024495). However, I stated that a upper byte? Pretty slim, of course.
WDT is the last line of defense But not zero, and probably a lot less
against product failure. Designed cor- than zero. A lot of move instructions
rectly, the system is guaranteed to re- will be in the code, and some will be
cover from a crash; anything else may followed by an ADD, which is what
result in a ticked-off customer. Or 0x5a represents. Much better would
loss of an expensive mission. Or, for be a configuration that allows one to
the unfortunate users, injury and access the WDT configuration only
death. once. Or perhaps only once after re-
Remember that when a program suming from a low-power mode.
crashes, it generally runs off and
starts executing code from some ran-
dom address. Rarely does the applica- ! Vendors and researchers
are trying some new
Then there’s Freescale’s newish
32-bit Coldfire+ line, like the
MCF51Qx.3 Instead of “watchdog,”
tion actually stop; if it does stop,
that’s usually only after it executes a
lot of incorrect instructions. ! tricks with watchdog
timers. Here are some
Freescale prefers the awkward phrase
“Computer Acting Properly” (COP).
But it does offer a very intriguing fea-

So what’s the state of the art to- ture. In general, one pets the watch-
day? A complete survey is impossible, notable attempts to dog, uh, COP, by writing 0x55 and
but here are a few data points. improve the old dog. then 0xaa to the control register. But
Texas Intruments’ MSP430 family in one mode, that sequence must be
is composed of a wide range of nifty sent in the last 25% of the COP time-
very low power 16-bit microcon- that disables the WDT. The system out period. A premature write results
trollers. The documentation shows a crashes and never recovers. in a reset. Odds of an errant program
very impressive-looking WDT block The MSP430’s instruction set is getting the timing Goldilocks-correct
diagram (see Figure 13-1 in TI’s refreshingly simple, generally using a (not too often, nor too infrequently)
PDF), but the reality is less thrilling.2 single word to represent, for instance, are tiny.
At any time, the code can turn the a MOV instruction. A nice feature is The part also generates a reset if
protection mechanism off, which that to change any WDT setting one any attempt is made to execute an il-
means a crashed program running uses a MOV with the upper byte set legal instruction. That’s somewhat
rogue code can issue an instruction to 0x5a; the lower 8 bits contain the different from most CPUs, which is-
sue an illegal op-code interrupt. I
rather like Freescale’s approach, since
interrupt handlers are not guaranteed
Jack G. Ganssle is a lecturer and consultant on embedded
development issues. He conducts seminars on embedded systems to work if the code crashes. A blown
and helps companies with their embedded challenges. stack, corrupt PC (on some CPUs if
Contact him at the PC is odd, a fault is taken), or a
vector base register changes. This also | embedded systems design | MARCH 2011 33

Virtual Conferences

When: Thurs., Mar. 24, 2011, 11am – 6pm EDT

On Demand
Virtual Conferences
Integrating Touch Interfaces
EE Times, the leading resource for design decision makers in
STMicroelectronics Virtual Conference:
the electronics industry brings to you a series of Virtual
Robust Ecosystem = Embedded Succes
Conferences. These fully interactive events incorporate online
learning, active movement in and out of exhibit booths and
System-on-chip 2.0 sessions, vendor presentations and more. Because the conference is virtual you can experience it from the comfort
of your own desk. So you can get right to the industry
Embedded Linux information and solutions you seek.
Why you should attend:
Approaching Multicore • Learn from top industry speakers
• Participate in educational sessions in real time
Advances in Power Management • Easy access to EE Times library of resources
• Interact with experts and vendors at the Virtual Expo Floor
• Find design solutions for your business

For sponsorship information, please contact:

Christian Fahlen, 415-947-6623 or
suggests that it’s a good idea they monitor addresses. They simply
to fill unused flash at link time suck in compressed trace data from
with an illegal op code, and on the processor’s serial trace data port.
power-up fill all of RAM with a The paper talks about using an ARM
similar instruction, so that er- CPU, but other parts also support
rant code waltzing through various kinds of serial trace data,
memory is likely to generate a and there’s even a standard
reset. named Nexus-5001, which is
Another nice touch is that the IEEE–ISTO 5001–2003.5
reset pin is open drain and is assert- ARM supplies a bewildering ar-
ed when any of these errors occur. Tie ray of debugging IP, including a
it to the peripheral reset inputs. Even if macrocell that sends trace data out
wandering code issues output instruc- just two pins. The so-called Program
tions their potentially scrambled little Flow Trace (PFT) is described at
brains will be straightened out. ARM’s infocenter.6 It can be set up in
STMicroelectronics has a line of a zillion different configurations, but
Cortex-M3 devices. The M3 has be-
come extremely popular for lower-end
embedded devices, and ST’s STM32F
! The WDT cannot be
disabled once enabled—
will at the very least emit compressed
information that lets one track
changes in program flow. For instance,
is representative of these parts (al-
though the WDT is an ST add-on and
does not necessarily mirror other ven- ! good thinking, folks!
But oddly, the other
the execution of a branch instruction
pushes address data out. So does
Branch with Link, which is ARM’s way

dors’ implementations). The STM32F configuration registers of calling a subroutine. Similar in-
has two different protection mecha- structions in Thumb mode do the
nisms. An “Independent Watchdog” is can be changed at will. same.
a pretty vanilla design that has little There’s a lot to like about the
going for it other than ease of use. But can be changed at will, which can Shankar and Lysecky’s approach. First,
their Window Watchdog offers more make the watchdog behave incorrectly. modern processors are barricaded be-
robust protection. When a countdown hind caches, pipelines, and speculative
timer expires, a reset is generated, A NOVEL APPROACH execution logic. Even if you’re using a
which can be impeded by reloading The latest issue of IEEE Embedded Sys- CPU that has address pins (as opposed
the timer. Nothing special there. But if tems Letters has an article that practi- to a self-contained microcontroller or
the reload happens too quickly, the cally grabbed me by the throat.4 Titled IP on an FPGA), those address lines
system will also reset. In this case “too “Control Focused Soft Error Detection simply don’t match anything that is
quickly” is determined by a value one for Embedded Applications” by going on in the processor’s grey cells.
programs into a control register. Karthik Shankar and Roman Lysecky, Second, pins are valuable. By
Another cool feature: it can gener- it’s a bit academic and rather a slog to squeezing addresses through a couple
ate an interrupt just before resetting. get through. But the authors have of debug pins few of these valuable re-
Write a bit of code to snag the inter- come up with a fascinating twist on sources are consumed. And logic is
rupt and you can take some action to, the concept of watchdog timers. In cheaper than pins; adding circuitry to
for instance, put the system in a safe fact, they don’t use the word “watch- decompress the trace stream and make
state or to snapshot data for debugging dog,” and no timers are involved. sense of it may cost very little. In the
purposes. ST suggests using the inter- The idea is quite simple and at first paper, the authors state that the over-
rupt service routine (ISR) to reload glance not particularly novel: monitor head for their particular approach was
the watchdog—that is, kick the dog so the addresses the processor issues and only 3% of the silicon area of an
a reset does not occur. Don’t take their compare those against a profile ob- ARM1156T.
advice. If the program crashes, the in- tained during development. If the Third, most watchdogs use ad hoc
terrupt handlers may very well contin- CPU goes to an unexpected location, approaches that just are not reliable
ue to function normally. And using an take some sort of remedial action. Do indicators of system operation. This
ISR to reload the WDT invalidates the the same if it doesn’t issue an expected new approach lets the designer decide
entire reason for a window watchdog. address. The authors go further and just how fine-grained the monitoring
The WDT cannot be disabled once compare against the number of ex- should be. Check an occasional call…
enabled—good thinking, folks! But pected loop iterations and the like. or watch every instance of every
oddly, the other configuration registers But what got my attention is how branch and call. | embedded systems design | MARCH 2011 35

break points

! There’s no overhead in
the system’s firmware.
shift all of the system’s addresses, re-
quiring the developer to re-profile the
code to determine where the branches
are. An alternative is to monitor just
they could offer some interesting

Zilch. Unlike traditional function calls, but use a level of indi- If you’re building an electronic tooth-
watchdog approaches, rection. Build a table of jump instruc- brush, watchdogs are probably not ter-
tions that lives at a fixed address in ribly important. But an automated re-

! address profiling is trans-

parent to the software.
memory; each entry corresponds to a
particular function. Make calls
through that table. Then monitor
those jumps. One would have to be
set helps boost consumer confidence
in our products’ quality. Everyone
hates the “remove batteries and wait
30 seconds” dance.
careful that the compiler didn’t Many vendors are putting more
optimize the jumps away. thought into their WDT designs; some
This does mean the are doing a pretty good job. But we
software changes a bit, of have a long way to go, and the wise de-
course. But more than a few veloper will apply sound engineering
of us use these jump tables practices to this often-neglected part
anyway. They can aid debug- of the system.
ging and sometimes simplify on- The article I cited shows that some
the-go firmware updating. ingenious approaches are being used.
ARM’s program flow trace also Consider adding a bit of hardware
very intriguingly sends address in- support if robustness is an important
formation out when a DSB, DMB, requirement. ■
or ISB instruction is executed. DSB
Fourth, there’s no overhead in the (Data Synchronization Barrier) holds ENDNOTES:
system’s firmware. Zilch. Unlike tradi- up program flow until all of the in- 1. Ganssle, Jack. “Born to Fail”
tional watchdog approaches, which re- structions before it complete; DMB, December 12, 2002.
quire one to seed the code with instruc- (Data Memory Barrier) ensures that Available at
tions to periodically kick the dog to keep memory accesses before it compete be- 2. Texas Instruments. “MSP430x5xx/
it from resetting the system, address pro- fore one after it starts, and ISB (In- MSP430x6xx Family User's Guide” Liter-
filing is transparent to the software. struction Synchronization Barrier) ature Number: SLAU208H, June 2008–
There are a few caveats, of course. flushes the instruction pipeline, ensur- revised December 2010. Available at
The logic needed to make sense of the ing that the instructions after the ISB
address information is substantial, and are fetched from memory or the cache.
is probably impractical unless imple- Why are these interesting instruc- 3. Freescale Semiconductor. “MCF51QM128
mented in an FPGA. Building such a tions? ARM mandates that at the very Reference Manual: Supports the
MCF51QM32, MCF51QM64, and
monitor would be a lot of work. But it least a DMB be issued before locking
MCF51QM128,” Document Number:
needs to be done once, and then can or unlocking a mutex, and before in-
MCF51QM128RM, Rev. 0, November 11,
ever after be used in a succession of crementing or decrementing a sema- 2010.
products. I can see this as packaged IP phore. One could monitor these, just 32bit/doc/ref_manual/MCF51QM128RM.
sold by a third party, or as an open like watching branches, to ensure that pdf
source project. the code is properly going through all 4. Shankar, Karthik and Roman Lysecky.
Remember, though, there’s no of the activities mediated by the RTOS. “Control Focused Soft Error Detection
guarantee that your ARM CPU will In fact, monitoring only these actions for Embedded Applications,” IEEE Em-
have the trace logic needed. Every ven- may be simpler than the addresses, be- bedded Systems Letters, December 2010,
dor is free to include the IP or not. cause program flow can change quite a Volume 2 Number 4.
lot even from a simple change (requir- 5. The Nexus 5001 Forum—A Program of
GOING DEEPER ing re-profiling), but resource man- the IEEE–ISTO. See
I haven’t thought out all of the impli- agement tends to change much less
cations or possible ways to actually use frequently. 6. ARM. “CoreSight Program Flow Trace
this idea in a real embedded system, As far as the DSB and ISB instruc- Architecture Specification, v1.0.” ARM,
but here are a few ideas. tions, I’m not quite sure how they 2008. Available at http://infocenter.arm.
One problem with the proposed could offer useful watchdog informa- com/help/topic/com.arm.doc.ihi0035a/IHI
approach is that every recompile will tion, but something makes me think 0035A_coresight_pft_architecture_spec.pdf.

36 MARCH 2011 | embedded systems design |

Who makes the fastest real-time oscilloscopes?

100 MHz - 200 MHz

100 MHz - 500 MHz

20 MHz - 40 MHz
100 MHz - 1 GHz

60 MHz - 200 MHz The fastest-growing 100 MHz - 1 GHz

oscilloscope company.*

DC - 90 GHz Sampling
100 MHz - 1 GHz

2.5 GHz - 13 GHz 600 MHz - 4 GHz

Introducing 16-32 GHz

Agilent Infiniium 90000 X-Series

Our portfolio offers families engineered to deliver the best:

• Best measurement accuracy
• Broadest measurement capability
• Best signal visibility
• More scope than you thought you could afford

Are you using the best scope?

Take the 5-minute scope challenge and find out.
*Prime Data 2009 Market Growth Analysis.
© Agilent Technologies, Inc. 2010 u.s. 1-800-829-4444 canada 1-877-894-4414