Академический Документы
Профессиональный Документы
Культура Документы
Abstra
t. In this paper we
ompare three systems for tra
ing and de-
bugging Haskell programs: Freja, Hat and Hood. We evaluate their use-
fulness in pra
ti
e by applying them to a number of moderately
omplex
programs in whi
h errors had deliberately been introdu
ed. We identify
the strengths and weaknesses of ea
h system and then form ideas on how
the systems
an be improved further.
1 Introdu
tion
The la
k of tools for tra
ing and debugging has deterred software developers from
using fun
tional languages [13℄. Conventional debuggers for imperative languages
give the user a
ess to otherwise invisible information about a
omputation by
allowing the user to step through the program
omputation, stop at given points
and examine variable
ontents. This tra
ing method is unsuitable for lazy fun
-
tional languages, be
ause their evaluation order is
omplex, fun
tion arguments
are usually unwieldy large unevaluated expressions and generally
omputation
details do not mat
h the user's high-level view of fun
tions mapping values to
values.
In the middle of the 1980's a wave of resear
h into tra
ing methods for lazy
fun
tional languages started and has been in
reasing sin
e. In this paper we
ompare the tra
ing systems that (a)
over a large subset of a standard lazy
fun
tional language, namely Haskell 98 [9℄, (b) are publi
ly available and (
) are
still a
tively developed. Freja1 [7, 5℄ is a system that
reates an evaluation de-
penden
y tree as tra
e, a stru
ture based on the idea of de
larative/algorithmi
debugging from the logi
programming
ommunity. Hat2 [12, 11℄
reates a tra
e
that shows the relationships between the redexes (mostly fun
tion appli
ations)
redu
ed by the
omputation. The most re
ent system, Hood3 [2℄, enables the pro-
grammer to observe the data stru
tures at given program points. It
an basi
ally
be used like print statements in imperative languages, but the lazy evaluation
order is not ae
ted and fun
tions
an be observed as well.
1
http://www.ida.liu.se/~henni
2
http://www.
s.york.a
.uk/fp/ART
3
http://www.haskell.org/hood
In this paper we
ompare Freja 1.1, Hat 1.0 and Hood July 2000 release. We
evaluate the systems in pra
ti
e by applying them to a number of moderately
omplex programs in whi
h errors are deliberately introdu
ed. Tra
ing systems
are intera
tively used tools. In this paper we
on
entrate on the usefulness of
the systems for the programmer. Runtime and spa
e usage measurements are
reported in other papers [5, 6, 11℄. We do not aim for a quantitative
omparison
to
rown a winner. Only with a large number of programmers
ould we have
obtained statisti
ally valid data about, for example, how long it takes to lo
ate
a spe
i
error with a spe
i
system. Even these data depend for example
on how well the programmers are trained for a system, espe
ially be
ause the
systems are rather dierent. Our aim is to explore the design spa
e of tra
ers
and gain insights for the future development of tra
ing and debugging systems.
Our experiments highlight and sometimes even un
over previously unnoti
ed
similarities and distinguishing features of the three systems. The experiments
enable us to evaluate the usefulness of system features and lead us to new ideas
for how the
urrent systems
an be improved or even be
ombined.
The paper is stru
tured as follows. Se
tion 2 gives a short introdu
tion to
ea
h of the three systems. Se
tion 3
ompares the systems with respe
t to their
approa
h to tra
ing, design and implementation. Se
tion 4 reports on our pra
-
ti
al experiments and the insights they gave us into the systems' distinguishing
properties and their usefulness. Se
tion 5 brie
y des
ribes other systems for
tra
ing and debugging. Se
tion 6
on
ludes.
To give an idea about what the three tra
ing systems provide and how they are
used we give a short introdu
tion here. Be
ause all three systems are still under
rapid development we try to avoid details that may
hange soon.
We demonstrate the use of ea
h system with the following example program.4
head (x:xs) = x
Note that the evaluation in Se
tion 4 is based on experiments with far larger
programs.
4
Freja a
tually expe
ts main to be of type String and the other two systems expe
t
it to be of type IO (). Here we abstra
t from the details of input/output.
2
2.1 Freja
Freja is a
ompiler for a subset of Haskell 98. A debugging session
onsists of the
user answering a sequen
e of questions. Ea
h question
on
erns a redu
tion of a
redex | that is, a fun
tion appli
ation | to a value. The user has to answer yes,
if the redu
tion is
orre
t with respe
t to his intentions, and no otherwise. In
the end the debugger states whi
h redu
tion is the
ause of the observed faulty
behaviour | that is, whi
h fun
tion denition is in
orre
t.
The rst question always asks if the redu
tion of the fun
tion main to the
result value of the program is
orre
t. If the question about the redu
tion of
a fun
tion appli
ation is answered with no, then the next question
on
erns a
redu
tion for evaluating the right-hand-side of the denition of this fun
tion.
Freja
an be used rather similarly to a
onventional debugger. The input no
means \step into
urrent fun
tion
all" and the input yes means \go on to
next fun
tion
all". If the redu
tion of a fun
tion appli
ation is in
orre
t but all
redu
tions for the evaluation of the fun
tion's right-hand-side are
orre
t, then
the denition of this fun
tion must be in
orre
t for the given arguments.
The following is a debugging session with Freja for our example program.
The symbol ? represents an error and the symbol ? represents an expression
that has never been evaluated and whose value hen
e
annot have in
uen
ed
the
omputation.
main ) (8; ?) no
4*2 ) 8 yes
head [8,?℄ ) 8 yes
last [8,?℄ ) ? no
last [?℄ ) ? no
last [℄ ) ? yes
Bug lo
ated! Erroneous redu
tion: last [?℄ ) ?
2.2 Hat
Hat
onsists of a modied version of the nh
98 Haskell
ompiler5 and a separate
browser program. A program
ompiled for tra
ing exe
utes as usual ex
ept that
alongside the normal
omputation it builds a redex trail in heap and instead of
terminating at the end it waits for the browser to
onne
t to it. The browser
shows the output of the program. The user sele
ts a part of it and asks the
browser for its parent redex. The parent redex of an expression is the redex that
through its own redu
tion
reated the expression. Ea
h part of the redex has
again a parent redex whi
h the browser shows on demand. A trail ends at the
fun
tion (redex) main, whi
h has no parent. Debugging with Hat works by going
from a faulty output or error message ba
kwards until the error is lo
ated.
The browser has a graphi
al user interfa
e whi
h we do not dis
uss here.
Basi
ally the system is used as follows to lo
ate the error in our example program.
The program aborts with an error message and the browser dire
tly shows its
5
http://www.
s.york.a
.uk/fp/nh
98
3
parent redex: last [℄. The user is surprised that the fun
tion last is ever
alled with an empty list as argument and asks the browser for the parent redex
of last [℄. The answer, last (3+6:[℄), makes
lear that the denition of last
is not
orre
t for a single element list. The browser presents the redex trail as
shown in the following gure. To demonstrate how the parent of a subexpression
is presented (4*2 is the parent of 8), more of the redex trail is shown than is
needed for lo
ating the error.
last [℄
last (3+6:[℄)
last (8:3+6:[℄)
5 4*2
main
The browser
an also show where in the program text for example last is
alled with the argument [℄ in the equation for last (x:xs).
2.3 Hood
Hood
urrently is simply a Haskell library. A user annotates some expressions
in a program with the
ombinator observe, whi
h is dened in the library.
While the program is running, information about the values of the annotated
expressions is re
orded. After program termination the user
an view for ea
h
annotation the observed values.
We annotate the argument of last in our example program:
main = let xs = [4*2, 3+6℄
in (head xs, last (observe "last arg" xs))
When the modied program terminates it gives us the following information:
-- last arg
_ : _ : [℄
The symbol _ represents an unevaluated expression. Note that the rst element
of the list xs is evaluated by the program, but not by the fun
tion last.
To gain more insight into how the program works we observe the fun
tion
last, in
luding all its re
ursive
alls:
last = observe "last" last'
4
So last is
alled with an empty list. We draw the
on
lusion that last
applied to the one element list
aused this erroneous
all, but stri
tly the infor-
mation provided by Hood does not imply this.
5
main ) (8,?)
last [?℄ ) ?
last [℄ ) ?
main (,)
4 * 2 8 head 3 + 6 last
last
: : [℄
last ?
6
Freja
reates an Evaluation Dependen
y Tree (EDT) as tra
e. Ea
h node
of the tree is a redu
tion as shown in the browser. The tree is basi
ally the
derivation/proof tree for a
all-by-value redu
tion with mira
ulous stops where
expressions are not needed for the result. The
all-by-value stru
ture ensures
that the tree stru
ture re
e
ts the program stru
ture and that arguments are
maximally evaluated. Figure 1 shows the EDT for our example program of Se
-
tion 2. The symbol ? represents the value of the error message.
Hat
reates a redex trail as tra
e. A redex trail is a dire
ted graph of value
nodes and redex nodes. Ea
h node, ex
ept the node for main, has an arrow to
its parent redex node. Be
ause subexpressions of a redex may have dierent
parents or may be shared, redex nodes may
ontain arrows to nodes of their
subexpressions. Figure 2 shows the redex trail for our example program of Se
-
tion 2. Dotted arrows point to subexpressions. Both dashed and solid arrows
denote the parent relationship. (8,?) is the result value of the
omputation. As
in Freja, ? represents the value of the error message.
The graphs of the two tra
e stru
tures are laid out to stress their similarity.
All arrows of the EDT are also present in the redex trail but point in the oppo-
site dire
tion. If the redex trail held information about whi
h parent relations
orrespond to redu
tions (these are shown as solid arrows), then the EDT
ould
be
onstru
ted from the redex trail (however, see also the next paragraph and
Se
tion 4.1 about free variables). In
ontrast, the redex trail
ontains more infor-
mation than the EDT, be
ause it additionally links every value with its parent
redex and des
ribes how expressions are shared.
The redex trail shown in Figure 2 is a simplied version of the one that is
really
reated by Hat. The real redex trail has an additional node xs with parent
main and
hildren 4 * 2, 3 + 6, the two : nodes and [℄. That is, the redex trail
also re
ords the redu
tion of the let expression. The whole let expression is a
redex, but in the redex trail it is represented by the dened variable xs. Similarly
a node xs ) [8,?℄ that re
ords the redu
tion of the let expression
ould be
added to the EDT. So re
ording a let redu
tion is an option for both the
EDT and the redex trail and the implementors of Freja and Hat made dierent
de
isions with respe
t to this option. On the one hand re
ording let redu
tions
leads to larger tra
es with an unusual kind of redex. On the other hand it enables
more ne grained tra
ing (
f. Se
tion 4.3).
Be
ause Hood observations
ontain values as they are demanded in a given
ontext, whereas both the EDT and the redex trail
ontain values in their most
evaluated form, it is not possible to gain Hood observations from either the EDT
or the redex trail. Conversely, even observing every subexpression of a program
with Hood would not enable us to
onstru
t an EDT or redex trail, be
ause
there is no information about the relations between the observations.
7
The developers of the three systems made dierent
hoi
es about the level at
whi
h they implemented the
reation of the tra
e. In Freja the tra
e is
reated in
the heap dire
tly by modied instru
tions of the abstra
t graph redu
tion ma-
hine. Hat transforms the original Haskell program into another Haskell program.
Running the
ompiled transformed program yields the redex trail in addition to
the normal result. Finally, in Hood the tra
e is
reated as a side ee
t by the
ombinator observe, whi
h is dened in a Haskell library.
The level of implementation has dire
t ee
ts on the portability to dierent
Haskell systems. Hood
an be used with dierent Haskell systems, be
ause the
library only requires a few non-standard fun
tions su
h as unsafePerformIO
whi
h are provided by every Haskell system6 . The transformation of Hat is
ur-
rently integrated into the nh
98
ompiler but
ould be separated. A transformed
program uses a few non-standard unsafe fun
tions to improve performan
e. Fur-
thermore, some extensions of the Haskell run-time system are required to retain
a
ess to the result after termination or interruption and to
onne
t to the
browser. Finally, Freja is a Haskell system of its own. Adding its low-level tra
e
reation me
hanism to any other Haskell system would require a major rewriting
of this system.
8
Hat
an produ
e partial tra
es by limiting the length of the redex trails. Be-
ause a redex trail is browsed ba
kwards, the system prunes away those redexes
that are further than a
ertain length away from the live program data or out-
put. Hat does not provide any me
hanism like re-exe
ution in Freja to re
reate
a pruned part of the redex trail.
Requiring less heap spa
e may redu
e garbage
olle
tion time, but Hat still
spends the time for
onstru
ting the whole tra
e whereas Freja does not need to
spend time on tra
e
onstru
tion after
onstru
tion of an upper part of an EDT.
Dieren
es between the systems dire
tly raise several questions. Is it desirable
to add a feature of one system to another system? Does an alternative design
de
ision make sense? How far is a distinguishing feature inherent to a system,
possibly determined by its implementation method or its tra
ing model? Be
ause
the design spa
e for a tra
er is huge, it is sensible to evaluate system features
in pra
ti
e early. We applied the three systems to a number of programs in
whi
h errors had deliberately been introdu
ed. The errors
aused all three kinds
of faulty observable behaviour mentioned earlier: wrong output, abortion with
error message and non-termination.
Our evaluation experiments use the following proto
ol: At least two program-
mers are involved. First the author of a
orre
tly working program explains how
the program works. Then one programmer se
retly introdu
es several deliberate
errors into the program, of a kind undete
ted by the
ompiler. Given the faulty
program, the other programmers use a tra
ing system to lo
ate and x all the
errors, thinking aloud and taking notes as they do so.
All the parti
ipants are experien
ed Haskell programmers.
The programs used in the experiments are of moderate
omplexity. The
largest program, PsaCompiler, a
ompiler for a toy language,
onsists of 900
lines in 13 modules and performs 20,000 redu
tions for the input we provided.
The longest running program, Adjoxo, an adjudi
ator for noughts and
rosses
(ti
ta
toe),
onsists of only 100 lines but performs up to 830,000 redu
tions
for our inputs. In our
hoi
e of programs we were restri
ted by the subset of
Haskell that Freja supports. For example, Freja does not implement
lasses and
unfortunately not even every Freja program is a valid Haskell program. Freja
had been applied to a mini
ompiler with 16 million redu
tions [6℄ and Hat had
been applied to a version of nh
98 with 14,000 lines and 5.2 million redu
tions
and a
hess end-game program with 20 million redu
tions [11℄. These papers give
performan
e gures but do not indi
ate how easy debugging programs of this
size is. We
annot make su
h statements either, but our programs are denitely
beyond toy examples and of a size often o
urring in pra
tise. Our programs also
do not perform monadi
input/output. Freja does not implement it and Hat only
supports a few operations. It would be interesting to see if Hood's ability to show
the return value of an exe
uted input/output a
tion is suÆ
ient in pra
ti
e.
9
4.1 Readability of Expressions
In
ontrast to our preliminary fears that the expressions shown by the browsers|
redu
tions, redexes and values | would be too large to be
omprehensible, for
our programs they are mostly of moderate size and easily readable.
As we will dis
uss in Se
tion 4.2 the user of a tra
ing system not only views
the tra
e but also the program. Nonetheless in Freja and Hat informative variable
(fun
tion) names, that
onvey the semanti
s of the variable well, substantially
redu
e the need for viewing the program and thus in
rease the speed of the
debugging pro
ess substantially.
Fun
tions In Haskell, fun
tions are rst-
lass
itizens and hen
e fun
tion values
may appear for example as arguments in redexes or inside data stru
tures.
For the representation of fun
tion values, Hood deviates from the prin
iple
of showing Haskell-like expressions. It shows fun
tion values as nite mappings
from arguments to results. Be
ause the mapping
ontains only expressions that
were demanded during the
omputation, the representation is short in most
ases. However, for fun
tions that are
alled often and espe
ially for higher-
order fun
tions the representation is unwieldy. The representation requires some
time to get used to. In return, it permits a rather abstra
t, denotational view of
program semanti
s whi
h is useful for determining the
orre
tness of part of a
program.
In Freja and Hat a fun
tion value is shown as a fun
tion name, a -abstra
tion,
or as a partial appli
ation of a fun
tion name or a -abstra
tion. Fun
tion names
and their partial appli
ations are easily readable but -abstra
tions are not. Both
systems do not show a -abstra
tion as it is written in the program but repre-
sent it by a new symbol: <lambda#n> for a number n in Freja and (\) in Hat.
Both systems
an show the full -abstra
tion on demand. However, be
ause
of the ne
essary additional step and be
ause -abstra
tions are often large ex-
pressions, reading expressions involving -abstra
tions is hard. We
onje
ture
10
that with Freja or Hat debugging programs that make substantial use of -
abstra
tions, as
ommonly done for stylised abstra
tions su
h as
ontinuation
passing, higher-order
ombinators and monads, is rather diÆ
ult. Our programs
hardly use stylised abstra
tions. In fa
t, PsaCompiler uses only named fun
tions,
even in the denitions of its parser
ombinators, where most Haskell program-
mers would use -abstra
tions. During tra
ing, Freja and Hat show very readable
expressions for PsaCompiler.
Free Variables Both -abstra
tions and the denition bodies of lo
ally dened
fun
tions often
ontain free variables. To answer a question in Freja the values of
su
h free variables must be known. Hen
e Freja shows this information in a where
lause. The following question from an evaluation experiment demonstrates that
this information usually adds to the
omprehensibility of a question
onsiderably:
tableRead
"y"
(TableImp
(newTableFun
tion
where
newIndex = "x",
newEntry = 1,
oldTableFun
tion = implTableEmpty))
=>
Just 1
The
orre
t answer is obviously no.
Hat does not show the values of free variables. This information
an be ob-
tained only indire
tly by following the
hain of parent redexes of su
h a fun
tion.
To realise that a fun
tion has free variables and to see the
orresponding argu-
ments of parent redexes it is ne
essary to follow links to the program sour
e.
In Hood an observation of a lo
ally dened fun
tion
an be misleading. The
observation is really for a family of dierent fun
tions, with dierent values for
free variables. In our experiments one observation of a lo
al fun
tion moveval is
presented as follows
-- moveval
f : : : , \ 8 -> Draw, : : : g
f : : : , \ 8 -> Win, : : : g
11
operations are
ompletely dierent. A major dieren
e between the systems
is the time the user has to spend thinking about what to do next, and the
eort required to do it. For example, the time required in Hood for de
iding
where to add observe annotations, modifying the program (dis
ussed further in
Se
tion 4.4), re
ompiling the program and reexe
uting it is substantially higher
than answering a question or sele
ting an expression for viewing its parent.
Furthermore, the amount of data produ
ed by a single observe annotation is
usually substantial.
Guidan
e and Strategies Freja asks questions whi
h the user has to answer
whereas in both other systems the user also has to ask the right questions. Freja
guides the user towards the error.
Hat at least starts with the program output, an error message or the last
evaluated redex in an interrupted program and the main operation is to
hoose
a subexpression and ask for its parent. There are usually many subexpressions
to
hoose from and the system never states that an error has been lo
ated at a
given position in the program. Wrong parts in the output or wrong arguments in
redexes are
andidates for further enquiry. Nonetheless, for the less experien
ed
user it is easy to get lost examining an irrelevant region of the redex trail.
Hood gives the
omplete freedom to observe any value in the program. The
initial
hoi
e of what to observe is diÆ
ult and often seems arbitrary. In general
Hood users apply a top-down strategy in their pla
ement of observe
ombina-
tors, if the faulty behaviour does not point to any program lo
ation, for example
when the program does not terminate. Then the questions the Hood users asks
are similar to those asked by Freja. If, on the other hand, the position where
the observable fault is
aused
an be identied, for example when the program
aborts with an error message o
urring only on
e in the program, then a Hood
user tries to apply a bottom-up strategy reminis
ent of Hat.
Our programs
ontain several errors. Users of Hat and Hood lo
ate the er-
rors in the same order, be
ause they always lo
ate the error that
auses the
observed faulty behaviour. In
ontrast, the questions of Freja sometimes lead to
the lo
ation of a dierent error. It is possible to ta
kle a spe
i
faulty behaviour
by answering some questions in
orre
tly, but that requires
are. One may easily
steer into irrelevant regions of the EDT.
General Usability Hat with its
omplex browser has the steepest learning
urve for a new user. In
ontrast, the prin
iple of questions and answers of
Freja is easy to grasp and Hood has the advantage of using the idea of print
statements, whi
h are well-known from imperative languages. Hen
e a mode that
would hide some features from the beginner seems desirable for Hat.
Information Used A Hood user has to modify the program and hen
e look
at it. Sometimes just the pro
ess of sear
hing for a good pla
ement of observe
reveals the error. Users of Freja and Hat, espe
ially the former, tend to negle
t
12
the program. As long as the user knows the intended meaning of fun
tions he
an use Freja without ever looking at the program. This does however imply
that the user does not try to follow Freja's reasoning and to understand how the
nally lo
ated error a
tually
aused the observed faulty behaviour. Redexes as
shown by Hat are not intended to be the only sour
e of information for lo
ating
an error. Viewing the program part where a redex is
reated gives valuable
ontext information and at the end the program is needed to lo
ate the error.
Both Freja and Hat provide qui
k a
ess to the part of the program relating
to the
urrent question or redex. Nonetheless, it seems worthwhile to test if
automati
ally showing the relevant part of the program when a new question or
parent is shown would improve usability.
In
ontrast to the other two systems Hat also gives information about whi
h
expressions are shared. This information is useful in some
ases, usually when
expressions are shared unexpe
tedly.
A tra
e of Hood is a set of observations. The tra
e unfortunately
ontains no
information about the relations between these observations. Hen
e, with a few
ex
eptions, we observe fun
tions to obtain at least a relation between arguments
and result. In parti
ular, the representation of an observed fun
tion shows
learly
whi
h (part of an) argument is not demanded by the fun
tion for determining
its result. This feature is helpful for lo
ating errors.
13
the same error. A Hood user obviously also tries to use information about wrong
subexpressions but it is not easy to de
ide where to pla
e the next observe
ombinator.
Redu
tion of Information In Hood, the user determines the size of the tra
e
by the pla
ement of observe
ombinators. It is, however, sometimes not easy to
foresee how large an observation will be. The trusting me
hanisms in Freja and
Hat not only save spa
e but also redu
e the amount of information presented to
the user. The ability of the Freja browser to dynami
ally trust a fun
tion and
thus avoid further questions about it is useful. For Hat a
orresponding feature
seems desirable. In Freja, sometimes a question is repeated, be
ause the same
redu
tion is performed again. Hen
e memoisation of questions and their answers
is desirable. It would also be useful to be able to generalise an answer, to avoid
a series of very similar questions all requiring the same answer.
Runtime Overhead With respe
t to the time overhead
aused by the
reation
of tra
es the low-level implementation of Freja pays o. The overhead is not no-
ti
eable. In
ontrast, in Hat tra
ed
omputations are more than ten times slower.
For some inputs adjoxo seems to be non-terminating but it is only slow! We ex-
perien
e the same with Hood when we observe at positions that are
omputed
very often and that lead to large observations. So in Hood the time overhead is
onsiderable but it is only proportional to the amount of observed data.
CAFs A
onstant appli
ative form (CAF) is a top-level variable of arity zero,
in other words a top-level fun
tion without arguments. Its value is
omputed on
demand and shared by its users. Both Freja and Hat take the view that a CAF
has no parent. Hen
e the tra
e of a program in Freja is generally not a single
8
http://www.haskell.org/gh
14
EDT but a set of EDTs, an EDT for ea
h CAF in
luding main. These EDTs are
sorted so that a CAF only uses those CAFs about whi
h questions have already
been asked and whi
h are hen
e known to be free of errors. Unfortunately one of
our experiment programs
ontaines 35 CAFs. We have to
onrm the
orre
tness
of evaluation for all CAFs before rea
hing the question about main, although
none of these CAFs are related to any of the errors. Freja
an be instru
ted
to start with the question about main. However, that implies stating that the
evaluation of all CAFs is
orre
t, whi
h may not be the
ase and thus lead Freja
to give a wrong error lo
ation. An alternative denition of the EDT
ould imply
that all users of a CAF are its parents. Then a question about a CAF would be
asked only if it were relevant and memoisation of the question and its answer
ould avoid asking the same question when another redu
tion using the CAF
were investigated.
For Hat a
orresponding modi
ation without losing sharing of CAFs seems
to be more diÆ
ult, be
ause the redex trail is browsed by going ba
kwards from
an expression to its unique parent. In our experiments the fa
t that a CAF
has no parent in a redex trail is not noti
eable, be
ause none of the introdu
ed
errors
on
ernes CAFs. However, programs
an be
onstru
ted where this la
k
of information hinders lo
ating an error:
nats :: [Int℄
nats = 0 : map su
nats
Guards,
ases and ifs In Haskell the sele
tion of an equation of a denition
may not only be determined by pattern mat
hing but may also depend on the
value of a guard:
test :: (a -> Bool) -> a -> Maybe a
test p x | p x = Just x
| otherwise = Nothing
In Freja the redu
tion of a guard (p x) is a
hild of the redu
tion of the fun
tion
(test). Redex trails are, however, traversed ba
kwards from the result value
15
(Just x or Nothing). To hold the information about the redu
tion of a guard,
redex trails have an additional sort of redexes. In the example, if the rst equation
were
hosen, then the value Just x would have the parent | True C test p
x, and if the se
ond equation were
hosen, then the value Nothing would have
the parent | True C | False C test p x. By asking for the parents of the
truth values True and False in the redexes, the user
an obtain information
about the evaluation of the guards.
Similarly, Hat uses spe
ial redexes for
ase and if expressions. On the one
hand, these spe
ial redexes
ompli
ate the system. On the other hand, they are
useful for large fun
tion denitions. The spe
ial redexes enable more ne grained
tra
ing up to the level of guards,
ases and ifs, whereas Freja only identies
a whole fun
tion redu
tion as faulty. Similar to the situation for lo
ally dened
variables it is possible to extend the denition of Freja's EDT by spe
ial nodes
for guard,
ase and if redu
tions. For Hat, spe
ial redexes for these redu
tions
are important to make parts of the redex trail rea
hable by ba
kward traversal
that otherwise would be unrea
hable.
16
Booth and Jones [3, 1℄ sket
h a system whi
h
reates a tra
e quite similar to
an EDT. The main dieren
e is that a parent node is only
onne
ted dire
tly
to one
hild. All sibling nodes are
onne
ted with ea
h other a
ording to the
stru
ture of the denition body of the parent node. Thus the tra
e has the ni
e
property that all
onne
ting arrows denote equality, unlike the arrows in an EDT
or a redex trail. The authors des
ribe a browser whi
h gives more freedom in
traversing the tra
e than the questions of Freja.
There also exist several systems for showing the a
tual
omputation sequen
e
of a lazy fun
tional program. Se
tion 2.2 of [14℄, Chapter 11 of [5℄ and Chapter
2 and Se
tion 7.5 of [8℄ review a large number of tra
ing and debugging systems
for lazy fun
tional languages.
We
ould not in
lude any of these systems in our experiments, be
ause there
are only limited prototypes, not publi
ly available.
We have
ompared and evaluated the tra
ing and debugging systems Freja, Hat
and Hood by applying them to a number of programs.
Tra
ing and debugging systems for lazy fun
tional languages have made
on-
siderable progress in re
ent years: all three systems prove to be ee
tive tools
for debugging our programs. Though none of our programs is very large, some
of them are large enough to show that the s
ope of appli
ation for the tools
goes well beyond easy exer
ises. Unfortunately the pra
ti
al usability of Hat
and espe
ially Freja is
urrently limited by the fa
t that they do not support
full Haskell 98.
Ea
h of the tra
ing tools takes a unique approa
h with spe
i
strengths. In
parti
ular, Freja has a systemati
fault-nding pro
edure; Hat starts at the ob-
served error and enables exploring ba
kwards the history of every subexpression;
Hood observes the data
ow at spe
i
program points by need.
Based on our experiments we identify in Se
tion 4 the strengths but also the
weaknesses of ea
h system. For some weaknesses we already suggest improve-
ments, often based on the
onvin
ing solutions of the problems in other systems.
Other weaknesses are linked either to the tra
ing method or the implementa-
tion, whi
h we dis
uss in Se
tion 3. Hen
e they are more diÆ
ult to address
and require further resear
h. For example, Freja
annot take advantage of the
ommon
ase that only a subexpression of a redu
tion is wrong, Hat is slow
and Hood gives almost no indi
ation of how values are related. We
laim that
an integration of Freja into Hat is feasible whereas Hood's approa
h is rather
dierent from the approa
hes of the other two systems.
Finally, good tools are not suÆ
ient for debugging. The user needs advi
e on
how to ee
tively use ea
h system; a strategy needs to be developed for Hat and
espe
ially for Hood, but even Freja would benet from advi
e on how to employ
its advan
ed features. Also a strategy for using several systems together, taking
advantage of their respe
tive strengths, is desirable.
17
A
knowledgments
We thank Henrik Nilsson and Jan Sparud for taking part in the evaluation
experiments and making valuable observations. The work reported in this paper
was supported by the Engineering and Physi
al S
ien
es Resear
h Coun
il of
the United Kingdom under grant number GR/M81953.
Referen
es
1. Simon P Booth and Simon B Jones. Walk ba
kwards to happiness | debug-
ging by time travel. Te
hni
al Report Te
hni
al Report CSM-143, Department
of Computer S
ien
e and Mathemati
s, University of Stirling, 1997. This pa-
per was presented at the 3rd International Workshop on Automated Debugging
(AADEBUG'97), hosted by the Department of Computer and Information S
ien
e,
Linkoping University, Sweden, May 1997.
2. Andy Gill. Debugging Haskell by observing intermediate data stru
tures. In Pro-
eedings of the 4th Haskell Workshop, 2000. Te
hni
al report of the University of
Nottingham.
3. Simon B. Jones and Simon P. Booth. Towards a purely fun
tional debugger for
fun
tional programs. In Pro
eedings Glasgow Workshop on Fun
tional Program-
ming 1995, Ullapool, S
otland, July 1995.
4. Lee Naish and Tim Barbour. Towards a portable lazy fun
tional de
larative de-
bugger. In Pro
. 19th Australasian Computer S
ien
e Conferen
e, January 1996.
5. Henrik Nilsson. De
larative Debugging for Lazy Fun
tional Languages. PhD thesis,
Linkoping, Sweden, May 1998.
6. Henrik Nilsson. Tra
ing pie
e by pie
e: aordable debugging for lazy fun
tional
languages. In Pro
eedings of the 1999 ACM SIGPLAN International Conferen
e
on Fun
tional Programming, pages 36{47. ACM Press, 1999.
7. Henrik Nilsson and Jan Sparud. The evaluation dependen
e tree as a basis for lazy
fun
tional debugging. Automated Software Engineering: An International Journal,
4(2):121{150, April 1997.
8. Alastair Penney. Augmenting Tra
e-based Fun
tional Debugging. PhD thesis, De-
partment of Computer S
ien
e, University of Bristol, September 1999.
9. Simon L. Peyton Jones, John Hughes, et al. Haskell 98: A non-stri
t, purely
fun
tional language. http://www.haskell.org, February 1999.
10. Bernard Pope. Buddha: A de
larative debugger for Haskell. Te
hni
al report, Dept.
of Computer S
ien
e, University of Melbourne, Australia, June 1998. Honours
Thesis.
11. Jan Sparud and Colin Run
iman. Complete and partial redex trails of fun
tional
omputations. In C. Cla
k, K. Hammond, and T. Davie, editors, Sele
ted papers
from 9th Intl. Workshop on the Implementation of Fun
tional Languages (IFL'97),
pages 160{177. Springer LNCS Vol. 1467, September 1997.
12. Jan Sparud and Colin Run
iman. Tra
ing lazy fun
tional
omputations using redex
trails. In H. Glaser, P. Hartel, and H. Ku
hen, editors, Pro
. 9th Intl. Symposium
on Programming Languages, Implementations, Logi
s and Programs (PLILP'97),
pages 291{308. Springer LNCS Vol. 1292, September 1997.
13. Philip Wadler. Fun
tional programming: Why no one uses fun
tional languages.
SIGPLAN Noti
es, 33(8):23{27, August 1998. Fun
tional programming
olumn.
14. R. D. Watson. Tra
ing Lazy Evaluation by Program Transformation. PhD thesis,
Southern Cross, Australia, O
tober 1996.
18