Вы находитесь на странице: 1из 18

Freja, Hat and Hood | A Comparative

Evaluation of Three Systems for Tra ing and

Debugging Lazy Fun tional Programs

Olaf Chitil, Colin Run iman and Mal olm Walla e


University of York, UK
folaf, olin,mal olmg s.york.a .uk

Abstra t. In this paper we ompare three systems for tra ing and de-
bugging Haskell programs: Freja, Hat and Hood. We evaluate their use-
fulness in pra ti e by applying them to a number of moderately omplex
programs in whi h errors had deliberately been introdu ed. We identify
the strengths and weaknesses of ea h system and then form ideas on how
the systems an be improved further.

1 Introdu tion
The la k of tools for tra ing and debugging has deterred software developers from
using fun tional languages [13℄. Conventional debuggers for imperative languages
give the user a ess to otherwise invisible information about a omputation by
allowing the user to step through the program omputation, stop at given points
and examine variable ontents. This tra ing method is unsuitable for lazy fun -
tional languages, be ause their evaluation order is omplex, fun tion arguments
are usually unwieldy large unevaluated expressions and generally omputation
details do not mat h the user's high-level view of fun tions mapping values to
values.
In the middle of the 1980's a wave of resear h into tra ing methods for lazy
fun tional languages started and has been in reasing sin e. In this paper we
ompare the tra ing systems that (a) over a large subset of a standard lazy
fun tional language, namely Haskell 98 [9℄, (b) are publi ly available and ( ) are
still a tively developed. Freja1 [7, 5℄ is a system that reates an evaluation de-
penden y tree as tra e, a stru ture based on the idea of de larative/algorithmi
debugging from the logi programming ommunity. Hat2 [12, 11℄ reates a tra e
that shows the relationships between the redexes (mostly fun tion appli ations)
redu ed by the omputation. The most re ent system, Hood3 [2℄, enables the pro-
grammer to observe the data stru tures at given program points. It an basi ally
be used like print statements in imperative languages, but the lazy evaluation
order is not a e ted and fun tions an be observed as well.
1
http://www.ida.liu.se/~henni
2
http://www. s.york.a .uk/fp/ART
3
http://www.haskell.org/hood
In this paper we ompare Freja 1.1, Hat 1.0 and Hood July 2000 release. We
evaluate the systems in pra ti e by applying them to a number of moderately
omplex programs in whi h errors are deliberately introdu ed. Tra ing systems
are intera tively used tools. In this paper we on entrate on the usefulness of
the systems for the programmer. Runtime and spa e usage measurements are
reported in other papers [5, 6, 11℄. We do not aim for a quantitative omparison
to rown a winner. Only with a large number of programmers ould we have
obtained statisti ally valid data about, for example, how long it takes to lo ate
a spe i error with a spe i system. Even these data depend for example
on how well the programmers are trained for a system, espe ially be ause the
systems are rather di erent. Our aim is to explore the design spa e of tra ers
and gain insights for the future development of tra ing and debugging systems.
Our experiments highlight and sometimes even un over previously unnoti ed
similarities and distinguishing features of the three systems. The experiments
enable us to evaluate the usefulness of system features and lead us to new ideas
for how the urrent systems an be improved or even be ombined.
The paper is stru tured as follows. Se tion 2 gives a short introdu tion to
ea h of the three systems. Se tion 3 ompares the systems with respe t to their
approa h to tra ing, design and implementation. Se tion 4 reports on our pra -
ti al experiments and the insights they gave us into the systems' distinguishing
properties and their usefulness. Se tion 5 brie y des ribes other systems for
tra ing and debugging. Se tion 6 on ludes.

2 Learn Three Systems in Three Minutes

To give an idea about what the three tra ing systems provide and how they are
used we give a short introdu tion here. Be ause all three systems are still under
rapid development we try to avoid details that may hange soon.
We demonstrate the use of ea h system with the following example program.4

main = let xs = [4*2, 3+6℄ :: [Int℄


in (head xs, last xs)

head (x:xs) = x

last (x:xs) = last xs


last [x℄ = x

Note that the evaluation in Se tion 4 is based on experiments with far larger
programs.
4
Freja a tually expe ts main to be of type String and the other two systems expe t
it to be of type IO (). Here we abstra t from the details of input/output.

2
2.1 Freja
Freja is a ompiler for a subset of Haskell 98. A debugging session onsists of the
user answering a sequen e of questions. Ea h question on erns a redu tion of a
redex | that is, a fun tion appli ation | to a value. The user has to answer yes,
if the redu tion is orre t with respe t to his intentions, and no otherwise. In
the end the debugger states whi h redu tion is the ause of the observed faulty
behaviour | that is, whi h fun tion de nition is in orre t.
The rst question always asks if the redu tion of the fun tion main to the
result value of the program is orre t. If the question about the redu tion of
a fun tion appli ation is answered with no, then the next question on erns a
redu tion for evaluating the right-hand-side of the de nition of this fun tion.
Freja an be used rather similarly to a onventional debugger. The input no
means \step into urrent fun tion all" and the input yes means \go on to
next fun tion all". If the redu tion of a fun tion appli ation is in orre t but all
redu tions for the evaluation of the fun tion's right-hand-side are orre t, then
the de nition of this fun tion must be in orre t for the given arguments.
The following is a debugging session with Freja for our example program.
The symbol ? represents an error and the symbol ? represents an expression
that has never been evaluated and whose value hen e annot have in uen ed
the omputation.
main ) (8; ?) no
4*2 ) 8 yes
head [8,?℄ ) 8 yes
last [8,?℄ ) ? no
last [?℄ ) ? no
last [℄ ) ? yes
Bug lo ated! Erroneous redu tion: last [?℄ ) ?

2.2 Hat
Hat onsists of a modi ed version of the nh 98 Haskell ompiler5 and a separate
browser program. A program ompiled for tra ing exe utes as usual ex ept that
alongside the normal omputation it builds a redex trail in heap and instead of
terminating at the end it waits for the browser to onne t to it. The browser
shows the output of the program. The user sele ts a part of it and asks the
browser for its parent redex. The parent redex of an expression is the redex that
through its own redu tion reated the expression. Ea h part of the redex has
again a parent redex whi h the browser shows on demand. A trail ends at the
fun tion (redex) main, whi h has no parent. Debugging with Hat works by going
from a faulty output or error message ba kwards until the error is lo ated.
The browser has a graphi al user interfa e whi h we do not dis uss here.
Basi ally the system is used as follows to lo ate the error in our example program.
The program aborts with an error message and the browser dire tly shows its
5
http://www. s.york.a .uk/fp/nh 98

3
parent redex: last [℄. The user is surprised that the fun tion last is ever
alled with an empty list as argument and asks the browser for the parent redex
of last [℄. The answer, last (3+6:[℄), makes lear that the de nition of last
is not orre t for a single element list. The browser presents the redex trail as
shown in the following gure. To demonstrate how the parent of a subexpression
is presented (4*2 is the parent of 8), more of the redex trail is shown than is
needed for lo ating the error.
 last [℄
last (3+6:[℄)
last (8:3+6:[℄)
5 4*2
main
The browser an also show where in the program text for example last is
alled with the argument [℄ in the equation for last (x:xs).

2.3 Hood
Hood urrently is simply a Haskell library. A user annotates some expressions
in a program with the ombinator observe, whi h is de ned in the library.
While the program is running, information about the values of the annotated
expressions is re orded. After program termination the user an view for ea h
annotation the observed values.
We annotate the argument of last in our example program:
main = let xs = [4*2, 3+6℄
in (head xs, last (observe "last arg" xs))
When the modi ed program terminates it gives us the following information:
-- last arg
_ : _ : [℄
The symbol _ represents an unevaluated expression. Note that the rst element
of the list xs is evaluated by the program, but not by the fun tion last.
To gain more insight into how the program works we observe the fun tion
last, in luding all its re ursive alls:
last = observe "last" last'

last' (x:xs) = last xs


last' [x℄ = x
The value of the fun tion is shown as a nite mapping of arguments to results:
-- last
f \ (_ : _ : [℄) -> throw <Ex eption>
, \ (_ : [℄) -> throw <Ex eption>
, \ [℄ -> throw <Ex eption>
g

4
So last is alled with an empty list. We draw the on lusion that last
applied to the one element list aused this erroneous all, but stri tly the infor-
mation provided by Hood does not imply this.

3 Comparison in Prin iple


At rst sight the three systems do not seem to have anything in ommon ex-
ept the goal of aiding debugging. However, all three systems take a two phase
approa h: while the program is running, information about the omputation
pro ess is olle ted. After termination of the program the olle ted information
is viewed in some kind of browser. In Freja, the browser is the part that asks
the questions, in Hat the program that lets the user view parents and in Hood
the part that prints the observations. This approa h should not be onfused
with lassi al post-mortem debugging where only the nal state of the omputa-
tion an be viewed. Having a tra e that des ribes aspe ts of a full omputation
enables new forms of exploring program behaviour and lo ating errors whi h
should make these systems also interesting for stri t fun tional languages or
even non-fun tional languages.
All three systems are suitable for programs that show any of the three kinds
of possible faulty observable behaviour: wrong output, abortion with error mes-
sage, non-termination. In the latter ase the program an be interrupted and
subsequently the tra e an be viewed.

3.1 Values and Evaluation


All three systems are sour e-level tra ers. They mostly show Haskell-like ex-
pressions whi h are built from fun tions, data onstru tors and onstants of the
program. To improve omprehensibility, all three systems show values instead of
arbitrary expressions as far as possible. Hood only shows values anyway. Both
Freja and Hat show an argument in a redex not as it was passed in the a tual
omputation but as a value. Only (a part of) an argument that was never eval-
uated is shown as an unevaluated redex in Hat (3+6 in the previous example)
whereas Freja and Hood represent it by a spe ial symbol (? in Freja and in
Hood). Freja and Hat show an expression only up to a given depth (for example
map su (0 : su 0 : ) in Hat;  represents the elided subexpression). A
subexpression beyond that depth is only shown on demand. None of the systems
hanges the usual observable behaviour of a program. In parti ular, they do not
for e the evaluation of expressions that are not needed by the program.
However, the systems di er in that Hood shows values as far evaluated as
they are demanded in the ontext of the observation position whereas both Freja
and Hat show how far values are evaluated in the whole omputation, in luding
the e e t of sharing. Hen e in the previous example Freja and Hat show the rst
element of the list argument in the rst all of last as 8 whereas Hood only
represents that element by .

5
main ) (8,?)

4 * 2 ) 8 head [8,?℄ ) 8 last [8,?℄ ) ?

last [?℄ ) ?

last [℄ ) ?

Fig. 1. Evaluation Dependen y Tree

main (,)

4 * 2 8 head   3 + 6 last 

last 

 :   :  [℄
last  ?

Fig. 2. Redex Trail

3.2 Tra e Stru tures

In Hood a tra e is a set of observations. These observations are shown in full to


the user. In ontrast, ea h of Freja and Hat reate a single large tra e stru ture
for a program run. It is impossible to show su h a tra e in full to the user. The
browser of ea h system permits the programmer to walk through the stru ture,
always seeing only a small lo al part of the whole tra e.

6
Freja reates an Evaluation Dependen y Tree (EDT) as tra e. Ea h node
of the tree is a redu tion as shown in the browser. The tree is basi ally the
derivation/proof tree for a all-by-value redu tion with mira ulous stops where
expressions are not needed for the result. The all-by-value stru ture ensures
that the tree stru ture re e ts the program stru ture and that arguments are
maximally evaluated. Figure 1 shows the EDT for our example program of Se -
tion 2. The symbol ? represents the value of the error message.
Hat reates a redex trail as tra e. A redex trail is a dire ted graph of value
nodes and redex nodes. Ea h node, ex ept the node for main, has an arrow to
its parent redex node. Be ause subexpressions of a redex may have di erent
parents or may be shared, redex nodes may ontain arrows to nodes of their
subexpressions. Figure 2 shows the redex trail for our example program of Se -
tion 2. Dotted arrows point to subexpressions. Both dashed and solid arrows
denote the parent relationship. (8,?) is the result value of the omputation. As
in Freja, ? represents the value of the error message.
The graphs of the two tra e stru tures are laid out to stress their similarity.
All arrows of the EDT are also present in the redex trail but point in the oppo-
site dire tion. If the redex trail held information about whi h parent relations
orrespond to redu tions (these are shown as solid arrows), then the EDT ould
be onstru ted from the redex trail (however, see also the next paragraph and
Se tion 4.1 about free variables). In ontrast, the redex trail ontains more infor-
mation than the EDT, be ause it additionally links every value with its parent
redex and des ribes how expressions are shared.
The redex trail shown in Figure 2 is a simpli ed version of the one that is
really reated by Hat. The real redex trail has an additional node xs with parent
main and hildren 4 * 2, 3 + 6, the two  :  nodes and [℄. That is, the redex trail
also re ords the redu tion of the let expression. The whole let expression is a
redex, but in the redex trail it is represented by the de ned variable xs. Similarly
a node xs ) [8,?℄ that re ords the redu tion of the let expression ould be
added to the EDT. So re ording a let redu tion is an option for both the
EDT and the redex trail and the implementors of Freja and Hat made di erent
de isions with respe t to this option. On the one hand re ording let redu tions
leads to larger tra es with an unusual kind of redex. On the other hand it enables
more ne grained tra ing ( f. Se tion 4.3).
Be ause Hood observations ontain values as they are demanded in a given
ontext, whereas both the EDT and the redex trail ontain values in their most
evaluated form, it is not possible to gain Hood observations from either the EDT
or the redex trail. Conversely, even observing every subexpression of a program
with Hood would not enable us to onstru t an EDT or redex trail, be ause
there is no information about the relations between the observations.

3.3 Implementation and Portablility


Ea h system onsists of two parts, the browser and a part for the generation of
the tra e. We will dis uss the browsers in Se tion 4.

7
The developers of the three systems made di erent hoi es about the level at
whi h they implemented the reation of the tra e. In Freja the tra e is reated in
the heap dire tly by modi ed instru tions of the abstra t graph redu tion ma-
hine. Hat transforms the original Haskell program into another Haskell program.
Running the ompiled transformed program yields the redex trail in addition to
the normal result. Finally, in Hood the tra e is reated as a side e e t by the
ombinator observe, whi h is de ned in a Haskell library.
The level of implementation has dire t e e ts on the portability to di erent
Haskell systems. Hood an be used with di erent Haskell systems, be ause the
library only requires a few non-standard fun tions su h as unsafePerformIO
whi h are provided by every Haskell system6 . The transformation of Hat is ur-
rently integrated into the nh 98 ompiler but ould be separated. A transformed
program uses a few non-standard unsafe fun tions to improve performan e. Fur-
thermore, some extensions of the Haskell run-time system are required to retain
a ess to the result after termination or interruption and to onne t to the
browser. Finally, Freja is a Haskell system of its own. Adding its low-level tra e
reation me hanism to any other Haskell system would require a major rewriting
of this system.

3.4 Redu tion of Tra e Size


In Hood the tra e onsists only of the observations of annotated expressions.
Hen e its size an be ontrolled by the hoi e of annotations7 . In ontrast, both
Freja and Hat onstru t tra es of the omplete omputation in the heap.
To redu e the size of the tra e, both Freja and Hat enable marking of fun -
tions or whole modules as trusted. The redu tion of a trusted fun tion itself is
re orded in the tra e, but not the redu tions performed to evaluate the right-
hand-side of its de nition. The details of the trusting me hanisms of both sys-
tems are non-trivial, be ause the evaluation of untrusted fun tions whi h are
passed to trusted higher-order fun tions have to be re orded in the tra e. Usu-
ally at least the Haskell Prelude is trusted.
To further redu e the spa e onsumption, both Freja and Hat support the
onstru tion of partial tra es. In Freja, rst only an upper part of the EDT may
be onstru ted during program exe ution. When the user rea hes the edge of
the onstru ted part of the EDT in the browser, this part is deleted and the
whole program is re-exe uted, this time onstru ting the part of the EDT that
an be rea hed next by the questions. So, ex ept for the time delay aused by
re-exe ution, the user has the impression that the whole EDT is present.
6
The version of Hood whi h an handle not only terminating programs but also those
that abort with an error message or do not terminate requires the non-standard
ex eption library supplied with the Glasgow Haskell ompiler.
7
A variant of Hood allows the annotated running program to write observed events
dire tly to a le, so that the tra e does not need to be kept in primary memory.
However, to obtain observations, the events in the le need to be sorted. Hen e the
browser for displaying observations reads the omplete le and thus has problems
with large observations.

8
Hat an produ e partial tra es by limiting the length of the redex trails. Be-
ause a redex trail is browsed ba kwards, the system prunes away those redexes
that are further than a ertain length away from the live program data or out-
put. Hat does not provide any me hanism like re-exe ution in Freja to re reate
a pruned part of the redex trail.
Requiring less heap spa e may redu e garbage olle tion time, but Hat still
spends the time for onstru ting the whole tra e whereas Freja does not need to
spend time on tra e onstru tion after onstru tion of an upper part of an EDT.

4 Evaluation of the Systems

Di eren es between the systems dire tly raise several questions. Is it desirable
to add a feature of one system to another system? Does an alternative design
de ision make sense? How far is a distinguishing feature inherent to a system,
possibly determined by its implementation method or its tra ing model? Be ause
the design spa e for a tra er is huge, it is sensible to evaluate system features
in pra ti e early. We applied the three systems to a number of programs in
whi h errors had deliberately been introdu ed. The errors aused all three kinds
of faulty observable behaviour mentioned earlier: wrong output, abortion with
error message and non-termination.
Our evaluation experiments use the following proto ol: At least two program-
mers are involved. First the author of a orre tly working program explains how
the program works. Then one programmer se retly introdu es several deliberate
errors into the program, of a kind undete ted by the ompiler. Given the faulty
program, the other programmers use a tra ing system to lo ate and x all the
errors, thinking aloud and taking notes as they do so.
All the parti ipants are experien ed Haskell programmers.
The programs used in the experiments are of moderate omplexity. The
largest program, PsaCompiler, a ompiler for a toy language, onsists of 900
lines in 13 modules and performs 20,000 redu tions for the input we provided.
The longest running program, Adjoxo, an adjudi ator for noughts and rosses
(ti ta toe), onsists of only 100 lines but performs up to 830,000 redu tions
for our inputs. In our hoi e of programs we were restri ted by the subset of
Haskell that Freja supports. For example, Freja does not implement lasses and
unfortunately not even every Freja program is a valid Haskell program. Freja
had been applied to a mini ompiler with 16 million redu tions [6℄ and Hat had
been applied to a version of nh 98 with 14,000 lines and 5.2 million redu tions
and a hess end-game program with 20 million redu tions [11℄. These papers give
performan e gures but do not indi ate how easy debugging programs of this
size is. We annot make su h statements either, but our programs are de nitely
beyond toy examples and of a size often o urring in pra tise. Our programs also
do not perform monadi input/output. Freja does not implement it and Hat only
supports a few operations. It would be interesting to see if Hood's ability to show
the return value of an exe uted input/output a tion is suÆ ient in pra ti e.

9
4.1 Readability of Expressions
In ontrast to our preliminary fears that the expressions shown by the browsers|
redu tions, redexes and values | would be too large to be omprehensible, for
our programs they are mostly of moderate size and easily readable.
As we will dis uss in Se tion 4.2 the user of a tra ing system not only views
the tra e but also the program. Nonetheless in Freja and Hat informative variable
(fun tion) names, that onvey the semanti s of the variable well, substantially
redu e the need for viewing the program and thus in rease the speed of the
debugging pro ess substantially.

Unevaluated Expressions Freja shows unevaluated expressions as ? and the


unde ned value as ?. This property makes expressions even shorter and more
readable. This also holds for Hood. Only in some ases more information would
be desirable for better orientation. In Hat the display of the unevaluated redexes
in full sometimes obs ures higher level properties, for example the length of a
list. All in all our observations suggest that unevaluated expressions should be
ollapsed into a symbol by default but should be viewable on demand.
Hood shows even less of a value than Freja, be ause it only shows the part
demanded in a given ontext. Note that this amount of information would suÆ e
for answering the questions of Freja. Be ause Hat is not based on questions, it is
less lear if showing only demanded values would be suitable for it. Finally, the
fa t that Freja and Hat show values to the extent to whi h they are evaluated in
the whole omputation whereas Hood shows them to the extent to whi h they
are demanded is losely linked to the respe tive implementations of the systems
and thus not easily hangeable.

Fun tions In Haskell, fun tions are rst- lass itizens and hen e fun tion values
may appear for example as arguments in redexes or inside data stru tures.
For the representation of fun tion values, Hood deviates from the prin iple
of showing Haskell-like expressions. It shows fun tion values as nite mappings
from arguments to results. Be ause the mapping ontains only expressions that
were demanded during the omputation, the representation is short in most
ases. However, for fun tions that are alled often and espe ially for higher-
order fun tions the representation is unwieldy. The representation requires some
time to get used to. In return, it permits a rather abstra t, denotational view of
program semanti s whi h is useful for determining the orre tness of part of a
program.
In Freja and Hat a fun tion value is shown as a fun tion name, a -abstra tion,
or as a partial appli ation of a fun tion name or a -abstra tion. Fun tion names
and their partial appli ations are easily readable but -abstra tions are not. Both
systems do not show a -abstra tion as it is written in the program but repre-
sent it by a new symbol: <lambda#n> for a number n in Freja and (\) in Hat.
Both systems an show the full -abstra tion on demand. However, be ause
of the ne essary additional step and be ause -abstra tions are often large ex-
pressions, reading expressions involving -abstra tions is hard. We onje ture

10
that with Freja or Hat debugging programs that make substantial use of -
abstra tions, as ommonly done for stylised abstra tions su h as ontinuation
passing, higher-order ombinators and monads, is rather diÆ ult. Our programs
hardly use stylised abstra tions. In fa t, PsaCompiler uses only named fun tions,
even in the de nitions of its parser ombinators, where most Haskell program-
mers would use -abstra tions. During tra ing, Freja and Hat show very readable
expressions for PsaCompiler.

Free Variables Both -abstra tions and the de nition bodies of lo ally de ned
fun tions often ontain free variables. To answer a question in Freja the values of
su h free variables must be known. Hen e Freja shows this information in a where
lause. The following question from an evaluation experiment demonstrates that
this information usually adds to the omprehensibility of a question onsiderably:
tableRead
"y"
(TableImp
(newTableFun tion
where
newIndex = "x",
newEntry = 1,
oldTableFun tion = implTableEmpty))
=>
Just 1
The orre t answer is obviously no.
Hat does not show the values of free variables. This information an be ob-
tained only indire tly by following the hain of parent redexes of su h a fun tion.
To realise that a fun tion has free variables and to see the orresponding argu-
ments of parent redexes it is ne essary to follow links to the program sour e.
In Hood an observation of a lo ally de ned fun tion an be misleading. The
observation is really for a family of di erent fun tions, with di erent values for
free variables. In our experiments one observation of a lo al fun tion moveval is
presented as follows
-- moveval
f : : : , \ 8 -> Draw, : : : g
f : : : , \ 8 -> Win, : : : g

4.2 Lo ating an Error


With all three systems we su essfully lo ate all errors in our programs. For
lo ating an error in our largest program we answer between 10 and 30 questions
in Freja, look at 0 to 6 parents in Hat and add observe up to 3 times for
Hood. The relation between these numbers is typi al. However, the numbers
annot be ompared dire tly to determine speed of use, be ause the ounted

11
operations are ompletely di erent. A major di eren e between the systems
is the time the user has to spend thinking about what to do next, and the
e ort required to do it. For example, the time required in Hood for de iding
where to add observe annotations, modifying the program (dis ussed further in
Se tion 4.4), re ompiling the program and reexe uting it is substantially higher
than answering a question or sele ting an expression for viewing its parent.
Furthermore, the amount of data produ ed by a single observe annotation is
usually substantial.

Guidan e and Strategies Freja asks questions whi h the user has to answer
whereas in both other systems the user also has to ask the right questions. Freja
guides the user towards the error.
Hat at least starts with the program output, an error message or the last
evaluated redex in an interrupted program and the main operation is to hoose
a subexpression and ask for its parent. There are usually many subexpressions
to hoose from and the system never states that an error has been lo ated at a
given position in the program. Wrong parts in the output or wrong arguments in
redexes are andidates for further enquiry. Nonetheless, for the less experien ed
user it is easy to get lost examining an irrelevant region of the redex trail.
Hood gives the omplete freedom to observe any value in the program. The
initial hoi e of what to observe is diÆ ult and often seems arbitrary. In general
Hood users apply a top-down strategy in their pla ement of observe ombina-
tors, if the faulty behaviour does not point to any program lo ation, for example
when the program does not terminate. Then the questions the Hood users asks
are similar to those asked by Freja. If, on the other hand, the position where
the observable fault is aused an be identi ed, for example when the program
aborts with an error message o urring only on e in the program, then a Hood
user tries to apply a bottom-up strategy reminis ent of Hat.
Our programs ontain several errors. Users of Hat and Hood lo ate the er-
rors in the same order, be ause they always lo ate the error that auses the
observed faulty behaviour. In ontrast, the questions of Freja sometimes lead to
the lo ation of a di erent error. It is possible to ta kle a spe i faulty behaviour
by answering some questions in orre tly, but that requires are. One may easily
steer into irrelevant regions of the EDT.

General Usability Hat with its omplex browser has the steepest learning
urve for a new user. In ontrast, the prin iple of questions and answers of
Freja is easy to grasp and Hood has the advantage of using the idea of print
statements, whi h are well-known from imperative languages. Hen e a mode that
would hide some features from the beginner seems desirable for Hat.

Information Used A Hood user has to modify the program and hen e look
at it. Sometimes just the pro ess of sear hing for a good pla ement of observe
reveals the error. Users of Freja and Hat, espe ially the former, tend to negle t

12
the program. As long as the user knows the intended meaning of fun tions he
an use Freja without ever looking at the program. This does however imply
that the user does not try to follow Freja's reasoning and to understand how the
nally lo ated error a tually aused the observed faulty behaviour. Redexes as
shown by Hat are not intended to be the only sour e of information for lo ating
an error. Viewing the program part where a redex is reated gives valuable
ontext information and at the end the program is needed to lo ate the error.
Both Freja and Hat provide qui k a ess to the part of the program relating
to the urrent question or redex. Nonetheless, it seems worthwhile to test if
automati ally showing the relevant part of the program when a new question or
parent is shown would improve usability.
In ontrast to the other two systems Hat also gives information about whi h
expressions are shared. This information is useful in some ases, usually when
expressions are shared unexpe tedly.
A tra e of Hood is a set of observations. The tra e unfortunately ontains no
information about the relations between these observations. Hen e, with a few
ex eptions, we observe fun tions to obtain at least a relation between arguments
and result. In parti ular, the representation of an observed fun tion shows learly
whi h (part of an) argument is not demanded by the fun tion for determining
its result. This feature is helpful for lo ating errors.

Wrong Subexpressions Often, in the questions posed by Freja, a spe i


subexpression of a result is wrong. For example in the following program the 1
in the se ond list element should be a 2. But there is no way to give Freja this
information. We an only on rm or refute the redu tion as a whole.
translateStatement
(TableImp
(newTableFun tion
where
newIndex = "y",
newEntry = 2,
oldTableFun tion = newTableFun tion
where
newIndex = "x",
newEntry = 1,
oldTableFun tion = implTableEmpty))
?
(Assignment "x" (Compound (Var "x") Minus (Var "y")))
=>
_Tuple_2 [Lod 1,Lod 1 ,Sb,Sto 1℄ 4
In ontrast, the redex trail ontains the parent of every subexpression. A
Hat user seldom asks for the parent of a omplete expression but usually for the
parent of some subexpression. We believe that this is the major reason why we
look at far less parents with Hat than we answer questions of Freja for lo ating

13
the same error. A Hood user obviously also tries to use information about wrong
subexpressions but it is not easy to de ide where to pla e the next observe
ombinator.

Redu tion of Information In Hood, the user determines the size of the tra e
by the pla ement of observe ombinators. It is, however, sometimes not easy to
foresee how large an observation will be. The trusting me hanisms in Freja and
Hat not only save spa e but also redu e the amount of information presented to
the user. The ability of the Freja browser to dynami ally trust a fun tion and
thus avoid further questions about it is useful. For Hat a orresponding feature
seems desirable. In Freja, sometimes a question is repeated, be ause the same
redu tion is performed again. Hen e memoisation of questions and their answers
is desirable. It would also be useful to be able to generalise an answer, to avoid
a series of very similar questions all requiring the same answer.

Runtime Overhead With respe t to the time overhead aused by the reation
of tra es the low-level implementation of Freja pays o . The overhead is not no-
ti eable. In ontrast, in Hat tra ed omputations are more than ten times slower.
For some inputs adjoxo seems to be non-terminating but it is only slow! We ex-
perien e the same with Hood when we observe at positions that are omputed
very often and that lead to large observations. So in Hood the time overhead is
onsiderable but it is only proportional to the amount of observed data.

Compiler Messages A helpful error message from a ompiler an redu e the


need for a tra er. If a fun tion is alled with an argument for whi h no mat hing
equation exists, then the aborting program gives the fun tion name if it was
ompiled with the Glasgow Haskell ompiler8 , but not if it was ompiled with
Freja or nh 98. However, in that ase Hat dire tly shows the fun tion with
its arguments whereas Freja requires the answers to numerous questions before
lo ating the error.

4.3 Redexes and Language Constru ts


A omputation does not only onsist of redu tions of fun tion appli ations. We
noted already in Se tion 3.2 for let expressions that there are other kinds of
redexes. This aspe t only on erns Freja and Hat, be ause Hood only shows
values.

CAFs A onstant appli ative form (CAF) is a top-level variable of arity zero,
in other words a top-level fun tion without arguments. Its value is omputed on
demand and shared by its users. Both Freja and Hat take the view that a CAF
has no parent. Hen e the tra e of a program in Freja is generally not a single
8
http://www.haskell.org/gh

14
EDT but a set of EDTs, an EDT for ea h CAF in luding main. These EDTs are
sorted so that a CAF only uses those CAFs about whi h questions have already
been asked and whi h are hen e known to be free of errors. Unfortunately one of
our experiment programs ontaines 35 CAFs. We have to on rm the orre tness
of evaluation for all CAFs before rea hing the question about main, although
none of these CAFs are related to any of the errors. Freja an be instru ted
to start with the question about main. However, that implies stating that the
evaluation of all CAFs is orre t, whi h may not be the ase and thus lead Freja
to give a wrong error lo ation. An alternative de nition of the EDT ould imply
that all users of a CAF are its parents. Then a question about a CAF would be
asked only if it were relevant and memoisation of the question and its answer
ould avoid asking the same question when another redu tion using the CAF
were investigated.
For Hat a orresponding modi ation without losing sharing of CAFs seems
to be more diÆ ult, be ause the redex trail is browsed by going ba kwards from
an expression to its unique parent. In our experiments the fa t that a CAF
has no parent in a redex trail is not noti eable, be ause none of the introdu ed
errors on ernes CAFs. However, programs an be onstru ted where this la k
of information hinders lo ating an error:
nats :: [Int℄
nats = 0 : map su nats

main = print (last nats)


The omputation of this program does not terminate. When the programmer
interrupts the omputation, Hat may show map su (0 : su 0 : ) as
next redex to be evaluated. The parent of this redex is nats, whi h has no
parent. The error may well be that the programmer intended to all another
fun tion than last in the de nition of main, but unfortunately the redex last
nats is unrea hable.
We stated in Se tion 3.2 that Hat has a spe ial kind of redex for lo ally
de ned variables of arity zero (de ned in let expressions and where lauses).
The parent of su h a variable redex is the redex that reated the de nition
and not | as for fun tion appli ation redexes | the redex that reated the
appli ation. So as for CAFs redexes may be ome unrea hable.

Guards, ases and ifs In Haskell the sele tion of an equation of a de nition
may not only be determined by pattern mat hing but may also depend on the
value of a guard:
test :: (a -> Bool) -> a -> Maybe a

test p x | p x = Just x
| otherwise = Nothing
In Freja the redu tion of a guard (p x) is a hild of the redu tion of the fun tion
(test). Redex trails are, however, traversed ba kwards from the result value

15
(Just x or Nothing). To hold the information about the redu tion of a guard,
redex trails have an additional sort of redexes. In the example, if the rst equation
were hosen, then the value Just x would have the parent | True C test p
x, and if the se ond equation were hosen, then the value Nothing would have
the parent | True C | False C test p x. By asking for the parents of the
truth values True and False in the redexes, the user an obtain information
about the evaluation of the guards.
Similarly, Hat uses spe ial redexes for ase and if expressions. On the one
hand, these spe ial redexes ompli ate the system. On the other hand, they are
useful for large fun tion de nitions. The spe ial redexes enable more ne grained
tra ing up to the level of guards, ases and ifs, whereas Freja only identi es
a whole fun tion redu tion as faulty. Similar to the situation for lo ally de ned
variables it is possible to extend the de nition of Freja's EDT by spe ial nodes
for guard, ase and if redu tions. For Hat, spe ial redexes for these redu tions
are important to make parts of the redex trail rea hable by ba kward traversal
that otherwise would be unrea hable.

4.4 Modi ation of the Program


Whereas Freja and Hat are applied to the original program, requiring only spe ial
ompilation, Hood is based on modifying the program. Sometimes the introdu -
tion of the observe ombinator requires modi ations whi h are non-trivial, if
an operator is observed (be ause of its in x position) or if not a spe i all but
all alls of a fun tion are observed as in our example in Se tion 2.3. Furthermore,
the main fun tion has to be modi ed and the library has to be imported in every
program module that uses its entities. Most importantly, a data type an only
be observed if it is an instan e of a lass Observable. Some of our experiment
programs de ne many data types; be ause we want to observe most of them, we
have to write many instan e de nitions. Writing these instan e de nitions is easy
but time onsuming. Additionally, all these modi ations potentially introdu e
new errors in the program and also make the program less readable.
On the other hand it might be useful to leave the modi ations for Hood in
the program. They ould be en-/disabled during ompilation by a prepro essor
ag for a debug mode. Then most modi ations, espe ially writing instan es of
the lass Observable, require only a one-time e ort. The observe ombinator
may even be pla ed to observe the main data stru tures of the program. Thus
debugging is integrated more losely into program development. In ontrast,
Freja and Hat annot save any information from a tra ing session for future
versions of the program.

5 Other Tra ers and Debuggers


Buddha [4, 10℄ is a tra ing system whi h like Freja onstru ts an EDT. Its imple-
mentation is based on a sour e-to-sour e transformation, but unlike the trans-
formation of Hat this transformation is not purely syntax-dire ted but requires
type information. Buddha is still a tively developed.

16
Booth and Jones [3, 1℄ sket h a system whi h reates a tra e quite similar to
an EDT. The main di eren e is that a parent node is only onne ted dire tly
to one hild. All sibling nodes are onne ted with ea h other a ording to the
stru ture of the de nition body of the parent node. Thus the tra e has the ni e
property that all onne ting arrows denote equality, unlike the arrows in an EDT
or a redex trail. The authors des ribe a browser whi h gives more freedom in
traversing the tra e than the questions of Freja.
There also exist several systems for showing the a tual omputation sequen e
of a lazy fun tional program. Se tion 2.2 of [14℄, Chapter 11 of [5℄ and Chapter
2 and Se tion 7.5 of [8℄ review a large number of tra ing and debugging systems
for lazy fun tional languages.
We ould not in lude any of these systems in our experiments, be ause there
are only limited prototypes, not publi ly available.

6 Summary and Con lusions

We have ompared and evaluated the tra ing and debugging systems Freja, Hat
and Hood by applying them to a number of programs.
Tra ing and debugging systems for lazy fun tional languages have made on-
siderable progress in re ent years: all three systems prove to be e e tive tools
for debugging our programs. Though none of our programs is very large, some
of them are large enough to show that the s ope of appli ation for the tools
goes well beyond easy exer ises. Unfortunately the pra ti al usability of Hat
and espe ially Freja is urrently limited by the fa t that they do not support
full Haskell 98.
Ea h of the tra ing tools takes a unique approa h with spe i strengths. In
parti ular, Freja has a systemati fault- nding pro edure; Hat starts at the ob-
served error and enables exploring ba kwards the history of every subexpression;
Hood observes the data ow at spe i program points by need.
Based on our experiments we identify in Se tion 4 the strengths but also the
weaknesses of ea h system. For some weaknesses we already suggest improve-
ments, often based on the onvin ing solutions of the problems in other systems.
Other weaknesses are linked either to the tra ing method or the implementa-
tion, whi h we dis uss in Se tion 3. Hen e they are more diÆ ult to address
and require further resear h. For example, Freja annot take advantage of the
ommon ase that only a subexpression of a redu tion is wrong, Hat is slow
and Hood gives almost no indi ation of how values are related. We laim that
an integration of Freja into Hat is feasible whereas Hood's approa h is rather
di erent from the approa hes of the other two systems.
Finally, good tools are not suÆ ient for debugging. The user needs advi e on
how to e e tively use ea h system; a strategy needs to be developed for Hat and
espe ially for Hood, but even Freja would bene t from advi e on how to employ
its advan ed features. Also a strategy for using several systems together, taking
advantage of their respe tive strengths, is desirable.

17
A knowledgments
We thank Henrik Nilsson and Jan Sparud for taking part in the evaluation
experiments and making valuable observations. The work reported in this paper
was supported by the Engineering and Physi al S ien es Resear h Coun il of
the United Kingdom under grant number GR/M81953.

Referen es
1. Simon P Booth and Simon B Jones. Walk ba kwards to happiness | debug-
ging by time travel. Te hni al Report Te hni al Report CSM-143, Department
of Computer S ien e and Mathemati s, University of Stirling, 1997. This pa-
per was presented at the 3rd International Workshop on Automated Debugging
(AADEBUG'97), hosted by the Department of Computer and Information S ien e,
Linkoping University, Sweden, May 1997.
2. Andy Gill. Debugging Haskell by observing intermediate data stru tures. In Pro-
eedings of the 4th Haskell Workshop, 2000. Te hni al report of the University of
Nottingham.
3. Simon B. Jones and Simon P. Booth. Towards a purely fun tional debugger for
fun tional programs. In Pro eedings Glasgow Workshop on Fun tional Program-
ming 1995, Ullapool, S otland, July 1995.
4. Lee Naish and Tim Barbour. Towards a portable lazy fun tional de larative de-
bugger. In Pro . 19th Australasian Computer S ien e Conferen e, January 1996.
5. Henrik Nilsson. De larative Debugging for Lazy Fun tional Languages. PhD thesis,
Linkoping, Sweden, May 1998.
6. Henrik Nilsson. Tra ing pie e by pie e: a ordable debugging for lazy fun tional
languages. In Pro eedings of the 1999 ACM SIGPLAN International Conferen e
on Fun tional Programming, pages 36{47. ACM Press, 1999.
7. Henrik Nilsson and Jan Sparud. The evaluation dependen e tree as a basis for lazy
fun tional debugging. Automated Software Engineering: An International Journal,
4(2):121{150, April 1997.
8. Alastair Penney. Augmenting Tra e-based Fun tional Debugging. PhD thesis, De-
partment of Computer S ien e, University of Bristol, September 1999.
9. Simon L. Peyton Jones, John Hughes, et al. Haskell 98: A non-stri t, purely
fun tional language. http://www.haskell.org, February 1999.
10. Bernard Pope. Buddha: A de larative debugger for Haskell. Te hni al report, Dept.
of Computer S ien e, University of Melbourne, Australia, June 1998. Honours
Thesis.
11. Jan Sparud and Colin Run iman. Complete and partial redex trails of fun tional
omputations. In C. Cla k, K. Hammond, and T. Davie, editors, Sele ted papers
from 9th Intl. Workshop on the Implementation of Fun tional Languages (IFL'97),
pages 160{177. Springer LNCS Vol. 1467, September 1997.
12. Jan Sparud and Colin Run iman. Tra ing lazy fun tional omputations using redex
trails. In H. Glaser, P. Hartel, and H. Ku hen, editors, Pro . 9th Intl. Symposium
on Programming Languages, Implementations, Logi s and Programs (PLILP'97),
pages 291{308. Springer LNCS Vol. 1292, September 1997.
13. Philip Wadler. Fun tional programming: Why no one uses fun tional languages.
SIGPLAN Noti es, 33(8):23{27, August 1998. Fun tional programming olumn.
14. R. D. Watson. Tra ing Lazy Evaluation by Program Transformation. PhD thesis,
Southern Cross, Australia, O tober 1996.

18

Вам также может понравиться