Вы находитесь на странице: 1из 26

Smart programming languages,

smart program analysis

Varmo Vene

Institute of Cybernetics at TUT


&
University of Tartu
Introduction
A quote from classics
Everyone knows that debugging is twice as hard as
writing a program in the first place. So if you’re as
clever as you can be when you write it, how will you
ever debug it?
Brian Kernighan, P.J. Plauger
”The Elements of Programming Style”, 2ed., 1978.

30 years later, . . .
we still spend often more time for debugging and testing
than for actual programming;
despite that, the software we are using and/or developing
has bugs (sometimes quite serious ones).
Introduction
Possible reasons
Human imperfection
To err is human, to forgive divine.
(Alexander Pope, 1688–1744)

Laws of nature
Program testing can be used to show the presence of
bugs, but never to show their absence!
(Edsger Dijkstra, 1970)

Imperfection of tools
The most effective debugging tool is still careful
thought, coupled with judiciously placed print state-
ments.
(Brian Kernighan, 1979)
Introduction
A goal of Semantics (among others)
To develop programming tools that give strong guarantees
about properties of programs.
– Eg. guarantee the absence of certain kind of errors.
Proactive tools
– Eg. program extraction.
Preventive tools
– Eg. programming languages with powerful type systems.
Retroactive tools
– Eg. static program analyzers.
Outline
Total Functional Programming
– Inductive types
– Comonadic recursion
– Recursive coalgebra
– Mendler-style recursion
Goblint
– Path-sensitivity
– Concurrent analysis
Working Group and Plans
Total Functional Programming
Total Functional Programming
In total functional programming paradigm all programs are
terminating.
In particular, there is no general recursion.
Instead, only some restricted forms of recursion are allowed,
which are guaranteed to terminate.
Usually, these are simple iteration or primitive recursion over
inductive types.
Sometimes also corecursive definitions of coinductive types
are allowed.
While not Turing complete, most of the interesting programs
are in principle expressible in such paradigm.
Total Functional Programming
Inductive Types and Iteration
Categorically, inductive types (such as natural numbers,
lists, trees, etc) are initial algebras of endofunctors.
The most basic form of recursion (known as iteration or fold)
corresponds to the unique homomorphism property of initial
algebras. in
FF F
Ff 9!f = fold(')
8'
FA A

By duality, coinductive types (streams and various other in-


finite and potentially infinite structures) are terminal coal-
gebras, and the basic form of corecursion (known as coiter-
ation) rises from the unique homomorphism property.
Total Functional Programming
Comonadic Recursion
In series of papers (Uustalu & Vene, 1996–98) we introduced
several new (co)recursion schemes capturing primitive core-
cursion, course-of-value (co-)recursion, etc.
All of them shared strong similarities, but differed on con-
crete details.
In (Uustalu & Vene & Pardo, 2000) we proved a generic
many-in-one recursion scheme parametrized by a recursive
call pattern represented by a comonad with a distributive
law.
The new scheme covered most of the previously known re-
cursion schemes as instances of the comonadic one.
Total Functional Programming
Recursive Coalgebras
The algebra structure inF : FF ! F is an isomorphism.
In fact, the essential properties of a recursion scheme depend
more on its inverse, a coalgebra!
In (Capretta & Uustalu & Vene, 2004) we defined the notion
of recursive coalgebras.
FA A
Ff 9!f
8'
FB B
The notion generalizes well-founded recursion and has it’s
origin in (Osius, 1970).
We identified a number of ways for constructing recursive
coalgebras and generalized the comonadic recursion to this
setting.
Total Functional Programming
Mendler-style recursion
Programming with recursors defined by properties such as
initiality, comonadic recursion, etc. is cumbersome.
Eg. functions defined by the iteration must have the follow-
ing form: in
FF F
Ff 9!f
8'
FA A
Total Functional Programming
Mendler-style recursion
Programming with recursors defined by properties such as
initiality, comonadic recursion, etc. is cumbersome.
In (Uustalu & Vene, 1996, 2000, 2002) we considered an
alternative form:
FF in F
8(f ) 9!f

A
where  : 8X:(X ! A) ! (FX ! A).
Idea originates from (Mendler, 1987).
And extends to other recursion schemes.
Total Functional Programming
Mendler-style recursion
The scheme looks quite similar to the general recursion,
hence is (hopefully) more intuitive.
But the termination is still guaranteed.
Ie. we have termination checking by type-checking.
Total Functional Programming
Mendler-style recursion
The scheme looks quite similar to the general recursion,
hence is (hopefully) more intuitive.
But the termination is still guaranteed.
Ie. we have termination checking by type-checking.

Ongoing and further works


Corecursive algebras (with V. Capretta)
Mendler-style vs. Circular proofs (with R. Cockett)
...
To make Total FP fly!
Where we are?
Total Functional Programming
– Inductive types
– Comonadic recursion
– Recursive coalgebra
– Mendler-style recursion
Goblint
– Path-sensitivity
– Concurrent analysis
Working Group and Plans
Goblint
What is Goblint?
Goblint is a static analyzer for Posix-threaded C
Focused on detecting multiple access data races
Integrates with Eclipse C development environment
Aims to be sound (ie. must detect all errors, but may give
false alarms)
Aims to be efficient enough to be able to analyze
medium-to-large scale programs ( 100 kLOC)
Aims to be precise enough to be able to analyze
medium-to-large scale programs ( 100 kLOC)
(Vojdani & Vene, 2007)
Goblint
Main conflicts
Soundness vs. C
Efficiency vs. Precision
Goblint
Main conflicts
Soundness vs. C
Efficiency vs. Precision

Soundness vs. C
Restrict to the ”safe” subset of C:
no setjmp and getjmp;
no dynamic data structures;
no recursion;
...
Goblint
Main conflicts
Soundness vs. C
Efficiency vs. Precision

Soundness vs. C
Restrict to the ”safe” subset of C: Not as bad as it looks:
no setjmp and getjmp; we can still handle
no dynamic data structures; these constructs,
no recursion; but do not guarantee
the soundness.
...
Goblint
Main conflicts
Soundness vs. C
Efficiency vs. Precision

Efficiency vs. Precision


We adopt normal data flow analysis techniques, but
use functional approach to distinguish calling contexts,
use dynamically adjustable path-sensitive analysis;
use global invariant based concurrent analysis.
Goblint: Path-sensitivity
man gcc on “-Wuninitialized”
These warnings are made optional because GCC is not smart
enough to see all the reasons why the code might be correct
despite appearing to have an error . . .

Here is another common case:


i n t save_y ;
i f ( change_y ) save_y = y , y = new_y ;
...
i f ( change_y ) y = save_y ;
This has no bug because "save_y" is used only if it is set.
Goblint: Path-sensitivity
Example

i n t save_y ;
i f ( change_y ) save_y = y , y = new_y ;
...
i f ( change_y ) y = save_y ;

What is the problem?


There are 4 potential execution paths.
Only 2 are logically possible.
We need to distinguish execution paths.
In general, there are an infinite number of paths!
Goblint: Path-sensitivity
Example

i n t save_y ;
i f ( change_y ) save_y = y , y = new_y ;
...
i f ( change_y ) y = save_y ;

Our solution
We only track the paths that are relevant to the analysis
result.
In this example, paths are relevant when the set of
uninitialized variables are different.
In general, relevance depends on the user-analysis. . .
Goblint: Concurrent Analysis
State explosion
Precise concurrent analysis leads to state explosion.
Eg. if there are two threads with 10 instructions each, then
there are 184756 possible interleavings!

Global invariant based concurrent analysis


Separate shared (ie. global) and local variables.
Compute a single invariant for global state.
Essentially, join all possible values in all program points.
Now all threads can be analyzed sequentially.
Very imprecise for base domain, but works well with user
domains like lock-sets.
(Seidl & Vene & Müller Olm, 2003).
Goblint
Ongoing and further works
Equality analysis of addresses (with H. Seidl);
Scalability improvements;
Adding new analyses (eg. variable initialization,
open-use-close analysis, etc.);
Better handling of external functions;
...

Additional information
Goblint has an Open Source license
You can download it from web:
http://goblin.at.mt.ut.ee/goblint/tracker/
Working Group and Plans
Programming Languages and Systems at EXCS
Senior staff
Keiko Nakata (IoC) Jaan Penjam (IOC)
Härmel Nestra (UT) Tarmo Uustalu (IOC)
Hellis Tamm (IOC) Varmo Vene (IOC/UT)
PhD students
Ando Saabas (IOC) Vesal Vojdani (UT)
Jevgeni Kabanov (UT) Andres Toom (IOC)
Aivar Annamaa (UT) Martin Pettai (UT)
Best friend
Peeter Laud (CybAS)
Working Group and Plans
Other research directions
Comonadic data-flow (Uustalu, Vene)
Proof transformation (Saabas, Uustalu)
Automata theory (Tamm, Penjam)
Transfinite semantics (Nestra)
Domain specific languages in Java (Kabanov)
Code generation for data-flow (Toom)

Вам также может понравиться