Вы находитесь на странице: 1из 6

5/23/2011

Overview So far we have looked at summary-based static analysis Path-Based Static Analysis
Lecture 18 CS 295 For each function f, compute description of the effect of f Mapping of inputs to outputs

Example ala Saturn:


(flag, *file1, Closed) -> (true, *file1, Open) if flag is true and file1 is closed on function entry, then file1 is open on function exit

Prof. Aiken

CS 295

Lecture 18

Prof. Aiken

CS 295

Lecture 18

Discussion Summaries can be hard to compute Must account for all paths through the function Summary language generally must be quite expressive

Another Approach An alternative to summaries is to peform path-based analysis Analyze just one path at a time Conceptually simpler
And often simpler to implement

Prof. Aiken

CS 295

Lecture 18

Prof. Aiken

CS 295

Lecture 18

Checking Paths

Issues There can be a lot of paths


N conditionals -> up to 2N paths

There can be a lot of paths p


Loops, recursive functions

Leave these issues aside for now

Prof. Aiken

CS 295

Lecture 18

Prof. Aiken

CS 295

Lecture 18

5/23/2011

Finite State Properties (Again) For specifications, use FSMs For this lecture, files
Two states: Open, Closed An Open file can be Closed A Closed file can be Open Other transitions are errors

A First Algorithm

For each path, path track the state of each file

Prof. Aiken

CS 295

Lecture 18

Prof. Aiken

CS 295

Lecture 18

Applying the Simple Algorithm


Assume file is initially in Open state

What Went Wrong? Some of these paths are invalid


no execution can follow these paths

Prof. Aiken

CS 295

Lecture 18

Prof. Aiken

CS 295

Lecture 18

10

An Invalid Path

A Second Algorithm Keep track of branch decisions on paths


Abstract state is now a pair

(predicate, file state)

Predicate is a conjunction of branch conditions


If predicate is false, path is infeasible

File state is as before

Prof. Aiken

CS 295

Lecture 18

11

Prof. Aiken

CS 295

Lecture 18

12

5/23/2011

Tracking Predicates

Discussion This works


Modulo unresolved issues with loops/recursion

Requires q
A theorem prover
Something that can deduce whether a predicate is false

A way of accurately modeling branch predicates


A hard problem in general Branch predicates can be arbitrary code But many predicates are easy in practice
Prof. Aiken CS 295 Lecture 18 13 Prof. Aiken CS 295 Lecture 18 14

The Problem The main problem is there are too many paths In practice, this approach has not proven to be scalable
Exponential blow-up in number of paths is real Cant extend this approach to large programs

An Observation Some of the paths in our example are irrelevant to the property of interest Consider the test on p

Prof. Aiken

CS 295

Lecture 18

15

Prof. Aiken

CS 295

Lecture 18

16

Irrelevant Predicates

Discussion We want something in between the nave approach with no predicates and modeling all predicates Just want to model predicates relevant to the property

Prof. Aiken

CS 295

Lecture 18

17

Prof. Aiken

CS 295

Lecture 18

18

5/23/2011

The Idea Give up on analyzing paths completely independently


Now analyze all paths in a function simultaneously

The Join Operation The join operation is special New abstract states: (p1 p2 pn, S) First component is a list of predicates
Implicitly conjoined

At points where paths join

join all abstract states where the information for the file is the same Note: number of possible abstract states is now limited by the number of FSM states
Prof. Aiken CS 295 Lecture 18 19

Prof. Aiken

CS 295

Lecture 18

20

The Join Operation (Cont.) Idea: Join drops any predicates not in common Example: Join[(p1 p2 p3, S),(p S) (p1 p4 p3, S)] = (p1 p3, S)

Example

Prof. Aiken

CS 295

Lecture 18

21

Prof. Aiken

CS 295

Lecture 18

22

Discussion Key to path-sensitive analysis is maintaining correlations


Between branches and the property of interest

What is Lost?

The pattern involving dump is very common


As are more elaborate variations

But not all correlations are useful


Some turn out to have no useful information at all ESP tries to locally infer the useful correlations
Prof. Aiken CS 295 Lecture 18 23 Prof. Aiken CS 295 Lecture 18 24

5/23/2011

What is Lost? If a correlation is established


Before the property state is affected And ultimately does affect the property state

Another Example

ESP will not track it and will lose information

Prof. Aiken

CS 295

Lecture 18

25

Prof. Aiken

CS 295

Lecture 18

26

Back to Recursion Consider the following example foo(x,y) { if (x == 0) return; open(y); close(y); foo(x-1, y) }
Prof. Aiken CS 295 Lecture 18 27

Comments Like any static analysis, recursion/looping introduces recursive constraints Need an initial estimate for the solution, which can then be iteratively improved
convergence is guaranteed for ESP as there are only finitely many possibilities altogether either one of them is a solution or there is no solution
Prof. Aiken CS 295 Lecture 18 28

Comments ESP uses summary edges to capture recursive constraints Essentially, break cycle by assigning some initial value to the result of a recursive function
Iterate to find true value

Aliasing Tricky aliasing is a problem in real code Example:


tmp = foo.field; foo field; f = tmp->file; open(tmp->file); close(tmp->file); if (foo.field->file == NULL)
29 Prof. Aiken CS 295 Lecture 18 30

Prof. Aiken

CS 295

Lecture 18

5/23/2011

Aliasing (Cont.) Like all sound analysis systems, ESP incorporates alias analysis
Context sensitive Flow-insensitive

What About Multiple Values? What if a program, say, opens 3 files? ESP is run 3 times
Once for each file Or rather, each alias equivalence class with a file

Property checking must be done for every expression in an alias equivalence class

Allows branches that affect other files to be ignored during analysis


Separate analysis of each is more efficient than simultaneous analysis of all

Prof. Aiken

CS 295

Lecture 18

31

Prof. Aiken

CS 295

Lecture 18

32

Results Verified file handling in gcc


140,000 LOC 600+ file manipulation calls

Experience ESP went on to become a production tool inside Microsoft


Used on many core Windows projects Generally considered very successful

Strong guarantee
Did not just fail to find any bugs Proved the program will always correctly handle files, regardless of input

But used primarily as a bug finder


Alias analysis not precise enough to limit false positives on truly large programs

Prof. Aiken

CS 295

Lecture 18

33

Prof. Aiken

CS 295

Lecture 18

34

Discussion ESP is simpler than the summary- and constraint-based systems we have discussed
Only reason about paths Simple model of program state

Discussion (Cont.) ESP is a global analysis system Much of the apparent simplicity is because p there is no need to construct sophisticated function summaries
The alias analysis is also global

But
Complexity is hidden in
Theorem prover Alias analysis Probably the two weakest links
Prof. Aiken CS 295 Lecture 18 35

Does require the entire program


Cannot easily be used on a library in isolation
Prof. Aiken CS 295 Lecture 18 36

Вам также может понравиться