Вы находитесь на странице: 1из 43

C Programming Pitfalls

Uri Goren Kernel Stateful Enforcement I/S December 2005


2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Part I: C Macros
And how they can damage your code

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Problems Caused by Macros


This presentation shows several ways, in which macros can cause
unexpected problems in your code. Goals: Make you scared. You should think twice before using macros. If you do write macros, help you do it safely. If you run into macro related trouble, help you figure out whats wrong. I have suggested solutions to each problem. But: Each solution solves just one problem. You may need to combine several solutions. Some solutions dont completely solve even one problem. Some cant be implemented is some cases. Some solutions contradict other solutions. Some solutions make your code ugly.

So DONT use this as a programming guide. Just use


it to know the risks.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Operator Order #1
Heres a nice macro: Lets try to use it:

#define double(x) x+x


int a=5; printf(a=%d a*6=%d\n, a, double(a)*3); printf(a=%d a*6=%d\n, a, a+a*3);

Code after preprocessing:


Result 20 instead of 30. Solution:
#define double(x) (x+x)

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Operator Order #2
Lets try this one:
#define sqr(x) (x*x)

And use it:


int a=3, b=2; printf(sqr(a+b)=%d\n, sqr(a+b));

Code after preprocessing


printf(sqr(a+b)=%d\n, (a+b*a+b));

Result: 11 instead of 25. Solution:


#define sqr(x) ((x) * (x))

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Multiple Evaluation
Heres the fixed version of the previous macro:
#define sqr(x) ((x)*(x))

How many sqares need to be added, to get 1000?


int n=1, sum=0; while (sum<1000) sum += sqr(n++); sum += ((n++)*(n++));

After preprocessing:

Result n is incremented twice in each loop. Solutions:


Dont pass parameters with side effects to macros. Dont evaluate twice. May be hard without using braces, which isnt possible when the macro returns a value.
With gcc, its possible. But were writing cross-platform.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Multiple / Late Evaluation


We implement pools, and have this macro:
#define move(srcpool, dstpool, num) { move_the_elements; srcpool->size -= num; dstpool->size += num; } move(pool1, pool2, pool1->size); move_the_elements; pool1->size -= pool1->size; pool2->size += pool1->size; \ \ \ \

Lets move everything from pool1 to pool2:

Code after preprocessing:

Set to 0 OK
Add 0 Bug!
\ \ \ \ \

Result pool2->size isnt changed. Solution:

#define move(srcpool, dstpool, num) { int move_num = num; move_the_elements; srcpool->size -= move_num ; dstpool->size += move_num ; }

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Variable Name Conflicts


Heres one:
#define add_square(x, a) { /* Add a*a to x */ int s = a; /* Avoid multiple eval */ x += s*s; } int s, t; add_square(s, t); { int s = t; s += s*s; } \ \ \

Lets try to use it:

After preprocessing:

Result: Callers s is shadowed, and isnt changed. Solutions:


Dont define variables in macros. But it contradicts previous suggestions. Compiler warning about shadowing (as if we look at them). Use more descriptive names. Only reduce the chances. Use names nobody uses with many ___underscores.
Everybody use names which nobody uses.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Flow Control #1
Heres a useful macro:
#define dbgprint(msg) if (dbgflag) printf(msg); if (x==3) dbgprint(very bad\n); else dbgprint(very good\n);

And lets use it:

After preprocessing:

if (x==3) if (dbgflag) printf(very bad\n);; else if (dbgflag) printf(very good\n);;

Result because of the extra ;, the compiler wont relate the

else to the if compilation error. Solution: Make sure to use braces around the conditional statement, even if theres only one. But as the macro writer, you cant assure this. Omit the ; from the macro leave it to the caller.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Flow Control #2
Look at this debugging macro:
#define dbgprint(msg) if (dbgflag) printf(msg) if (x==3) dbgprint(very bad\n); else dbgprint(very good\n);

Suppose we use it, this way:

After preprocessing:

if (x==3) if (dbgflag) printf((very bad\n); else if (dbgflag) printf (very good\n);

Result - whose else is it? The else will be related to if(dbgflag), not
to if (x==3) wrong results. Solutions: Use braces with the if. But the macro writer doesnt control it. Use one of these structures:

#define dbgprint(msg) if (!dbgflag) {} else printf(msg) #define dbgprint(msg) do {if (dbgflag) printf(msg);} while(0)

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Macro / Symbol Name Collision


In some header, we define a nice macro:
#define sum(a, b) ((a)+(b))

In some unrelated code, which just happens to


include this header:
int sum = 3;

Result variable sum treated as macro


compilation error. Solution - prefix name with component name:
#define cp_math_sum(a, b) ((a)+(b))

Quite annoying name too long.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Part II: Integer Arithmetic Pitfalls

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Preview
Do we really understand integers?
We take mathematical operations for granted. We assume that things work just like we have learned in elementary school. We take integer arithmetic as something basic, that doesnt require any bothering. Question: Which integers satisfy the condition (x == -x) ? In normal math, theres only one 0. In fact there are two 0 and 0x80000000. Check for yourself. Conclusion integers are not as simple as you may think. In this presentation you will find: Many functions and code segments, doing integer arithmetic. All are mathematically correct if integers were simple numbers, they would give correct results. All are buggy they fail because of how integers work.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Intermediate Results
When evaluating a complex expression, there are intermediate
results. Normally, we ignore them we look at the big picture. Intermediate results have their types, and their data range. They are not stored in an arbitrary size and precision. Its just as if you have declared them explicitly : int x = a + b * c; is the same as: int temp1 = b * c; (Assuming that b,c are ints) int x = a + temp1;

What if the intermediate result wraps, but the final result doesnt?
With addition and subtraction, its usually OK.
In (1-2)+3, though 1-2 is 0xffffffff, we eventually get 2 correct.

With multiplication and division, we usually cant.


In 2GB * 3 / 10, 2GB * 3 will overflow, and division wont fix it.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Unsigned Integers loop counter

Heres a nice loop:


int a[SIZE]; unsigned int pos; for (pos=SIZE-1; pos>=0; pos--) a[pos] = 555;

Any problem?
pos >= 0 is a meaningless condition! pos will go down from SIZE-1 to 0, then to 0xffffffff. This is still positive.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Unsigned Integers Subtraction #1

Heres a nice macro:


#define DIFF(x,y) (x-y) #define PDIFF(x,y) ((DIFF(x,y) > 0 ? DIFF(x,y) : 0)

Very nice and simple returns the difference,


if its positive. Now lets use it:
unsigned int x=3, y=5; printf(%d\n, PDIFF(x,y));

We get -2!!!
x,y are unsigned, so (x-y) is unsigned. (x-y) > 0 is the same as (x-y) != 0.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Unsigned integers Subtraction #2

Lets fix the macro above:


#define PDIFF(x,y) ((int)(DIFF(x,y) > 0 ? (int)(DIFF(x,y) : 0)

Now, does it work?


unsigned int x = 3*1024*1024*1024 + 3; unsigned int y = 3; printf(%u\n, PDIFF(x,y));

We get 0!
x is greater than y, but (x-y), when viewed as an integer, is negative.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Unsigned Integers Overflow


Take a look at this function:
int f(unsigned int x, unsigned int y) { if (x+y > 1000) return TRUE; return FALSE; }

We expect it to return FALSE only if both x and y are


pretty small. Now how about this:
unsigned int a = 2 * 1024 * 1024 * 1024; if (!f(a, a)) printf(boom!\n);

Obviously, a is very large, so a+a must also be large.


However, a+a equals 0! The function will return FALSE.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Signed Integers Boundaries #1


Boundary checking is important. As in:
int f(int x) { static int y[SIZE] = { }; if (x >= SIZE) return ERROR; return y[x]; }

But does it check the argument properly? How about f(-1) ?


The error wont be caught.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Signed Integers Boundaries #2


Heres a lovely function:
int dy(int x) { static int y[SIZE] = { ... }; if (x<0 || x+1>=SIZE) return ERROR; return (y[x+1] y[x]); }

This time, we carefully check the arguments, so we


wont exceed the array boundaries.
Do we?

How about dy(0x7fffffff)?


The condition is now:
if (7fffffff < 0 || 0x80000000 > SIZE) return ERROR;

0x80000000 is negative! its not greater than SIZE. The function will not catch the error!
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Using Constants #1
Consider the following function:
int big_enough(int size) { return (size > sizeof(int)); } Tells you whether a given size is big enough.
Obviously, a negative size is not big enough. Or is it?

big_enough(-1) will return true!


When comparing, size is converted to u_int. So were comparing 0xffffffff with 4 certainly big enough!

Constants have a type, and can be signed/unsigned:


Signed constants - e.g. 100, 100L. Unsigned constants e.g. 100U, sizeof(anything).

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Using Constants #2
Heres another function:
int big_enough2(int size) { return ((size - sizeof(int)) > 100); }

Tells you if the size is big enough for something.


Again negative sizes are surely not big enough. And again

big_enough2(-1) will return true!


When subtracting sizeof(int), the result is unsigned. So were comparing 0xfffffffb with 100 certainly big enough!

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Calculating Average
Heres a simple exercise:
Write a function to calculate the average of two numbers. Actually two exercises signed and unsigned.

Solutions:

unsigned int u_avg(unsigned int x, unsigned int y) { return (x + y) / 2; } int s_avg(int x, int y) { return (x + y) / 2; }

Now lets check if it works:


Unsigned: We get 0!
x+x equals 0, so (x+x)/2 does also.
unsigned int x = 2 * 1024 * 1024 * 1024; printf(%u\n, u_avg(x, x));

Signed:

int x = 0x7fffffff; printf(%d\n, s_avg(x, x));

We get -1!
x+x equals 0xfffffffe, which is -2. So (x+x)/2 is -1.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Percentage #1
How much is 30% out of something?
Thats easy. Can you program it?

Sure. Lets do it nice, clean and modular:


#define PCT(p) (p / 100) int f(int x) { return PCT(30) * x; }

Oops, it never works.


30 / 100 is 0. We always return 0.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Percentage #2
The last percentage function was stupid. Lets
write a better one:
int f(x) { return x * 30 / 100; }

Now it works.
Always?

Company X is worth 143,165,600$. I have


30% of the shares. Whats my fortune?
Using the above function we get 7$. Not very exciting. Why? x * 30 is more than 4G, so it wraps around. Division cant fix it any more.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Percentage #3
Writing a percentage function cant be that
hard. This time, well do it right:
f(143165600) reurns 42,949,680. I like this one much better.
int f(int x) { return x / 100 * 30; }

Lets how it does with the last example: But how about something easier?
How much is 30% of 10? f(10) returns 0!
10 / 100 is 0.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Percentage #4
Heres a more general percentage function:
int p(int x, unsigned int p) { if (x>1000 || x<-1000 || p>100) return OUT_OF_RANGE; return x * p / 100; }

We dont support large x so we cant overflow.


But How much is 50% of -30? Lets try p(-30, 50):
We get 42949657. How come?

x is signed, p is unsigned (makes sense).


In C, it means x*p is unsigned. We put -1500 in an unsigned integer it wraps around. Division treats it as a large positive number, and returns a smaller positive number.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Percentage #5
Heres a harder question what percentage is 30 out
of 50?
Or generally, what percentage is x out of y?

Here are all the simple ways to calculate it:


(x / y * 100)
Returns 0 when x < y (all normal cases).

(x * 100 / y)
Overflows when x is large (whats 5M out of 8M?)

x / (y / 100)
Crashes when y is small (whats 5 out of 8?) Inaccurate when y is not very large (whats 500 of 599?)

100 / y * x
Inaccurate when y less than 100. 0 when y is more than 100.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Signed/Unsigned Division
Heres a nice function:

int cut(int x, unsigned int factor) { return x / factor; }

How much is a half of -6? Try cut(-6, 2):


We get 2147483645. How come?

x is signed, factor is unsigned.


So before dividing, x is converted to unsigned we get 4G-6. After dividing, we get 2G-3 converting it back to signed keeps it a large positive.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Bit Fields
Bit fields are very nice they save memory. Heres a program that uses them:

struct x { int flag:1; int count:31; } const char *flag_set(struct x *s) { const char *n[] = { FALSE, TRUE }; return n[s->flag]; }

What happens if we set flag to 1 and call flag_set?


It returns an invalid string! flag is signed, so it can get either 0 or -1. So our program returns n[-1]. This example is platform dependant. In Solaris cc, bit fields are unsigned (unless explicitly signed).
This program works fine on Solaris (if compiled with cc).

In gcc (on all platforms), bit fields are signed.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Shifting
Check out this function:
void printb(int x, char *buf) { buf[0] = \0; for (; x != 0; x >>= 1) strcat(buf, (x & 1) ? 1 : 0); }

It creates a string with x in binary (reversed). How about count1s(0x80000000)?


We expect a lots of zeros, ending with 1.

The function will loop infinitely (until it crashes)!


Shifting right a signed integer duplicates the high order bit! 0x80000000 >> 1 == 0xc0000000. (8=1000b, c=1100b). Shift right is like division preserves the sign.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Conclusions
Be aware. Remember that:
Your code may not mean what you think it means. The variables type and valid range are important. Intermediate data has a type and valid range.
Especially important with multiplication and division.

Unsigned integers are more dangerous:


The wrap around value (0) is closer to the expected values. A single unsigned makes the whole expression unsigned. Test you arithmetic: Copy the arithmetic into a simple test program, and test all cases.
Much easier to cover all possible values this way.

Even code that seems simple and correct may surprise you. Use types explicitly: When types matter, dont let the compiler cast automatically. Cast yourself, to make things clear. Use variables for intermediate results, even when not needed.
This may remind you of the intermediate values importance.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Part III: Miscellaneous C Pitfalls


Uri Goren August 2005

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Alignment
Consider the following function:

char buf[SIZE]; void write_num(int off, int num) { int *p = &buf[off]; *p = num; }

It writes a number in a given offset within a


buffer. What if the offset isnt a multiple of 4?
Intel based platforms will work a bit slower. Sun Sparc (Solaris) crash!

So pay attention to alignment.


2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Operator Precedence
We all know the precedence of some operators:
Multiplication and division before addition and subtraction.
a * b + c is the same as (a * b) + c.

Assignment after almost everything:


a = x + y is the same as a = (x + y). Not (a = x) + y!

But do we always know the precedence?


a + b << 2 a^b&c a>3?c=d:x=y

You can find the full precedence table easily.


Dont do it! When youre not 100% sure use parenthesis.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

The Importance of Prototypes


Prototypes are great, but optional.
They allow the compiler to catch more errors. Omitting them just causes a warning. The code works fine without them.
Most of the time

Look at this case:

/* char *get_name(int id); No prototype! */ printf(%s\n, get_name(MY_ID)); Will this work? On 64bit platforms, the returned value will be assumed int. The higher 32 bits will be ignored. If the string is located above 4GB it will crash. Sometimes we get away with it. In Solaris 64bit kernel, all global and static variables are located below 4GB. The problem is when returning a pointer to dynamic memory.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Arrays with Offset -1


Normally, we can access only positive array offsets. But look at this trick:
int _a[SIZE+1]; int *a = &_a[1]; Now we can access a[-1] to a[SIZE-1]. But it will fail, under two conditions:
u_char or u_short on 32 bits. On 64 bits - also u_int.

The index is an unsigned variable. The index is of a type smaller than a pointer.

So this trick should be done carefully (or not at all).

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Part IV: Examples from Our Code

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Array With a Negative Index


Heres an array, defined in fwdrv.c:
static struct fwiftab _fwiftab[MAXIFP+1]; struct fwiftab *fwiftab = &_fwiftab[1];

This should allow access to fwiftab[-1]. But what if the index is unsigned?
It will crash on 64bit platforms. It will crash if the index is u_char or u_short.

In practice:
Its always called with a signed int. -1 is possible only on Nokia, which isnt 64bit. Were lucky.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Macro Affecting Flow Control


Heres a very useful kernel macro:
#define FW_ASSERT(caller, cond, msg) { \ if (fw_assert_on && !(cond)) { \ kdprintf("FW-1: %s: %s (%s:%d)\n", \ caller, msg, __FILE__, __LINE__); \ fw_panic(msg); \ } \ }

What happens when used in an if?

if (x > 0) FW_ASSERT(rname, y > 0, too small); else printf(OK\n);

This will not compile!


The semicolon after FW_ASSERT will break the if statement. So use FW_ASSERT carefully.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Macro Parameter Names


Look what Ive found in fwlddist.c:
#define FWSYNC_FCU_SET_TMOUT_TTL(timeout, ttl) do { info.timeout = &(timeout); info.ttl = &(ttl); } while (0) \ \ \

The macro was written carefully:


do {} while(0) used works fine with if-else. Parenthesis around all parameters.

One real bug:


timeout and ttl are both parameters and structure members. FWSYNC_FCU_SET_TMOUT_TTL(3, 4) wont compile.

In practice:
Its called many times, always with variables named timeout and ttl. This is the only case where the macro can work.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Possible Overflow
Heres a piece of code from fwatom.c:
u_int fw_hmem_size_new, fw_hmem_maxsize_new; ... if (fw_hmem_size_new * 2 > fw_hmem_maxsize_new) fw_hmem_size_new = fw_hmem_maxsize_new / 2;

Makes sure that the new size doesnt exceed half the new limit.
Both sizes are in bytes. But what if the size is 2GB or more? fw_hmem_size_new * 2 will wrap around. The size wont be decreased. In practice: The size cant be more than 2GB minus something. This is because we currently cant use more than 2GB. The bug is just around the corner.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Wrong Parameter Checking


A function from fwdrv.c:

char * fw_func_getname(int func_id) { if (func_id < fwfuncs.nfunc) return fwfuncs.funcdesc[func_id].funcname; return NULL; }

What if func_id is negative?


It will return a bad pointer.

In Practice:
func_id isnt negative, unless theres another bug. The string is used only if debug is enabled.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Вам также может понравиться