C Programming Pitfalls

C Programming Pitfalls
Uri Goren Kernel Stateful Enforcement I/S December 2005

2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Part I: C Macros
And how they can damage your code
Problems Caused by Macros

This presentation shows several ways, in which macros can cause
unexpected problems in your code. Goals: Make you scared. You should think twice before using macros. If you do write macros, help you do it safely. If you run into macro related trouble, help you figure out whats wrong. I have suggested solutions to each problem. But: Each solution solves just one problem. You may need to combine several solutions. Some solutions dont completely solve even one problem. Some cant be implemented is some cases. Some solutions contradict other solutions. Some solutions make your code ugly.
So DONT use this as a programming guide. Just use

it to know the risks.
Operator Order #1
Heres a nice macro: Lets try to use it:
#define double(x) x+x

int a=5; printf(a=%d a*6=%d\n, a, double(a)*3); printf(a=%d a*6=%d\n, a, a+a*3);
Code after preprocessing:

Result 20 instead of 30. Solution:
#define double(x) (x+x)
Operator Order #2
Lets try this one:
#define sqr(x) (x*x)
And use it:

int a=3, b=2; printf(sqr(a+b)=%d\n, sqr(a+b));
Code after preprocessing

printf(sqr(a+b)=%d\n, (a+b*a+b));
Result: 11 instead of 25. Solution:

#define sqr(x) ((x) * (x))
Multiple Evaluation
Heres the fixed version of the previous macro:
#define sqr(x) ((x)*(x))
How many sqares need to be added, to get 1000?

int n=1, sum=0; while (sum<1000) sum += sqr(n++); sum += ((n++)*(n++));
After preprocessing:
Result n is incremented twice in each loop. Solutions:

Dont pass parameters with side effects to macros. Dont evaluate twice. May be hard without using braces, which isnt possible when the macro returns a value.
With gcc, its possible. But were writing cross-platform.
Multiple / Late Evaluation

We implement pools, and have this macro:
#define move(srcpool, dstpool, num) { move_the_elements; srcpool->size -= num; dstpool->size += num; } move(pool1, pool2, pool1->size); move_the_elements; pool1->size -= pool1->size; pool2->size += pool1->size; \ \ \ \
Lets move everything from pool1 to pool2:
Code after preprocessing:
Set to 0 OK
Add 0 Bug!
\ \ \ \ \
Result pool2->size isnt changed. Solution:
#define move(srcpool, dstpool, num) { int move_num = num; move_the_elements; srcpool->size -= move_num ; dstpool->size += move_num ; }
Variable Name Conflicts

Heres one:
#define add_square(x, a) { /* Add a*a to x */ int s = a; /* Avoid multiple eval */ x += s*s; } int s, t; add_square(s, t); { int s = t; s += s*s; } \ \ \
Lets try to use it:
Result: Callers s is shadowed, and isnt changed. Solutions:

Dont define variables in macros. But it contradicts previous suggestions. Compiler warning about shadowing (as if we look at them). Use more descriptive names. Only reduce the chances. Use names nobody uses with many ___underscores.
Everybody use names which nobody uses.
Flow Control #1
Heres a useful macro:
#define dbgprint(msg) if (dbgflag) printf(msg); if (x==3) dbgprint(very bad\n); else dbgprint(very good\n);
And lets use it:
if (x==3) if (dbgflag) printf(very bad\n);; else if (dbgflag) printf(very good\n);;
Result because of the extra ;, the compiler wont relate the
else to the if compilation error. Solution: Make sure to use braces around the conditional statement, even if theres only one. But as the macro writer, you cant assure this. Omit the ; from the macro leave it to the caller.
Flow Control #2
Look at this debugging macro:
#define dbgprint(msg) if (dbgflag) printf(msg) if (x==3) dbgprint(very bad\n); else dbgprint(very good\n);
Suppose we use it, this way:
if (x==3) if (dbgflag) printf((very bad\n); else if (dbgflag) printf (very good\n);
Result - whose else is it? The else will be related to if(dbgflag), not
to if (x==3) wrong results. Solutions: Use braces with the if. But the macro writer doesnt control it. Use one of these structures:
#define dbgprint(msg) if (!dbgflag) {} else printf(msg) #define dbgprint(msg) do {if (dbgflag) printf(msg);} while(0)
Macro / Symbol Name Collision

In some header, we define a nice macro:
#define sum(a, b) ((a)+(b))
In some unrelated code, which just happens to

include this header:
int sum = 3;
Result variable sum treated as macro

compilation error. Solution - prefix name with component name:
#define cp_math_sum(a, b) ((a)+(b))
Quite annoying name too long.
Part II: Integer Arithmetic Pitfalls
Preview
Do we really understand integers?
We take mathematical operations for granted. We assume that things work just like we have learned in elementary school. We take integer arithmetic as something basic, that doesnt require any bothering. Question: Which integers satisfy the condition (x == -x) ? In normal math, theres only one 0. In fact there are two 0 and 0x80000000. Check for yourself. Conclusion integers are not as simple as you may think. In this presentation you will find: Many functions and code segments, doing integer arithmetic. All are mathematically correct if integers were simple numbers, they would give correct results. All are buggy they fail because of how integers work.
Intermediate Results
When evaluating a complex expression, there are intermediate
results. Normally, we ignore them we look at the big picture. Intermediate results have their types, and their data range. They are not stored in an arbitrary size and precision. Its just as if you have declared them explicitly : int x = a + b * c; is the same as: int temp1 = b * c; (Assuming that b,c are ints) int x = a + temp1;
What if the intermediate result wraps, but the final result doesnt?
With addition and subtraction, its usually OK.
In (1-2)+3, though 1-2 is 0xffffffff, we eventually get 2 correct.
With multiplication and division, we usually cant.

In 2GB * 3 / 10, 2GB * 3 will overflow, and division wont fix it.
Unsigned Integers loop counter
Heres a nice loop:

int a[SIZE]; unsigned int pos; for (pos=SIZE-1; pos>=0; pos--) a[pos] = 555;
Any problem?
pos >= 0 is a meaningless condition! pos will go down from SIZE-1 to 0, then to 0xffffffff. This is still positive.
Unsigned Integers Subtraction #1
Heres a nice macro:

#define DIFF(x,y) (x-y) #define PDIFF(x,y) ((DIFF(x,y) > 0 ? DIFF(x,y) : 0)
Very nice and simple returns the difference,

if its positive. Now lets use it:
unsigned int x=3, y=5; printf(%d\n, PDIFF(x,y));
We get -2!!!
x,y are unsigned, so (x-y) is unsigned. (x-y) > 0 is the same as (x-y) != 0.
Unsigned integers Subtraction #2
Lets fix the macro above:

#define PDIFF(x,y) ((int)(DIFF(x,y) > 0 ? (int)(DIFF(x,y) : 0)
Now, does it work?

unsigned int x = 3*1024*1024*1024 + 3; unsigned int y = 3; printf(%u\n, PDIFF(x,y));
We get 0!
x is greater than y, but (x-y), when viewed as an integer, is negative.
Unsigned Integers Overflow

Take a look at this function:
int f(unsigned int x, unsigned int y) { if (x+y > 1000) return TRUE; return FALSE; }
We expect it to return FALSE only if both x and y are

pretty small. Now how about this:
unsigned int a = 2 * 1024 * 1024 * 1024; if (!f(a, a)) printf(boom!\n);
Obviously, a is very large, so a+a must also be large.

However, a+a equals 0! The function will return FALSE.
Signed Integers Boundaries #1

Boundary checking is important. As in:
int f(int x) { static int y[SIZE] = { }; if (x >= SIZE) return ERROR; return y[x]; }
But does it check the argument properly? How about f(-1) ?

The error wont be caught.
Signed Integers Boundaries #2

Heres a lovely function:
int dy(int x) { static int y[SIZE] = { ... }; if (x<0 || x+1>=SIZE) return ERROR; return (y[x+1] y[x]); }
This time, we carefully check the arguments, so we

wont exceed the array boundaries.
Do we?
How about dy(0x7fffffff)?

The condition is now:
if (7fffffff < 0 || 0x80000000 > SIZE) return ERROR;
0x80000000 is negative! its not greater than SIZE. The function will not catch the error!
Using Constants #1
Consider the following function:
int big_enough(int size) { return (size > sizeof(int)); } Tells you whether a given size is big enough.
Obviously, a negative size is not big enough. Or is it?
big_enough(-1) will return true!

When comparing, size is converted to u_int. So were comparing 0xffffffff with 4 certainly big enough!
Constants have a type, and can be signed/unsigned:

Signed constants - e.g. 100, 100L. Unsigned constants e.g. 100U, sizeof(anything).
Using Constants #2
Heres another function:
int big_enough2(int size) { return ((size - sizeof(int)) > 100); }
Tells you if the size is big enough for something.

Again negative sizes are surely not big enough. And again
big_enough2(-1) will return true!

When subtracting sizeof(int), the result is unsigned. So were comparing 0xfffffffb with 100 certainly big enough!
Calculating Average
Heres a simple exercise:
Write a function to calculate the average of two numbers. Actually two exercises signed and unsigned.
Solutions:
unsigned int u_avg(unsigned int x, unsigned int y) { return (x + y) / 2; } int s_avg(int x, int y) { return (x + y) / 2; }
Now lets check if it works:

Unsigned: We get 0!
x+x equals 0, so (x+x)/2 does also.
unsigned int x = 2 * 1024 * 1024 * 1024; printf(%u\n, u_avg(x, x));
Signed:
int x = 0x7fffffff; printf(%d\n, s_avg(x, x));
We get -1!
x+x equals 0xfffffffe, which is -2. So (x+x)/2 is -1.
Percentage #1
How much is 30% out of something?
Thats easy. Can you program it?
Sure. Lets do it nice, clean and modular:

#define PCT(p) (p / 100) int f(int x) { return PCT(30) * x; }
Oops, it never works.

30 / 100 is 0. We always return 0.
Percentage #2
The last percentage function was stupid. Lets
write a better one:
int f(x) { return x * 30 / 100; }
Now it works.
Always?
Company X is worth 143,165,600$. I have

30% of the shares. Whats my fortune?
Using the above function we get 7$. Not very exciting. Why? x * 30 is more than 4G, so it wraps around. Division cant fix it any more.
Percentage #3
Writing a percentage function cant be that
hard. This time, well do it right:
f(143165600) reurns 42,949,680. I like this one much better.
int f(int x) { return x / 100 * 30; }
Lets how it does with the last example: But how about something easier?
How much is 30% of 10? f(10) returns 0!
10 / 100 is 0.
Percentage #4
Heres a more general percentage function:
int p(int x, unsigned int p) { if (x>1000 || x<-1000 || p>100) return OUT_OF_RANGE; return x * p / 100; }
We dont support large x so we cant overflow.

But How much is 50% of -30? Lets try p(-30, 50):
We get 42949657. How come?
x is signed, p is unsigned (makes sense).

In C, it means x*p is unsigned. We put -1500 in an unsigned integer it wraps around. Division treats it as a large positive number, and returns a smaller positive number.
Percentage #5
Heres a harder question what percentage is 30 out
of 50?
Or generally, what percentage is x out of y?
Here are all the simple ways to calculate it:

(x / y * 100)
Returns 0 when x < y (all normal cases).
(x * 100 / y)
Overflows when x is large (whats 5M out of 8M?)
x / (y / 100)
Crashes when y is small (whats 5 out of 8?) Inaccurate when y is not very large (whats 500 of 599?)
100 / y * x
Inaccurate when y less than 100. 0 when y is more than 100.
Signed/Unsigned Division
Heres a nice function:
int cut(int x, unsigned int factor) { return x / factor; }
How much is a half of -6? Try cut(-6, 2):

We get 2147483645. How come?
x is signed, factor is unsigned.

So before dividing, x is converted to unsigned we get 4G-6. After dividing, we get 2G-3 converting it back to signed keeps it a large positive.
Bit Fields
Bit fields are very nice they save memory. Heres a program that uses them:
struct x { int flag:1; int count:31; } const char *flag_set(struct x *s) { const char *n[] = { FALSE, TRUE }; return n[s->flag]; }
What happens if we set flag to 1 and call flag_set?

It returns an invalid string! flag is signed, so it can get either 0 or -1. So our program returns n[-1]. This example is platform dependant. In Solaris cc, bit fields are unsigned (unless explicitly signed).
This program works fine on Solaris (if compiled with cc).
In gcc (on all platforms), bit fields are signed.
Shifting
Check out this function:
void printb(int x, char *buf) { buf[0] = \0; for (; x != 0; x >>= 1) strcat(buf, (x & 1) ? 1 : 0); }
It creates a string with x in binary (reversed). How about count1s(0x80000000)?

We expect a lots of zeros, ending with 1.
The function will loop infinitely (until it crashes)!

Shifting right a signed integer duplicates the high order bit! 0x80000000 >> 1 == 0xc0000000. (8=1000b, c=1100b). Shift right is like division preserves the sign.
Conclusions
Be aware. Remember that:
Your code may not mean what you think it means. The variables type and valid range are important. Intermediate data has a type and valid range.
Especially important with multiplication and division.
Unsigned integers are more dangerous:

The wrap around value (0) is closer to the expected values. A single unsigned makes the whole expression unsigned. Test you arithmetic: Copy the arithmetic into a simple test program, and test all cases.
Much easier to cover all possible values this way.
Even code that seems simple and correct may surprise you. Use types explicitly: When types matter, dont let the compiler cast automatically. Cast yourself, to make things clear. Use variables for intermediate results, even when not needed.
This may remind you of the intermediate values importance.
Part III: Miscellaneous C Pitfalls

Uri Goren August 2005
Alignment
Consider the following function:
char buf[SIZE]; void write_num(int off, int num) { int *p = &buf[off]; *p = num; }
It writes a number in a given offset within a

buffer. What if the offset isnt a multiple of 4?
Intel based platforms will work a bit slower. Sun Sparc (Solaris) crash!
So pay attention to alignment.

Operator Precedence
We all know the precedence of some operators:
Multiplication and division before addition and subtraction.
a * b + c is the same as (a * b) + c.
Assignment after almost everything:

a = x + y is the same as a = (x + y). Not (a = x) + y!
But do we always know the precedence?

a + b << 2 a^b&c a>3?c=d:x=y
You can find the full precedence table easily.

Dont do it! When youre not 100% sure use parenthesis.
The Importance of Prototypes

Prototypes are great, but optional.
They allow the compiler to catch more errors. Omitting them just causes a warning. The code works fine without them.
Most of the time
Look at this case:
/* char *get_name(int id); No prototype! */ printf(%s\n, get_name(MY_ID)); Will this work? On 64bit platforms, the returned value will be assumed int. The higher 32 bits will be ignored. If the string is located above 4GB it will crash. Sometimes we get away with it. In Solaris 64bit kernel, all global and static variables are located below 4GB. The problem is when returning a pointer to dynamic memory.
Arrays with Offset -1

Normally, we can access only positive array offsets. But look at this trick:
int _a[SIZE+1]; int *a = &_a[1]; Now we can access a[-1] to a[SIZE-1]. But it will fail, under two conditions:
u_char or u_short on 32 bits. On 64 bits - also u_int.
The index is an unsigned variable. The index is of a type smaller than a pointer.
So this trick should be done carefully (or not at all).
Part IV: Examples from Our Code
Array With a Negative Index

Heres an array, defined in fwdrv.c:
static struct fwiftab _fwiftab[MAXIFP+1]; struct fwiftab *fwiftab = &_fwiftab[1];
This should allow access to fwiftab[-1]. But what if the index is unsigned?
It will crash on 64bit platforms. It will crash if the index is u_char or u_short.
In practice:
Its always called with a signed int. -1 is possible only on Nokia, which isnt 64bit. Were lucky.
Macro Affecting Flow Control

Heres a very useful kernel macro:
#define FW_ASSERT(caller, cond, msg) { \ if (fw_assert_on && !(cond)) { \ kdprintf("FW-1: %s: %s (%s:%d)\n", \ caller, msg, __FILE__, __LINE__); \ fw_panic(msg); \ } \ }
What happens when used in an if?
if (x > 0) FW_ASSERT(rname, y > 0, too small); else printf(OK\n);
This will not compile!

The semicolon after FW_ASSERT will break the if statement. So use FW_ASSERT carefully.
Macro Parameter Names

Look what Ive found in fwlddist.c:
#define FWSYNC_FCU_SET_TMOUT_TTL(timeout, ttl) do { info.timeout = &(timeout); info.ttl = &(ttl); } while (0) \ \ \
The macro was written carefully:

do {} while(0) used works fine with if-else. Parenthesis around all parameters.
One real bug:

timeout and ttl are both parameters and structure members. FWSYNC_FCU_SET_TMOUT_TTL(3, 4) wont compile.
In practice:
Its called many times, always with variables named timeout and ttl. This is the only case where the macro can work.
Possible Overflow
Heres a piece of code from fwatom.c:
u_int fw_hmem_size_new, fw_hmem_maxsize_new; ... if (fw_hmem_size_new * 2 > fw_hmem_maxsize_new) fw_hmem_size_new = fw_hmem_maxsize_new / 2;
Makes sure that the new size doesnt exceed half the new limit.
Both sizes are in bytes. But what if the size is 2GB or more? fw_hmem_size_new * 2 will wrap around. The size wont be decreased. In practice: The size cant be more than 2GB minus something. This is because we currently cant use more than 2GB. The bug is just around the corner.
Wrong Parameter Checking

A function from fwdrv.c:
char * fw_func_getname(int func_id) { if (func_id < fwfuncs.nfunc) return fwfuncs.funcdesc[func_id].funcname; return NULL; }
What if func_id is negative?

It will return a bad pointer.
In Practice:
func_id isnt negative, unless theres another bug. The string is used only if debug is enabled.

C Programming Pitfalls

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

C Programming Pitfalls

Загружено:

Авторское право:

Доступные форматы

C Programming Pitfalls

Uri Goren Kernel Stateful Enforcement I/S December 2005

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Problems Caused by Macros

So DONT use this as a programming guide. Just use

#define double(x) x+x

Code after preprocessing:

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

And use it:

Code after preprocessing

Result: 11 instead of 25. Solution:

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

How many sqares need to be added, to get 1000?

Result n is incremented twice in each loop. Solutions:

Multiple / Late Evaluation

Lets move everything from pool1 to pool2:

Code after preprocessing:

Result pool2->size isnt changed. Solution:

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Variable Name Conflicts

Lets try to use it:

Result: Callers s is shadowed, and isnt changed. Solutions:

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

And lets use it:

if (x==3) if (dbgflag) printf(very bad\n);; else if (dbgflag) printf(very good\n);;

Result because of the extra ;, the compiler wont relate the

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Suppose we use it, this way:

if (x==3) if (dbgflag) printf((very bad\n); else if (dbgflag) printf (very good\n);

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Macro / Symbol Name Collision

In some unrelated code, which just happens to

Result variable sum treated as macro

Quite annoying name too long.

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Part II: Integer Arithmetic Pitfalls

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

With multiplication and division, we usually cant.

Unsigned Integers loop counter

Heres a nice loop:

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Unsigned Integers Subtraction #1

Heres a nice macro:

Very nice and simple returns the difference,

Unsigned integers Subtraction #2

Lets fix the macro above:

Now, does it work?

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Unsigned Integers Overflow

We expect it to return FALSE only if both x and y are

Obviously, a is very large, so a+a must also be large.

Signed Integers Boundaries #1

But does it check the argument properly? How about f(-1) ?

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Signed Integers Boundaries #2

This time, we carefully check the arguments, so we

How about dy(0x7fffffff)?

big_enough(-1) will return true!

Constants have a type, and can be signed/unsigned:

2005 Check Point Software Technologies Ltd. Proprietary & Confidential

Tells you if the size is big enough for something.

big_enough2(-1) will return true!