Вы находитесь на странице: 1из 44

C Programming :: Introduction

C is a simple programming language with few keywords and a relatively simple to


understand syntax.
C is also useless. C itself has no input/output commands, doesn't have support for
strings as a fundamental (atomic) data type. No useful math functions built in.
Because C is useless by itself, it requires the use of libraries. This increases the
complexity of C. The issue of standard libraries is resolved through the use of ANSI
libraries and other methods.
C Programming :: Hello, World
Let's give a go at a very simple program that prints out "Hello World" to
standard out (usually your monitor). We'll call our little program hello.c.
#include <stdio.h>

main() {
printf("Hello, world!\n");
return 0;
}

What's all this junk just to print out Hello, World? Let's see what's
happening:
#include <stdio.h> - Tells the compiler to include this header file for compilation.
o What is a header file? They contain prototypes and other compiler/pre-
processor directives. Prototypes are basic abstract function definitions. More
on these later...
o Some common header files are stdio.h, stdlib.h, unistd.h, math.h.
main() - This is a function, in particular the main block.
{ } - These curly braces are equivalent to stating "block begin" and "block end".
These can be used at many places, such as if and switch.
printf() - Ah... the actual print statement. Thankfully we have the header file stdio.h!
But what does it do? How is it defined?
return 0 - What's this? Who knows!
Seems like trying to figure all this out is just way too confusing. Let's break
things up one at at time:
The return 0 statement. Seems like we are trying to give something back, and it is
an integer. Maybe if we modified our main function definition: int main() Ok, now
we are saying that our main function will be returning an integer! So remember, you
should always explicitly declare the return type on the function!
Something is still a little fishy... I remember that 0 implied false... so isn't it
returning that an int signifying a bad result? Thankfully there is a simple solution to
this. Let's add#include <stdlib.h> to our includes. Let's change our return
statement to return EXIT_SUCCESS;. Now it makes sense!
Let's take a look at printf. Hmm... I wonder what the prototype for printf is. Utilizing
the man pages we see that printf is: int printf(const char *format,
...); printf returns an int. The man pages say that printf returns the number of
characters printed. Now you wonder, who cares? Why should you care about this? It
is good programming practice toALWAYS check for return values. It will not only
make your program more readable, but in the end it will make your programs less
error prone. But in this particular case, we don't really need it. So we cast the
function's return to (void). fprintf, fflush, and exit are the only functions where you
should do this. More on this later when we get to I/O. For now, let's just void the
return value.
What about documentation? We should probably doc some of our code so that
other people can understand what we are doing. Comments in the C89 standard are
noted by: /* */. The comment begins with /* and ends with */.
Let's see our new improved code!
#include <stdio.h>
#include <stdlib.h>

/* Main Function
* Purpose: Controls program, prints Hello, World!
* Input: None
* Output: Returns Exit Status
*/
int main(int argc, char **argv) {
printf("Hello, world!\n");
return EXIT_SUCCESS;
}
Much better! The KEY POINT of this whole introduction is to show you the
fundamental difference between correctness and understandability. Both
sample codes produce the exact same output in "Hello, world!" However,
only the latter example shows better readability in the code leading to code
that is understandable. All codes will have bugs. If you sacrifice code
readability with reduced (or no) comments and cryptic lines, the burden is
shifted and magnified when your code needs to be maintained.

Document what you can. Complex data types, function calls that may not be
obvious, etc. Good documentation goes a long way!

In the introduction, we discussed very simple C, now it is time for us to
move ahead and explore the basics of C programming. If you do not
understand the concepts explained in the Introduction, do not proceed. Make
sure you understand completely the topics covered in the introduction before
you dive into C.
Operations :: Relational Operators
You probably are familiar with < and the > relational operators from mathematics.
The same principles apply in C when you are comparing two objects.
There are six possibilities in C: <, <=, >, >=, !=, and ==. The first four a self-
explanatory, the != stands for "not equals to" and == is "equivalent to".
Here we can point out the difference between syntax and semantics. a = b is
different from a == b. Most C compilers will allow both statements to be used in
conditionals like if, but they have two completely different meanings. Make sure your
assignment operators are where you want them to be and your relationals where you
want relational comparisons!
Operations :: Logical Operators
Logical operators simulate boolean algebra in C.
A sampling of Logical/Boolean Operators: &&, ||, &, |, and ^.
For example, && is used to compare two objects with AND: x != 0 && y != 0
Expressions involving logical operators undergo Short-Circuit Evaluation. Take the
above example into consideration. If x != 0 evaluates to false, the whole statement
is false regardless of the outcome of y != 0. This can be a good thing or a bad thing
depending on the context. (See Weiss pg. 51-52).
Operations :: Arithmetic Operators
There are other operators available. The two arithemetic operators that are used
frequently are ++ and --. You can place these in front or on the back of
variables. ++ specifies increment, the -- specifies decrement. If the operator is
placed in front, it is prefix if it is placed behind, it is postfix. Prefix means,
increment before any operations are performed, postfix is increment afterwards.
These are important considerations when using these operators.
C allows *= += /= -= operators. For example:
int i = 5;

i *= 5;
The int i would have the value 25 after the operation.
For a full listing of operators, reference Weiss pg. 51.
Basic C :: Conditionals
if used with the above relational and logical operators allows for conditional
statements. You can start blocks of code using { and }. if's can be coupled
with else keyword to handle alternative outcomes.
The ? : operator can be a shorthand method for signifying (if expression) ?
(evaluate if true) : (else evaluate this). For example, you can use this in a return
statement or a printf statement for conciseness. Beware! This reduces the readability
of the program... see Introduction. This does not in any way speed up execution
time.
The switch statement allows for quick if-else checking. For example, if you wanted
to determine what the char x was and have different outcomes for certain values of
x, you could simply switch x and run cases. Some sample code:
switch ( x ) {
case 'a': /* Do stuff when x is 'a' */
break;
case 'b':
case 'c':
case 'd': /* Fallthrough technique... cases b,c,d all
use this code */
break;
default: /* Handle cases when x is not a,b,c or d.
ALWAYS have a default */
/* case!!! */
break;
}
Basic C :: Looping
You can loop (jumping for those assembly junkies) through your code by using
special loop keywords.
These include while, for, and do while.
The while loops until the expression specified is false. For example while (x <
4) will loop while x is less than 4.
The syntax for for is different. Here's an example: for (i = 0; i < n; i++, z++).
This code will loop until i is equal to n. The first argument specifies initializing
conditions, the second argument is like the while expression: continue the for loop
until this expression no longer evaulates to true. The third argument allows for
adjustment of loop control variables or other variables. These statements can be
null, e.g. for (; i < n; i++) does not specify initializing code.
do while is like a "repeat-until" in Pascal. This is useful for loops that must be
executed at least once. Some sample code would be:
do {
/* do stuff */
} while (statement);
Basic C :: Types, Type Qualifiers, Storage Classes
int, char, float, double are the fundamental data types in C.
Type modifiers include: short, long, unsigned, signed. Not all combinations of
types and modifiers are availble.
Type qualifiers include the keywords: const and volatile. The const qualifier
places the assigned variable in the constant data area of memory which makes the
particular variable unmodifiable (technically it still is though). volatile is used less
frequently and tells the compiler that this value can be modified outside the control
of the program.
Storage classes include: auto, extern, register, static.
The auto keyword places the specified variable into the stack area of memory. This
is usually implicit in most variable declarations, e.g. int i;
The extern keyword makes the specified variable access the variable of the same
name from some other file. This is very useful for sharing variables in modular
programs.
The register keyword suggests to the compiler to place the particular variable in
the fast register memory located directly on the CPU. Most compilers these days (like
gcc) are so smart that suggesting registers could actually make your program
slower.
The static keyword is useful for extending the lifetime of a particular variable. If
you declare a static variable inside a function, the variable remains even after the
function call is long gone (the variable is placed in the alterable area of memory).
The static keyword is overloaded. It is also used to declare variables to be private
to a certain file only when declared with global variables. static can also be used
with functions, making those functions visible only to the file itself.
A string is NOT a type directly supported by C. You, therefore, cannot "assign" stuff
into strings. A string is defined by ANSI as an array (or collection) of characters. We
will go more in-depth with strings later...
Basic C, Operations, Types Review
Relational and Logical operators are used to compare expressions.
Conditonal statements allow for conditional execution of expressions using if,
else, switch.
Loops allow you to repeatedly do things until a stopping point is reached.
Types, Storage Classes, and Type Qualifiers are used to modify the particular type's
scope and lifetime.
Now that we have a understanding of the very basics of C, it is time now to
turn our focus over to making our programs not only run correctly but more
efficiently and are moreunderstandable.
Functions :: The Basics
Why should we make functions in our programs when we can just do it all under
main? Weiss (pg. 77) has a very good analogy that I'll borrow :) Think for a minute
about high-end stereo systems. These stereo systems do not come in an all-in-one
package, but rather come in separate components: pre-amplifier, amplifier,
equalizer, receiver, cd player, tape deck, and speakers. The same concept applies to
programming. Your programs become modularized and much more readable if they
are broken down into components.
This type of programming is known as top-down programming, because we first
analyze what needs to be broken down into components. Functions allow us to
create top-down modular programs.
Each function consists of a name, a return type, and a possible parameter list. This
abstract definition of a function is known as it's interface. Here are some sample
function interfaces:
char *strdup(char *s)
int add_two_ints(int x, int y)
void useless(void)
The first function header takes in a pointer to a string and outputs a char pointer.
The second header takes in two integers and returns an int. The last header doesn't
return anything nor take in parameters.
Some programmers like to separate returns from their function names to facilitate
easier readability and searchability. This is just a matter of taste. For example:
int
add_two_ints(int x, int y)
A function can return a single value to its caller in a statement using the
keyword return. The return value must be the same type as the return type
specified in the function's interface.
Functions :: Prototypes
In the introduction, we touched on function prototypes. To recap, what are function
prototypes? Function prototypes are abstract function interfaces. These function
declarations have no bodies; they just have their interfaces.
Function prototypes are usually declared at the top of a C source file, or in a
separate header file (see Appendix: Creating Libraries).
For example, if you wanted to grab command line parameters for your program, you
would most likely use the function getopt. But since this function is not part of ANSI
C, you must declare the function prototype, or you will get implicit declaration
warnings when compiling with our flags. So you can simply prototype getopt(3) from
the man pages:
/* This section of our program is for Function
Prototypes */
int getopt(int argc, char * const argv[], const char
*optstring);
extern char *optarg;
extern int optind, opterr, optopt;
So if we declared this function prototype in our program, we would be telling the
compiler explicitly what getopt returns and it's parameter list. What are
those extern variables? Recall that extern creates a reference to variables across
files, or in other words, it creates file global scope for those variables in that
particular C source file. That way we can access these variables that getopt modifies
directly. More on getopt on the next section about Input/Output.
Functions :: Functions as Parameters
This is a little more advanced section on functions, but is very useful. Take this for
example:
int applyeqn(int F(int), int max, int min) {
int itmp;

itmp = F(int) + min;
itmp = itmp - max;
return itmp;
}
What does this function do if we call it with applyeqn(square(x), y, z);? What
happens is that the int F(int) is a reference to the function that is passed in as a
parameter. Thus inside applyeqn where there is a call to F, it actually is a call
to square! This is very useful if we have one set function, but wish to vary the input
according to a particular function. So if we had a different function called cube we
could change how we call applyeqn by calling the function by applyeqn(cube(x),
y, z);.
Functions :: The Problem
So now you must be thinking... Wow! Functions are great! I can do anything with
functions! WRONG. There are four major ways that parameters are passed into
functions. The two that we should be concerned with are Pass by Value and Pass
by Reference. In C, all parameters are passed by value.
So you're saying so what? It makes a big difference. In simplistic terms, functions in
C create copies of the passed in variables. These variables remain on the stack for
the lifetime of the function and then are discarded, so they do not affect the
inputs! This is important. Let's repeat it again. Passed in arguments will remain
unchanged. Let's use this swapping function as an example:
void swap(int x, int y) {
int tmp = 0;

tmp = x;
x = y;
y = tmp;
}
If you were to simply pass in parameters to this swapping function that swaps two
integers, this would fail horribly. You'll just get the same values back.
But thankfully, you can circumvent this pass by value limitation in C by simulating
pass by reference. Pass by reference changes the values that are passed in when the
function exits. This isn't how C works technically but can be thought of in the same
fashion. So how do you avoid pass by value side effects? By using pointers and in
some cases using macros. We will discuss pointers in detail later.
The C Preprocessor :: Overview
The C Preprocessor is not part of the compiler, but is a separate step in the
compilation process. In simplistic terms, a C Preprocessor is just a text substitution
tool. We'll refer to the C Preprocessor as the CPP.
All preprocessor lines begin with #. This listing is from Weiss pg. 104. The
unconditional directives are:
o #include - Inserts a particular header from another file
o #define - Defines a preprocessor macro
o #undef - Undefines a preprocessor macro
The conditional directives are:
o #ifdef - If this macro is defined
o #ifndef - If this macro is not defined
o #if - Test if a compile time condition is true
o #else - The alternative for #if
o #elif - #else an #if in one statement
o #endif - End preprocessor conditional
Other directives include:
o # - Stringization, replaces a macro parameter with a string constant
o ## - Token merge, creates a single token from two adjacent ones
Some examples of the above:
#define MAX_ARRAY_LENGTH 20
Tells the CPP to replace instances of MAX_ARRAY_LENGTH with 20. Use #define for
constants to increase readability. Notice the absence of the ;.
#include <stdio.h>
#include "mystring.h"
Tells the CPP to get stdio.h from System Libraries and add the text to this file. The
next line tells CPP to get mystring.h from the local directory and add the text to the
file. This is a difference you must take note of.
#undef MEANING_OF_LIFE
#define MEANING_OF_LIFE 42
Tells the CPP to undefine MEANING_OF_LIFE and define it for 42.
#ifndef IROCK
#define IROCK "You wish!"
#endif
Tells the CPP to define IROCK only if IROCK isn't defined already.
#ifdef DEBUG
/* Your debugging statements here */
#endif
Tells the CPP to do the following statements if DEBUG is defined. This is useful if you
pass the -DDEBUG flag to gcc. This will define DEBUG, so you can turn debugging on
and off on the fly!
The C Preprocessor :: Parameterized Macros
One of the powerful functions of the CPP is the ability to simulate functions using
parameterized macros. For example, we might have some code to square a number:
int square(int x) {
return x * x;
}
We can instead rewrite this using a macro:
#define square(x) ((x) * (x))
A few things you should notice. First square(x) The left parentheses must
"cuddle" with the macro identifier. The next thing that should catch your eye are
the parenthesis surrounding the x's. These are necessary... what if we used this
macro as square(1 + 1)? Imagine if the macro didn't have those parentheses? It
would become ( 1 + 1 * 1 + 1 ). Instead of our desired result of 4, we would get 3.
Thus the added parentheses will make the expression ( (1 + 1) * (1 + 1) ). This is a
fundamental difference between macros and functions. You don't have to worry
about this with functions, but you must consider this when using macros.
Remeber that pass by value vs. pass by reference issue earlier? I said that you could
go around this by using a macro. Here is swap in action when using a macro:
#define swap(x, y) { int tmp = x; x = y; y = tmp }
Now we have swapping code that works. Why does this work? It's because the CPP
just simply replaces text. Wherever swap is called, the CPP will replace the macro
call with the defined text. We'll go into how we can do this with pointers later.
Input/Output and File I/O
With most of the basics of C under our belts, lets focus now on grabbing
Input and directing Output. This is essential for many programs that might
require command line parameters, or standard input.
I/O :: printf(3)
printf(3) is one of the most frequently used functions in C for output.
The prototype for printf(3) is:
int printf(const char *format, ...);
printf takes in a formatting string and the actual variables to print. An example of
printf is:
int x = 5;
char str[] = "abc";
char c = 'z';
float pi = 3.14;

printf("\t%d %s %f %s %c\n", x, str, pi, "WOW", c);
The output of the above would be:
5 abc 3.140000 WOW z
Let's see what's happening. The \t line signifies an escape sequence, specifically, a
tab. Then the %d specifies a conversion specification as given by the variable x.
The %smatches with the string and the %f matches with the float. The default
precision for %f is 6 places after the decimal point. %f works for both floats and
doubles. For long doubles, use %Lf. The %s matches with the "WOW", and
the %c tells printf to output the char c. The \n signifies a newline.
For a listing of escape sequences, see Weiss pg. 183.
You can format the output through the formatting line.. By modifying the conversion
specification, you can change how the particular variable is placed in output. For
example:
printf("%10.4d", x);
Would print this:
0005
The . allows for precision. This can be applied to floats as well. The number 10 puts
0005 over 10 spaces so that the number 5 is on the tenth spacing. You can also
add + and -right after % to make the number explicitly output as +0005. Note that
this does not actually change the value of x. In other words, using %-10.4d will not
output -0005.
%e is useful for outputting floats using scientific notation. %le for doubles and %Le for
long doubles.
I/O :: scanf(3)
scanf(3) is useful for grabbing things from input. Beware though, scanf isn't the
greatest function that C has to offer. Some people brush off scanf as a broken
function that shouldn't be used often.
The prototype for scanf is:
int scanf( const char *format, ...);
Looks similar to printf, but doesn't completely behave like printf does. Take for
example:
scanf("%d", x);
You'd expect scanf to read in an int into x. But scanf requires that you specify the
address to where the int should be stored. Thus you specify the address-of operator
(more on this when we get to pointers). Therefore,
scanf("%d", &x);
will put an int into x correctly.
Simple enough, eh? Think again. scanf's major "flaw" is it's inability to digest
incorrect input. If scanf is expecting an int and your standard in keeps giving it a
string, scanf will keep trying at the same location. If you looped scanf, this would
create an infinite loop. Take this example code:
int x, args;

for ( ; ; ) {
printf("Enter an integer bub: ");
if (( args = scanf("%d", &x)) == 0) {
printf("Error: not an integer\n");
continue;
} else {
if (args == 1)
printf("Read in %d\n", x);
else
break;
}
}
The code above will fail. Why? It's because scanf isn't discarding bad input. So
instead of using just continue;, we have to add a line before it to digest input. We
can use a function called digestline().
void digestline(void) {
scanf("%*[^\n]"); /* Skip to the End of the Line
*/
scanf("%*1[\n]"); /* Skip One Newline */
}
This function is taken from Weiss pg. 341. Using assignment suppression, we can
use * to suppress anything contained in the set [^\n]. This skips all characters until
the newline. The next scanf allows one newline character read. Thus we can digest
bad input!
The section on scanf by Weiss is excellent. Section 12.2, pgs. 336-342.
File I/O :: fgets(3)
One of the alternatives to scanf/fscanf is fgets. The prototype is:
char *fgets(char *s, int size, FILE *stream);
fgets reads in size - 1 characters from the stream and stores it into *s pointer.
The string is automatically null-terminated.
fgets stops reading in characters if it reaches an EOF or newline.
Now that you've read characters of interest from a stream, what do you do with the
string? Simple! Use sscanf, see below.
File I/O :: sscanf(3)
To scan a string for a format, the sscanf library call is handy. It's prototype:
int sscanf(const char *str, const char *format, ...);
sscanf works much like fscanf except it takes a character pointer instead of a file
pointer.
Using the combination of fgets/sscanf instead of scanf/fscanf you can avoid the
"digestion" problem (or bug, depending on who you talk to :)
File I/O :: fprintf(3)
It is sometimes useful also to output to different streams. fprintf(3) allows us to do
exactly that.
The prototype for fprintf is:
int fprintf(FILE *stream, const char *format, ...);
fprintf takes in a special pointer called a file pointer, signified by FILE *. It then
accepts a formatting string and arguments. The only difference between fprintf and
printf is that fprintf can redirect output to a particular stream. These streams can
be stdout, stderr, or a file pointer. More on file pointers when we get to fopen.
An example:
fprintf(stderr, "ERROR: Cannot malloc enough
memory.\n");
This outputs the error message to standard error.
File I/O :: fscanf(3)
fscanf(3) is basically a streams version of fscanf. The prototype for fscanf is:
int fscanf( FILE *stream, const char *format, ...);
File I/O :: fflush(3)
Sometimes it is necessary to forcefully flush a buffer to its stream. If a program
crashes, sometimes the stream isn't written. You can do this by using the fflush(3)
function. The prototype for fflush is:
int fflush(FILE *stream);
Not very difficult to use, specify the stream to fflush.
File I/O :: fopen(3), fclose(3), and File Pointers
fopen(3) is used to open streams. This is most often used with opening files for
input. fopen's prototype is:
FILE *fopen (const char *path, const char *mode);
fopen returns a file pointer and takes in the path to the file as well as the mode to
open the file with. Take for example:
FILE *Fp;

Fp = fopen("/home/johndoe/input.dat", "r");
This will open the file in /home/johndoe/input.dat for reading. Now you can use
fscanf commands with Fp. For example:
fscanf(Fp, "%d", &x);
This would read in an integer from the input.dat file. If we opened input.dat with the
mode "w", then we could write to it using fprintf:
fprintf(Fp, "%s\n", "File Streams are cool!");
To close the stream, you would use fclose(3). The prototype for fclose is:
int fclose( FILE *stream );
You would just give fclose the stream to close the stream. Remember to do this for
all file streams, especially when writing to files!
I/O and File I/O :: Return Values
We have been looking at I/O functions without any regard to their return values. This
is bad. Very, very bad. So to make our lives easier and to make our programs
behave well, let's write some macros!
Let's write some wrapper macros for these functions. First let's create a meta-
wrapper function for all of our printf type functions:
#define ERR_MSG( fn ) { (void)fflush(stderr); \
(void)fprintf(stderr, __FILE__
":%d:" #fn ": %s\n", \
__LINE__,
strerror(errno)); }
#define METAPRINTF( fn, args, exp ) if( fn args exp )
ERR_MSG( fn )
This will create an ERR_MSG macro to handle error messages. The METAPRINTF is
the meta-wrapper for our printf type functions. So let's define our printf type
macros:
#define PRINTF(args) METAPRINTF(printf, args, < 0)
#define FPRINTF(args) METAPRINTF(fprintf, args, < 0)
#define SCANF(args) METAPRINTF(scanf, args, < 0)
#define FSCANF(args) METAPRINTF(fscanf, args, < 0)
#define FFLUSH(args) METAPRINTF(fflush, args, < 0)
Now we have our wrapper functions. Because args is sent to METAPRINTF, we need
two sets of parentheses when we use the PRINTF macro. Examples on using the
wrapper function:
PRINTF(("This is so cool!"));
FPRINTF((stderr, "Error bub!"));
Now you can this code into a common header file and be able to use these
convenient macros and still be able to check for return values! (Make sure you have
included the string.h library)
Note: We did not write macros for fopen and fclose. You must manually check for
return values on those functions.
Other I/O Functions
There are many other Input/Output functions, such as fputs, getchar, putchar,
ungetc. Refer to the man pages on these functions or in the Weiss text.
Command Line Arguments and Parameters :: getopt(3)
I'm sure you've run the ls -l command before. ls -l *.c would display all c files
with extended information. These parameters and arguments can be handled by your
c program through getopt(3).
We have already seen getopt, but now lets actually make some code that makes this
function useful. Let's see the prototype again:
int getopt(int argc, char * const argv[], const char
*optstring);
extern char *optarg;
extern int optind, opterr, optopt;
In order for us to utilize argc and argv, we must allow these as parameters on our
main() function:
int main(int argc, char **argv)
Now that we have everything set up, lets get this show on the road. Here's an
example of using getopt:
int ich;

while ((ich = getopt (argc, argv, "ab:c")) != EOF) {
switch (ich) {
case 'a': /* Flags/Code when -a is specified */
break;
case 'b': /* Flags/Code when -b is specified */
/* The argument passed in with b is
specified */
/* by optarg */
break;
case 'c': /* Flags/Code when -c is specified */
break;
default: /* Code when there are no parameters */
break;
}
}

if (optind < argc) {
printf ("non-option ARGV-elements: ");
while (optind < argc)
printf ("%s ", argv[optind++]);
printf ("\n");
}
This code might be a bit confusing if taken in all at once.
o The first step is to get getopt to pass an int into ich. The options allowed are
specified by the "ab:c". The colon following b allows b to have arguments,
e.g. -b gradient. Thus, optarg will contain the string "gradient".
o ich is then switched to check for the parameters.
o The ending conditional if (optind < argc) { checks for aguments passed
in without an accompanying parameter. optind is the current index in the list
of arguments passed in in the argv-list. argc is the total number of
arguments passed in.
o So if we had a program called "junk" and we called it from the command
prompt as ./junk -b gradient yeehaw the variables would look like:
o Variable Contains
o ------------------ ----------
o argc 4
o argv[0] "./junk"
o argv[1] "-b"
o argv[2] "gradient"
o argv[3] "yeehaw"
o optarg at case 'b' "gradient"
o optind after while 3
o getopt loop
Input/Output and File I/O Review
printf and scanf can be used for Input and Output, while the "f versions" of these can
be used to modify streams. Make sure you check the return values!
You can grab command line arguments and parameters through getopt.

Functions and C Preprocessor Review
Functions allow for modular programming. You must remember that all parameters
passed into function in C are passed by value!
The C Preprocessor allows for macro definitions and other pre-compilation directives.
It is just a text substitution tool before the actual compilation be
Pointers :: Definition
Pointers provide an indirect method of accessing variables. The reason why some
people have difficulty understanding the concept of a pointer is that they are usually
introduced without some sort of analogy or easily understood example.
For our simple to understand example, let's think about a typical textbook. It will
usually have a table of contents, some chapters, and an index. Suppose we have a
Chemistry textbook and would like to find more information on the noble gases.
What one would typically do instead of flipping through the entire text, is to consult
the index in the back. The index would direct us to the page(s) on which we can read
more on noble gases. Conceptually, this is how pointers work!
A pointer is simply a reference containing a memory address. In our example, the
noble gas entry in the index would list page numbers for more information. This is
analogous to a pointer reference containing the memory address of where the real
data is actually stored!
You may be wondering, what is the point of this (no pun intended)? Why don't I just
make all variables without the use of pointers? It's because sometimes you can't.
What if you needed an array of ints, but didn't know the size of the array before
hand? What if you needed a string, but it grew dynamically as the program ran?
What if you need variables that are persistent through function use without declaring
them global (remember the swap function)? They are all solved through the use of
pointers. Pointers are also essential in creating larger custom data structures, such
as linked lists.
So now that you understand how pointers work, let's define them a little better.
o A pointer when declared is just a reference. DECLARING A POINTER DOES
NOT CREATE ANY SPACE FOR THE POINTER TO POINT TO. We will
tackle this dynamic memory allocation issue later.
o As stated prior, a pointer is a reference to an area of memory. This is known
as a memory address. A pointer may point to dynamically allocated memory
or a variable declared within a block.
o Since a pointer contains memory addresses, the size of a pointer typically
corresponds to the word size of your computer. You can think of a "word" as
how much data your computer can access at once. Typical machines today
are 32- or 64-bit machines. 8-bits per byte equates to 4- or 8-byte pointer
sizes. More on this later.
Pointers :: Declaration and Syntax
Pointers are declared by using the * in front of the variable identifier. For example:
int *ip;
float *fp = NULL;
This delcares a pointer, ip, to an integer. Let's say we want ip to point to an integer.
The second line delares a pointer to a float, but initializes the pointer to point to
the NULLpointer. The NULL pointer points to a place in memory that cannot be
accessed. NULL is useful when checking for error conditions and many functions
return NULL if they fail.
int x = 5;
int *ip;

ip = &x;
We first encountered the & operator first in the I/O section. The & operator is to
specify the address-of x. Thus, the pointer, ip is pointing to x by assigning the
address of x. This is important. You must understand this concept.
This brings up the question, if pointers contain addresses, then how do I get the
actual value of what the pointer is pointing to? This is solved through the * operator.
The *dereferences the pointer to the value. So,
printf("%d %d\n", x, *ip);
would print 5 5 to the screen.
There is a critical difference between a dereference and a pointer declaration:
int x = 0, y = 5, *ip = &y;

x = *ip;
The statement int *ip = &y; is different than x = *ip;. The first statement does
not dereference, the * signifies to create a pointer to an int. The second statement
uses a dereference.
Remember the swap function? We can now simulate call by reference using
pointers. Here is a modified version of the swap function using pointers:
void swap(int *x, int *y) {
int tmp;

tmp = *x;
*x = *y;
*y = tmp;
}

int main() {
int a = 2, b = 3;

swap(&a, &b);
return EXIT_SUCCESS;
}
This snip of swapping code works. When you call swap, you must give the address-of
a and b, because swap is expecting a pointer.
Why does this work? It's because you are giving the address-of the variables. This
memory does not "go away" or get "popped off" after the function swap ends. The
changes within swap change the values located in those memory addresses.
Pointers :: Pointers and const Type Qualifier
The const type qualifier can make things a little confusing when it is used with
pointer declarations.
The below example is from Weiss pg. 132:
const int * const ip; /* The pointer *ip is const and
what it points at is const */
int * const ip; /* The pointer *ip is const
*/
const int * ip; /* What *ip is pointing at is
const */
int * ip; /* Nothing is const
*/
As you can see, you must be careful when specifying the const qualifier when using
pointers.
Pointers :: void Pointers
void pointers can be assigned to any pointer value. It sometimes necessary to
store/copy/move pointers without regard to the type it references.
You cannot dereference a void pointer.
Functions such as malloc, free, and scanf utilize void pointers.
Pointers :: Pointers to Functions
Earlier, we said that you can pass functions as parameters into functions. This was
essentially a reference, or pointer, passed into the function.
There is an alternative way to declare and pass in functions as parameters into
functions. It is discussed in detail in Weiss, pgs. 135-136.
Pointers :: Pointer Arithmetic
C is one of the few languages that allows pointer arithmetic. In other words, you
actually move the pointer reference by an arithmetic operation. For example:
int x = 5, *ip = &x;

ip++;
On a typical 32-bit machine, *ip would be pointing to 5 after initialization.
But ip++; increments the pointer 32-bits or 4-bytes. So whatever was in the next 4-
bytes, *ip would be pointing at it.
Pointer arithmetic is very useful when dealing with arrays, because arrays and
pointers share a special relationship in C. More on this when we get to arrays!
Pointers Review
Pointers are an indirect reference to something else. They are primarily used to
reference items that might dynamically change size at run time.
Pointers have special operators, & and *. The & operator gives the address-of a
pointer. The * dereferences the pointer (when not used in a pointer declaration
statement).
You must be careful when using const type qualifier. You have to also be cautious
about the void pointer.
C allows pointer arithmetic, which gives the programmer the freedom to move the
pointer using simple arithmetic. This is very powerful, yet can lead to disaster if not
used properly.


Arrays
You must understand the concepts discussed in the previous pointers section before
proceeding.
Arrays are a collection of items (i.e. ints, floats, chars) whose memory is allocated in
a contiguous block of memory.
Arrays and pointers have a special relationship. This is because arrays use pointers
to reference memory locations. Therefore, most of the times, pointer and array
references can be used interchangeably.
Arrays :: Declaration and Syntax
A simple array of 5 ints would look like:
int ia[5];
This would effectively make an area in memory (if availble) for ia, which is 5 *
sizeof(int). We will discuss sizeof() in detail in Dynamic Memory Allocation.
Basically sizeof()returns the size of what is being passed. On a typical 32-bit
machine, sizeof(int) returns 4 bytes, so we would get a total of 20 bytes of
memory for our array.
How do we reference areas of memory within the array? By using the [ ] we can
effectively "dereference" those areas of the array to return values.
printf("%d ", ia[3]);
This would print the fourth element in the array to the screen. Why the fourth? This
is because array elements are numbered from 0.
Note: You cannot initialize an array using a variable. ANSI C does not allow this. For
example:
int x = 5;
int ia[x];
This above example is illegal. ANSI C restricts the array intialization size to be
constant. So is this legal?
int ia[];
No. The array size is not known at compile time.
How can we get around this? By using macros we can also make our program more
readable!
#define MAX_ARRAY_SIZE 5
/* .... code .... */

int ia[MAX_ARRAY_SIZE];
Now if we wanted to change the array size, all we'd have to do is change the define
statement!
But what if we don't know the size of our array at compile time? That's why we have
Dynamic Memory Allocation. More on this later...
Can we initialize the contents of the array? Yes!
int ia[5] = {0, 1, 3, 4};
int ia[ ] = {0, 2, 1};
Both of these work. The first one, ia is 20 bytes long with 16 bytes initialized to 0, 1,
3, 4. The second one is also valid, 12 bytes initialized to 0, 2, 1. (Examples on a
typical 32-bit machine).
Arrays :: Relationship with Pointers
So what's up with all this pointers are related to arrays junk? This is because an
array name is just a pointer to the beginning of the allocated memory space.
This causes "problems" in C, as the Limitations sub-section will show.
Let's take this example and analyze it:
int ia[6] = {0, 1, 2, 3, 4, 5};
/* 1 */
int *ip;
/* 2 */

ip = ia; /* equivalent to ip = &ia[0];
*/ /* 3 */
ip[3] = 32; /* equivalent to ia[3] = 32;
*/ /* 4 */
ip++; /* ip now points to ia[1]
*/ /* 5 */
printf("%d ", *ip); /* prints 1 to the screen
*/ /* 6 */
ip[3] = 52; /* equivalent to ia[4] = 52
*/ /* 7 */
Ok, so what's happening here? Let's break this down one line at a time. Refer to the
line numbers on the side:
1. Initialize ia
2. Create ip: a pointer to an int
3. Assign ip pointer to ia. This is effectively assigning the pointer to point to the
first position of the array.
4. Assign the fourth position in the array to 32. But how? ip is just a pointer?!?!
But what is ia? Just a pointer! (heh)
5. Use pointer arithmetic to move the pointer over in memory to the next block.
Using pointer arithmetic automatically calls sizeof().
6. Prints ia[1] to the screen, which is 1
7. Sets ia[4] to 52. Why the fifth position? Because ip points to ia[1] from the
ip++ line.
Now it should be clear. Pointers and arrays have a special relationship because
arrays are actually just a pointer to a block of memory!
Arrays :: Multidimensional
Sometimes its necessary to declare multidimensional arrays. In C, multidimensional
arrays are row major. In other words, the first bracket specifies number of rows.
Some examples of multidimensional array declarations:
int igrid[2][3] = { {0, 1, 2}, {3, 4, 5} };
int igrid[2][3] = { 0, 1, 2, 3, 4, 5 };
int igrid[ ][4] = { {0, 1, 2, 3}, {4, 5, 6, 7}, {8, 9}
};
int igrid[ ][2];
The first three examples are valid, the last one is not. As you can see from the first
two examples, the braces are optional. The third example shows that the number of
rows does not have to be specified in an array initialization.
This seems simple enough. But what if we stored pointers in our arrays? This would
effectively create a multidimensional array! Since reinforcement of material is key to
learning it, let's go back to getopt. Remember the variable argv? It can be declared
in the main function as either **argv or *argv[]. What does **argv mean? It looks
like we have two pointers or something. This is actually a pointer to a pointer.
The *argv[] means the same thing, right? Imagine (pardon the crappy graphics
skills):
argv
+---+
| 0 | ---> "./junk"
+---+
| 1 | ---> "-b"
+---+
| 2 | ---> "gradient"
+---+
| 3 | ---> "yeehaw"
+---+
So what would argv[0][1] be? It would be the character '/'. Why is this? It's because
strings are just an array of characters. So in effect, we have a pointer to the actual
argv array and a pointer at each argv location to each string. A pointer to a pointer.
We will go more in depth into strings later.
Arrays :: Limitations
Because names of arrays represents just a pointer to the beginning of the array, we
have some limitations or "problems."
1. No Array Out of Bounds Checking. For example:
2. int ia[2] = {0, 1};
3.
4. printf("%d ", ia[2]);
The above code would segfault, because you are trying to look at an area of
memory not inside the array memory allocation.
5. Array Size Must be Constant or Known at Compile-time. See Arrays ::
Declaration and Syntax.
6. Arrays Cannot be Copied or Compared. Why? Because they are pointers. See
Weiss pg. 149 for a more in-depth explanation.
7. Array Index Type must be Integral.
Another limitation comes with arrays being passed into functions. Take for example:
void func(int ia[])
void func(int *ia)
Both are the same declaration (you should know why by now). But why would this
cause problems? Because only the pointer to the array is passed in, not the whole
array. So what if you mistakenly did a sizeof(ia) inside func? Instead of returning
the sizeof the whole array, it would return the size of a pointer which corresponds to
the word size of the computer.
Arrays Review
Arrays are, in simple terms, just a pointer! Remember that!
There isn't much to review. Remember arrays have limitations because they are
inherently just a pointer. Wait we mentioned that already. :)
Dynamic Memory Allocation
Now that we have firm grasp on pointers, how can we allocate memory at run-time
instead of compile time? ANSI C provides five standard functions that helps you
allocate memory on the heap.
Dynamic Memory Allocation :: sizeof()
We have already seen this function in the array section. To recap, sizeof() returns
a size_t of the item passed in. So on a typical 32-bit machine, sizeof(int) returns
4 bytes.size_t is just an unsigned integer constant.
sizeof() is helpful when using malloc or calloc calls. Note that sizeof() does not
always return what you may expect (see below).
Dynamic Memory Allocation :: malloc(3), calloc(3), bzero(3),
memset(3)
The prototype for malloc(3) is:
void *malloc(size_t size);
malloc takes in a size_t and returns a void pointer. Why does it return a void
pointer? Because it doesn't matter to malloc to what type this memory will be used
for.
Let's see an example of how malloc is used:
int *ip;

ip = malloc(5 * sizeof(int));
Pretty simple. sizeof(int) returns the sizeof an integer on the machine, multiply by
5 and malloc that many bytes.
Wait... we're forgetting something. AH! We didn't check for return values. Here's
some modified code:
#define INITIAL_ARRAY_SIZE 5
/* ... code ... */
int *ip;

if ((ip = malloc(INITIAL_ARRAY_SIZE * sizeof(int))) ==
NULL) {
(void)fprintf(stderr, "ERROR: Malloc failed");
(void)exit(EXIT_FAILURE); /* or return
EXIT_FAILURE; */
}
Now our program properly prints an error message and exits gracefully if malloc
fails.
calloc(3) works like malloc, but initializes the memory to zero if possible. The
prototype is:
void *calloc(size_t nmemb, size_t size);
Refer to Weiss pg. 164 for more information on calloc.
bzero(3) fills the first n bytes of the pointer to zero. Prototype:
void bzero(void *s, size_t n);
If you need to set the value to some other value (or just as a general alternative to
bzero), you can use memset:
void *memset(void *s, int c, size_t n);
where you can specify c as the value to fill for n bytes of pointer s.
Dynamic Memory Allocation :: realloc(3)
What if we run out of allocated memory during the run-time of our program and
need to give our collection of items more memory?
Enter realloc(3), it's prototype:
void *realloc(void *ptr, size_t size);
realloc takes in the pointer to the original area of memory to enlarge and how
much the total size should be.
So let's give it a try:
ip = realloc(ip, sizeof(ip) + sizeof(int)*5);
Now we have some more space through adding the sizeof the complete array and an
additional 5 spaces for ints... STOP! This is NOT how you use realloc. Again. The
above example is wrong. Why?
First, sizeof(ip) does not give the size of the allocated space originally allocated by
malloc (or a previous realloc). Using sizeof() on a pointer only returns the sizeof the
pointer, which is probably not what you intended.
Also, what happens if the realloc on ip fails? ip gets set to NULL, and the previously
allocated memory to ip now has no pointer to it. Now we have allocated memory just
floating in the heap without a pointer. This is called a memory leak. This can
happen from sloppy realloc's and not using free on malloc'd space.
So what is the correct way? Take this code for example:
int *tmp;
if ((tmp = realloc(ip, sizeof(int) *
(INITIAL_ARRAY_SIZE + 5))) == NULL) {
/* Possible free on ip? Depends on what you want */
fprintf(stderr, "ERROR: realloc failed");
}
ip = tmp;
Now we are creating a temporary pointer to try a realloc. If it fails, then it isn't a big
problem as we keep our ip pointer on the original memory space. Also, note that we
specified the real size of our original array and now are adding 5 more ints (so
4bytes*(5+5) = 40bytes, on a typical 32-bit machine).
Dynamic Memory Allocation :: free(3)
Now that we can malloc, calloc, and realloc we need to be able to free the
memory space if we have no use for it anymore. Like we mentioned above, any
memory space that loses its pointer or isn't free'd is a memory leak.
So what's the prototype for free(3)? Here it is:
void free(void *ptr);
free simply takes in a pointer to free. Not challenging at all. Note that free can take
in NULL, as specified by ANSI.
Dynamic Memory Allocation :: Multi-dimensional Structures
It's nice that we can create a ``flat" structure, like an array of 100 doubles. But
what if we want to create a 2D array of doubles at runtime? This sounds like a
difficult task, but it's actually simple!
As an example, lets say we are reading in a file of x, y, z coordinates from a file of
unknown length. The incorrect method to approach this task is to create an
arbitrarily large 2D array with hopefully enough rows or entries. Instead of leaving
our data structure to chance, let's just dynamically allocate, and re-allocate on the
fly.
First, let's define a few macros to keep our code looking clean:
#define oops(s) { perror((s)); exit(EXIT_FAILURE); }
#define MALLOC(s,t) if(((s) = malloc(t)) == NULL) {
oops("error: malloc() "); }
#define INCREMENT 10
MALLOC macro simply takes in the pointer (s) to the memory space to be allocated
(t). oops is called when malloc fails, returning the error code from malloc and exits
the program.INCREMENT is the default amount of memory to allocate when we run
out of allocated space.
On to the dynamic memory allocation!
double **xyz;
int i;

MALLOC(xyz, sizeof(double *) * INCREMENT);
for (i = 0; i < INCREMENT; i++) {
MALLOC(xyz[i], sizeof(double) * 3);
}
What's going on here? Our double pointer, xyz is our actual storage 2D array. We
must use a double pointer, because we are pointing to multiple pointers of doubles!
If this sounds confusing, think of it this way. Instead of each array entry having a
real double entry, each array position contains a pointer to another array of doubles!
Therefore, we have our desired 2D array structure.
The first MALLOC call instructs malloc to create 10 double pointers in the xyz array.
So each of these 10 array positions now has an unitializied pointer to data of type
pointer to a double. The for loop goes through each array position and creates a
new array at each position to three doubles, because we want to read in x, y, z
coordinates for each entry. The total space we just allocated is 10 spaces of 3
doubles each. So we've just allocated 30 double spaces.
What if we run out of space? How do we reallocate?
double **tmp;
int current_size, n;

/* clip ... other code */

if (current_size >= n) {
if ((tmp = realloc(xyz, sizeof(double *) * (n +
INCREMENT)) == NULL) {
oops("realloc() error! ");
}
for (i = n; i < n + INCREMENT; i++) {
MALLOC(tmp[i], sizeof(double) * 3);
}
n += INCREMENT;
xyz = tmp;
}
What's going on here? Suppose our file of x, y, z coordinates is longer than 10 lines.
On the 11th line, we'll invoke the realloc(). n is the current number of rows
allocated.current_size indicates the number of rows we are working on (in our
case, the expression would be 10 >= 10). We instruct realloc to reallocate space
for xyz of (double *) type, or double pointers of the current size (n) plus
the INCREMENT. This will give us 10 additional entries. Remember NEVER reallocate
to the same pointer!!
If realloc() succeeds, then we need to allocate space for the double array of size 3
to hold the x, y, z coordinates in the new xyz realloc'd array. Note the for loop,
where we start and end. Then we cleanup by providing our new max array size
allocated (n) and setting the xyz double pointer to the newly realloc'd and malloc'd
space, tmp.
Not as difficult as you might have imagined it to be, right? What if we're done with
our array? We should free it!
for (i = 0; i < n; i++) {
free(xyz[i]);
}
free(xyz);
The above code free's each entry in the xyz array (the actual double pointers to real
data) and then we free the pointer to a pointer reference. The statements cannot be
reversed, because you'll lose the pointer reference to each 3-entry double array!
Dynamic Memory Allocation Review
You have powerful tools you can use when allocating memory dynamically: sizeof,
malloc, calloc, realloc, and free.
Take precautions when using the actual memory allocation functions for memory
leaks, especially with realloc. Remember, always check for NULL with malloc! Your
programs will thank you for it.
Strings
We have discussed arrays previously, but we have not discussed them in depth in
the context of character arrays. These character arrays are referred to as strings.
Again, strings are not directly supported in C. Let's try that again, there is no
direct string support in C.
So how do we emulate strings in C? By correctly creating string constants or properly
allocating space for a character array we can get some string action in C.
Strings :: Declaration and Syntax
Let's see some examples of string declarations:
char str[5] = {'l', 'i', 'n', 'u', 'x'};
char str[6] = {'l', 'i', 'n', 'u', 'x', '\0'};
char str[3];
char str[ ] = "linux";
char str[5] = "linux";
char str[9] = "linux";
All of the above declarations are legal. But which ones don't work? The first one is a
valid declaration, but will cause major problems because it is not null-terminated.
The second example shows a correct null-terminated string. The special escape
character \0 denotes string termination. The fifth example also suffers the same
problem. The fourth example, however does not. This is because the compiler will
determine the length of the string and automatically initialize the last character to a
null-terminator.
Strings :: Dynamic Memory Allocation
This stuff is much the same as the previous section. You must be careful to allocate
one additonal space to contain the null-terminator.
For example:
char *s;

if ((s = malloc(sizeof(char) * 5)) == NULL) {
/* ERROR Handling code */
}
strcpy(s, "linux");
printf("%s\n", s);
This would result in a bunch of junk being printed to the screen. printf will try to
print the string, but will continue to print past the allocated memory for s, because
there is no null-terminator. The simple solution would be to add 1 to the malloc call.
You must be particularly careful when using malloc or realloc in combination
with strlen. strlen returns the size of a string minus the null-terminator. More on
strlen on the next sub-section.
A final note: What's wrong with the following code:
char s1[ ] = "linux";
char *s2;

strcpy(s2, s1);
Remember that simply declaring a pointer does not create any space for the pointer
to point to (remember that?).
Strings :: string.h Library
You can add support for string operations via the string.h library. (Note: If you
understand everything that has gone on by now, you should be able to code most of
the functions in string.h!)
Below is a listing of prototypes for commonly used functions in string.h:
size_t strlen(const char *s);
char *strdup(const char *s);
char *strcpy(char *dest, const char *src);
char *strncpy(char *dest, const char *src, size_t n);
char *strcat(char *dest, const char *src);
char *strncat(char *dest, const char *src, size_t n);
int strcmp(const char *s1, const char *s2);
int strncmp(const char *s1, const char *s2, size_t n);
int atoi(const char *nptr);
double atof(const char *nptr);
See Weiss pg. 486 (Appendix D.14) for a full string.h listing. Weiss Appendix D is
your friend. Use it!
Strings Review
Strings are just character arrays. Nothing more, nothing less.
Strings must be null-terminated if you want to properly use them.
Remember to take into account null-terminators when using dynamic memory
allocation.
The string.h library has many useful functions.
Most of the I/O involved with strings was covered in the Input/Output and File I/O
section. If you still don't grasp strings, read Weiss Chapter 8.
Structures
A structure in C is a collection of items of different types. You can think of a structure
as a "record" is in Pascal or a class in Java without methods.
Structures, or structs, are very useful in creating data structures larger and more
complex than the ones we have discussed so far. We will take a cursory look at some
more complex ones in the next section.
Structures :: Declaration and Syntax
So how is a structure declared and initialized? Let's look at an example:
struct student {
char *first;
char *last;
char SSN[9];
float gpa;
char **classes;
};

struct student student_a, student_b;
Another way to declare the same thing is:
struct {
char *first;
char *last;
char SSN[10];
float gpa;
char **classes;
} student_a, student_b;
As you can see, the tag immediately after struct is optional. But in the second case,
if you wanted to declare another struct later, you couldn't.
The "better" method of initializing structs is:
struct student_t {
char *first;
char *last;
char SSN[10];
float gpa;
char **classes;
} student, *pstudent;
Now we have created a student_t student and a student_t pointer. The pointer
allows us greater flexibility (e.g. Create lists of students).
How do you go about initializing a struct? You could do it just like an array
initialization. But be careful, you can't initialize this struct at declaration time
because of the pointers.
But how do we access fields inside of the structure? C has a special operator for this
called "member of" operator denoted by . (period). For example, to assign the SSN
ofstudent_a:
strcpy(student_a.SSN, "111223333\0");
Structures :: Pointers to Structs
Sometimes it is useful to assign pointers to structures (this will be evident in the
next section with self-referential structures). Declaring pointers to structures is
basically the same as declaring a normal pointer:
struct student *student_a;
But how do we dereference the pointer to the struct and its fields? You can do it in
one of two ways, the first way is:
printf("%s\n", (*student_a).SSN);
This would get the SSN in student_a. Messy and the readability is horrible! Is there a
better way? Of course, programmers are lazy! :)
To dereference, you can use the infix operator: ->. The above example using the
new operator:
printf("%s\n", student_a->SSN);
If we malloc'd space for the structure for *student_a could we start assigning things
to pointer fields inside the structure? No. You must malloc space for each individual
pointer within the structure that is being pointed to.
Structures :: typedef
There is an easier way to define structs or you could "alias" types you create. For
example:
typedef struct {
char *first;
char *last;
char SSN[9];
float gpa;
char **classes;
} student;

student student_a;
Now we get rid of those silly struct tags. You can use typedef for non-structs:
typedef long int *pint32;

pint32 x, y, z;
x, y and z are all pointers to long ints. typedef is your friend. Use it.
Structures :: Unions
Unions are declared in the same fashion as structs, but have a fundamental
difference. Only one item within the union can be used at any time, because the
memory allocated for each item inside the union is in a shared memory location.
Why you ask? An example first:
struct conditions {
float temp;
union feels_like {
float wind_chill;
float heat_index;
}
} today;
As you know, wind_chill is only calculated when it is "cold" and heat_index when it is
"hot". There is no need for both. So when you specify the temp in today, feels_like
only has one value, either a float for wind_chill or a float for heat_index.
Types inside of unions are unrestricted, you can even use structs within unions.
Structures :: Enumerated Types
What if you wanted a series of constants without creating a new type? Enter
enumerated types. Say you wanted an "array" of months in a year:
enum e_months {JAN=1, FEB, MAR, APR, MAY, JUN, JUL,
AUG, SEP, OCT, NOV, DEC};
typedef enum e_months month;

month currentmonth;
currentmonth = JUN; /* same as currentmonth = 6;
*/
printf("%d\n", currentmonth);
We are enumerating the months in a year into a type called month. You aren't
creating a type, because enumerated types are simply integers. Thus the printf
statement uses %d, not %s.
If you notice the first month, JAN=1 tells C to make the enumeration start at 1
instead of 0.
Note: This would be almost the same as using:
#define JAN 1
#define FEB 2
#define MAR 3
/* ... etc ... */
Structures :: Abilities and Limitations
You can create arrays of structs.
Structs can be copied or assigned.
The & operator may be used with structs to show addresses.
Structs can be passed into functions. Structs can also be returned from functions.
Structs cannot be compared!
Structures Review
Structures can store non-homogenous data types into a single collection, much like
an array does for common data (except it isn't accessed in the same manner).
Pointers to structs have a special infix operator: -> for dereferencing the pointer.
typedef can help you clear your code up and can help save some keystrokes.
Enumerated types allow you to have a series of constants much like a series
of #define statements.
Advanced Data Structures
In the previous section, we mentioned that you can create pointers to structures.
The Data Structures presented here all require pointers to structs, or more
specifically they are self-referential structures.
These self-referential structures contain pointers within the structs that refer to
another identical structure.
Advanced Data Structures :: Linked Lists
Linked lists are the most basic self-referential structures. Linked lists allow you to
have a chain of structs with related data.
So how would you go about declaring a linked list? It would involve a struct and a
pointer:
struct llnode {
<type> data;
struct llnode *next;
};
The <type> signifies data of any type. This is typically a pointer to something,
usually another struct. The next line is the next pointer to another llnode struct.
Another more convenient way using typedef:
typedef struct list_node {
<type> data;
struct list_node *next;
} llnode;

llnode *head = NULL;
Note that even the typedef is specified, the next pointer within the struct must still
have the struct tag!
There are two ways to create the root node of the linked list. One method is to
create a head pointer and the other way is to create a dummy node. It's usually
easier to create a head pointer.
Now that we have a node declaration down, how do we add or remove from our
linked list? Simple! Create functions to do additions, removals, and traversals.
o Additions: A sample Linked list addition function:
o void add(llnode **head, <type> data_in) {
o llnode *tmp;
o
o if ((tmp = malloc(sizeof(*tmp))) == NULL) {
o ERR_MSG(malloc);
o (void)exit(EXIT_FAILURE);
o }
o tmp->data = data_in;
o tmp->next = *head;
o *head = tmp;
o }
o
o /* ... inside some function ... */
o llnode *head = NULL;
o <type> *some_data;
o /* ... initialize some_data ... */
o
o add(&head, some_data);
What's happening here? We created a head pointer, and then sent
the address-of the head pointer into the add function which is expecting a
pointer to a pointer. We send in the address-of head. Inside add, a tmp
pointer is allocated on the heap. The data pointer on tmp is moved to point to
the data_in. The next pointer is moved to point to the head pointer (*head).
Then the head pointer is moved to point to tmp. Thus we have added to
the beginning of the list.
o Removals: You traverse the list, querying the next struct in the list for the
target. If you get a match, set the current target next's pointer to the pointer
of the next pointer of the target. Don't forget to free the node you are
removing (or you'll get a memory leak)! You need to take into consideration if
the target is the first node in the list. There are many ways to do this (i.e.
recursively). Think about it!
o Traversals: Traversing list is simple, just query the data part of the node for
pertinent information as you move from next to next. There are different
methods for traversing trees (see Trees).
What about freeing the whole list? You can't just free the head pointer! You have to
free the list. A sample function to free a complete list:
void freelist(llnode *head) {
llnode *tmp;

while (head != NULL) {
free(head->data); /* Don't forget to free
memory within the list! */
tmp = head->next;
free(head);
head = tmp;
}
}
Now we can rest easy at night because we won't have memory leaks in our lists!
Advanced Data Structures :: Stacks
Stacks are a specific kind of linked list. They are referred to as LIFO or Last In First
Out.
Stacks have specific adds and removes called push and pop. Pushing nodes onto
stacks is easily done by adding to the front of the list. Popping is simply removing
from the front of the list.
It would be wise to give return values when pushing and popping from stacks. For
example, pop can return the struct that was popped.
Advanced Data Structures :: Queues
Queues are FIFO or First In First Out. Think of a typical (non-priority) printer queue:
The first jobs submitted are printed before jobs that are submitted after them.
Queues aren't more difficult to implement than stacks. By creating a tail
pointer you can keep track of both the front and the tail ends of the list.
This allows you to enqueue onto the tail of the list, and dequeue from the front of
the list.
Advanced Data Structures :: Hash Tables
So what's the problem with linked lists? Their efficiency isn't that great. (In Big-O
notation, a linked list performs O(n)). Is there a way to speed up data structures?
Enter hash tables. Hash tables provide O(1) performance while having the ability to
grow dynamically. The key to a well-performing hash table is understanding the data
that will be inserted into it. By custom tailoring an array of pointers, you can have
O(1) access.
But you are asking, how do you know where a certain data piece is in within the
array? This is accomplished through a key. A key is based off the data, the most
simple one's involve applying a modulus to a certain piece of information within the
data. The general rule, is that if a key sucks, the hash table sucks.
What about collisions (e.g. same key for two different pieces of information)? There
are many ways to resolve this, but the most popular way is through coalesced
chaining. You can create a linked list from the array position to hold multiple data
pieces, if necessary.
Weiss provides a more in-depth study on hash tables, section 10.3, pg. 271-279.
Advanced Data Structures :: Trees
Another variation of a linked list is a tree. A simple binary tree involves having two
types of "next" pointers, a left and a right pointer. You can halve your access times
by splitting your data into two different paths, while keeping a uniform data
structure. But trees can degrade into linked list efficiency.
There are different types of trees, some popular ones are self-balancing. AVL trees
are a typical type of tree that can move nodes around so that the tree is balanced
without a >1 height difference between levels.
If you want more information on trees or self-balancing trees, you can query google
about this.
Advanced Data Structures Review
Linked lists, stacks, queues, hash tables, trees are all different types of data
structures that can help accomodate almost any type of data.
Other data structures exist such as graphs. That is beyond the scope of this tutorial.
If you want a more in-depth look at the data structures discussed here, refer to
Weiss chapter 10, pg. 257-291 and chapter 11 pg. 311-318 for information on binary
search trees.
For more information on recursive functions, see Weiss chapter 11, pg. 294-311.
Make and Makefiles Overview
Make allows a programmer to easily keep track of a project by maintaining current
versions of their programs from separate sources. Make can automate various tasks
for you, not only compiling proper branch of source code from the project tree, but
helping you automate other tasks, such as cleaning directories, organizing output,
and even debugging.
Make and Makefiles :: An Introduction
If you had a program called hello.c, you could simply call make in the directory, and it
would call cc (gcc) with -o hello option. But this isn't why make is such a nice tool
for program building and management.
The power and ease of use of make is facilitated through the use of a Makefile.
Make parses the Makefile for directives and according to what parameters you give
make, it will execute those rules.
Rules take the following form:
target target_name : prerequisites ...
command
...
The target is the parameter you give make. For example make clean would cause
make to carry out the target_name called clean. If there are any prerequisites to
process, they make will do those before proceeding. The commands would then be
executed under the target.
NOTE: The commands listed must be TABBED over!
Examples? The following Makefile below is a very simple one taken from the GNU
make manual:
edit : main.o kbd.o command.o display.o insert.o
search.o files.o utils.o
cc -o edit main.o kbd.o command.o display.o
insert.o search.o \
files.o utils.o

main.o : main.c defs.h
cc -c main.c
kbd.o : kbd.c defs.h command.h
cc -c kbd.c
command.o : command.c defs.h command.h
cc -c command.c
display.o : display.c defs.h buffer.h
cc -c display.c
insert.o : insert.c defs.h buffer.h
cc -c insert.c
search.o : search.c defs.h buffer.h
cc -c search.c
files.o : files.c defs.h buffer.h command.h
cc -c files.c
utils.o : utils.c defs.h
cc -c utils.c
clean :
rm edit main.o kbd.o command.o display.o
insert.o search.o \
files.o utils.o
Now if you change just kbd.c, it will only recompile kbd.c into it's object file and then
relink all of the object files to create edit. Much easier than recompiling the whole
project!
But that's still too much stuff to write! Use make's smarts to deduce commands. The
above example re-written (taken from GNU make manual):
objects = main.o kbd.o command.o display.o insert.o
search.o files.o utils.o

edit : $(objects)
cc -o edit $(objects)

main.o : defs.h
kbd.o : defs.h command.h
command.o : defs.h command.h
display.o : defs.h buffer.h
insert.o : defs.h buffer.h
search.o : defs.h buffer.h
files.o : defs.h buffer.h command.h
utils.o : defs.h

.PHONY: clean
clean :
rm edit $(objects)
So what changed? Now we have a grouping of objects containing all of our object
files so that the edit target only requires this variable. You may also notice that all
of the .c files are missing from the prerequisite line. This is because make is
deducing that the c source is a required part of the target and will automatically use
the c source file associated with the object file to compile. What about
the .PHONY target? Let's say you actually have a file called "clean". If you had just a
clean target without the .PHONY, it would never clean. To avoid this, you can use the
.PHONY target. This isn't used that often because it is rare to have a file called
"clean" in the target directory... but who knows if you might have one?
Make and Makefiles :: Beyond Simple
You can include other Makefiles by using the include directive.
You can create conditional syntax in Makefiles, using ifdef, ifeq, ifndef, ifneq.
You can create variables inside of Makefiles, like the $(objects) above.
Let's use a different example. The hypothetical source tree:
moo.c
/ \
--- ---
/ \
foo.c bar.c
/ \
------- -------
/ \ / \
baz.c loop.h dood.c shazbot.c
/ \
------- -------
/ \ / \
mop.c <libgen.h> woot.c defs.h
Let's create a more complex, yet easier to maintain Makefile for this project:
# Source, Executable, Includes, Library Defines
INCL = loop.h defs.h
SRC = moo.c foo.c bar.c baz.c dood.c shazbot.c mop.c
woot.c
OBJ = $(SRC:.c=.o)
LIBS = -lgen
EXE = moolicious

# Compiler, Linker Defines
CC = /usr/bin/gcc
CFLAGS = -ansi -pedantic -Wall -O2
LIBPATH = -L.
LDFLAGS = -o $(EXE) $(LIBPATH) $(LIBS)
CFDEBUG = -ansi -pedantic -Wall -g -DDEBUG $(LDFLAGS)
RM = /bin/rm -f

# Compile and Assemble C Source Files into Object Files
%.o: %.c
$(CC) -c $(CFLAGS) $*.c

# Link all Object Files with external Libraries into
Binaries
$(EXE): $(OBJ)
$(CC) $(LDFLAGS) $(OBJ)

# Objects depend on these Libraries
$(OBJ): $(INCL)

# Create a gdb/dbx Capable Executable with DEBUG flags
turned on
debug:
$(CC) $(CFDEBUG) $(SRC)

# Clean Up Objects, Exectuables, Dumps out of source
directory
clean:
$(RM) $(OBJ) $(EXE) core a.out
Now we have a clean and readable Makefile that can manage the complete source
tree. (Remember to use tabs on the command lines!)
You can manipulate lots and lots of things with Makefiles. I cannot possibly cover
everything in-depth in this short tutorial. You can navigate between directories
(which can have separte Makefiles/rules), run shell commands, and various other
tasks with make.
Make and Makefiles :: Where to go from here?
A lot of this tutorial references the GNU make manual. If you have any questions
about Make, the GNU manual should cover it.
GNU autoconf is a tool for automatically generating configure files from a configure.in
file. These configure files can automatically setup Makefiles in conjunction with GNU
automake and GNU m4. These tools are way beyond the scope of this document.
Look in GNU's manual repository for more information on these tools.
CVS is another tool that may be useful for very large projects. CVS stands for
Concurrent Versions System and it allows you to record the history of your source
files. CVS stores the base source and then stores the differences for each version.
CVS also allows for protecting code pieces of a multi-developer effort from accidental
overwriting... in other words, code-insulation. More information on CVS can be
found here.
Debugging Techniques
Now that we have learned the basics of Makefiles, we can now look into debugging
our code in conjunction with Makefiles.
As an introduction we will be using three debugging techniques:
1. Non-interactive
2. GNU gdb
3. dbx
Debugging Techniques :: Non-interactive
You can debug your code by placing #ifdef DEBUG and
corresponding #endif statements around debug code. For example:
#ifdef DEBUG
PRINTF(("Variables Currently Contain: %d, %f, %s\n",
*pi, *pf[1], str));
#endif
You can specify a DEBUG define at compile time by issuing gcc with the -
DDEBUG command option.
Note: This can be even further simplified into a single command called DPRINTF, so
you don't even have to write the #ifdef #endif directives! How? Look at
the Programming Tips and Tricks section (Quick Debugging Statements).
Debugging Techniques :: GNU gdb
gdb is a powerful program in tracking down Segmentation Faults and Core Dumps. It
can be used for a variety of debugging purposes though.
First thing you must do is compile with the -g option and without any optimization
(i.e. no -O2 flag).
Once you do that, you can run gdb <exe>. where <exe> is the name of the
executable.
gdb should load with the executable to run on. Now you can
create breakpoints where you want the the execution to stop. This can be specified
with the line number in the corresponding c source file. For example: break
376 would instruct gdb to stop at line 376.
You can now run the program by issuing the run command. If your program requires
command-line options or parameters, you can specify them with the run command.
For example:run 4 -s Doc! where 4, -s, Doc! are the parameters.
The program should run until the breakpoint or exit on a failure. If it fails before the
breakpoint you need to re-examine where you should specify the break. Repeat the
breakpoint step and rerun. If your program stops and shows you the breakpoint line,
then you can step into the function.
To step into the function use the step command. NOTE: Do not step into system
library calls (e.g. printf). You can use the command next over these types of calls or
over local function calls you don't wish to step into. You can repeat the last
command by simply pressing enter.
You can use the continue command to tell gdb to continue executing until the next
breakpoint or it finishes the program.
If you want to peek at variables, you can issue the print command on the variable.
For example: print mystruct->data.
You can also set variables using the set command. For example: set mystruct-
>data = 42.
The ptype command can tell you what type a particular variable is.
The commands instruction tells gdb to set a particular number of commands and to
report them to you. For example, commands 1 will allow you to enter in a variable
number of other commands (one per line, end it with "end"), and will report those
commands to you once breakpoint 1 is hit.
The clear command tells gdb to clear a specified breakpoint.
The list command can tell you where you are at in the particular code block.
You can specify breakpoints not only with lines but with function names.
For more information on other commands, you can issue the help command inside
gdb.
Debugging Techniques :: dbx
dbx is a multi-threaded program debugger. This program is great for tracking down
memory leaks. dbx is not found on linux machines (it can be found on Solaris or
other *NIX machines).
Run dbx with the executable like gdb. Now you can set arguments with runargs.
After doing that, issue the check -memuse command. This will check for memory
use. If you want to also check for access violations, you can use the check -
all command.
Run the program using the run command. If you get any access violations or
memory leaks, dbx will report them to you.
Run the help command if you need to understand other commands or similar gdb
commands.
Debugging Techniques :: Other Debuggers
With Gnome 1.4, there is a program called MemProf. It is a memory profiler that
can detect leaks. Although I have not personally used it, it could be a great graphical
tool to use in finding those nasty memory leaks!
strace is another program that can trace the program. Although the output is much
harder to parse than the other programs, this can be very useful in tracking down
problems with your code.
Creating Libraries
If you have a bunch of files that contain just functions, you can turn these source
files into libraries that can be used statically or dynamically by programs. This is
good for program modularity, and code re-use. Write Once, Use Many.
A library is basically just an archive of object files.
Creating Libraries :: Static Library Setup
First thing you must do is create your C source files containing any functions that will
be used. Your library can contain multiple object files.
After creating the C source files, compile the files into object files.
To create a library:
ar rc libmylib.a objfile1.o objfile2.o objfile3.o
This will create a static library called libname.a. Rename the "mylib" portion of the
library to whatever you want.
Next:
ranlib libmylib.a
This creates an index inside the library. That should be it! If you plan on copying the
library, remember to use the -p option with cp to preserve permissions.
Creating Libraries :: Static Library Usage
Remember to prototype your library function calls so that you do not get implicit
declaration errors.
When linking your program to the libraries, make sure you specify where the library
can be found:
gcc -o foo -L. -lmylib foo.o
The -L. piece tells gcc to look in the current directory in addition to the other library
directories for finding libmylib.a.
You can easily integrate this into your Makefile (even the Static Library Setup part)!
Creating Libraries :: Shared Library Setup
Creating shared or dynamic libraries is simple also. Using the previous example, to
create a shared library:
gcc -fPIC -c objfile1.c
gcc -fPIC -c objfile2.c
gcc -fPIC -c objfile3.c
gcc -shared -o libmylib.so objfile1.o objfile2.o
objfile3.o
The -fPIC option is to tell the compiler to create Position Independent Code (create
libraries using relative addresses rather than absolute addresses because these
libraries can be loaded multiple times). The -shared option is to specify that an
architecture-dependent shared library is being created. However, not all platforms
support this flag.
Now we have to compile the actual program using the libraries:
gcc -o foo -L. -lmylib foo.o
Notice it is exactly the same as creating a static library. Although, it is compiled in
the same way, none of the actual library code is inserted into the executable, hence
the dynamic/shared library.
Note: You can automate this process using Makefiles!
Creating Libraries :: Shared Library Usage
Since programs that use static libraries already have the library code compiled into
the program, it can run on its own. Shared libraries dynamically access libraries at
run-time thus the program needs to know where the shared library is stored.
What's the advantage of creating executables using Dynamic Libraries? The
executable is much smaller than with static libraries. If it is a standard library that
can be installed, there is no need to compile it into the executable at compile time!
The key to making your program work with dynamic libraries is through
the LD_LIBRARY_PATH enviornment variable. To display this variable, at a shell:
echo $LD_LIBRARY_PATH
Will display this variable if it is already defined. If it isn't, you can create a wrapper
script for your program to set this variable at run-time. Depending on your shell,
simply usesetenv (tcsh, csh) or export (bash, sh, etc) commands. If you already
have LD_LIBRARY_PATH defined, make sure you append to the variable, not
overwrite it! For example:
setenv LD_LIBRARY_PATH
/path/to/library:${LD_LIBRARY_PATH}
would be the command you would use if you had tcsh/csh and already had an
existing LD_LIBRARY_PATH. If you didn't have it already defined, just remove
everything right of the :. An example with bash shells:
export
LD_LIBRARY_PATH=/path/to/library:${LD_LIBRARY_PATH}
Again, remove the stuff right of the : and the : itself if you don't already have an
existing LD_LIBRARY_PATH.
If you have administrative rights to your computer, you can install the particular
Programming Tips and Tricks
Below is a listing of a few tips you can use when you are programming. This
article has been translated to Serbo-Croatian language.
Quick Commenting
Sometimes you may find yourself trying to comment blocks of code which have
comments within them. Because C does not allow nested comments, you may find
that the */comment end is prematurely termanating your comment block. You can
utilize the C Preprocessor's #if directive to circumvent this:
#if 0
/* This code here is the stuff we want commented */
if (a != 0) {
b = 0;
}
#endif
Quick Debugging Statements
In the C Preprocessor section, we mentioned that you could turn on and off
Debugging statements by using a #define. Expanding on that, it is even more
convenient if you write a macro (using the PRINTF() macro from the I/O section):
#ifdef DEBUG
#define DPRINTF(s) PRINTF(s)
#else
#define DPRINTF(s)
#endif
Now you can have DPRINTF(("Debugging statement")); for debugging statements!
This can be turned on and off using the -DDEBUG gcc flag.
Quick man Lookup in vim or emacs
In vim, move your cursor over the standard function library call you want to lookup,
or any other word that might be in the man pages. Press K (capital k).
In emacs, open up your .emacs file and this line:
(global-set-key [(f1)] (lambda () (interactive)
(manual-entry (current-word))))
Now you can load up emacs put the cursor on the word in question and press
the F1 key to load up the man page on it. You can replace the F1 key with anything
you wish.
library to the /usr/local/lib directory and permanently add an LD_LIBRARY_PATH into
your .tcshrc, .cshrc, .bashrc, etc. file.