Вы находитесь на странице: 1из 1341

C

programming
THE TUTORIAL


Thomas Gabriel


Copyright 2002,2016
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or
by any means, electronic, mechanical, photocopying, recording or otherwise, without the written permission of the
author. For information regarding permissions, write to ynbbook@gmail.com or thomas.gabrielfr@gmail.com.

ISBN: 978-2-9551114-2-0


Library of Congress
Cataloging-in-Publication Data
Thomas Gabriel
C Programming: The Tutorial

Cover Design: Najat Younsi/Thomas Gabriel.


Disclaimer:
Even though the author and the publisher have taken care in the preparation of this book, they assume no responsibilities
for errors or omissions that might have been crept into it, and make no expressed or implied warranty of any kind. No
liability is assumed for damages or negatives consequences coming from the use of the information or programs
contained within the book.

The examples contained within the book are intended for learning purposes not to be used as-is in professional
environments.

Contact: ynbbook@gmail.com or thomas.gabrielfr@gmail.com


Trademarks:
BSD is a trademark of University of California, Berkeley, USA
Solaris and NFS are registered trademarks of Oracle Corporation
AIX is a registered trademark of International Business Machines Corporation
POSIX is a registered trademark of The Institute of Electrical and Electronic Engineers, Inc.
UNIX is a registered trademark of The Open Group
Linux is a registered trademark of Linus Torvalds.
X Window is a trademark of the Massachusetts Institute of Technology
Microsoft Windows and MS-DOS are trademarks of Microsoft Corporation,

HP-UX is a registered trademark of Hewlett-Packard Company


Release 1.1







To Catherine for whom my love goes beyond the words for expressing it

CONTENTS

PART I C PROGRAMMING
CHAPTER I OVERVIEW
I.1 Introduction
I.2 The very first step
I.3 Variables
I.4 Comments
I.5 Operations
I.6 Control flow
I.7 Functions
I.8 Macros
I.9 Line continuation
I.10 Portability
CHAPTER II BASIC TYPES AND VARIABLES
II.1 Introduction
II.2 Numeral systems
II.3 Data representation
II.4 Literals
II.5 Variables
II.6 Basic types
II.7 Types of constants
II.8 Type qualifiers
II.9 Aliasing types
II.10 Compatible types
II.11 Conversions
II.12 Exercises
CHAPTER III ARRAYS, POINTERS AND STRINGS
III.1 Introduction
III.2 Arrays
III.3 Pointers
III.4 Strings
III.5 Arrays are not pointers
III.6 malloc(), realloc() and calloc()
III.7 Emulating multidimensional arrays with pointers
III.8 Array of pointers, pointer to array and pointer to pointer
III.9 Variable-length arrays and variably modified types
III.10 Creating types from array and pointer types
III.11 Qualified pointer types

III.12 Compatible types


III.13 Data alignment
III.14 Conversions
III.15 Exercises
CHAPTER IV OPERATORS
IV.1 Introduction
IV.2 Arithmetic operators
IV.3 Relational operators
IV.4 Equality operators
IV.5 Logical operators
IV.6 Bitwise operators
IV.7 Address and dereferencing operators
IV.8 Increment and decrement operators
IV.9 lvalue
IV.10 Assignment operators
IV.11 Ternary conditional operator
IV.12 Comma operator
IV.13 Operator precedence
IV.14 Type conversion
IV.15 Constant expressions
IV.16 Exercises
CHAPTER V CONTROL FLOW
V.1 Introduction
V.2 Statements
V.3 if statement
V.4 continue
V.5 break
V.6 goto
V.7 Nested loops
V.8 Exercises
CHAPTER VI USER-DEFINED TYPES
VI.1 Introduction
VI.2 Enumerations
VI.3 Structures
VI.4 unions
VI.5 Alignments
VI.6 Compatible types
VI.7 Conversions
VI.8 Exercises
CHAPTER VII FUNCTIONS
VII.1 Introduction
VII.2 Definition

VII.3 Function calls


VII.4 Return statement, part1
VII.5 Function declarations
VII.6 Scope of identifiers
VII.7 Storage duration
VII.8 Compound literals
VII.9 Object initializations
VII.10 Return statement, part2
VII.11 Default argument promotions
VII.12 Function type compatibility
VII.13 Conversions
VII.14 Call-by-value
VII.15 Call-by-reference
VII.16 Passing arrays
VII.17 Variable-length arrays and variably modified types
VII.18 Type qualifiers
VII.19 Recursive functions
VII.20 Pointer to function
VII.21 Understanding C declarations
VII.22 Pointers to functions as structure members
VII.23 functions and void *
VII.24 Parameters declared as void *
VII.25 Side effects
VII.26 Compound statements
VII.27 Inline functions and macros
VII.28 Variable number of parameters
VII.29 Some useful macros
VII.30 main() function
VII.31 exit() function
VII.32 Exercises
CHAPTER VIII C MODULES
VIII.1 Introduction
VIII.2 Overview
VIII.3 Writing Source Files
VIII.4 Header Files
VIII.5 Separate Compilation
VIII.6 Declaration, definition, initialization and prototype
VIII.7 Scope of user-defined types
VIII.8 Default argument promotions
VIII.9 Compatible structure, union and enumerated types
VIII.10 An example
VIII.11 Encapsulation

VIII.12 Exercise
CHAPTER IX INTERNATIONALIZATION
IX.1 Locales
IX.2 Categories
IX.3 setlocale
IX.4 localeconv()
IX.5 Character encodings
IX.6 Terminal settings
IX.7 strcoll() and strxfm()
IX.8 Conversion functions
IX.9 Functions manipulating wide characters
CHAPTER X INPUT/OUTPUT
X.1 Introduction
X.2 Files
X.3 closing a file
X.4 Reading a file
X.5 Writing to a file
X.6 Position indicator
X.7 Managing errors
X.8 Buffers
X.9 freopen()
X.10 Standard input, standard input, standard error
X.11 Removing a file
X.12 Renaming a file
X.13 Temporary files
X.14 Wide and Multibyte I/O functions
X.15 Exercises
CHAPTER XI STANDARD C LIBRARY
XI.1 Introduction
XI.2 <assert.h>
XI.3 <ctype.h>: character handling functions
XI.4 <errno.h>
XI.5 <math.h>
XI.6 <stdarg.h>
XI.7 <stdbool.h>
XI.8 <stddef.h>
XI.9 <stdio.h>
XI.10 <stdint.h>
XI.11 <stdlib.h>
XI.12 <string.h>
XI.13 <time.h>
XI.14 <signal.h>

XI.15 <setjmp.h>
XI.16 <wctype.h>: wide character handling functions
XI.17 <wchar.h>
CHAPTER XII C11
XII.1 Introduction
XII.2 Generic selection
XII.3 Exclusive open mode
XII.4 Anonymous unions and structures
XII.5 Static assertion
XII.6 No-return functions
XII.7 Complex
XII.8 Alignment
XII.9 Bounds-checking functions
PART II TOOLS
CHAPTER XIII COMPILING C PROGRAMS
XIII.1 Introduction
XIII.2 Compilation Phases
XIII.3 Preprocessing
XIII.4 Lexical analysis
XIII.5 Syntax analysis
XIII.6 Semantic analysis
XIII.7 Assembly code
XIII.8 Assembly
XIII.9 Linking
XIII.10 Compilers and Interpreters
XIII.11 Compiler Driver
XIII.12 Compiling C Programs
XIII.13 GNU gcc
XIII.14 Writing Source Files
XIII.15 Header Files
XIII.16 Separate compilation
XIII.17 Warning Messages
XIII.18 Libraries
CHAPTER XIV MAKEFILE
XIV.1 Introduction
XIV.2 Invocation
XIV.3 Makefile
XIV.4 Rules
XIV.5 Dependency graph
XIV.6 Macros
XIV.7 Implicit rules
XIV.8 Controlling make behavior

XIV.9 Recursive make


XIV.10 Using multiple rules for one target
XIV.11 Multiple targets in the same rule
XIV.12 Continuation line
XIV.13 Compiling C programs with make
XIV.14 Dependency graph
CHAPTER XV PROGRAMMING TOOLS
XV.1 Introduction
XV.2 Lint and splint
XV.3 Time
XV.4 Prof and gprof
XV.5 GDB
XV.6 Maintaining file versions

LIST OF FIGURES
Figure II1 Byte ordering: Big-endian and Little-endian
Figure II2 Piece of data in main memory
Figure II3 Symbolic representation of a variable
Figure II4 Ones complement
Figure II5 Twos complement
Figure II6 Padding bits
Figure II7 Ranges of normalized and denormalized floating-point numbers
Figure II8 Binary floating-point representation
Figure III1 Memory layout of the array age[5]
Figure III2 Representation of the array age after initialization
Figure III3 Two-dimension array arr[2][3] viewed as a table
Figure III4 Memory layout of a two-dimension array arr[2][3]
Figure III5 Three-Dimensional array arr[2][2][3] in a matrix representation
Figure III6 Memory layout of the three-Dimensional array arr[2][2][3]
Figure III7 Representation of a pointer
Figure III8 Relationship between a pointer and the object it references
Figure III9 Memory allocation with malloc()
Figure III10 Representation of a pointer to int
Figure III11 Pointers p and q referencing the same object
Figure III12 Initialization of an array with a string literal
Figure III13 Initialization of a pointer with a string literal
Figure III14 Representation of an array and a pointer
Figure III15 Pointer to pointer to int: int **p
Figure III16 Pointer to pointer to strings
Figure III17 Representation of char arr[2][3]
Figure III18 Representation of char **arr
Figure III19 Representation of char (*arr)[3]
Figure III20 Representation of char *arr[2]
Figure III21 Pointer to array and pointer to int
Figure IV1 Bitwise NOT

Figure IV2 Bitwise left shift


Figure IV3 Bitwise right shift
Figure IV4 Bitwise AND
Figure IV5 Bitwise OR
Figure IV6 Bitwise XOR
Figure IV7 Integer conversion rank
Figure V1 continue statement
Figure V2 break statement
Figure V3 goto statement
Figure VI1 Linked list
Figure VI2 Tree data structure
Figure VI3 Example of padding bytes inside structures
Figure VI4 Example of padding bytes in unions
Figure VII1 Function call
Figure VII2 Scope overlaps
Figure VII3 Call-by-value
Figure VII4 Call-by-reference
Figure VIII1 Simplified view of compilation steps
Figure VIII2 Objects
Figure VIII3 External linkage
Figure VIII4 Structure student_node
Figure IX1 UTF-8 encoding for
Figure IX2 Setting character encoding for Gnome
Figure IX3 Setting character encoding for KDE: steps 1 and 2
Figure IX4 Setting character encoding for KDE: steps 3 and 4
Figure X1 Data transfer between stream and file
Figure XI1 ISO 8601 Week
Figure XI2 E and O modifiers used by strftime()
Figure XIII1 Compilation Phases
Figure XIII2 Interpreter
Figure XIII3 Compiler
Figure XIII4 Virtual Machine

Figure XIII5 Gcc steps


Figure XIII6 Linking Object Files
Figure XIII7 Building an executable
Figure XIII8 Using a Static Library
Figure XIII9 Three Processes Using the Same Functions
Figure XIII10 Example of Project Organization
Figure XIII11 Processes Sharing the Same Library
Figure XIII12 Mapping Shared Libraries into process address spaces
Figure XIV1 Dependency graph showing relationship between files
Figure XIV2 Dependency graph showing target f depending on targets f1 and f2
Figure XIV3 Recursive make processing from the top target up to the leaves
Figure XIV4 Dependency tree showing relationship between targets and prerequisites
Figure XIV5 Compilation steps of C source files
Figure XIV6 Tree showing dependencies between the executable and the source files
Figure XIV7 Dependency tree of our project
Figure XIV8 Directory hierarchy of our project
Figure XV1 GDB launched within GNU emacs
Figure XV2 SCCS directory hierarchy
Figure XV3 Adding two branches from delta 1.2
Figure XV4 Derivation Graph of SCCS Versions
Figure XV5 Derivation Graph of RCS Versions
Figure XV6 Introducing two branches from revision 2.4

LIST OF TABLES
Table II1 Meaning of the number 2512 in base 10
Table II2 Meaning of the number 7EFF in base 16
Table II3 Meaning of the number 7761 in base 8
Table II4 Meaning of the number 1101 in base 2
Table II5 Printing literals with printf()
Table II6 Escape Sequences
Table II7 Integer types
Table II8 Range of unsigned integers
Table II9 Range of integers using the signed magnitude representation
Table II10 Range of integers using the ones complementation representation
Table II11 Range of integers using the twos complementation representation
Table II12 ASCII coded character set (ANSI X3.4-1986)
Table II13 Basic character set
Table II14 Trigraphs
Table II15 Digraphs
Table II16 Character types
Table II17 Short types
Table II18 Int types
Table II19 Long types
Table II20 Long long types
Table II21 Boundaries of Integer types
Table II22 Example of values for floating-point numbers
Table II23 Some minimum limits defined in float.h
Table II24 Some maximum limits defined in float.h
Table II25 Examples of compatible types
Table II26 Conversion to signed integer types
Table II27 Conversion to unsigned integer types
Table II28 Conversion to real floating-point types
Table III1 Declarations mixing arrays and pointers
Table III2 Examples of implementation of a dynamic three-dimensional array

Table III3 Explicit conversions on pointer and arithmetic types


Table III4 Assignment conversions on pointer and arithmetic types
Table IV1 Arithmetic operators
Table IV2 Relational Operators
Table IV3 Equality Operators
Table IV4 Logical operators
Table IV5 Logical AND
Table IV6 Logical OR
Table IV7 Bitwise operators
Table IV8 Bitwise AND
Table IV9 Bitwise OR
Table IV10 Bitwise XOR
Table IV11 Compound assignments
Table IV12 Operator precedence in decreasing order
Table VII1 Explicit conversions
Table VII2 Implicit conversions
Table VII3 Declaration of functions returning a pointer to a function
Table VII4 Declaration of pointers to functions
Table VIII1 C Types
Table VIII2 Type of definition and linkage of inline functions
Table VIII3 Scope and storage duration of identifiers
Table VIII4 Storage-class specifiers, scopes, definitions, declarations and linkage
Table IX1 Locale categories
Table IX2 Members of the structure lconv
Table IX3 UTF-8 encoding
Table X1 Available modes for fopen()
Table X2 Specifiers of fscanf()
Table X3 Expected types of arguments for fscanf()
Table X4 Examples with fscanf()
Table X5 Flags for fprintf()
Table X6 Specifiers for fprintf()
Table X7 Types of the arguments passed to fprintf()

Table X8 fseek(): reference position


Table X9 Byte and wide-characters I/O functions
Table X10 Differences between fprintf() and fwprintf()
Table X11 Modifier l used with %c in fprintf() anf fwprintf()
Table X12 Modifier l used with %s in fprintf() and fwprintf()
Table X13 Differences between fscanf() and fwscanf()
Table X14 Conversion for %c and %lc performed by fscanf() and fwscanf()
Table X15 Conversion for %s and %ls performed by fscanf() and fwscanf()
Table XI1 Some data type models
Table XI2 Conversion specifiers for strftime()
Table XII1 C11 new open modes
Table XIII1 Static and shared library comparison
Table XIV1 Dynamic macros
Table XIV2 Special targets
Table XIV3 Make options
Table XV1 GDB break points
Table XV2 GDB enable/disable
Table XV3 GDB subcommands for resuming execution
Table XV4 GDB print command
Table XV5 Displaying variables
Table XV6 Frame-related subcommands
Table XV7 SCCS commands
Table XV8 SCCS kewords
Table XV9 RCS keywords







PREFACE





Introduction
The C language was born in 1972 during the development of the Unix Operating system at
Bell Labs. Basing on the B language (created by Ken Thompson in 1969), Denis Ritchie
designed the C language in order to redevelop the Unix operating system that had been
written in assembly language so far. The goal of the researchers at BTL (Bell Labs) was to
build a portable operating system.

In 1978, Brian Kernighan and Denis Ritchie released the renowned book The C
programming language. The version is known as K&R C. In 1989, the very first standard
specification of the C language known as C89 or ANSI C was released by the American
National Standards Institute (ANSI). In 1990, the ANSI C became an international
standard: the standard is called ISO/CEI 9899:1990 or C90 (also called C89). Therefore,
ANSI C and C90 refer to the same C standard. In 1995, some minor features (amendment
called ISO/CEI 9899/AMD1:1995) and corrections were added to C90: to distinguish it
from other C standards, it is referred to as C90 Amendment 1 or C95 (sometimes called
C94). In 2000, a new international C standard, adding a great number of new features and
corrections, was published under the label ISO/CEI 9899:1990. It is commonly called
C99. At the time this book is written, the current C standard, released in 2011, is ISO/CEI
9899:2011 or C11.

The book is mainly focused on C99. As matter of fact, the philosophy of the language has
not changed over years; the different standards corrected errors, introduced new features,
and refined some concepts without altering the core of the language. Through the book,
we will learn the C language as described by C90, the extensions brought by C95 and
C99. As far as C11 is concerned, a chapter has been dedicated to it in order to introduce
the most handy features that can be used by new comers in the C language.

A standard C program, though the language was closely connected to the UNIX operating
system at its inception, can be compiled on any operating system and any computer
provided you have the right compiler on your machine. A C program is human-readable
program that cannot be executed as-is by a computer. Therefore, a translator is necessary
to convert a human-understandable programming language into a machine-executable
program. This is the role of a compiler. Logically, a book about C standards should be
independent from the operating system, hardware and the compiler. Therefore,
compilation should not be broached in the book. However, since the C language is tied to
the C compiler, you cannot learn the C programming without understanding the basics of
the compilation! For this reason, two chapters dealing with compilation have been added.
As we cannot cover all the operating systems and compilers, we only talk about the GNU
compiler called gcc on UNIX and Linux operating systems. The rationale is anyone can
easily and freely install a virtual machine running a GNU/Linux operating system and
directly install in it a great number of free and valuable GNU tools. Furthermore, to help
new programmers in C to improve and correct errors in their programs, a chapter
describing briefly some tools terminates the book.

Audience
Throughout the book, we will suppose that the reader already knows the basics of
operating systems. This book is suitable for users who wish to learn the standard C
language. It is neither interesting for people who have never used a computer nor for those
who have already a good knowledge of the C language searching for a reference
manual.

This book does not aim to explain in details all the features of the C standards because this
is not compatible with learning smoothly a programming language. For example, threads,
described by C11, are not described in the book because the topic cannot be broached by
beginners: an entire book would be necessary for such a subject. The book attempts to
give a strong foundation by detailing the core of the C language. The essential themes are
thoroughly explained with simplicity, through numerous examples and figures. Trickier
aspects of the C standards are examined in several locations with different perspectives to
enable the reader to assimilate the concepts.

This book explains with simple but progressive examples the essentials of the C language
as described by the C standards C90, C95, C99 and C11.

This book is the third of a series. Two other books are also available:
o The UNIX & Linux Operating Systems: The Tutorial
o UNIX & Linux Shell Scripting: The Tutorial

Organization
The book is composed of two parts and fifteen chapters. The first part describes the C
language, the second one explains how to compile C programs, and introduces some
useful programming tools. The first part is independent from the operating system while
the second one is intended for users working on UNIX or Linux operating systems.

PART I C PROGRAMMING
Chapter 1 Overview
Chapter 2 Basic types and Variables
Chapter 3 Arrays, Pointers and Strings
Chapter 4 Operators
Chapter 5 Control Flow
Chapter 6 User-defined Types
Chapter 7 Functions
Chapter 8 C Modules
Chapter 9 Internationalization
Chapter 10 X Input/Output
Chapter 11 Standard C Library
Chapter 12 C11

PART II TOOLS
Chapter 13 Compiling C Programs
Chapter 14 Makefile
Chapter 15 Programming Tools

Conventions
Throughout the book, the following conventions are used:
o Explanations appear in Liberation serif font.
o Definitions, syntaxes and synopsis are embedded within a white rectangle:
float variable_name = val;

o Examples are placed within a blue rectangle.

$ pwd
/users/michael
$ cd /etc
$ pwd
/etc

o Algorithms are enclosed within a salmon-colored rectangle


While there is input data
For each record read

.
ENDFOR
ENDWHILE

o We will use the following typographical conventions to present command syntaxes and
examples:

How to work with the book


Throughout the book, our examples are compiled on UNIX and Linux operating systems.
If you work on another operating system or use a compiler other than the GNU Compiler
gcc, please adapt the given compilation commands with your working environment.

If you are working on a Microsoft operating system and would like to type the examples as
[1]
they are shown, you could install a hypervisor and then create a virtual machine
running one of the following operating system:

o A GNU/Linux Distribution such as CentOS, OpenSUSE, Fedora, Ubuntu


o A BSD distribution such as NetBSD, FreeBSD, OpenBSD
o A UNIX distribution: Oracle Solaris.

Do not hesitate to tinker the given examples to understand how they work. However,
please, do not log in to a system as a user with an administrative role to test the examples.
In all cases, use a machine dedicated to tests or trainings: do not work on a
production machine.

Let us view how you have to deal with the examples that we propose in the book.
Suppose, the following example is given:
$ cat first_program.c
#include <stdio.h>

int main(void) {
printf (This is my first C program\n);
return 0;
}
$ gcc o prog first_program.c
$ ./prog
This is my first C program

To test such an example, first, open a terminal. The last line of your terminal then looks
like this:
$

Every line of the terminal starts with a text known as a prompt printed by the shell. You
should not type it: here, it appears as $. Then, perform the following tasks:
o In a text editor, type the following text and save it as first_program.c:
#include <stdio.h>

int main(void) {
printf (This is my first C program\n);
return 0;
}

o Compile the source file with gcc by running the following command:
$ gcc o prog first_program.c

o Then execute it by typing ./prog followed by <ENTER>:

$ ./prog


Now let us give some recommendations to set up a programming environment on your
computer. If the tools we propose are not suitable for you, feel free to choose others
meeting your preferences. Unless specified otherwise, the examples presented throughout
the book can be compiled in any operating system. On your computer, you can compile
and run the C programs proposed in the book whatever the operating system provided you
have an installed a compiler on it beforehand.

Remember that in the book, our examples are compiled and executed on a UNIX and
Linux operating systems. If your computer is running a UNIX operating system or a
UNIX-like operating system (such as Linux, or BSD systems), you can write or modify C
programs with a text editor such as vi, vim, emacs, gvim, and gedit. If your computer is
running a Microsoft Windows operating, you can write or modify your programs with a
text editor such as notepad, notepad++, XEmacs, and gvim.

Throughout the book, to show the contents of a text file, we invoke the command cat
(remember we will work on Linux and UNIX operating systems) followed by the name of
the file. Thus, the following example displays the contents of the file main.c:
$ cat main.c
#include <stdio.h>

int main(void) {
printf (This is my first C program\n);
return 0;
}


A compiler is a utility designed to translate a text file written in a programming language
to a binary file (which can be then executed). Throughout the book, we will work with the
GNU compiler gcc to compile our C programs but nothing prevents you from using the
compiler of your choice.

On UNIX operating systems, and UNIX-like operating systems (Linux, BSD systems),
you can freely download and install gcc if not already present on your system. On IBM
AIX system, you may use IBM XL C. On Oracle Solaris, you could use Oracle Solaris
Studio.

On Microsoft Windows operating system, you can download and install MingGW, Cygwin,
Pelles C or Microsoft Visual Studio.


If you are working with an Integrated Development Environment (IDE) such as Microsoft
Visual Studio or Oracle Solaris Studio, the text editor, the compiler and programming
tools such as a debugger are already integrated within the software.

About the author


Graduated from a French engineer school, specialized in systems and networks, the author
worked as IT consultant for several leading international companies. Starting his career by
developing software on UNIX systems and Microsoft Operating systems, before
becoming partner with Sun Microsystems for more than ten years, he worked as a system
architect in charge designing robust architectures for customers in large environments,
writing specific tools on demand for the customers, training users

FEEDBACK
Any comments, questions or suggestions for improving the book are welcome. Please
send them to ynbbook@gmail.com or thomas.gabrielfr@gmail.com.

PART I
C PROGRAMMING

CHAPTER I OVERVIEW
I.1 Introduction
This chapter gives you a glance at the C programming; the objective being to penetrate the
C world smoothly, easing the learning of the next chapters. After learning to write very
simple programs, we will take our microscope to go through C programming in details in
the subsequent chapters.

I.2 The very first step


According to the complexity of the C program, you are intended to develop one or more
text files could compose it. They can be read and modified by any text editor such as vi,
emacs, notepad, Notepad++, or gedit. A file that contains C code (composed of C
instructions) is known as a source file (source code).

Though a C program can be composed of several files, we will start working with a single
source file. Let us write a very simple program (called first_program.c) that just outputs to
the screen the sentence This is my first C program.
$ cat first_program.c
#include <stdio.h>

int main(void) {
printf (This is my first C program\n);
return 0;
}

Though it is quite simple, there are many things to say about this program. First, before
explaining each line, we are going to compile it. What does it mean? Compiling a C
program means translating a human-readable program to a computer-executable file. Thus,
your small program stored in the file first_program.c cannot be executed as it is by your
computer. Since your computer does not speak the C language, you have to use a
particular tool, known as a compiler, that not only can understand the C language,
translates it into a language understandable by the computer (machine language) but also
writes it into a specific format that can be managed by the operating system. A compiler is
a complex tool that actually is a suite of utilities performing many tasks ranging from the
C preprocessing to the output of the binary file. The compilation steps will be fully
described in the second part of the book. For now, we will simply call compiler the utility
that produces the system-executable binary file.


Let us use the GNU compiler gcc to generate the binary file that we then execute:
$ gcc first_program.c
$ ./a.out
This is my first C program

Above, we invoked the gcc utility with no option, which generated a binary file with the
default name a.out. To give a specific name to the output file, just specify the o option as
shown below:
$ gcc -o prog1 first_program.c
$ ./prog1
This is my first C program

Explanations:
o We invoked the gcc utility with the o option to specify the name of output binary file. If
you omit this option, gcc will spawn a binary file with the name a.out.
o The last argument of the first command is the name of the file holding the C code you
have written.
o The second command (i.e. ./prog1) executes the binary file.

You may encounter several issues when trying to compile your program. The first one is
the compiler gcc is not installed at all in your system. In this case, just install it, and go
on

The second one is the gcc tool is installed in your system but is not in a directory listed in
the PATH environment variable:
$ gcc -o prog1 first_program.c
/usr/bin/ksh: gcc: not found [No such file or directory]
$ which gcc
no gcc in /usr/bin /usr/sbin
$ PATH=$PATH:/opt/freeware/bin
$ which gcc
/opt/freeware/bin/gcc
$ gcc -o prog1 first_program.c

Explanations:
o First command: we invoked gcc but it failed
o Second command: we invoked the which command that confirmed the gcc command was
not in the PATH variable.

o Third command: we added to the environment variable PATH the directory in which the
gcc command can be found. In our example, the gcc tool was installed in /opt/freeware/bin.
o Fourth command: we invoked again the which command that showed the directory in
which gcc was located.
o Fifth command: we compiled successfully our C program.

Another issue you could meet is a typo in you C program:
$ gcc -o prog1 first_program.c
first_program.c: In function main:
first_program.c:5:1: error: expected ; before } token

Dont be afraid of that, this will often happen in your long lifetime of C programmer;
fortunately compilers will tell you where the problem is and give you enough details to
correct it. In our example, we forgot a semicolon as shown below:
$ cat first_program.c
#include <stdio.h>

int main(void) {
printf (This is my first C program\n)
return 0;
}

So far, we have learned to generate, from our C program, a binary file that can be executed
by the computer. Now, lets go back to our C code:
$ cat first_program.c
#include <stdio.h>

int main(void) {
printf (This is my first C program\n);
return 0;
}

First, you can notice our program name has the .c extension. This is not compulsory but it
is highly recommended to use the .c extension for your C source files. You will understand
why soon. The .c extension is an indicator for us (and everyone reading our program)
telling: this is a text file, holding a human-readable program written in C language.

First, a C code is made of set of actions, known as statements, telling the computer what to
do. In our C code, we can see two main components:
o #include <stdio.h>.

o The main() function and its code.



The #include statement is not actually a C statement but a preprocessor directive. For now,
we can consider the preprocessor being part of the compiler itself. A preprocessor
directive is just a macro (an action) meant for the compiler. Here, the directive #include tells
the compiler to copy the contents of the file stdio.h in the place where the directive is found
before actually compiling the source file. All happens as if the file stdio.h was actually
present in the source file. Later, we fully explain why we do that. For now, you just have
to know that the stdio.h file contains information about the I/O routine printf() allowing us to
display our text. Files included in that way are known as header files: their names hold the
.h extension. Dont worry, this is not relevant yetWe are just learning to make our first
step.

The second part of the program is the main() function. First, do you know what a
function is? A function is another name for subroutine or routine. If you have never
programmed in your life, those words do not help much more. A function is just a named
set of statements telling the computer what to do. For example, the function sum2numbers()
could be composed of two statements: the first one sums the numbers you give it and the
second one displays the result on the screen. Functions are very important because not
only will they save you time, but they also ease and relieve dramatically your programs.
Instead of writing the same code several times in your program, you could write it only
once as a function and then call it each time you need it. In our example, we called the
printf() function that is provided by the C library. A library is a set of functions written by
you or someone else and that can be incorporated into your programs. Hence you can call
printf() each time you need to display text without having to write code for that: it has been
already done for you, just call it.

You may have noticed that we have appended braces () to the names referring to functions:
it is our way to indicate we are talking about a function. Thus, throughout the book, we do
not write myfunc but myfunc() if we are referring to a function.

Remember that any C program must contain one and only one main() function. Otherwise,
your program will not be compiled. The compilation of the following code fails because
there is no main() function:
$ cat dummy_program_2.c
#include <stdio.h>

void display() {
printf (This is my first C program\n);
}
$ gcc dummy_program_2.c

Undefined first referenced


symbol in file
main /usr/lib/crt1.o
ld: fatal: symbol referencing errors. No output written to prog1
collect2: ld returned 1 exit status

The reason why the main() function is requited is the main() function is directly executed
[2]
when the program is run . This implies that the main() function is the core of your
program, or another way to say it, it is the scheduler, or the conductor of your program.

You have noticed the main() function is composed of three parts:
o int
o main(void)
o {
printf (This is my first C program\n);
return 0;
}


The third part of the main() function is known as a block or a function body. It is composed
of statements enclosed between braces ({}). The left brace indicates the beginning the
statements and the right brace terminates the set of statements of the function. Take note
that the braces can be alone in a line or with statements. Generally, the left brace is on the
same line as the function name or alone, while the right brace is alone as in the following
example:
$ cat first_program.c
#include <stdio.h>

int main(void)
{
printf (This is my first C program\n);
return 0;
}

In our example, the body of the main() function contains the statement printf (This is my first C
program\n) displaying the text This is my first C program on the screen. Remember that any C
[3]
statement must end with a semi-colon . I am sure you have noticed the strange symbol \n
at the end of the text to be displayed It means the newline; that is, after displaying the
text, the cursor goes to the next line. Try out the same example without \n

The second part of function indicates three things:

o The identifier (name of the function) that is main


o The type of the identifier is a function. This is indicated by the parentheses.
o The arguments that can be passed to it, specified between parentheses. We will not talk
about them now. When a function accepts no argument, it takes the keyword void as in
our example.

The first part of the main() function (i.e. int) is the type of the return value of the function.
In the C language, a function can return something (i.e. a value) or nothing. When it
returns something, you have to specify the type of the value it returns (we will explain C
types later). In the main() function, if you do not specify a return value, the default returned
value 0 is used (C99 and C11). Remember that the main() function always returns an
[4]
integer and you cannot change that. The rationale for that is initially, any command
under the UNIX system terminated with an integral number known as an exit status
notifying the UNIX shell if it had ended successfully or not. Consequently, we have to
specify an exit status (ranging from 0 to 255) for our program. This can be accomplished
through the return statement as shown below:
$ cat first_program_ok.c
#include <stdio.h>

int main(void) {
printf (This is my first C program\n);
return 0;
}

The value of 0 as a return value tells the operating system that our program ends with the
value 0 (In UNIX, Linux, and BSD systems, 0 means OK, any other value indicates a
failure). If we compile it and then run it on a Linux box, we would get something like this:
$ gcc -o prog_ok first_program_ok.c
$ ./prog_ok
This is my first C program
$ echo $?
0

We could specify any return value ranging from 0 to 255:


$ cat first_program_ko.c
#include <stdio.h>

int main(void) {
printf (This is my first C program\n);
return 10;
}

If we compile it and then run it:


$ gcc -o prog_ko first_program_ok.c
$ ./prog_ko
This is my first C program
$ echo $?
10

[5]
As you have guessed, under the shell , $? shows the exit status of the last command you
have executed. Normally the last statement of the main() function should be something like
return return_value.

Though a default value is automatically set if no return value is found in the main()
function, make sure you have specified a return value in the main() function, which ensures
you to keep the control of the behavior of your code. If you do not specify a return value
[6]
in the main() function, the compiler will do it for you: C99 or C11 compilers set it to 0 .

It is worth noting that since the C language can be used in other operating systems, a
successful exit status may be a value different from 0. For this reason, the macros
EXIT_SUCCESS and EXIT_FAILURE have been specified (in the header file stdlib.h) . We will
explain later what a macro is. Now consider a macro a symbolic name representing
a value. On the UNIX system (and UNIX-like systems), EXIT_SUCCESS is synonym for 0
and EXIT_FAILURE is synonym for 1. Since, those macros are defined in the header file
stdlib.h, you have to include it if you wish to use them. Thus, the program can be rewritten
as follows:
$ cat first_program.c
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
printf (This is my first C program\n);
return EXIT_SUCCESS;
}

As you have noticed, the body of the main() function is composed of two statements, each
ended by a semi-colon. Theoretically, if the C standard allows you to put on the same line
several statements, which saves space, it is always better to write readable code and then
avoiding appending several statements on the same line. When writing C code, your goal
is not to gain space but readability. For example, our first program could have been written
in two lines like this:

$ cat first_program.c
#include <stdio.h>
int main(void) {printf(This is my first C program\n);return EXIT_SUCCESS;}

In summary, a C program, whatever its complexity has at least one source file (the main
source file) that looks like this:
#include

int main(void) {

return retval;
}

The main source file is sometimes called main.c marking it holds the main() function but you
can give it any name.

I.3 Variables
Whatever the complexity of your program, you will need to store data coming from
outside the program itself, or from computations, for next utilizations. The best way to
store data temporarily, the time the program is running, is to use variables. A variable is
just a piece of memory of the computer storing a value. Since a program may have several
variables how to distinguish them? Simply by giving them a name. If we give the label X
to a variable and fill it with a value, we could use it again just by calling it by its name.

A variable could be viewed as a box. In C, before you can work with a variable, you have
to specify the size of your box: in some way, you tell the compiler to reserve a piece of
memory with a certain size that you are intended to use later. For example, if you think
you will work with big numbers (let say 167900765456709876477890), it is wise to ask
for a bigger box than if you plan to work with small numbers (let say numbers ranging
from 0 to 999). If you request a little box and you put in it more than what can be
supported, you will get an unexpected behavior.

So, a variable is characterized by its name and its size. The name allows us to set or get a
value. The variables size ensures us that we will have enough space in the computers
memory to store our values. Over time, a variable may have different values. This is the
reason why a variable has a type indicating what it is supposed to store. The C language
has a number of predefined types described by the C standard, but also user-fined types.
We first start with some basic types defined by the C standard.

As said earlier, before working with a variable, you have to specify its name and its type

as shown below:
$ cat prog_var1.c
#include <stdlib.h>

int main(void) {
int age;
return EXIT_SUCCESS;
}

Explanation:
o At the very first line, we include the header file stdlib.h in order to use the macro
EXIT_SUCCESS

[7]
o int is the type of the variable age. The type int indicates the set of integral numbers ,
such as 1, 20, -6, 0, or the number -3, we are going to use.
o age is the identifier of the variable (name). A variable name is composed of letters, digits
and underscores but cannot start with a digit.

In the example prog_var1.c, we tell the compiler that we want to store a number into the
variable age. This ensures us that while the program is running we will have a piece of
memory in which we can store a number that may vary over time. Next, we can give a
value to the variable:
$ cat prog_var2.c
#include <stdlib.h>

int main(void) {
int age;
age = 44;
return EXIT_SUCCESS;
}

Here the equals sign (known an assignment symbol) allows us to set a value to a variable.
Above we put the integer value of 44 into the age variable. The example could also have
been written like this:
$ cat prog_var3.c
#include <stdlib.h>

int main(void) {
int age = 44;
return EXIT_SUCCESS;
}

Above, the number 44 on right side of the equals sign is said to be an integer literal or
integer constant. The word literal means that even before running the program, the value
is known and fixed at compilation time.

What if we displayed the contents of the age variable?
$ cat prog_var4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age = 44;

printf (age variable=%d\n, age);
return EXIT_SUCCESS;
}

Explanations:
o The statement int age = 44 reserves memory space called age that will store an integer, and
initializes the age variable with the value 44.
o The printf statement displays the text age variable= followed by the contents of the age
variable. %d is called a specifier telling printf() the type of its argument (here age) so that it
could displays it correctly.

Let us compile and run it:
$ gcc -o prog_var4 prog_var4.c
$ ./prog_var4
age variable=44

The printf() function can display several arguments. Its general syntax is given below:
printf(fmt, arg1, arg2)

The very first argument, fmt, is known as a format allowing giving the type of the
subsequent arguments. The format appears between double quotes and is composed of text
and specifiers. A specifier is a letter preceded by the % symbol, expressing how the
corresponding argument should be interpreted. For example, %d is used to display an
integer, %s for a text and %f for a floating-point number.

The following example displays the contents of the variables X and Y:
$ cat prog_var5.c
#include <stdio.h>

#include <stdlib.h>

int main(void) {
int X = 10;
int Y = 20;

printf (First argument=%d and Second Argument=%d\n, X, Y);
return EXIT_SUCCESS;
}

$ gcc -o prog_var5 prog_var5.c
$ ./prog_var5
First argument=10 and Second Argument=20

The next example displays two variables of different types: the first one is a negative
integer and the second is a floating-point number:
$ cat prog_var6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int X = -10;
float Z = 3.14;

printf (X holds %d\nZ holds %f\n, X, Z);
return EXIT_SUCCESS;
}
$ gcc -o prog_var6 prog_var6.c
$ ./prog_var6
X holds -10
Z holds 3.140000

Here, we can add two notes:


o The format of the printf() function contains \n, indicating a newline is inserted after
displaying the value of each variable. Then, you could also have written the previous
example like this:
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int X = -10;
float Z = 3.14;


printf (X holds %d\n,X);
printf (Z holds %f\n,Z);
return EXIT_SUCCESS;
}

o You cannot swap the places of X and Z, and keeping the specifiers as they are.
Otherwise, you will obtain an undefined behavior. If you swap the place of the variables,
you must also invert the corresponding specifiers as shown below:
$ cat prog_var7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int X = -10;
float Z = 3.14;

printf (Z holds %f\nX holds %d\n, Z, X);
return EXIT_SUCCESS;
}
$ gcc -o prog_var7 prog_var7.c
Z holds 3.140000
X holds -10


The third basic type we would like to introduce is the string. A string is a series of
characters forming a logical unit. In C, it can be declared as char *. Consider the following
example:
$ cat prog_var8.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *my_text=This is my first program;

printf (%s\n, my_text);
return EXIT_SUCCESS;
}
$ gcc -o prog_var8 prog_var8.c
$ ./prog_var8
This is my first program

Explanations:
o The main() function is composed of three statements. The first one declares the variable
my_text and the second one displays it.
o The statement char *my_text=This is my first program tells two things: the variable my_text is
supposed to hold a series of characters and it stores the text This is my first program. On the
left side of the equals sign, we can see the name of the variable and its type. On the right
side of the equals sign lies its value (string literal) that is my first program enclosed
between double quotes. Double quotes are not part of the value to assign to the variable;
they are only delimiters for the string literal: the first double quote starts the string and
the second one terminates the string. Obviously, this infers that if you do not close a
string by writing only one double quote, you will get a error as in the example below:
$ cat prog_var8_err.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *my_text=This is my first program;

printf (%s\n, my_text);
return EXIT_SUCCESS;
}
$ gcc -o prog_var8_err prog_var8_err.c
prog_var8_err.c: In function main:
prog_var8_err.c:4:18: warning: missing terminating character
prog_var8_err.c:4:4: error: missing terminating character
prog_var8_err.c:6:4: warning: initialization makes pointer from integer without a cast


So far, we have only assigned a literal to a variable. Fortunately, you can store the
contents of a variable into another variable: you assign a variable to another variable as
shown below:
$ cat prog_var9.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int X = -3;
int Y = X;

printf (X=%d and Y=%d\n, X, Y);
return EXIT_SUCCESS;

}
$ gcc -o prog_var9 prog_var9.c
$ ./prog_var9
X=-3 and Y=-3

In our example, we placed the contents of the X variable into the variable Y. The equals
sign allows setting a value to a variable: the container, known as a lvalue, is on the left
side of the equals sign and the contents on the right side. On the right side, you can place a
literal, or another variable.

Once declared (a single declaration must be done), a variable can be reused as much as
you wish as shown below:
$ cat prog_var10.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int X = 0;
printf (X=%d\n, X);

X = 1;
printf (X=%d\n, X);

X = 2;
printf (X=%d\n, X);

return EXIT_SUCCESS;
}
$ gcc -o prog_var10 prog_var10.c
$ ./prog_var10
X=0
X=1
X=2

I.4 Comments
Comments within a program are of great importance particularly if it is large or complex.
They are used to describe statements, functions, algorithmsThey are ignored by
compiler. You have two ways to write comments:
o The characters /* introduce a comment that ends with the characters */. It can be
composed of several lines. Comments enclosed between /* and */ can be used anywhere,

even within statements.


o The characters // introduces a comment that ends with the line (when you press the
<ENTER> key). It was introduced by C99.

Here is a program containing examples of comments:
#include <stdio.h>
#include <stdlib.h>

/*
The program shows examples of comments
*/
int main(void /* Comment: no parameter used */ ) {
// this comment held in a single line
// This is another single-line commment

/* This comment
spans over
several lines
*/
int nb = 10; // nb is a variable
int x = 7; /* x is also a variable */

x = 10 + /* dummy comment */ 8;

return EXIT_SUCCESS;
}

I.5 Operations
Most of the operations in C language are quite natural and easy to understand but as we
will study it later, you must pay attention to the type of variables and literals. Let us
start with basic arithmetic operations: addition, subtraction, division and multiplication.
The example below adds two integers:
$ cat prog_add1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int p = 1 + 2;

printf (p=%d\n, p);


return EXIT_SUCCESS;
}
$ gcc -o prog_add1 prog_add1.c
$ ./prog_add1
p=3

Explanation:
o The statement int p = 1 + 2 yields three different actions.
It declares the variable p as an integer;
It computes the sum of the two integer literals 1 and 2. The parameters (here the

literals 1 and 2) appearing on either side of the + operator are known as operands.
An operand is an argument of an operator.
It assigns the output of the operation 1 + 2 to the p variable.

o The printf() function displays the p variable that holds the value 3.

Here again, we used the assignment operator (equals sign) to store the output of an
operation into a variable. The operation appears on the right side of the operator. Of
course, you can sum several operands as below:
$ cat prog_add2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int p = 1 + 2 + 3;
printf (p=%d\n, p);
return EXIT_SUCCESS;
}
$ gcc -o prog_add2 prog_add2.c
$ ./prog_add2
p=6

The same + operator can operate with integers as well as with floating-point numbers. The
following example adds floating-point numbers:
$ cat prog_add3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 3.14 + 1;

printf (x=%f\n, x);


return EXIT_SUCCESS;
}
$ gcc -o prog_add3 prog_add3.c
$ ./prog_add3
X=4.14000


The subtraction operation works in the same way (the operator is the minus sign -):
$ cat prog_sub.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int p = 1 - 2;
printf (p=%d\n, p);
return EXIT_SUCCESS;
}
$ gcc -o prog_sub prog_sub.c
$ ./prog_sub
p=-1

For the multiplication operation, the operator is the symbol star *.


$ cat prog_mult.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 3.14 * 2;
printf (x=%f\n, x);
}

$ gcc -o prog_mult prog_mult.c
$ ./prog_mult
x=6.280000

We finish by the division operation that uses the slash symbol / as an operator:
$ cat prog_div.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 2.1/3.2;
printf (x=%f\n, x);
}
$ gcc -o prog_div prog_div.c
$ ./prog_div
x=0.656250

The C operations seem to be obvious, working as you learned in your math coursesbut
this is not actually the case, seemingly There remain many things to say about them in
the next chapters. Here is a flavor of the strangeness of the C language:
$ cat prog_div2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 2/3;
printf (x=%f\n, x);
}
$ gcc -o prog_div2 prog_div2.c
$ ./prog_div2
x=0.000000

No, it is not an error! The output of the operation 2/3, as we coded it, is actually 0! You
may have expected something like 0.666667. We will explain why

I.6 Control flow


So far, we have worked with sequential statements: statements are executed in order of
appearance. It happens that we want to execute one or more actions if specific conditions
are met, or we want some tasks to be accomplish several times until some conditions
evaluates to true (or false). With no specific mechanism, your program always runs in the
same way, always produces the same output and cannot adapt to input data. Fortunately,
the C standard defines several statements that will allow you to yield actions according to
the circumstances: they are known as control flow statements.

Let us have a look at the if statement. In the chapter, we briefly describe only the following
two forms:
if (condition) {
statement_list;
}

if (condition) {
statement_list;
} else {
else_statement_list;
}

Where:
o condition is an expression. As we describe the C language, we will give more and more
details about C expressions. Here, condition is an expression that can evaluate to true or
false such as x > 8.
o statement_list is a set of statements, each of which terminated with a semicolon. Generally,
there is one statement on a line, but you could write several statements on the same line.
Statements are separated by one or more newlines (after the semicolon) for clarity.
o else_statement_list is a set of statements, each of which terminated with a semicolon.
o Blanks and newlines can be placed before and after the left and right braces. They have
no effect.
o Blanks and newlines can be placed before and after the left and right parentheses. They
have no effect.

The first form is composed of two parts: if (condition) and { statement_list; }. The first part is
composed of the keyword if and a condition between parentheses: its task is to evaluate the
expression condition: if it is true, the second part of the statement is executed. The second
piece of the if statement is known as a block or body of the if statement: it consists of a set
of statements embedded in braces that are executed only if the expression condition is true.

The second form is composed of four parts:
o if (condition)
o { statement_list; }
o else
o { else_statement_list; }

The first two parts are identical to the first form and have the same meaning. The last two
parts complete the first form: they mean if condition is not true (represented by the keyword
else) the block of else is executed. That is, if condition is true, the first block is executed,
otherwise the second one is executed.

Now, let us talk a little bit about relational expressions to help us better understand how
the if statement works. A relational expression is an expression that compares two values
and returns a value (0 for false or 1 for true). Here are some relational expressions:

o A > B: returns 1 (which means true) if A is greater than B. Otherwise, it returns 0 (false).
o A < B: returns 1 (true) if A is less than B. Otherwise, it returns 0 (false).
o A == B: returns 1 (true) if A is equal to B. Otherwise, it returns 0 (false).

Consider the following example:
$ cat prog_cflow1.c
1#include <stdio.h>
2
3 int main(void) {
4 int num;
5 int rval;
6
7 printf(Please, enter an integer less than or equal to 9: );
8 scanf(%d, &num);
9
10 if (num > 9) {
11 printf(Failure, the number is too big\n);
12 rval = 1;
13 } else {
14 printf(OK, the number is the requested range\n);
15 rval = 0;
16 }
17
18 return rval;
19 }

Explanation:
o Line 4: the num variable is declared as an integer. It will store a number read from the
keyboard.
o Line 5: the rval variable is declared as an integer. It will hold the return value of the main()
function.
o Line 7: the printf() function displays a text prompting the user to enter an integral number
smaller than 9.
o Line 8: the scanf() function reads the number the user has typed, and stores it into the num
variable. The function will be described later. Here, we use it just to get the number that
the user has typed. The ampersand (&) before the num variable will be explained when we
will talk about pointers.
o Line 10: the ifthenelse statement is a control flow statement, more specifically a
conditional statement. It means if the variable num holds a value greater than 9 (num > 9)
then line 11 is executed. Otherwise, line 14 is executed. You have noticed, the statement

[8]
has two parts: if and else, and each one having its own block .
o Line 11: it displays the message Failure, the number is too big. This is the first statement of
the if block. If the condition num > 9 is true, this line and the next one are executed.
o Line 12: this is the second statement of the if block. The rval variable is set to 1. The rval
variable holds the return value of the main() function.
o Line 13: This line tells two things. First, the if block ends with the right curly brace.
Secondly, the alternative introduced by the reserved word else starts.
o Line 14: this line is the first statement of the else block. It is run only if the condition of
the if statement is not met. That is, only if the variable num stores a number smaller than
9.
o Line 15: this is the second statement of the else block. The rval variable is set to 0. The
rval variable holds the return value of the main() function.
o Line 16: end of the else block.
o Line 18: the return value of the main() function appears here.
o Line 19: the right brace ends the block of the main() function.

Now, compile it and run it:
$ gcc -o prog_cflow1 prog_cflow1.c
$ ./prog_cflow1
Please, enter an integer less than or equal to 9: 10
Failure, the number is too big
$ echo $?
1

Above, we typed the number 10: the number is out of range. Let us run the program again,
but this time we type the integer 8:
$ ./prog_cflow1
Please, enter an integer less than or equal to 9: 8
OK, the number is the requested range
$ echo $?
0

Now, suppose we wanted the user to type a positive integral number less than or equal to 9
(in other word, a decimal digit). In this case, our if condition is composed of two
conditions: num >= 0 and num <= 9. Since both sub-conditions must be true at the same time,
we have to use the AND operator represented by the && symbol. Thus, the condition num
>= 0 && num <= 9 is true only if the sub-condition num >= 0 is true and the sub-condition num
<= 9 is also true. This means that if one of the sub-conditions is false, the condition num >= 0
&& num <= 9 is also false. Here is the program:

$ cat prog_cflow2.c
1#include <stdio.h>
2
3 int main(void) {
4 int num,rval;
5
6 printf(Please, enter an integer in the range [0,9]: );
7 scanf(%d, &num);
8
9 if (num >=0 && num <= 9) {
10 printf(OK, the number is the range [0,9]\n);
11 rval = 0;
12 } else {
13 printf(Failure, the number is out of range\n);
14 rval = 1;
15 }
16
17 return rval;
18 }

If we compile it and run it:


$ gcc -o prog_cflow2 prog_cflow2.c
$ ./prog_cflow2
Please, enter an integer in the range [0,9]: -1
Failure, the number is out of range
$ ./prog_cflow2
Please, enter an integer in the range [0,9]: 3
OK, the number is the range [0,9]
$ ./prog_cflow2
Please, enter an integer in the range [0,9]: 10
Failure, the number is out of range

If you have a look at our C source code in prog_cflow2.c, more specifically line 4, you can
see a new way of declaring several variables of the same type. The statement int num,rval is
the same as:
int num;
int rval;

The second type of control flow statement is the loop. A loop is a block (i.e. group of one
or more statements) executed several times. The C language has three loop statements. Let
us have a look at the while loop: the statement starts with the reserved word while; it allows
running a block as long as a condition is true. The following example displays the ten

decimal digits:
$ cat prog_cflow3.c
1#include <stdio.h>
2#include <stdlib.h>
3 int main(void) {
4 int i = 0;
5
6 printf(Displaying digits:\n);
7
8 while ( i < 10 ) {
9 printf (%d\n, i);
10 i = i + 1;
11 }
12
13 return EXIT_SUCCESS;
14 }

Explanation:
o Line 4: we declare the i variable as an integer, initialized to the value 0. It stores the
current digit that will be displayed.
o Line 8: the loop statement starts with the reserved word while. It is composed of two
parts. The first one is the condition and the second one is the body of the while loop. The
condition must be met in order to execute the statements in the block (i.e. loop body)
between the pair of curly braces. The condition is checked, if it is true, the block is
executed. This process continues until the condition becomes false, which causes the
loop to end. Here, the condition i < 10 is true as long as the value of the variable i holds a
value less than 10.
o Line 9: the variable i is output to the screen.
o Line 10: the i variable is incremented. At the very beginning, at the first iteration, i holds
0 before that statement. After executing the statement, i holds 1: i = 0 + 1. Then, the while
condition i < 10 is checked again, and since it is still true (the condition 1 < 10 is true), the
block is executed again: the i variable (holding 1) is displayed and then incremented: i = 1
+ 1. And so on. This process is repeated until i holds a value greater than 9. At the last
iteration, i holds 10 and therefore the condition i < 10 becomes false, which ends the loop
without running the body of the while loop.
o Line 11: the right curly brace ends the while block.

After compiling our program, we run it to obtain this:
$ gcc -o prog_cflow3 prog_cflow3.c
$ ./prog_cflow3

Displaying digits:
0
1
2
3
4
5
6
7
8
9

The while loop looks like the if statement. The latter is executed once if the condition is
true. The former is executed as long as the condition is true.

I.7 Functions
A C source code is composed of statements telling the computer what to do. In the same
way as a writer groups sentences into paragraphs, a C programmer gathers statements to
form blocks. Thus, as we saw it, a block can be the body of a conditional statement (e.g. if
statement), or a loop. There is another way to use a block in order to make it reusable.

A function is a named block that can accept input arguments (as if they were part of the
block) and may return a value. This is a very interesting feature since not only does it
allow multiple executions of a same block but also the block itself depends on input
values.

Let us start by explaining the return value of a function. The return value of a function is
the value given to the return statement. When the return statement is met, the function
terminates and goes back to the point it was called.
$ cat prog_func1.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 float pi_func(void) {
5 return 3.14;
6 }
7
8 int main(void) {
9 float x = pi_func();
10 printf(The return value is %f\n, x);;

11 return EXIT_SUCCESS;
12 }
$ gcc -o prog_func1 prog_func1.c
$ ./prog_func1
The return value is 3.140000

Explanation:
o Line 4: We declare the pi_func() function. It takes no input argument (void) and returns a
floating-point number (type is float).
o Line 5-6: The body of the function starts at line 4 (with the left curly brace) and ends at
line 12 (with the right curly brace). Line 4 holds the single statement of the function:
return 3.14. So, it does nothing but returning the number 3.14.
o Line 8: the main() function starts at line 7 and ends at line 10. Its block is made up of
three statements.
o Line 9: the x variable is declared as a floating-point number and is initialized to the
return value of the pi_func() function. We can note that on the left side of the equals sign is
the variable x (the container) and on the right side lies the function call (the contents). We
tell the computer to execute a function just by specifying its name. In our example, x =
pi_func() calls the function pi_func() that is then executed. The statements of the pi_func()
function are executed until a return statement is found or when the block terminates with
the right curly brace. Here, the function returns the value 3.14. Then, the x variable is
assigned to the value 3.14.
o Line 10: the printf() functions shows the value of the x variable.
o Line 12: end of the main() function.

This C source file prog_func1.c is equivalent to:
$ cat prog_func2.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 float pi_func() {
5 return 3.14;
6 }
7
8 int main(void) {
9 printf(The return value is %f\n, pi_func());
10 return EXIT_SUCCESS;
11 }
$ gcc -o prog_func2 prog_func2.c
$ ./prog_func2

The return value is 3.140000

You can pass values to functions. What does actually mean? This means you can provide a
function with initialized variables as if they were declared in its block. Look at the
function show_arg():
$ cat prog_func3.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 void show_arg(int n) {
5 printf(Argument is %d\n, n);
6 }
7
8 int main(void) {
9 show_arg(5);
10 show_arg(-4);
11 return EXIT_SUCCESS;
12 }
$ gcc -o prog_func3 prog_func3.c
$ ./prog_func3
Argument is 5
Argument is -4

Explanation:
o Line 4: the show_arg() function takes one argument n of type int and returns no value.
When a function returns nothing, the reserved word void is used. It tells the compiler and
anyone wishing to call it:Do not make assignment, no value is returned.
You have noticed that unlike what we saw so far, our show_arg() function has a
declaration of a variable inside parentheses. This means that we can pass data to the
function: the integer variable n will be set to the value that you will pass to the function
when you invoke it.
o Line 5: We display the value of variable n passed.
o Line 8-11: we define the main() function.
o Line 9: we invoke the function show_arg() with the value 5. All happens as if in the block
of the show_arg() function, we made the statement int n = 5. The show_arg() function is
executed and displays the value of the provided argument n: show_arg(5) displays Argument
is 5 on the screen.
o Line 10; we invoke the function show_arg() with the value -4. All happens as if the
statement int n = -4 was part of the body of the show_arg() function. The show_arg() function
executes, and displays the value of the provided argument n: show_arg(-4) displays the text
Argument is -4 on the screen.

I.8 Macros
Besides the features of the C language, the C pre-compiler have some interesting facilities
such as directives. We will explain in details how to work with the pre-compiler directives
later in the book. For now, we can consider a directive as a task performed by the compiler
before actually starting to compile a program. One of the most important directive is #define
that creates macros. It is used as follows:
#define macro_name macro_definition

It creates a kind of alias, called macro_name, for a series of characters macro_definition. When
the compiler meets the string macro_name, it simply replaces it by macro_definition. Here is an
example:
$ cat macro1.c
#include <stdio.h>
#include <stdlib.h>

#define NAME_MAX_LEN 64
#define ARRAY_LEN 128

int main(void) {
printf(NAME_MAX_LEN=%d\n, NAME_MAX_LEN);
printf(ARRAY_LEN=%d\n, ARRAY_LEN);

return EXIT_SUCCESS;
}
$ gcc -o macro1 macro1.c
$ ./macro1
NAME_MAX_LEN=64
ARRAY_LEN=128

The directives #define are usually placed after the #include directives. A macro cannot be
altered as variables are.

I.9 Line continuation


The newline character (generated when you hit the <ENTER> key) ends a line: it is the endof-line indicator. The C language allows statements to span over several lines as if they
were written on the same line. This can be done by using the backslash character \ at the
end of each intermediate line as in the following example:
$ cat line_continuation.c

#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(This line \
spans over \
three lines\n);

return EXIT_SUCCESS;
}
$ gcc -o line_continuation line_continuation.c
$ ./line_continuation
This line spans over three lines

If is often used with long macros.


I.10 Portability
I.10.1 Undefined, unspecified and implementation-defined behaviors
Some behaviors are not ruled by the C standard. They are not described by the standard
but are handled by the compiler (called implementation by the C standard). Undefined
behaviors must be avoided while unspecified and implementation-defined behaviors must
be used in the right way in order to have expected results.
o Undefined behaviors: when some errors occur, the compiler is free to choose how to
manage them: it may generate an error, ignore them or provide specific results. For
example, overflow is an undefined behavior.
o Unspecified behaviors: the C standard gives choices to the compiler to handle some
behaviors. The choice may not be described by the documentation of the compiler. For
example, when a function is called, the evaluation order of the arguments is unspecified
such as in f(x+1, y*2, z).
o Implementation-defined behaviors: some unspecified behaviors implemented by the
compiler are required to be documented, they are called implementation-defined
behaviors. For example, the number of bits composing a byte.

I.10.2 Compliance
A C program is said to be strictly conform if it uses only the features and libraries
described by the C standard and does not depend on undefined, unspecified or
implementation-defined behaviors. Such a program is portable.

The C standard considers two kinds of environments: translating environments and

executing environments. A translating environment is a system allowing compiling C


programs for an executing environment. An executing environment is a system that runs
programs compiled in a translation environment. An environment can be both a translating
and executing environment.

The C standard distinguishes two kinds of executing environments: hosted environments
and freestanding environments.

A hosted environment is an operating system having several facilities, such as files, that
can be used by the program. A compiler used in a translating environment to generate a
binary program for a hosted environment is called hosted implementation by the C
standard. It is said to be conform if it can compile a strictly conforming program.

A freestanding environment has not all the facilities usually found in operating systems.
An example of freestanding environment is the firmware that manages an embedded
[9]
system dedicated to specialized tasks. A freestanding environment is not a complete
operating system but a basic and specialized environment. In such conditions, a
conforming C program running in a freestanding environment can use only a subset of the
features defined by the C standard. A compiler used in a translating environment to
generate a binary program for a freestanding environment is called freestanding
implementation by the C standard. It is said to be conform if it can compile a strictly
conforming program that do not use the complex types, and use only a limited set of
libraries corresponding to the header files <float.h>, <stdint.h>, <limits.h>, <iso646.h>, <stdarg.h>,
<stdbool.h>, and <stddef.h>.

As far as we are concerned, throughout the book, we will work on an operating system,
that is both a hosted environment and a translating environment, to build and run our
programs.

Throughout the book, we will invoke gcc with the options -std=standard -pedantic, where
standard is c90, c99 or c11. Unless specified otherwise, when compiling our programming, we
will use C99 as the default standard: most of our programs will be compiled with the
options -std=c99 -pedantic. You could also add the option Wall that provides useful warnings
when compiling.


CHAPTER II BASIC TYPES AND


VARIABLES

II.1 Introduction
In the previous chapter, we took a glance at what a C program looks like. If it is tempting
to think the C concepts are quite easy to grasp, and therefore easy to use, there are
nevertheless many subtle aspects that you will find out as we move along through the
book.

This chapter does not cover user-defined types, structures, unions, arrays and pointers.
Those types are derived from basic types. We talk again about variables and types later in
the book. For now, let us go deeper into two notions seen in the previous chapter: basic
types and variables.

When you write a program, whatever the language used, you tell the computer what tasks
it has to accomplish. There two kinds of actions: complex and elementary. Complex tasks
are made up of elementary tasks. For example, the same way as the task do the
housework is composed of several basic actions (cleaning the floor, washing the dishes,
dusting), a program is also made up of basic statements.

Statements act upon data in order to produce a specific output. We can enumerate two
kinds of data:
o Data that is already known as the time you write the program. It is then present within
the program under the form of literals also known as constants.
o Data that is not known before running the program. This kind of data is dynamic: it
varies over time and each run may produce a different result. It can come from a
calculation within the program or from outside through I/O functions.

Both can be stored within a piece of the computers memory known as a variable. Let us
start with an introduction to numeral systems before broaching basic types.

II.2 Numeral systems


A numeral system is a conventional way to express numbers. In computing, four numeral
systems are commonly used: binary system, decimal system, octal system and
hexadecimal system. All of them use a positional notation. That is, if n is a number, in
base b, it is expressed as n=d1xb0+d2xb1++dpbp.

A base b is composed of b digits. In base b, a number written WXYZ means
Wxb3+Xxb2+Yxb+Z (we consider here that the most significant digit is the left most digit as
in our usual writing of decimal numbers). Thus, a digit d in position p (counting from 0,
from the right) means dxbp. In a base b, a number written dpdp-1d0 means dpbp+dp-1bp-1+
+d0b0, where d0, d1,, dp are digits ranging from 0 through b-1.

Using the same logic, the fractional part of a floating-point number can be written: f1xb-1+
+fpb-p where f1,, fp are digits ranging from 0 through b-1.

In our following discussions, we will append a subscript to numbers to specify their base
when there may be ambiguity. For example, 1012 is a binary number (base 2) while 10110
is a decimal number (base 10).

II.2.1 Decimal numeral system


A decimal numeral system is a system whose base is 10. The base 10 is composed of 10
digits denoted by 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9. Any number in base 10 is composed of
those digits.

As an example, consider the number 12310 in the base 10. It actually means
1*102+2*101+3*100. Similarly, in base 10, the number 2512=2*103+5*102+1*101+2*100
(see Table II1). The right-most digit is the least-significant digit and the left-most digit is
the most-significant bit. Starting from the right, the first digit, in position 0, is multiplied
by 100 (that evaluates to 1). The second one, in position 1, is multiplied by 101 (that
evaluates to 10). The third one, in position 2, is multiplied by 102, and so on.

Table II1 Meaning of the number 2512 in base 10


What about numbers with a fractional part? The same rule applies. Consider the number
0.12310, it can be written 1x10-1+2x10-2+3x10-3.

II.2.2 Hexadecimal Number System


The Hexadecimal number system is a base 16 number system. The hexadecimal system is
composed of 16 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A (or a), B (or a), C (or c), D (or d), E (or
e), F (or f). For example, the hexadecimal number 7EFF actually means
7x163+Ex162+Fx161+Fx20. Since, E and F represent respectively 14 and 15 in decimal
system number, 7EFF can be written, in decimal system number, as 7x163 + 14x162 +
15x161+15x20=32511.

Table II2 Meaning of the number 7EFF in base 16

II.2.3 Octal Number System


The octal number system is a base 8 number system. The octal system is composed of 8
digits: 0, 1, 2, 3, 4, 5, 6, 7. For example, the octal number 7761 actually means

7x83+7x82+6x81+1x80. The octal number 7761 can be written, in decimal system number,
as 7x83 + 7x82 + 6x81+1x80=4081.

Table II3 Meaning of the number 7761 in base 8

II.2.4 Binary Number System


The Binary number system is a base two number system working exactly in the same
manner as the base 10 number system.

The binary system is composed of two digits: 0 and 1. Thus, the binary number 11012
actually means 1*23+1*22+0*21+1*20.

Table II4 Meaning of the number 1101 in base 2

From the computers perspective, any piece of data is a series of 0 and 1. The computer
understands only the base 2 number system and stores data using this base. This means
that our base 10 number 251210 (1001110100002) is actually composed of twelve digits in
the binary number system and the number 510 (1012) requires three binary digits in the
base 2 number system.

To write the fractional part of a binary number, we use the same rule. Consider the binary
number 0.1012, it can be written 1x2-1+0x10-2+1x2-3. In base 10, 0.1012=1x2-1+0x10-2+1x23=1/2+1/8=0.625.


In order that your program could store your data, you have to tell it their length and what
they exactly are (integers, floating-point numbers, characters) by using types: a C type
defines both at a time. A number of basic types are described by the C standard. Once you
understand how to use them, you could define your own types. For now, let us examine
how data are actually represented by a computer.

II.3 Data representation


II.3.1 Byte
C programmers do not need to know of data is internally represented within a computer
because C standard is designed to be independent from hardware. In this section, we just
give a simplified overview of data representation, which is enough to understand C types.
Whatever the types of values you will use, internally, they will be represented by a series
of bits (the smallest unit of storage) that can be 0 or 1. However, the representation
depends on the type of piece of data. For example, floating-point numbers (such as 3.14),
and integers (such as 123) have different representations because they represent two
distinct entities. Computers store data in a fixed number of bits, representing their size,
according their type.

The computers memory is broken into chunks, called memory location, each of which is
assigned an index called an address allowing to accessing it. When the computer needs to
access a piece of data stored in memory, it specifies its address. The size of the smallest
addressable memory unit, called a byte, depends on the architecture of the processor. In
older computers, a byte could be any size such as 6 bits or 13 bits. Most of modern
[10]
computers use 8-bit bytes
though a few computers can still use another sizes.

Modern computers can address directly a byte or a group of bytes at a time. A program
cannot access bits individually directly but only a byte or a group of bytes (for example 2
bytes, 4 bytes or 8 bytes). When a program accesses memory, it specifies an address that
identifies a memory location that can be a byte or a group of bytes. The address of a group

of bytes is the address of the byte that has the lowest address (base address).

In C, the size of a byte is specified by CHAR_BIT (defined in the header file limits.h) and the
size of any type is a multiple of a byte.


II.3.1.1 Endianness

Figure II1 Byte ordering: Big-endian and Little-endian


In computers, there are two ways to organize the bytes of values fitting in several bytes

[11]
depending on the processor architecture: big-endian or little-endian
. Consider the
number 2937782621 written in hexadecimal AF 1B 01 5D represented by four bytes, how should
it be considered? It can be read as AF 1B 01 5D (left-to-right reading) or as 5D 01 1B AF (rightto-left reading): which byte is read first, the most significant byte (AF) or the least
significant byte (5D)? That is, from the computers perspective, either the most significant
byte (MSB) is stored at the lowest address or the less significant byte (LSB) is stored at
the lowest address (see Figure II1).

Do not confuse the way a value is internally represented with the way to write numbers in
the C language. In C, numbers are read from left to right as you usually read them in the
everyday life.

II.4 Literals
[12]
A literal is just a constant
value known before the startup of the program. In the book,
we will use the terms literals and constants as synonyms. There are four kinds of basic
constants:
o Integer constants
o Floating constants
o String constants
o Character constants

Table II5 shows the specifiers you have to use to display basic literals described in the
next sections.

Table II5 Printing literals with printf()

II.4.1 Integer constants


An integer constant does not contain a decimal radix (a period). You can express an
integer constant in base 10 (decimal), base 16 (hexadecimal) and base 8 (octal):
o Base-10 integer constants (commonly used) such as 19. A decimal number is composed
of decimal digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. A decimal constant starts with a digit
different from 0. If it starts with 0, it is treated as an octal number.
o Hexadecimal constants (base-16 notation) such as 0xFA. A hexadecimal number is

composed of the hexadecimal digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A (or a), B (or b), C (or c),
D (or d), E (or e) and F (or f). Hexadecimal constants start with 0X or 0x followed by
hexadecimal digits.
o Octal constants such as 020 (base-8 notation). An octal constant starts with zero (0). An
octal number is composed of octal digits: 0, 1, 2, 3, 4, 5, 6, and 7. Octal constants start
with 0 followed by octal digits.

An integer constant (whatever the notation used: base 10, base 8 or base 16) can be
displayed by printf():
o The %d or %i specifier displays the constant in base 10
o The specifier %x or %X displays the constant in base 16. The specifier %x uses lowercase letters while the specifier %X uses uppercase letters.
o The %o specifier displays the integer constant in octal base.

Of course, most of the time, you will work with decimal numbers (base 10) as you usually
do it in your daily life, but it also happens that you need to work with hexadecimal
notation or octal notation. Whether you work with the base of 10, 16 or 8, it is the same
for the computer. The example below displays the integer constants 10 (decimal number),
0xFA (hexadecimal number), and 020 (octal number) in decimal, hexadecimal and octal
bases:
$ cat literals_1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (Dec Hex Oct\n);
printf (%d %X %o\n, 10, 10, 10); /* Decimal number */
printf (%d %X %o\n, 0xFA, 0xFA, 0xFA); /* Hexadecimal number */
printf (%d %X %o\n, 020, 020, 020); /* Octal number */

return EXIT_SUCCESS;
}
$ gcc -o lit1 -std=c99 -pedantic literals_1.c
$ ./lit1
Dec Hex Oct
10 A 12
250 FA 372
16 10 20

As you can see, the output is not smartly presented. Let us introduce here a way to make

the display a little bit more sexy: a modifier, as its name implies, alters the way the printf()
function shows data:
$ cat literals_2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (%4s %4s %4s\n, Dec, Hex, Oct);
printf (%4d %4X %4o\n, 10, 10, 10);
printf (%4d %4X %4o\n, 0xFA, 0xFA, 0xFA);
printf (%4d %4X %4o\n, 020, 020, 020);

return EXIT_SUCCESS;
}
$ gcc -o lit2 -std=c99 -pedantic literals_2.c
$ ./lit2
Dec Hex Oct
10 A 12
250 FA 372
16 10 20

The number 4, known as a width, before the specifier is a modifier telling printf() to display
the value with at least four characters. If the number of characters of the value is greater
than or equal to 4, all of its characters are displayed but if the number of characters of the
value is lesser than 4, spaces are placed before the value. Thus, 10 is prefixed with two
additional spaces while 250 with only one.

[13]
You have noticed that the output is right aligned
. If you prefer a left-alignment, use the
minus modifier just before the modifier 4:
$ cat literals_3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (%-4s %-4s %-4s\n, Dec, Hex, Oct);
printf (%-4d %-4X %-4o\n, 10, 10, 10);
printf (%-4d %-4X %-4o\n, 0xFA, 0xFA, 0xFA);
printf (%-4d %-4X %-4o\n, 020, 020, 020);

return EXIT_SUCCESS;
}

$ gcc -o lit3 -std=c99 -pedantic literals_3.c


$ ./lit3
Dec Hex Oct
10 A 12
250 FA 372
16 10 20

II.4.2 String literals


A string literal (string constant) is a series of characters such as Hello world. It can be
displayed by printf() using the %s specifier. A string literal is enclosed in double quotation
marks. The following example displays the three string literals Dec, Hex and Oct:
$ cat literals_4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (%s %s %s\n, Dec, Hex, Oct);
return EXIT_SUCCESS;
}
$ gcc -o lit4 -std=c99 -pedantic literals_4.c
$ ./lit4
Dec Hex Oct

A string literal starts with a double quotation mark and ends with a double quotation mark. Each time
you wish to write a string literal, first type in two double quotes and then place your text between them.


If you forget the second double quote in a string literal, the compiler will detect it:
$ cat literals_5.c

#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (%s %s %s\n, Dec, Hex, Oct);
return EXIT_SUCCESS;
}
$ gcc -o lit5 -std=c99 -pedantic literals_5.c
literals_5.c: In function main:
literals_5.c:5:40: error: expected ) before Oct
literals_5.c:5:43: warning: missing terminating character
literals_5.c:5:40: error: missing terminating character
literals_5.c:9:1: error: expected ; before } token

Above, the compiler met the first error at line 4: the Hex literal has only one double
quote.

II.4.3 Floating-point literals


A floating-point constant can take two forms. In its simplest form, it is composed of two
groups of digits separated by the radix point (known as a significand) such as 1.718. The
second form corresponds to the scientific notation for floating-pointer numbers that
consists of a significand followed by an exponent part. The exponent part is composed of
a base and an exponent. In base 10, the base is represented by e or E, the exponent part is
then of the form en or En. For example, the number 1.718 x 102 is expressed, in C, as
1.718e2. C99 allows using the scientific notation in hexadecimal: The number starts with 0x
or 0X, and the base is represented by p or P which means 2. For example, the number
0x1.5p2 means (1+5*16-1)*22=5.25.

You have three printing formats for floating-point literals with printf():
o by using the specifier %f: the number is displayed in the format [-]i.f, where each i and f
are decimal integer numbers.
o by using the specifier %e, %g, %E or %G: %e displays a floating-point number in
scientific decimal notation (the decimal base e appears in lowercase) while %g is either
%e or %f depending on the value and the precision of the number (see Chapter X section
X.5.5). The specifiers %E and %G are equivalent to %e and %g respectively: they just
display the base in uppercase. The decimal scientific notation is of the form [-]i.fen (with
%e) or [-]i.fEn (with %E) where i, f, and n are decimal digits.
o by using the specifier %a or %A that displays a floating-point number in scientific
hexadecimal notation. With the specifier %a, hexadecimal digits and the base are in
lowercase while with %A they are in uppercase. The hexadecimal scientific notation is of
the form [-]0xihex.fhexpndec (with %a) or [-]0Xihex.fhexPndec (with %A) where ihex, fhex, are

hexadecimal digits and ndec is a decimal number.



The following example displays the floating-point constant 3.14159.
$ cat literals_6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (%f\n, 3.14159);
return EXIT_SUCCESS;
}
$ gcc -o lit6 -std=c99 -pedantic literals_6.c
3.141590

The following example displays only two digits of the fractional part of the floating-point
literal 3.14159:
$ cat literals_7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (%.2f\n, 3.14159);
return EXIT_SUCCESS;
}
$ gcc -o lit7 -std=c99 -pedantic literals_7.c
3.14

You have noticed that we used the printf() format %.2f. As you can guess, it tells the
function to display the floating-point number with only two digits after the decimal point.
In the printf() format, the number 2 after the point and before the f letter is called a
precision. In addition, we could also specify a width. In the following example, the width
is 6, which adds extra spaces if the number of characters to display is less than 6:
$ cat literals_8.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (%6.2f\n, 3.14159);
return EXIT_SUCCESS;
}
$ gcc -o lit8 -std=c99 -pedantic literals_8.c

3.14

Two leading spaces are added (right alignment by default) so that the number of characters
to display be at least six characters (the length of 3.14 is four characters). If you place a
minus after the percentage sign, you request a left alignment (two trailing spaces are
added):
$ cat literals_9.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf ([%-6.2f]\n, 3.14159);
return EXIT_SUCCESS;
}
$ gcc -o lit9 -std=c99 -pedantic literals_9.c
[3.14 ]

We used brackets to show the trailing spaces. We will say much more about the printf()
function when we will talk about the I/O functions (see Chapter X sections X.5.5 and
X.10.3.3).

The following example displays the number 0.1 in scientific notation, in decimal and
hexadecimal:
$ cat literals_10.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 0.1;

printf(x=%e (decimal), %a (hexadecimal)\n, x, x);

return EXIT_SUCCESS;
}
$ gcc -o literals_10 -std=c99 -pedantic literals_10.c
$ ./literals_10
x=1.000000e-01 (decimal), 0x1.99999a0000000p-4 (hexadecimal)


The following example displays the variables f1 and f2 of type
formatting:

float

with different

$ cat literals_11.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float f1 = 0x1.5p2;
float f2 = 5.25; // 0x1.5p2=(1+5 * 1/16) * 4 = 5.25;

printf(Decimal:\n);
printf(f1=%e (%E)\n, f1, f1);
printf(f2=%e (%E)\n, f2, f2);

printf( \nHexadecimal:\n);
printf(f1=%a (%A)\n, f1, f1);
printf(f2=%a (%A)\n, f2, f2);

return EXIT_SUCCESS;
}
$ gcc -o literals_11 -std=c99 -pedantic literals_11.c
$ ./literals_11
Decimal:
f1=5.250000e+00 (5.250000E+00)
f2=5.250000e+00 (5.250000E+00)

Hexadecimal:
f1=0x1.5000000000000p+2 (0X1.5000000000000P+2)
f2=0x1.5000000000000p+2 (0X1.5000000000000P+2)

II.4.4 Character literals


The last literal we are going to describe is the character literal or character constant. A
character literal such as c can be displayed by printf() using the %c specifier. A character
literal is a symbol enclosed between single quoting marks. The following example
displays the six character constants h, e, l, l, o, !.
$ cat literals_10.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (%c%c%c%c%c%c\n,h, e, l, l, o, !);
return EXIT_SUCCESS;

}
$ gcc -o lit10 -std=c99 -pedantic literals_10.c
$ ./lit10
hello!

As not all characters are printable, there is another way to represent some character
literals: escape sequences. Escape sequences are special in the sense that they do not
represent themselves. They are special characters not printable but have effects when
output. For example, the escape sequence \n denotes the newline character. The following
example displays three character sequences \v (vertical tab), \t (horizontal tab) and \b
(backspace):
$ cat literals_11.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(a\tb\tc\v\bC\tD\n);
return EXIT_SUCCESS;
}
$ gcc -o lit11 -std=c99 -pedantic literals_11.c
$ ./lit11
a b c
C D

Explanation:
o a\tb\tc displays the character a then a tab then the character b followed by a tab and the
letter c.
o \v\bC\tD displays the vertical tab (jump to the next line) followed by a backspace (move
left one character in order to be placed just under the letter c).
o C\tD displays the letter C followed by a tab and the letter D.

Table I6 lists escape sequences you can use with the printf() function (it is unlikely you
often will use all of them).

Table II6 Escape Sequences


Suppose now we would like to display this text: The string delimiter is . How can we do that
since a double-quote is a string-delimiter? The C language defines the character backslash
\ as an escape character removing the special meaning of the character following it. Thus,
to display a double-quote, you just have to place a backslash in front of it: \ as shown
below:
$ cat literals_12.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(The string delimiter is \\n);
return EXIT_SUCCESS;
}
$ gcc -o lit12 -std=c99 -pedantic literals_12.c
$ ./lit12
The string delimiter is

Now, we are going to talk about another way to work with character literals. Any character
is in fact an integer constant whose value depends on the coded character set used. We
can view a coded character set as a table that maps each character with a unique integer
number representing its code value (the topic will be broached in this chapter and in
Chapter IX). The coded character set depends on the language that is used by your

[14]
program. In English, ASCII
is an example of coded character set.

You have two ways to work with a character through its code value by using an octal or a
hexadecimal number. An octal number code starts by \ followed by three octal digits (i.e.
each in the range [0-8]). A hexadecimal code starts with \x followed by two hexadecimal
digits (each in the range [0-F]). For example, in ASCII and Unicode, the A letter has the
code value 65 (101 in octal, 41 in hexadecimal) and the double-quote has the code value
34 (042 in octal, 22 in hexadecimal) as shown below:
$ cat literals_13.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(Octal Code 101=\101 or Hex Code 0x41=\x41\n);
printf(Octal Code 042=\042 or Hex Code 0x22=\x22\n);
return EXIT_SUCCESS;
}
$ gcc -o lit13 -std=c99 -pedantic literals_13.c

In our computer, we get this:


$ ./lit13
Octal Code 101=A or Hex Code 0x41=A
Octal Code 042= or Hex Code 0x22=

To find an ASCII code of character (in the range [0-127]), you can make an internet search
or using the little program below:
$ cat literals_14.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int i=0;
6
7 while (i < 128) {
8 printf(%d=0x%02X=0%03o=%c\n, i,i,i,i);
9 i=i+1;
10 }
11
12 return EXIT_SUCCESS;
13 }
$ gcc -o lit14 -std=c99 -pedantic literals_14.c

$ ./lit14

Explanation:
o Line 5: We declared the variable i as an integer. It will store the character code. We also
initialized the i variable to 0 because the very first code in ASCII is 0.
o Line 7: The while loop allows going through all the 128 characters. The loop ends when
the i variable reaches the value 128.
o Line 9: At the end of the while body, the i variable is incremented.
o Line 8: The printf() function displays the i variable as a decimal number (%d), as a
hexadecimal (%x), as an octal number and as a character (%c).

Several characters, known as control characters (escape sequences), are not printable
You may have noticed the modifiers in the printf() format for displaying the hexadecimal
and octal numbers: %02X and %03o. The format %02X means we want to display a
hexadecimal number with at least two digits; if there is less than two digits, printf() adds
leading 0: the number F appears as 0F. Do not confuse %02X with %2X: the first one adds
leading zeroes while the second one adds leading spaces if the number of characters to be
displayed are less than two. Likewise, %03o tells printf() to display a number in octal
representation with at least three digits adding leading zeroes if required: the octal number
7 appears as 007.

In our example literals_14.c, the i variable was an integer representing the code of a character
we printed using the printf() specifier %c. In C, as a character is in an integer, to display the
[15]
code of a character
just use the %d , %X or %o specifier as shown below:
$ cat literals_15.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(Code of the character %c is %d\n, A, A);
return EXIT_SUCCESS;
}
$ gcc -o lit15 -std=c99 -pedantic literals_15.c
$ ./lit15
Code of the character A is 65

II.5 Variables

Figure II2 Piece of data in main memory

II.5.1 What is a variable?

A variable (also known as an object in the book) is a named piece of memory storing a
[16]
value. When you execute a program, it becomes a process
to which the operating
system loans the processor in order to execute it. Then, the processor executes the
statements of the program and stores required data in main memory (also known as RAM)
and registers. Each manipulated piece of data is stored in a specific memory address. In
Figure II2 we can see the character A(decimal code 65 or 10000012 in binary notation) is
stored at address 3 (0112 in binary notation) in an imaginary computer.

In order to use several times the same value, programmers declares symbolic names,
variables, representing pieces of memory into which data can be stored. Thus (see Figure
II3), we could define the variable letter into which we would store the character literal A.
To retrieve the value held by a variable, just use its name. Thanks to variables, you do not
have to deal with computers addresses or registers but only identifiers.

Figure II3 Symbolic representation of a variable





A variable can be viewed as a box in which we can store a value. The C language defines
several kinds of boxes (variables) being able to hold small or big numbers, integers,
floating-point numbers, collections of characters Before talking about types, let us
examine how a piece of data is represented.

II.5.2 Data size


It is obvious that you will have to manipulate several kinds of pieces of data in your
programs. In every project on which you will work, you will have to make a design of the
real world and then implement it. For example, suppose you want to create your own
database storing a list of persons for a given purpose. The last names and first names could
be implemented as a string, the age as an integer, the height as a floating-point number,
and the gender as a single character.

We might imagine a variable that could hold any type of value as in PERL, or AWK but
this is not the case in the C language. A C variable has a single type that cannot change
after being declared. It was designed to be closer to the human language and much more
convenient than the machine language or the assembly language. However, it was also
designed to be very effective and then, in a way, close to the machine language.

When you declare a variable, you must know the interval of the values that it could hold.
Since a computer works only with 0 and 1 digits known as bits, whatever the value held in
a variable, it is finally stored in memory and registers as a binary number consisting in a
specified number of bits. If you know the minimum and maximum values that can be held
in a variable, you can determine its type. For example, the biggest value of an ASCII
character is 127 and the lowest is 0. Therefore, a variable holding an ASCII character can
be represented by seven bits. Why? A group of seven bits can represent 27 (=128) different
values: from 0000000 through 1111111 (27-1=127). So, a positive integer (known as an
unsigned integer) in the range [0,127] can be represented by seven bits. In the same way,
an integer that can be positive, zero or negative (known as a signed integer) in the range
[-63,63] can also be represented by seven bits. Both ranges [0,127] and [-63,63] hold
integers and both can be represented by 7 bits. The C language allows you to be more
specific: an integer type can be signed or unsigned.

II.5.3 Declarations
As said earlier, a variable is a chunk of the computers memory having a certain size

expressed bytes. Before using a variable, you must declare it by a statement known as a
declaration:
type variable_name;

Where:
o type is either a user-defined type, system-defined type or a C-type (defined by the C
standard)
o variable_name is an identifier composed of letters (lowercase or uppercase), natural
numbers and underscores. However, it cannot start with a number.
o The statement ends by a semicolon (;).

The declaration of a variable means several things:
o It defines the size of the variable telling the operating system the amount of memory that
will be requested to store the value held in the variable.
o It allows identifying a variable
o It allows using the same variables in several different files: in the C language, a program
may be composed of several source files contained the C code. We will say more about it
when we will talk about modular programming.

Until C95, variables must have been declared at the beginning of a block before
statements. As of C99, the declarations of variables can be placed anywhere within a
block. In the following example, we declare the variable f of type float and the variable k of
type int:
$ cat variable_declaration.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int k = 10;
printf(k=%d\n, k);

float f = 3.14;
printf(f=%f\n, f);

return EXIT_SUCCESS;
}
$ gcc -o variable_declaration -std=c99 -pedantic -Wall variable_declaration.c
$ ./variable_declaration
k=10

f=3.140000

However, generally, programmers have made the traditional habit of grouping the
declarations at the beginning of blocks in order to localize them easily.

Let us start with the basic types defined by the C standardOther types such as arrays,
structures, unions, pointers and functions, called derived types, are described later in the
book.

II.6 Basic types


The C language defines two main basic types: integer and floating types. In the C
language, a type has three different consequences: the type of the value (integer or
floating-point number) determining its representation, its bit-length, and the range of
allowed values.

II.6.1 Integer types


There are several integer types that can be split into two groups: signed and unsigned
integers. Signed integers represent integral numbers than can be negative, 0, or positive.
Unsigned integers can be 0 or positive. Integer numbers can be represented in one byte,
two bytes, four bytesEach signed integer type has an unsigned counterpart: signed
char/unsigned char, signed int/unsigned int Take note that a signed integer type and an unsigned
integer type are two different types. The range of positive values represented by a signed
type is a subset of the range of values represented by the corresponding unsigned type.

An integer is a number with no fractional part such 1, 128, or 41526. The C standard
defines several kinds of integer types (called standard integer types):
o Integer types fitting in at least 8 bits denoted by char
o Integer types fitting in at least 16 bits denoted by short
o Integer types fitting in at least 16 bits denoted by int
o Integer types fitting in at least 32 bits denoted by long
o Integer types fitting in at least 64 bits denoted by long long

In all cases, whatever the machines on which you will work and whatever the sizes of the
types, the compilers enforce the following rule: size of long long types size of long types
size of int types size of short types size of char types.

Moreover, the reserved words signed or unsigned can be used to specify if an integer is

signed or unsigned. The keyword signed indicates values can be negative, zero or positive
while the word unsigned states the values are positive values or zero.

The number of bits, excluding the sign bit and padding bits, used to represent an integer is
called the precision. The number of bits, including the sign bit and excluding the padding
bits, used to represent an integer is called the width. The size of a number is the width plus
the padding bits.

Table I7 lists the C standard integer types we are going to describe in the next sections.

Table II7 Integer types


In addition to standard integer types, implementations can define other integer types. They
are called extended integer types.

II.6.1.1 Integer encoding
In order to have a better understanding of the integer bounds enforced by the C standard,
in this section, we describe some representations of integers. The C standard dictates
integers have a binary representation but does not impose a specific way to represent them

internally (encoding).

For sake of clarity, in our discussions, in the following sections, we will work with the
big-endian representation.

II.6.1.1.1 Unsigned integers

Unsigned integers can take a positive value or 0. Their representation is quite simple.
Suppose, our computer has a big-endian processor, and the unsigned short type is represented
by 2 bytes. The decimal number 44827 (0xAF1B) stored in a variable of type unsigned short
would be represented like this:
10101111 00011011


In hexadecimal, the number takes the form AF 1B. The first byte AF corresponds to the
binary number 10101111 and the second byte 1B to 00011011.

The most significant byte occupying the lowest address would be 10101111 (AF) and the next
byte 00011011 lies on the next address. It is interpreted as:
o First byte: 1x215 + 0x214 + 1x213 + 0x212 + 1x211 + 1x210 + 1x29 + 1x28
o Second byte: 0x27 + 0x26 + 0x25 +1x24 + 1x23 + 0x22 + 1x21 + 1x20

Integer size

range

8 bits

[0,+255]

16 bits

[0,+65535]

32 bits

[0,+232-1]

64 bits

[0,+264-1]

n bits

[0,+2n-1]
Table II8 Range of unsigned integers


II.6.1.1.2 Signed integers

The internal representation of signed integers is not as simple as that of unsigned integers
because of the sign. They have a different encoding. How negative integers can be
represented? There are several ways to encode signed integers but the C standard specifies
three possibilities:

o the signed magnitude representation


o the ones complement
o the twos complement

II.6.1.1.2.1 Signed magnitude representation

In this format, the most significant bit reserved for the sign, while the remaining bits are
used to represent the absolute value (magnitude) of the number. If the number is positive,
the sign bit is set to 0. If negative, it is set to 1. However, this representation has a
loophole: 0 has two representations! In a big-endian representation, the value of 0 would
be represented by 00000000 (-0) or 10000000 (+0). For this reason, another representation
of signed integers is used.

Suppose integers fit in n bits: 1 bit for the sign and n-1 bit for the magnitude. Therefore:
o 2n-1 1 positive integers can be represented
o 2n-1 1 negative integers can be represented
o 0 has two representations
o The largest magnitude is 2n-1-1.

Integer size

range

8 bits

[-127,+127]

16 bits

[-32767,+32767]

32 bits

[-231-1,+231-1]

64 bits

[-263-1,+263-1]

n bits

[-2n-1-1,+2n-1-1]
Table II9 Range of integers using the signed magnitude representation


II.6.1.1.2.2 Ones complement

In this representation, the most significant bit is also reserved for the sign (0 means
positive and 1 negative) while the remaining bits are used to represent the absolute value
of the number but here, positive and negative values are not expressed in the same way.
o Positive values are written as described for unsigned integers. For example, the integer
+5 represented by 1 byte has the absolute value 000 0101. Then, as it is positive, it is
written as 0000 0101.

o Negative values use the ones complement. The absolute value of a negative number is
computed from the magnitude of the corresponding positive number by applying the
ones complement: every occurrence of 0 is turned to 1 and 1 to 0. For example, since the
absolute value of 5 is 000 0101, the absolute value of -5 is 111 1010. Then, by adding the
sign bit, -5 is written 1111 1010.

Consider the number 0001 1101. The most significant bit is 0: it is a positive integer. Its
absolute value is 001 1101. Then, its value is +29.

Consider the number 1110 0010. The most significant bit is 1: it is a negative integer. Its
absolute value is 110 0010. Therefore, its value is -001 1101 that is -29 (see Figure II4).

Now, can you find out the number represented by 1111 1111? As the most significant bit is
1, the number is negative. Its absolute value is 111 1111 that means 000 0000. The number
is -0. Here again, in that representation, 0 has two representation: 0000 0000 and 1111
1111.

Figure II4 Ones complement


Integer size

range

8 bits

[-127,+127]

16 bits

[-32767,+32767]

32 bits

[-231-1,+231-1]

64 bits

[-263-1,+263-1]

n bits

[-2n-1-1,+2n-1-1]
Table II10 Range of integers using the ones complementation representation


II.6.1.1.2.3 Twos complement

In the twos complement representation, the most significant bit is also reserved for the

sign (0 for + and 1 for -) while the remaining bits are used to represent the absolute value
of the number. Here again, positive and negative values are not expressed in the same way.
o Positive values are written as described for unsigned integers. For example, the integer
+5 represented by 1 byte has the absolute value 000 0101. Then, as it is positive, it is
written 0000 0101.
o Negative values use the twos complement. The absolute value of a negative number is
computed from the magnitude of the corresponding positive number by applying the
twos complement that is the ones complement plus one. For example, as the absolute
value of +5 is 000 0101, the absolute value of -5 is then 111 1010 + 1 = 111 1011. Then,
by adding the sign bit, -5 is written 1111 1011.

Take note that from the magnitude of a negative integer, if you apply the same formula,
you get the magnitude of the corresponding positive number. As an example, let us
consider the number 1110 0011. The most significant bit is 1: it is a negative integer. Its
absolute value is 110 0011. The magnitude of the corresponding positive number is 001
1100+1=001 1101. The number is -29 (see Figure II5).

Figure II5 Twos complement







In the twos complement representation, 0 has a single bit pattern: 0000 0000. This allows
representing the number -128 as 1000 0000.

If integers fit in n bits: 1 bit for the sign and n-1 bit for the magnitude. Therefore:
o 2n-1 1 positive integers can be represented
o 2n-1 negative integers can be represented
o 0 has a single representation
o The largest magnitude for positive number is 2n-1-1.
o The largest magnitude for negative number is 2n-1.

Integer size

range

8 bits

[-128,+127]

16 bits

[-32768,+32767]

32 bits

[-231,+231-1]

64 bits

[-263,+263-1]

n bits

[-2n-1,+2n-1-1]

Table II11 Range of integers using the twos complementation representation


It is interesting to note that computers using the twos complement can represent the value
-128 by a signed char

Most of systems use the twos complement scheme.

II.6.1.2 Character representation
II.6.1.2.1 Character encoding

In this section, we will not have cumbersome discussion about character encodings but a
short introduction to some concepts related to the character representation. We will talk
again about those concepts in Chapter IX Section IX.5.

Each language is composed of a set of characters: letters, digits, word-separators (such as
the space character), punctuation marks, mathematical symbols and other symbols. Human
beings identify a symbol through its graphical representation while a computer, working
only with binary numbers, identifies a symbol by its binary representation.

To represent the different languages all over the world, several kinds of character sets are
used (such as ASCII, and the Unicode character set called Universal Character Set or
UCS). A character set, also known as a repertoire, is just a collection of characters
representing symbols used by a set of languages. A coded character set is a character set
whose each character is associated with an integer number called code point. For example,
in ASCII and Unicode, the letter A has the decimal value 65 while in EBCDIC, it is
mapped to the decimal value 193.

A coded character set is not sufficient for a computer to work with characters. So that a
computer could interpret a character properly, a binary representation (encoding) for the
code point is required. A character encoding, also called a code page, is a mapping
between code points and their binary representation. Here are some examples of character
encodings:
o ANSI X3.4-1986 is the ASCII encoding character set that can be used by English
languages.
o ISO/IEC 8859-1 (known as Latin-1) that was used by languages such as German,
Swahili, Spanish, and English. It is an extension of ANSI X3.4-1986.
o ISO/IEC 8859-15 (also known as Latin-9) that can be used by languages such as French.
It is a superset of ISO/IEC 8859-1.
o Windows-1252 used in Microsoft Operating systems is quite the same as ISO/IEC 885915
o Unicode character encodings UTF-8 , UTF-16 and UTF-32 can be used with any
language. They can encode any character of the Unicode character set.

Take note that the same code point may have different encodings. For example, a character
of the Unicode character set is represented by one byte to four bytes by UTF-8, two bytes
or four bytes by UTF-16 and by four bytes by UTF-32.

Table II12 ASCII coded character set (ANSI X3.4-1986)


The C standard distinguishes two kinds of character sets: the character set used to write a
C program (called source character set) and the character set used as the program
executes (called execution character set). For us, throughout the book, both the character
sets are the same since we write, compile and execute our programs on the same
environment but if you cross compile your program, the execution character set may be
different from the source character set. Cross compiling means you compile a program for
another platform. For example, you may write a program using UTF-8 and cross compile
it for a target platform using the JIS character encoding. In the book, we will not talk
about cross compiling.

Table II13 Basic character set


Both the character sets, source character set and execution character set, include a
collection of basic characters forming a basic character set (95 symbols) sketched in
Table II13. Additional characters depending on the character set used, called extension
characters (such as , or ) may be used. An extended character set is a character set
composed of basic characters and extended characters. The default character set of a C
program is the basic character set.

Furthermore, the C standard requires the execution character set includes the null
character (whose all bits are set to 0) that terminates a string along with three control
characters: alert (\a), carriage return (\r) and newline (\n). The newline character indicates
the end of a line.

Any character of a basic character set fits in one byte whatever the character encoding
used. The code point for each character depends on the character encodings. Computers
come with one or more character encodings allowing dealing with characters of the locale
language and possibly other languages. For a given language, there are several character
encodings available (when learning the C language, you do not have to care about it). For

example, the character encoding ISO/IEC 8859-1, that is an extension of the ASCII
character encoding, also referred to as Latin-1, was used by several European languages.
The character encoding UTF-8, also compatible with the ASCII character encoding, can be
also be used by a computer to represent characters of those languages. In Chapter IX, we
will learn how to work with locales.

Our environment, using UTF-8, represents the letter A by the integer 65 as shown by the
following example:
$ cat charset1.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
printf(%c has code %d\n, A, A);
return EXIT_SUCCESS;
}
$ gcc -o charset1 -std=c99 -pedantic charset1.c
$ ./charset1
A has code 65

Never assume a character is bound to a specific code point (code value). In summary, on a
computer, a character is associated with an integer value having a specific binary
representation depending on the character encoding. As far as, we are concerned, until
Chapter IX, we will work with the basic character set whose each element fits in a single
byte.

II.6.1.2.2 Trigraphs

As some character sets do not include some characters needed to write C program, the C
standard defines sequences of three characters (Table II14), known as trigraphs replaced
by one character within a program when compiled. A trigraph is composed of two
question marks ?? followed by a third character.

Trigraph

Replacement character

??=

??(

??/

??)

??

??<

??!

??>

??-

~
Table II14 Trigraphs


C94 introduced sequences of two characters, known as digraphs, more practical than
trigraphs, replaced by one character by the compiler.

Digraph

Replacement character

<:

:>

<%

%>

%:

%:%:

##
Table II15 Digraphs


To break the substitutions of trigraphs (to prevent from having three successive characters
forming a trigraph), a backslash must be used. The following example displays some
trigraphs.
$ cat trigraph1.c
#include <stdio.h>
??=include <stdlib.h>

int main(void) ??<
char trigraph;

trigraph=??=; printf(?\?= replaced by %c\n, trigraph);
trigraph=??(; printf(?\?( replaced by %c\n, trigraph);
trigraph=??!; printf(?\?! replaced by %c\n, trigraph);
trigraph=??>; printf(?\?> replaced by %c\n, trigraph);
trigraph=??-; printf(?\?- replaced by %c\n, trigraph);


return EXIT_SUCCESS;
??>
$ gcc -o trigraph1 -std=c99 -pedantic trigraph1.c
$ ./trigraph1
??= replaced by #
??( replaced by [
??! replaced by |
??> replaced by }
??- replaced by ~

The backslash character \ preceding a character removes its special meaning. If a character
has no special meaning, the backslash is ignored. For example, to print the backslash
character \, we precede it with another backslash:
$ cat trigraph2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(\?\?/ replaced by %c\n, \??/);
return EXIT_SUCCESS;
}
$ gcc -o trigraph2 -std=c99 -pedantic trigraph2.c
$ ./trigraph2
??/ replaced by \

Normally, you will not have to use trigraphs and digraphs unless your keyboard cannot
represent those characters.

II.6.1.3 Padding bits
Data is stored in one or more bytes. A byte is composed of specific number of bits. Most
of the time, all bits of each byte are used to represent data but it may happen that not all
bits are used, some of them actually may be ignored as if they did not exist: they are called
padding bits. Padding bits do not participate to the value (Figure II6). For example, a 32bit type (i.e. size of 32 bits) may be represented by 31 bits (width of 31 bits) with one
padding bit: only 31 bits are used for encoding values.

Figure II6 Padding bits


In C, operations deal with values. That is, padding bits are invisible to programmers and
normally you do not have to worry about them if your programs conform to the C
standard.

II.6.1.4 Size, width, and precision
The precision of an integer is the number of digits used to represent its magnitude
excluding padding bits. The width of an integer is the number of digits used to represent
its magnitude and its sign, excluding padding bits: width=precision+1. The size of an
integer is the number of digits used to represent its magnitude and its sign, including
padding bits: size=width + padding bits. The size of a value or a type is yielded by the
operator sizeof.

II.6.1.5 Character types
Three types of integers, known as character types, represented by at least 8 bits are defined
by the C standard:

o char: it can be signed or unsigned depending on the implementation. This is known as


plain char.
o signed char: the minimum range is [-127,127].
o unsigned char: the minimum range is [0,255].

Take note that even though the size of a char is commonly 8 bits (i.e. 1 octet), it does not
mean in some computers it could not be 9, 12, 16 bits The C standard says only that its
bit-length must be at least 8 bits. We can infer that to write a C program that would work
on every machine (i.e. a portable program), we should ensure that our values of type char
be in the range [-127, 127] if they are signed or [0-255] if unsigned. Likewise, since a char
type can be signed or unsigned depending on the compiler, a portable program should use
values in the range [0-127]: this range is common to signed char and unsigned char.

In the following example, we display the values of an unsigned char variable called i and a
char variable called j.
$ cat char1.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 unsigned char i = 255;
6 char j = 255;
7
8 printf (i=%d j=%d\n, i,j);
9 return EXIT_SUCCESS;
10 }

What do think such a program will produce? The answer is it depends. Let us compile it
with gcc on our computer:
$ gcc -o char1 char1.c
$ ./char1
i=255 j=-1

As you can see it, the j variable (char type) appears as -1. This means that an overflow
happened indicating that on our computer, with gcc, the char type is considered a signed type.
In other words, on our computer, the char type is actually signed char. On another computer,
or with another compiler we may have a different result. Compilers have options giving
you more warnings while compiling:
$ gcc -o char1 -std=c99 -pedantic char1.c
char.c: In function main:
char.c:6:3: warning: overflow in implicit constant conversion

In the example above, the option -std=c99 -pedantic tells the compiler to be compliant with
the C99 standard and provides warnings if a program is not compliant: in our example,
line 6 must be reviewed.

Compilers have an option to treat a char type as unsigned char:
$ gcc -o char1 -std=c99 -pedantic -funsigned-char char1.c
$ ./char1
i=255 j=255

Or as signed char:
$ gcc -o char1 -std=c99 -pedantic -fsigned-char char1.c
char.c: In function main:
char.c:6:3: warning: overflow in implicit constant conversion

You can force the compiler to translate char as signed or unsigned char only if you have fully
understood how all char variables are used in the program. However, it is better use the
right types without using such compiler options. This means you have to know the range
of values that can be taken by your variables in order to use the right type.

We said character types are small integers fitting in one byte but, as matter, they are
used for variables holdings characters not for working with small integer numbers. The
term character, within the book, has two meanings depending on the context in which it is
used. In C, a character is an object of type character (unsigned char, char or signed char) fitting
in one byte. For a given human language (Japanese, German, French), characters are
symbols forming words, and sentences: for example, the letter z is a character. Characters
of languages cannot be represented any character sets. For example, ASCII describes
characters used in English and their corresponding 7-bit code (integer number). The
following example shows the mapping between a code value and a character (Unicode
encoding UTF-8):
$ cat char2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char c1=&;
char c2=38;

printf (c1: code is %d, character is %c\n, c1, c1);
printf (c2: code is %d, character is %c\n, c2, c2);
return EXIT_SUCCESS;

}
$ gcc -o char2 -std=c99 -pedantic char2.c
$ ./char2
c1: code is 38, character is &
c2: code is 38, character is &

Table II16 Character types


Character types always fit a byte whose size depends on the implementation. A byte is the
smallest amount of computers memory that can be addressed. For this reason, the C
language defines it as a unit of memory for storing data. The sizes of other types are
multiples of byte. The sizeof operator returns the size of a type or a given variable. In the C
language, sizeof(char) always returns 1 (bit-length of a byte) as shown below:
$ cat char3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf (Size of char %d.\n, sizeof(char));
return EXIT_SUCCESS;
}
$ gcc -o char3 -std=c99 -pedantic char3.c
$ ./char3
Size of char is 1.


In a given human language, such as French, a certain numbers of symbols (characters) are
used. ASCII is not enough for representing all characters used by all languages. For
example, the character used in Spanish or used in French is not present in ASCII but
within other character sets. More than seven bits are required for representing characters
of most of languages. Hence, a character of a given language may actually fit in more than
one byte (multibyte characters) and then may not be storable in type char.

In C, the type unsigned char is different from other types in that its encoding is a pure binary
representation as stated by C99. Pure representation means there is no hidden bits: all
bits are part of the number. This is the single type having this property. For example, in
some computers, an integer composed of n bits may have some bits unused (padding bits).
In such computers, the value is computed silently ignoring the padding bits. Programmers
do not have to be aware of that. For an unsigned char, this is not permitted: all bits are part of
the number. This feature is interesting, thanks to the type unsigned char, programmers can
have access all bits of an object.

II.6.1.6 Short types
The following integer types represented by at least 16 bits can be used:
o short (or short int): same as signed short.
o signed short (or signed
[32767,+32767]).

short int):

the smallest allowed range is [215-1, 215-1] (i.e.

o unsigned short (or unsigned short int): the smallest allowed range is [0, 216-1] (i.e. [0,65535]).

Table II17 Short types


In the following example, we show the biggest values that can be held by a variable of
type signed and unsigned short in our computer:
$ cat short1.c
#include <stdio.h>
#include <math.h>
#include <stdlib.h>

int main(void) {
short x = pow(2,15)-1;
unsigned short y = pow(2,16)-1;

printf (max signed short value=%d\nmax unsigned short value=%u\n, x, y);
return EXIT_SUCCESS;
}
$ gcc -o short1 -std=c99 -pedantic short1.c
$ ./short1
max signed short value=32767
max unsigned short value=65535

The following example is the same as the previous one except that the values we set are
too big (hence the error message overflow in implicit constant conversion):
$ cat short2.c
1 #include <stdio.h>
2 #include <math.h>
3
4 int main(void) {
5 short x = pow(2,15);
6 unsigned short y = pow(2,16);
7
8 printf (max signed short value=%d\nmax unsigned short value=%u\n, x, y);
9 return EXIT_SUCCESS;
10 }
$ gcc -o short2 -std=c99 -pedantic short2.c
short2.c: In function main:
short2.c:5:3: warning: overflow in implicit constant conversion
short2.c:6:3: warning: overflow in implicit constant conversion

In our example, we have introduced something new: the pow() math function. In the C
language, there is no power operator, to compute x to the power of y (xy), programmers call
the function pow(x,y). The function is declared in the header file math.h that is included by
the directive #include <math.h>. In our example, pow(2,15) means 215.

II.6.1.7 int types
The following integer types represented by at least 16 bits and having a bit-length greater
than or equal to the bit-length of the short type:
o int: same as signed int.
o signed int: the minimum range is [215-1, 215-1] (i.e. [32767,+32767]).

o unsigned int: the minimum range is [0, 216-1] (i.e. [0,65535]).



Usually, the int type is represented by 32 bits while the short type fits in 16 bits. However,
never assume the bit-length of the int type is 32 bits in all computers.

Table II18 Int types


In the following example, we display the bit-length (expressed in bytes) of the i variable of
type int:
$ cat int1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i;
printf (size of i is %d\n, sizeof i);
return EXIT_SUCCESS;
}
$ gcc -o int1 -std=c99 -pedantic int1.c
$ ./int1
size of i is 4

On our machine, the type int is represented by 4 bytes (32 bits). This number is given by
the sizeof operator. It is very useful since it returns the size of a type as well as the size of
an object. The following example displays the size of char, short and int types:
$ cat int2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(char=%d byte(s)\n, sizeof(char));

printf(short=%d bytes\n, sizeof(short));


printf(int=%d bytes\n, sizeof(int));
return EXIT_SUCCESS;
}
$ gcc -o int2 -std=c99 -pedantic int2.c
$ ./int2
char=1 byte(s)
short=2 bytes
int=4 bytes

The sizeof operator can be called with a type name or a variable name. If the argument is a
variable, you can omit the parentheses but if the argument is a type name, you must use
the parentheses around it.

The sizeof operator returns a number of bytes (that is not necessarily 8 bits). In C, a byte means
sizeof(char) that is the smallest amount of memory that the computer can access: the macro CHAR_BIT, defined in the
limits.h header file, stores the length of a byte.

The following example shows the biggest values of an int and an unsigned int on our
computer:
$ cat int3.c
#include <stdio.h>
#include <math.h>
#include <stdlib.h>

int main(void) {
int x = pow(2,31)-1;
int y = x + 1;
unsigned int z = pow(2,32)-1;

printf (x=%d\ny=%d\nz=%u\n, x, y, z);
}
$ gcc -o int3 -std=c99 -pedantic int3.c
$ ./int3
x=2147483647
y=-2147483648
z=4294967295

Explanations:
o The statement int x = pow(2,31)-1 declares the x variable as an int and initializes it to 231-1.

o The statement int y = x + 1 declares the y variable as type int and sets its value to the
contents of the x variable plus 1. That is, y holds the value 231.
o Since the size of an int is 32 bits on our machine, the value we gave to the y variable was
definitely too big, which should have risen an abnormal behavior. This was shown by the
printf() function that displayed the contents of the variable x, then y. We can see the x
variable was correctly printed while y was not (because of the overflow).
o We can also see that the z variable (unsigned int type) was correctly printed. It held the
biggest value for an unsigned int type on our computer. Notice that we used the %u
specifier in printf() to display it.


II.6.1.8 Long types
The following integer types are represented by at least 32 bits and have a bit-length greater
than or equal to the bit-length of type int:
o long: same as long int.
o long int: same as signed long int.
o signed long int: the minimum range is [231-1, 231-1] (i.e. [2147483647, 2147483647])
o unsigned long int: the minimum range is [0, 232-1] (i.e. [0, 4294967295]).

Table II19 Long types


The following example displays the size of the type long:
$ cat long1.c
#include <stdio.h>
#include <stdlib.h>


int main(void) {
printf(long=%d bytes\n, sizeof(long));
return EXIT_SUCCESS;
}
$ gcc -o long1 -std=c99 -pedantic long1.c
$ ./long1
long=4 bytes

The following example shows the biggest values of long and unsigned long types on our
computer (held in the variables x and z):
$ cat long2.c
1 #include <stdio.h>
2 #include <math.h>
3
4 int main(void) {
5 long x = pow(2,31)-1;
6 long y = pow(2,31);
7 unsigned long z = pow(2,32) 1;
8
9 printf (x=%ld\ny=%ld\nz=%lu\n, x, y, z);
10 return EXIT_SUCCESS;
11 }
$ gcc -o long2 -std=c99 -pedantic long2.c
long2.c: In function main:
long2.c:6:3: warning: overflow in implicit constant conversion
$ ./long2
x=2147483647
y=2147483647
z=4294967295

Above, the x and z variables (holding the biggest values respectively for types long and
unsigned long on our computer) were correctly printed while the y variable was not because
of an overflow error.

II.6.1.9 Long long types
The long long types were introduced in C99. The following integer types represented by at
least 64 bits and having a bit-length greater than or equal to the bit-length of the type long
[17]
can be used
:
o long long: same as signed long long int
o long long int: same as signed long long int

o signed long long: same as signed long long int


o signed long long int: the minimum range is [263-1, 263-1] (i.e. [- 9223372036854775807,
9223372036854775807])
o unsigned long: same as unsigned long int
o unsigned long int: the minimum range is [0, 264-1] (i.e. [0,18446744073709551615])

Table II20 Long long types


The following example displays the size of a long long type:
$ cat llong1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(long long=%d bytes\n, sizeof(long long));
return EXIT_SUCCESS;
}
$ gcc -o llong1 -std=c99 -pedantic llong1.c
$ ./llong1
long long=8 bytes

The following example shows the biggest values of long long and unsigned long long types on
our computer (held in the x and z variables):
$ cat llong2.c
1 #include <stdio.h>
2 #include <math.h>

3 #include <stdlib.h>
4
5 int main(void) {
6 long long x = pow(2,63)-1;
7 long long y = pow(2,63);
8 unsigned long long z = pow(2,64)-1;
9
10 printf (x=%lld\ny=%lld\nz=%llu\n, x, y, z);
11 return EXIT_SUCCESS;
12 }
$ gcc -o llong2 -std=c99 -pedantic llong2.c
llong2.c: In function main:
llong2.c:7:5: warning: overflow in implicit constant conversion
$ ./llong2
x=9223372036854775807
y=9223372036854775807
z=18446744073709551615

The y variable did not contain the expected value because of an overflow error.

II.6.1.10 Boolean type
The Boolean type _Bool, introduced in C99, is an integer type that can store only two
values: 0 or 1; 0 meaning false 1 meaning true. In C, the value of 0 is considered false,
while any other value is treated as true. Thus in C, the values 2 and -10 are both
considered true as shown below:
$ cat bool1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
if ( 2 ) {
printf (2 is TRUE\n) ;
} else {
printf (2 is FALSE\n) ;
}

if ( 0 ) {
printf (0 is TRUE\n) ;
} else {
printf (0 is FALSE\n) ;

}

if ( -5 ) {
printf (-5 is TRUE\n) ;
} else {
printf (-5 is FALSE\n) ;
}

return EXIT_SUCCESS;
}
$ gcc -o bool1 -std=c99 -pedantic bool1.c
$ ./bool1
2 is TRUE
0 is FALSE
-5 is TRUE

Here is an example using two Boolean variables b1 and b2 showing the value of 0 is
synonym for false while 1 is synonym for true.
$ cat bool2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
_Bool b1 = 0;
_Bool b2 = 1;

if ( b1 ) {
printf (b1 is TRUE\n) ;
} else {
printf (b1 is FALSE\n) ;
}

if ( b2 ) {
printf (b2 is TRUE\n) ;
} else {
printf (b2 is FALSE\n) ;
}

return EXIT_SUCCESS;
}
$ gcc -o bool2 -std=c99 -pedantic bool2.c
$ ./bool2

b1 is FALSE
b2 is TRUE

If you attempt to assign a number different from 0 to a Boolean variable, it will take the
value 1:
$ cat bool3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
_Bool b1 = 0;
_Bool b2 = 12;
_Bool b3 = -7;

printf (b1=%d\n, b1) ;
printf (b2=%d\n, b2) ;
printf (b3=%d\n, b3) ;
return EXIT_SUCCESS;
}
$ gcc -o bool3 -std=c99 -pedantic bool3.c
$ ./bool3
b1=0
b2=1
b3=1

The C language defines a macro called bool, in stdbool.h, that expands to _Bool. Thus, our
previous example can also be written like this:
$ cat bool4.c
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int main(void) {
bool b1 = 0;
bool b2 = 12;
bool b3 = -7;

printf (b1=%d\n, b1) ;
printf (b2=%d\n, b2) ;
printf (b3=%d\n, b3) ;
return EXIT_SUCCESS;
}

$ gcc -o bool4 -std=c99 -pedantic bool4.c


$ ./bool4
b1=0
b2=1
b3=1

Though not often used, you can work with the macros true (expanded to 1) and false
(expanded to 0) defined in the header file stdbool.h:
$ cat bool5.c
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int main(void) {
bool b1 = true;
bool b2 = false;

printf (b1=%d\n, b1) ;
printf (b2=%d\n, b2) ;


if ( b1 == true ) {
printf (b1 is TRUE\n) ;
} else {
printf (b1 is FALSE\n) ;
}

if ( b2 == true) {
printf (b2 is TRUE\n) ;
} else {
printf (b2 is FALSE\n) ;
}

return EXIT_SUCCESS;
}
$ gcc -o bool5 -std=c99 -pedantic bool5.c
$ ./bool5
b1=1
b2=0
b1 is TRUE
b2 is FALSE

In the following example, we initialize the Boolean variables with expressions (see
Chapter IV):
$ cat bool6.c
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int main(void) {
int x = 5;
bool b1 = x > 0; /* true */
bool b2 = x < 10; /* true */

printf (b1=%d\n, b1) ;
printf (b2=%d\n, b2) ;

return EXIT_SUCCESS;
}
$ gcc -o bool6 -std=c99 -pedantic bool6.c
$ ./bool6
b1=1
b2=1

Though a Boolean type is an integer type, when you assign a value different from 0 to a
variable of type Boolean, it will take the value of 1. For example:
$ cat bool7.c
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int main(void) {
bool b = 0.2;
int i = 0.2;

printf (b=%d\n, b) ;
printf (i=%d\n, i) ;

return EXIT_SUCCESS;
}
$ gcc -o bool7 -std=c99 -pedantic bool7.c
$ ./bool7
b=1

i=0



II.6.1.11 Limits
So far, we have talked about the different integer types defined by the C standard. Through
examples, we displayed the maximum values that can be in held by variables depending
on integer types but we did not explain yet where the boundaries are defined.

[18]

The boundaries of integers (see Table II21) are defined in the header file limits.h . Limits
are not held in variables but are expressed in form of macros. For now, you can view a
macro as an alias. For example, the directive #define CHAR_BIT 8 makes the symbolic name
CHAR_BIT (macro) as an alias for the number 8.

Table II21 Boundaries of Integer types


The following C program displays the limits of integer types defined by your systems.
$ cat limits_int.c
#include <stdio.h>
#include <limits.h>
#include <stdlib.h>

int main(void) {
printf (CHAR_BIT=%d\n, CHAR_BIT);

printf (====CHAR====\n);
printf (SCHAR_MIN=%d (miminum value for signed char)\n, SCHAR_MIN);

printf (SCHAR_MAX=%d (maximum value for signed char)\n, SCHAR_MAX);


printf (UCHAR_MAX=%u (maximum value for unsigned char)\n, UCHAR_MAX);
printf (CHAR_MIN=%d (miminum value for char)\n, CHAR_MIN);
printf (CHAR_MAX=%d (maximum value for char)\n, CHAR_MAX);

printf (\n====SHORT====\n);
printf (SHRT_MIN=%d (miminum value for signed short)\n, SHRT_MIN);
printf (SHRT_MAX=%d (maximum value for signed short)\n, SHRT_MAX);
printf (USHRT_MAX=%u (maximum value for unsigned short)\n, USHRT_MAX);

printf (\n====INT====\n);
printf (INT_MIN=%d (miminum value for int)\n, INT_MIN);
printf (INT_MAX=%d (maximum value for int)\n, INT_MAX);
printf (UINT_MAX=%u (maximum value for unsigned int)\n, UINT_MAX);

printf (\n====LONG====\n);
printf (LONG_MIN=%ld (miminum value for long)\n, LONG_MIN);
printf (LONG_MAX=%ld (maximum value for long)\n, LONG_MAX);
printf (ULONG_MAX=%lu (maximum value for unsigned long)\n, ULONG_MAX);

printf (\n====LONG LONG====\n);
printf (LLONG_MIN=%lld (miminum value for long long)\n, LLONG_MIN);
printf (LLONG_MAX=%lld (maximum value for long long)\n, LLONG_MAX);
printf (ULLONG_MAX=%llu (maximum value for unsigned long long)\n, ULLONG_MAX);
return EXIT_SUCCESS;
}

Of course, you have noticed in the second line, we included the limits.h header files since it
contains the limits. If we run it after compiling it, we obtain this in our computer:
$ gcc -o limits_val -std=c99 -pedantic limits_int.c
$ ./limits_val
CHAR_BIT=8
====CHAR====
SCHAR_MIN=-128 (miminum value for signed char)
SCHAR_MAX=127 (maximum value for signed char)
UCHAR_MAX=255 (maximum value for unsigned char)
CHAR_MIN=-128 (miminum value for char)
CHAR_MAX=127 (maximum value for char)

====SHORT====
SHRT_MIN=-32768 (miminum value for signed short)
SHRT_MAX=32767 (maximum value for signed short)

USHRT_MAX=65535 (maximum value for unsigned short)



====INT====
INT_MIN=-2147483648 (miminum value for int)
INT_MAX=2147483647 (maximum value for int)
UINT_MAX=4294967295 (maximum value for unsigned int)

====LONG====
LONG_MIN=-2147483648 (miminum value for long)
LONG_MAX=2147483647 (maximum value for long)
ULONG_MAX=4294967295 (maximum value for unsigned long)

====LONG LONG====
LLONG_MIN=-9223372036854775808 (miminum value for long long)
LLONG_MAX=9223372036854775807 (maximum value for long long)
ULLONG_MAX=18446744073709551615 (maximum value for unsigned long long)


II.6.1.12 Overflow
II.6.1.12.1 Unsigned integers

Whatever the operations involving unsigned integers, there is no overflow. This implies
that if you assign a variable of an unsigned integer type of a value v (that may result from
an expression) less than the minimum value or greater than the maximum value, it will
still have a defined value. The actual value will be v modulo (umax+1), where umax is the
maximum value of the unsigned integer type. Thus, the value of the variable always
ranges from 0 through umax.

Let us consider a variable of type unsigned int. Its maximum value is UINT_MAX. If you
attempt to assign it the value UINT_MAX + 1, it will store the value (UNIT_MAX + 1) modulo
(UINT_MAX+1) that yields 0. If you attempt to assign the value UINT_MAX + 2, it will store the
value (UNIT_MAX + 2) modulo (UINT_MAX+1) that yields 1. If you attempt to assign the value
UINT_MAX + 3, it will store the value (UNIT_MAX + 3) modulo (UINT_MAX+1) that yields 2
$ cat unsigned_overflow.c
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main(void) {
unsigned int max1 = UINT_MAX + 1;
unsigned int max2 = UINT_MAX + 2;
unsigned int max3 = UINT_MAX + 3;

printf(max1=%d max2=%d max3=%d\n, max1, max2, max3);



return EXIT_SUCCESS;
}
$ gcc -o unsigned_overflow -std=c99 -pedantic unsigned_overflow.c
$ ./unsigned_overflow
max1=0 max2=1 max3=2

Let us give a quick explanation on the mathematic operator modulo. In C, it is denoted by the symbol
%. A division of two integers n/q can be written n = p * q + r where p is an integer number and r is the remainder such
that |r| < |n|. The result of the modulo operation n mod q (in C, it is written n % q) is the remainder r: n % q=r. For
example, as 6 = 2 * 4 + 2 then 6 % 4 = 2.

Of course, if n < q, n % q = n and if n = q, then n % q = 0.


II.6.1.12.2 Signed integers

When a variable of a signed integer type is assigned a value less than the minimum value
or greater than the maximum, its value is undefined and an overflow occurs.

II.6.2 Real floating types


In a computer, any value is stored in a fixed of number of bits according its types. Real
numbers as mathematics define them cannot be stored in computers memory because a
real number may have an infinite number of digits (for example ). Instead, in computing,
we work with floating-point numbers. The adjective floating means the decimal point can
have different positions (not fixed): the number 3.14 can also be written as 314 * 10-2 or
31.4*10-1 (the decimal point takes different positions). A floating-point number is
composed of three parts: the sign, the significand (sometimes referred to as a mantissa)
and the exponential part, that may be omitted, composed of the base representing a
numeral system and an exponent:
significand x basee

In decimal base, base is 10. In binary system, base is 2. In hexadecimal system, base is 16.
Consider the decimal number -31.4*10-1:
o The sign is negative

o The significand is 31.4.


o The exponential part is 10-1.

The C language has two kinds of floating types: real floating types and complex (since
C99). Real floating types are finite real numbers. The C language defines three kinds of
real floating types: float, double and long double. The values represented by the type float are a
subset of the set of values represented by the type double. The values represented by the
type double are a subset of the set of values represented by the type long double.

The C standard does not enforce the way to represent floating-point numbers. Thus, the
number of bytes representing the significand and the exponent is defined by the
implementation. The header file float.h contains a list of macros representing the radix
(base of the numeral system in which floating-point numbers are represented), the number
of decimal digits for the significand (known as the precision), the minimum and maximum
values for the exponent Each implementation defines its own values that are equal or
greater than the minimum values and equal or less than the maximum values specified by
the C standard.

II.6.2.1 float
In C, a variable of type float is declared like this:
float variable_name;

Declaring a variable allows labeling a variable, specifying the type of data it contains and
its size. If you also want to initialize a variable at the same time as its declaration (known
as a definition):
float variable_name = val;

o The semicolon (;) at the end of the statement is mandatory.


o The keyword float is at the beginning of the statement. It cannot be used for naming a
variable or a function. It is recognized as a special word denoting a type.
o Spaces around the equals sign and the semicolon, are allowed
o One or more spaces after the keyword float are required.
o Variable_name is the name of the variable used to identify it.
o val can be a variable, a floating-point constant, or an integer constant. More generally, it
is an arithmetic expression (see Chapter IV).

To display a double or a float with printf(), you have three ways:
o by using %f: the number is displayed in the format [-]i.f, where i is the integral part and f

the fractional part of the number.


o by using the specifier %e, %g, %E or %G: %e displays a floating-point number in
scientific decimal notation (the base appears in lowercase) while %g is either %e or %f
depending on the value and the precision of the number. The specifiers %E and %G are
equivalent to %Le and %Lg respectively: they just display the base in uppercase.
o by using the specifier %a or %A that displays a floating-point number in scientific
hexadecimal notation.

The following example displays the variable x initialized with the floating constant
3.14159:
$ cat float1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 3.14159;
printf(x=%f\n, x);

return EXIT_SUCCESS;
}
$ gcc -o float1 -std=c99 -pedantic float1.c
$ ./float1
x=3.141592

Explanations:
o The statement float x = 3.14159 declares the x variable as type float and initialized it to the
value 3.14159.
o The statement printf(x=%f\n, x) displays the x variable.

There are two ways to display and initialize a floating-point number: by using or not an
exponent part. The following example initializes the x variable by using the exponential
notation:
$ cat float2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 1.52e-3;
printf(x (%%f)=%f\n, x);
printf(x (%%e)=%e\n, x);

printf(x (%%g)=%g\n, x);



return EXIT_SUCCESS;
}
$ gcc -o float2 -std=c99 -pedantic float2.c
x (%f)=0.001520
x (%e)=1.520000e-03
x (%g)=0.00152

Explanations:
o The statement float x = 1.52e-3 sets the x variable of type float to a floating-point literal by
using the exponential notation (1.52 10-3).
o The first printf() function displays x with no exponent part (%f specifier).
o The second printf() function displays x with an exponent part (%e specifier).
o The third printf() function displays the variable x. The %g specifier refers to the most
appropriate format (either %f or %e).
o To display the % symbol, you have to precede it with another %. Otherwise, it is
considered a specifier. Hence, %%f appears as %f.

In C, a floating-point number that is too big to be represented is considered an infinite
number denoted by a special value called infinity (+infinity or infinity) as shown below:
$ cat float3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 1e900; /* value too big => infinity*/
float y = -1e900; /* value too big => infinity*/

printf(%%f: x=%f and y=%f \n, x, y);
printf(%%e: x=%e and y=%e \n, x, y);
printf(%%g: x=%g and y=%g \n, x, y);

return EXIT_SUCCESS;
}
$ gcc -o float3 -std=c99 -pedantic float3.c
float3.c: In function main:
float3.c:5:4: warning: floating constant exceeds range of double
float3.c:6:4: warning: floating constant exceeds range of double

$ ./float3
%f: x=Inf and y=-Inf
%e: x=Inf and y=-Inf
%g: x=Inf and y=-Inf


II.6.2.2 double
The type double is similar to type float with more digits to represent the significand and the
exponent. A variable of type double is declared like this:
double variable_name;

You could also initialize a variable at the same time as its declaration (definition):
double variable_name = val;

o The semicolon at the end of the statement is mandatory.


o The keyword double is at the beginning of the statement. It cannot be used for naming a
variable or a function. It is recognized as a special word denoting a type.
o Spaces around the equals sign and the semicolon, are allowed
o One or more spaces after the keyword double are required.
o val can be a variable, a floating-point constant, or an integer constant. More generally, it
is an arithmetic expression (expressions are broached in Chapter IV).

The type double can be used exactly in the same way as the type float. The difference is the
type double is a superset of the type float. The set of values represented by the type double
contains the set of values representable by the type float. The following example shows
that a variable of type double can hold bigger floating numbers than if it was of type float:
$ cat double1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 1.52e135;
printf(x (%%e)=%e\n, x);
printf(x (%%g)=%g\n, x);

double y = 1.52e135;
printf(y (%%e)=%e\n, y);
printf(y (%%g)=%g\n, y);

return EXIT_SUCCESS;

}
$ gcc -o double1 -std=c99 -pedantic double1.c
$./double1
x (%e)=Inf
x (%g)=Inf
y (%e)=1.520000e+135
y (%g)=1.52e+135

In our computer, the number 1.52*10135 is too big to be held by the variable x of type float. It
is displayed as Inf (infinite) by gcc while it fits in the variable y of type double.

The following example shows the type double allows a better accuracy than the type float.
Two variables of type float and double are assigned a floating constant that is an
approximation of . Both the variables cannot support such a precision, they are both
rounded to the nearest floating-point number.
$ cat double2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
double dbl_pi = 3.141592653589793238462643383279;
float flt_pi = 3.141592653589793238462643383279;

printf(literal =3.141592653589793238462643383279\n);
printf(dbl_pi =%.30lf\n, dbl_pi);
printf(flt_pi =%.30f\n, flt_pi);

return EXIT_SUCCESS;
}
$ gcc -o double2 -std=c99 -pedantic double2.c
$ ./double2
literal =3.141592653589793238462643383279
dbl_pi =3.141592653589793115997963468544
flt_pi =3.141592741012573242187500000000

The type double has a precision greater than or equal to the precision of the type float. In our
computer, the double variable has fifteen correct digits while the float variable has six
correct digits. The section II.6.2.6 will explain why

II.6.2.3 long double

The type long double can be used in the same way as the types double and float. A variable of
type long double is declared like this:
long double variable_name;

The C language allows you to initialize a variable at the same time as its declaration:
long double variable_name = val;

o The semicolon at the end of the statement is mandatory.


o The keyword long double is at the beginning of the statement.
o Spaces around the equals sign and the semicolon, are allowed
o One or more spaces after the keyword long double are required.
o val can be a variable, a floating-point constant, or an integer constant. More generally, it
is an arithmetic expression (see Chapter IV).

To display a long double with printf(), you have three ways:
o by using %Lf: the number is displayed in the format [-]i.f, where i is the integral part and f
the fractional part of the number.
o by using %Le, %Lg, %LE or %LG: %Le displays a floating-point number in scientific
decimal notation (the base appears in lowercase) while %Lg is either %Le or %Lf
depending on the value and the precision of the number. %LE and %LG are equivalent to
%Le and %Lg respectively: they just display the base in uppercase.
o by using %La or %LA that displays a floating-point number in scientific hexadecimal
notation.

The type long double works in the same way as the types float and double. It is a superset of
the double type. The following example tries to display the number with 30 digits after the
decimal point after storing it into the dbl_pi variable having the type double and into the
ldbl_pi variable of type long double:
$ cat ldbl1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
double dbl_pi = 3.141592653589793238462643383279;
long double ldbl_pi = 3.141592653589793238462643383279;

printf(literal =3.141592653589793238462643383279\n);
printf(dbl_pi =%.30f\n, dbl_pi);
printf(ldbl_pi =%.30Lf\n, ldbl_pi);


return EXIT_SUCCESS;
}
$ gcc -o ldbl1 -lm -std=c99 -pedantic ldbl1.c
$ ./ldbl1
literal =3.141592653589793238462643383279
dbl_pi =3.141592653589793115997963468544
ldbl_pi =3.141592653589793238512808959406

The long double type has a precision greater than or to that of the type double. In our
computer, the double variable has fifteen correct digits while the long double variable has
eighteen correct digits.

The range of values represented by long double type is greater than or equal to that of the
type double. In the following example, in our operating system, the number 103000 assigned
to a variable of type double is treated as infinite while it can be represented by the type long
double.
$ cat ldbl2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
double dbl = 1e3000 ;
long double ldbl = 1e3000;

printf(dbl =%f\n, dbl);
printf(ldbl =%Lf\n, ldbl);

return EXIT_SUCCESS;
}
$ ./ldbl2
dbl =Inf
ldbl =1e+3000


II.6.2.4 Infinity
Floating-point numbers that are too large to be represented by a real floating type are
considered infinite. In the following example, the floating-point numbers 105000 and
-105000 cannot be represented by the type float, they are treated as +infinite and infinite:
$ cat float_infinite.c
#include <stdio.h>

#include <stdlib.h>

int main(void) {
float x = 1e5000 ;
float y = -1e5000 ;

printf(x=%f and y=%f\n, x, y);

return EXIT_SUCCESS;
}
$ gcc -o float_infinite -std=c99 -pedantic float_infinite.c
float_infinite.c: In function main:
float_infinite.c:5:4: warning: floating constant exceeds range of double
float_infinite.c:6:4: warning: floating constant exceeds range of double
$ ./float_infinite
x=Inf and y=-Inf


II.6.2.5 NaN
Operations or functions dealing with floating-point numbers may yield special values
known as NaN. NaNs (Not a Number) represent undefined values. There can be several NaNs
whose values depend on the implementation. For example, the square root of -1, sqrt(-1),
produces NaN. The following operations also produce NaN: 0/0, infinite/infinite, infinite infinite, 0*infinite. Here is an example:
$ cat float_NaN.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(void) {
double v = 1E900; /* Infinite */
double u = 1E-900; /* 0 */
double w = v * 0; /* NaN */
double x = v / v; /* NaN */
double y = v - v; /* NaN */
double z = u/u; /* NaN */

printf(square root(-1): sqrt(-1)=%f\n, sqrt(-1));

printf(v=%f u=%f\n, u, v);
printf(v*0=%f\n, w);
printf(v/v=%f\n, x);

printf(v-v=%f\n, y);
printf(u/u=0/0=%f\n, z);

return EXIT_SUCCESS;
}
$ gcc -o float_NaN -std=c99 -pedantic -lm float_NaN.c
float_NaN.c: In function main:
float_NaN.c:6:4: warning: floating constant exceeds range of double
float_NaN.c:7:4: warning: floating constant truncated to zero
$ ./float_NaN
square root(-1): sqrt(-1)=-NaN
v=0.000000 u=Inf
v*0=-NaN
v/v=-NaN
v-v=-NaN
u/u=0/0=-NaN


II.6.2.6 Floating-point limits
In scientific notation, a floating-point number is composed of three parts: a sign, a
significand and an exponent part. The significand is made up of an integer part, the radix
point, and a fractional part. The exponent part may be omitted such as in the number 3.14
(instead of 3.14*100). A floating-point number has the form: m x be, where:
o is the sign. It can be positive or negative.
o m is the significand (sometimes referred to as a mantissa). It is a number with a
fractional part
o b represents the base or radix. In the base 10 number system, b is 10. In the binary
number system, b is 2. Generally, systems work with base 2 but nothing prevents from
using another base.
o e is the exponent. It is an integer that can be positive, zero or negative

As our computer has a finite memory and then stores floating-point numbers in a fixed bitlength memory chunk, how could the number 3.14 be stored? Should it be stored as
0.314*10 or 314*10-2? How many bits should be reserved for the significand and how
many bits for the exponent?

The first issue is that a floating-point number may be written in several ways: 3.14,
31.4x10-1, 0.314x101 Thats why, a floating-point number is normalized so as to have a
single representation of the number. The normalization of a number depends on the
representation that is adopted. For example, a normalized floating-point number could

start with 0, followed by the radix point followed by a nonzero digit such as 0.314x101.

[19]
In order to store a floating-point number, a specific representation must be used
. There
exist several representations of floating-point numbers. The most widely used is described
by the standard IEEE 754 also referred to as ISO/IEC/EEEE 60559. To understand the
limits of the C language, defined in the header file float.h, we have to resort to a
representation of floating-point numbers. Otherwise, they would appear as cryptic. In the
following section, we resort to the examples of floating-point representation given by the
C99 standard deriving from the representations described by the standard IEEE 754.

II.6.2.7 Example of representation
A floating-point number could be represented as follows (see the beginning of the chapter
about system numerals):
fnb=sign m be
Where m=d1 b-1 + d2 b-2 + + dn b-n
Where emin e emax

Where 0 di b-1
Where:
o sign is the sign of the floating-point number ().
o b is the radix. In decimal numeral system, b is 10. In binary base, b is 2. In C99, it is
denoted by the macro FLT_RADIX.
o d1, d2,, dn are digits expressed in base radix number system. They are in the range of
the natural numbers [0, b-1]. For example, in base 2, they can be either 0 or 1 . In base
10, the digits are in the integral interval [0-9].
o n is the number of digits of the significand, known as a precision. The C99 standard
represents it by the macro FLT_MANT_DIG for the type float, DBL_MANT_DIG for the type
double, LDBL_MANT_DIG for the type long double.
o e is the exponent within the integral range [emin,emax]. The values emin and emax depend
on the implementation and the floating type. In C99, emin is called FLT_MIN_EXP for the
type float, DBL_MIN_EXP for the type double, LDBL_MIN_EXP for the type long double. emax is
called FLT_MAX_EXP for the type float, DBL_MAX_EXP for the type double, LDBL_MAX_EXP
for the type long double

For example, in base 10, the number 3.14 can be represented as 0.314*10-1 = (3x101+1x10-2+4x10-3+) x 10-1. It is composed of:
o The sign +

o The significand is 0.314: d1=3, d2=1, d3=4 and 0 di 9. Its precision is 3.


o The exponent is -1
o The base is 10.

A variable of real floating type can take several kinds of values:
o Finite floating-point numbers:
If the floating-point number fnb is not zero and d1 > 0, the number is said to be

normalized.
If the floating-pointer number fnb is not zero, d1=0 and e = emin, the number is said

to be denormalized. Denormalized numbers (also called subnormal) are too small to


be represented as normalized numbers. They can be used to represent very small
floating-point numbers.
o Infinite numbers: +infinite and infinite. The values depend on the implementation.
o NaN (Not a number) representing an undetermined value. There can be several kinds of
NaN whose values depend on the implementation.

What is the difference between normalized and denormalized floating-point numbers? The
normalized form ensures a single way to represent a finite floating-point number: the very
first significant digit d1 is different from 0. The denormalized form is used to represent
numbers too small to be represented by the normalized form: the first digit d1 is 0 which
yield the loss of one digit of precision. In our representation, a normalized floating-point
number takes the form 0.d1d2d3 x be. For example, the number -827.6 takes the
normalized form -0.8276*103 composed of:
o The sign
o The significand is 0.8276: d1=8, d2=2, d3=7 and d4=6. Its precision is 4.
o The exponent is 2
o The base is 10.

[20]
Likewise, in our representation, the binary number
101.112 has the normalized form
1.01112*22:
o The sign is +
o The significand is 1.01112.
o The precision is 5: d1=1, d2=1, d3=1, d4=1, d5=1.
o The exponent is 4

o The radix is 2.


How do you think we could convert the binary number 101.11 into decimal number?

101.112=1*22 + 0*21 + 1*20 + 1*2-1 + 1*2-2=5+0.75=5.75.

So, the binary number 101.11 has the normalized form 1.01112*22 and stands for 5.7510 in the decimal number system.

In Figure II7, we have represented the intervals for normalized and denormalized
numbers. In our representation, the bounds can be computed easily, they are given below:
NFLPmax=bemax (1-b-n)
NFLPmin= bemin-1

DFLPmax = bemin-1 (1-b-n+1)
DFLPmin = bemin-n

Where:
o NFLPmax is the maximum normalized floating-point number. It represents the largest
representable finite number. In C, it is represented by the macro FLT_MAX for the type
float, DBL_MAX for the type double and LDBL_MAX for the type long double.
o NFLPmin is the minimum normalized floating-point number. It represents the smallest
representable number without losing precision. In C, it is denoted by the macro FLT_MIN
for the type float, DBL_MIN for the type double and LDBL_MIN for the type long double.
o DFLPmax is the maximum denormalized floating-point number. It is not specified in C.
o DFLPmin is the minimum denormalized floating-point number. It represents the smallest
representable number but with precision loss. It is not specified in C.

Figure II7 Ranges of normalized and denormalized floating-point numbers




If the base is 2:
NFLPmax=2emax(1-2-n).
NFLPmin=2emin-1

DFLPmax = 2emin-1(1-2-n+1)
DFLPmin = 2emin-n.

A normalized floating-point number is in the range [-NFLPmin, -NFLPmax] U [NFLPmin,


NFLPmax]. A denormalized floating-point number is in the range [-DFLPmin, -DFLPmax] U
[DFLPmin, DFLPmax].

Not all normalized and denormalized floating-point numbers can be represented because
the number of digits for the significand is finite while a real floating-point number can
have any number of significand digits. Figure II7 shows several bounds: NFLPmin,
NFLPmax, DFLPmin and DFLPmax. A real floating-point number with a precision m > n (n
being the largest precision defined by the system according to the floating type) cannot be
represented and then is rounded to the nearest representable real floating-point number.
The absolute value of a floating-point number greater than NFLPmax cannot be represented
either (overflow): it is considered as infinite. The absolute value of a floating-point
number less than NFLPmin is not a normalized number (underflow) but can be
approximated by a denormalized number with precision loss. The absolute value of a
floating-point number less than DFLPmin is not representable at all.

Let us compute the DFLPmax, DFLPmin, NFLPmax, NFLPmin. We are going to play with
mathematics. A normalized number takes the form d1 b-1 + d2 b-2 + + dn b-n where d1 > 0. The maximum
normalized floating-pointer number NFLPmax is equal to:
bemax((b-1)xb-1 + (b-1)xb-2 + + (b-1)xb-n).

The minimum normalized floating-pointer number NFLPmin is equal to:
bemin(1xb-1 + 0xb-2 + + 0x2-n) = bemin x b-1= bemin-1

In mathematics, the geometric series 1+q+q2++qn equals to (1-qn+1)/(1-q). Which implies 1+r+r-2++r-n=
1+1/r+(1/r)++(1/r)n = (1-1/rn+1)/(1-1/r).

So, we can write:


(b-1)xb-1 + (b-1)xb-2 + + (b-1)xb-n
= (b-1) b-1 (1+1/b2++1/bn-1)

=(b-1) b-1 (

= (b-1) (

= 1-b-n

Then, NFLPmax=bemax (1-b-n)



Lets move onLet us compute the maximum and minimum denormalized floating-point number respectively denoted
by DFLPmax and DFLPmin.
DFLPmax = bemin((b-1)b-2++(b-1)b-n)
= bemin (b-1) b-2 (1+1/b2++1/bn-2)

= bemin (b-1) b-2(

= bemin (b-1) b-1(

= bemin b-1 (1-b-n+1)



DFLPmax = bemin-1 (1-b-n+1)

DFLPmin = bemin (0xb-2++1xb-n)=bemin-n.

Figure II8 Binary floating-point representation


The C99 standard specifies another value represented by the macro FLT_EPSILON for the
type float, DBL_EPSILON for the type double, LDBL_EPSILON for the type long double. Let us call
it epsilon. It is the smallest significand (with no order of magnitude: exponent is set to 0)
such that 1 + epsilon > 1. With our representation, its value would be:
epsilon = b1-n.

For a floating-point number v that is less than epsilon, 1 + v=1!



Let us compute epsilon,


1+epsilon=1+d1xb-1++d1xb-i

The normalized form of that number is 1+epsilon=1+d1xb-1++dixb-i=( b-1+d1xb-2++dixb-i-1)b
The smallest number such that 1+epsilon=(b-1+d1xb-2++dixb-i-1)b > 1=(b-1)b
is d1=0, d2=0,,di=1 and i-1=-n because n is the maximum number of digits for a significand (precision).
Then, i=n-1 and epsilon=b-(n-1)=b1-n


Table I22 shows examples of binary floating-point representation for the types float and
double.

Table II22 Example of values for floating-point numbers


II.6.2.8 Limits
The C language does not impose a specific representation for floating point numbers: base
(radix), and the size of the radix and the significand are left to implementations. Table
II23 and Table II24 describe some limits represented by macros defined in the header
file float.h. Macros beginning with FLT apply to type float. Macros beginning with DBL apply
to type double. Macros beginning with LDBL apply to type long double.

Table II23 Some minimum limits defined in float.h

Table II24 Some maximum limits defined in float.h


The following program displays the limits list in Table II23 and Table II24 for the type
float:
$ cat float_max.c
#include <stdio.h>
#include <float.h>
#include <stdlib.h>

int main(void) {
printf(FLT_RADIX=%d\n, FLT_RADIX);
printf(FLT_MANT_DIG=%d\n, FLT_MANT_DIG);
printf(FLT_MIN_EXP=%d\n, FLT_MIN_EXP);
printf(FLT_MAX_EXP=%d\n, FLT_MAX_EXP);
printf(FLT_MIN_10_EXP=%d\n, FLT_MIN_10_EXP);
printf(FLT_MAX_10_EXP=%d\n, FLT_MAX_10_EXP);
printf(FLT_MIN=%e\n, FLT_MIN);
printf(FLT_MAX=%e\n, FLT_MAX);
printf(FLT_DIG=%d\n, FLT_DIG);
printf(FLT_EPSILON=%e\n, FLT_EPSILON);

return EXIT_SUCCESS;
}

In our computer, after compiling the program, we get this:


$ gcc -o float_max -std=c99 -pedantic float_max.c
$ ./float_max
FLT_RADIX=2
FLT_MANT_DIG=24
FLT_MIN_EXP=-125
FLT_MAX_EXP=128
FLT_MIN_10_EXP=-37
FLT_MAX_10_EXP=38
FLT_MIN=1.175494e-38
FLT_MAX=3.402823e+38
FLT_DIG=6
FLT_EPSILON=1.192093e-07

The following program displays the limits listed in Table II23 and Table II24 for the type
double:
$ cat dbl_max.c
#include <stdio.h>

#include <float.h>
#include <stdlib.h>

int main(void) {
printf(FLT_RADIX=%d\n, FLT_RADIX);
printf(DBL_MANT_DIG=%d\n, DBL_MANT_DIG);
printf(DBL_MIN_EXP=%d\n, DBL_MIN_EXP);
printf(DBL_MAX_EXP=%d\n, DBL_MAX_EXP);
printf(DBL_MIN_10_EXP=%d\n, DBL_MIN_10_EXP);
printf(DBL_MAX_10_EXP=%d\n, DBL_MAX_10_EXP);
printf(DBL_MIN=%e\n, DBL_MIN);
printf(DBL_MAX=%e\n, DBL_MAX);
printf(DBL_DIG=%d\n, DBL_DIG);
printf(DBL_EPSILON=%Le\n, DBL_EPSILON);

return EXIT_SUCCESS;
}

If we run it in our computer, we get this


$ ./dbl_max
FLT_RADIX=2
DBL_MANT_DIG=53
DBL_MIN_EXP=-1021
DBL_MAX_EXP=1024
DBL_MIN_10_EXP=-307
DBL_MAX_10_EXP=308
DBL_MIN=2.225074e-308
DBL_MAX=1.797693e+308
DBL_DIG=15
DBL_EPSILON=2.220446e-16

The following program displays the limits listed in Table II23 and Table II24 for the type
long double:
$ cat ldbl_max.c
#include <stdio.h>
#include <float.h>
#include <stdlib.h>

int main(void) {
printf(FLT_RADIX=%d\n, FLT_RADIX);
printf(LDBL_MANT_DIG=%d\n, LDBL_MANT_DIG);
printf(LDBL_MIN_EXP=%d\n, LDBL_MIN_EXP);

printf(LDBL_MAX_EXP=%d\n, LDBL_MAX_EXP);
printf(LDBL_MIN_10_EXP=%d\n, LDBL_MIN_10_EXP);
printf(LDBL_MAX_10_EXP=%d\n, LDBL_MAX_10_EXP);
printf(LDBL_MIN=%Le\n, LDBL_MIN);
printf(LDBL_MAX=%Le\n, LDBL_MAX);
printf(LDBL_DIG=%d\n, LDBL_DIG);
printf(LDBL_EPSILON=%Le\n, LDBL_EPSILON);

return EXIT_SUCCESS;
}

If we run it in our computer, we get this:


$ ./dbl_max
FLT_RADIX=2
LDBL_MANT_DIG=64
LDBL_MIN_EXP=-16381
LDBL_MAX_EXP=16384
LDBL_MIN_10_EXP=-4931
LDBL_MAX_10_EXP=4932
LDBL_MIN=3.362103e-4932
LDBL_MAX=1.189731e+4932
LDBL_DIG=18
LDBL_EPSILON=1.084202e-19


As floating-point numbers have internal binary representation in computers, decimal
floating-numbers you will use may actually be an approximation. Consider the decimal
floating-point numbers 0.5 and 0.125, their binary representations are 0.1 (0.5=1x2-1) and 0.001
(0.125=0x2-1+0x2-2+1x2-3) respectively. Both the numbers are accurately represented in binary.
Now, consider the number 0.1: in binary, it is written 0.0001100110011 Whatever the
precision adopted, the decimal floating-point number 0.1 will never be represented
accurately in binary base. Therefore, we have four kinds of issues with floating-point
numbers:
o A floating-point number with too many digits (such as ) cannot be represented
accurately: it is approximated.
o A floating-point number with a magnitude too large (such as
represented: it is considered infinite.

109999)

cannot be

o A floating-point number with a magnitude too small (such as


represented: it is considered 0.

10-9999)

cannot be

o A decimal floating-point number may be approximated if FLT_RADIX is not 10 (usually


2).


If a floating-point number, expressed in base 10, has a precision greater than FLT_DIG (for
float), DBL_DIG (for double), or LDBL_DIG (for long double), there may be a loss of accuracy.

Consider the following example:
$ cat float_limit1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 3.1415926535;
printf(x set to 3.1415926535. x=%.10f\n, x);

return EXIT_SUCCESS;
}
$ gcc -o float_limit1 float_limit1.c
$ ./float-limit1
x set to 3.1415926535. x=3.1415927410

In our example, the x variable is set to a decimal floating-point literal (3.1415926535) with a
precision of 11, which is greater than FLT_DIG. The number held in x is converted to a
binary number (if FLT_RADIX is 2, which is generally the case) with a precision of
FLT_MANT_DIG and rounded if required before being stored into the variable. This means,
we may not get exactly the same number and then there may be a loss of accuracy. There
will be no loss if the floating-point number has a precision less than or equal to FLT_DIG
digits as shown by the following example:
$ cat float_limit2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x;

x = 3.14159;
printf(x set to 3.14159. x=%f\n, x);

x = 33.14159;
printf(x set to 33.14159. x=%f\n, x);

x = 333.14159;
printf(x set to 333.14159. x=%f\n, x);


x = 3333.14159;
printf(x set to 3333.14159. x=%f\n, x);

x = 33333.14159;
printf(x set to 33333.14159. x=%f\n, x);

x = 333333.14159;
printf(x set to 333333.14159. x=%f\n, x);

x = 3333333.14159;
printf(x set to 3333333.14159. x=%f\n, x);

return EXIT_SUCCESS;
}
$ gcc -o float_limit2 -std=c99 -pedantic float_limit2.c
$ ./float_limit2
x set to 3.14159. x=3.141590
x set to 33.14159. x=33.141590
x set to 333.14159. x=333.141602
x set to 3333.14159. x=3333.141602
x set to 33333.14159. x=33333.140625
x set to 333333.14159. x=333333.156250
x set to 3333333.14159. x=3333333.250000

The example shows the more the magnitude of a floating-point number is large, the less
the number of significant digits for the fractional part is small and can even be ignored as
shown below:
$ cat float_limit3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float f = 8888888.125;
float g = 8888888.225;

printf(%f-%f=%g\n, g, f, g-f);

return EXIT_SUCCESS;
}
$ gcc -o float_limit3 -std=c99 -pedantic float_limit3.c
$ ./float_limit3

8888888.000000-8888888.000000=0

The less significant digits of the integral part may be discarded and the number may be
rounded as shown by the following example:
$ cat float_limit4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float f = 777777777; /* precision of 9 */

printf(777777777=%f\n, f);
printf(777777777=%e\n, f);

return EXIT_SUCCESS;
}
$ gcc -o float_limit4 -std=c99 -pedantic float_limit4.c
$ ./float_limit4
777777777=777777792.000000
777777777=7.777778e+08
0100 and dbl_g=1e-08

When a number is too big to be held in a variable of type float, it takes the symbolic value
Inf (or Inf):
$ cat float_limit5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 10e+130;
float y = -10e+130;

printf(x=%f\ny=%f\n, x, y);

return EXIT_SUCCESS;
}
$ gcc -o float_limit5 -lm -std=c99 -pedantic float_limit5.c
$ ./float_limit5
x=Inf
y=-Inf

It is possible to have numbers less than FLT_MIN. They are denormalized numbers. In the

following example, we display a number less than FLT_MIN:


$ cat float_limit6.c
#include <stdio.h>
#include <stdlib.h>
#include <float.h>

int main(void) {
float x = FLT_MIN*0.01;

printf(FLT_MIN=%e\n, FLT_MIN);
printf(FTL_MIN*0.01=%e\n, x);

return EXIT_SUCCESS;
}
$ gcc -o float_limit6 -std=c99 -pedantic float_limit6.c
$ ./float_limit6
FLT_MIN=1.175494e-38
FTL_MIN*0.01=1.175493e-40

The decimal floating-point number 1.25 has a precision of 3 while the decimal floating-point number
1.250 has a precision of 4. Mathematically, they are equal but there is a subtle distinction: the first notation indicates we
are sure that the less significant digit is 5 and the digits afterwards are unknown and then are not written. The second
notation shows our quantity is known accurately with three digits after the decimal point.

II.6.3 Complex types


In mathematic a complex number takes the form:
a + i b

Where a and b are real numbers, and i the imaginary unit equal to
(i.e. i2=-1). The real
number a is called the real part of the complex number and b the imaginary part. An
imaginary number is a complex number with no real part having the form: i b. In C, real
floating types and complex types are called floating types.

In C (as of C99), the complex type is called _Complex, and the imaginary type is called

_Imaginary. However, practically, they are not often used because the header file complex.h

defines type names more natural: complex, and imaginary.


The header file complex.h defines several useful functions and macros:
o complex that expands to _Complex. You can then define a variable holding a complex
number as complex or _Complex. Both are equivalent.
o imaginary that expands to _Imaginary. Thus, you can define a variable holding an imaginary
number as imaginary or _Imaginary. Both are equivalent.
o _Imaginary_I and _Complex_I (imaginary unit) that expand to a constant i such that i2=-1.
o I (representing the imaginary unit) that expands to
_Imaginary_I is not implemented, it expands to _Complex_I.

_Complex_I

or

_Imaginary_I.

If


The imaginary type may not be supported on your system. Accordingly, the macros
imaginary and _Imaginary_I would not be defined.

As matter of fact, there are three kinds of complex types:
o float _Complex (same as float complex if you include complex.h): real and imaginary parts are of
type float.
o double _Complex (same as double complex if you include complex.h) : real and imaginary parts
are of type double.
o long double _Complex (same as long double complex if you include complex.h) : real and
imaginary parts are of type long double.

Likewise, if the imaginary type is implemented, three kinds of imaginary types can be
used:
o float _Imaginary (same as float imaginary if you include complex.h)
o double _Imaginary (same as double imaginary if you include complex.h)
o long double _Imaginary (same as long double imaginary if you include complex.h)

To get the real part of a complex number, use the functions, defined in complex.h, creal(),
crealf(), or creall() whose prototypes are given below:
float creal(float complex z);
double creal(double complex z);
long double creal(long double complex z);

If you declare a variable of type float complex, call the function crealf(). If you declare a
variable of type double complex, call the function creal()

To get the imaginary part of a complex number, use the function, defined in complex.h,
cimag(), cimaglf() or cimagll() whose prototypes are shown below:
float cimag(float complex z);
double cimag(double complex z);
long double cimag(long double complex z);

Not all compilers support complex types.



For example:
$ cat complex.c
#include <stdio.h>
#include <stdlib.h>
#include <complex.h>

int main(void) {
double complex z1 = 1 + 2*I;
double complex z2 = 2.8 + 2.2*I;
double complex z3 = z1 + z2;

printf(z1=%f+%f i\n, creal(z1), cimag(z1) );
printf(z2=%f+%f i\n, creal(z2), cimag(z2) );
printf(z3=%f+%f i\n, creal(z3), cimag(z3) );

return EXIT_SUCCESS;
}
$ gcc -o complex -std=c99 -pedantic complex.c
$ ./complex
z1=1.000000 + 2.000000 i
z2=1.100000 + 2.200000 i
z2=2.100000 + 4.200000 i

II.7 Types of constants


We talked about constants but we say hardly anything about their type. If it is obvious the
constant 12 is an integer, we could wonder what kind of integer type it is: int, unsigned int,
long

It is worth noting integer and floating constants are positive numbers. The minus sign
before arithmetic constants is treated as a unary operator (see Chapter IV Section IV.2.2)
that is not part of the constant. For example, when you write int v = -12, the integer constant

is 12 not -12 while the variable v actually holds a negative value (-12).

II.7.1 Character constants


A character constant such as Z has type int. An object of type char can hold any basic
character as a positive integer. If a basic character fits in one byte, an extended character
may be represented by more than one byte. For example, in UCS, the character constant
has the integer value 0x20AC. The character encoding UTF-8 represents it by three bytes:
0x20, 0xE2, and 0x82. Basic characters can be represented by a character type (char, signed char
or unsigned char) while extended characters (such as ), described in Chapter IX, are
represented by one or more bytes (multibyte characters) or as a wide character (wchar_t).

II.7.2 Integer constants


The C language defines a list of suffixes for integer constants specifying their type: u or U
for unsigned, l or L for long, ll and LL for long long. The suffix u or U can be combined with l
(or L) and ll (or LL), which leads to several possibilities. According to C99:
o No suffix
If a decimal integer constant has no suffix, the first integer type that can hold it is

used according to the following order:


int, long, long long
If a hexadecimal or octal integer constant has no suffix, the first integer type that

can hold it is used according to the following order:


int, unsigned int, long, unsigned long, long long, unsigned long long

o Suffix U:
If a decimal, hexadecimal or octal integer constant has the suffix U, the first integer

type that can hold it is used according to the following order:


unsigned int, unsigned long, unsigned long long

o Suffix L:
If a decimal integer constant has suffix L, the first integer type that can hold it is

used according to the following order:


long, long long
If a hexadecimal or octal integer constant has the suffix L, the first integer type that

can hold it is used according to the following order:


long, unsigned long, long long, unsigned long long

o Suffix UL:

If a decimal, hexadecimal or octal integer constant has the suffix UL, the first

integer type that can hold it is used according to the following order:
unsigned long, unsigned long long

o Suffix LL:
If a decimal integer constant has suffix LL, the first integer type that can hold it is:
long long
If a hexadecimal or octal integer constant has the suffix LL, the first integer type

that can hold it is used according to the following order:


long long, unsigned long long.

o Suffix ULL:
If a decimal, hexadecimal or octal integer constant has the suffix ULL, the first

integer type that can hold it is:


unsigned long long.

For example, the integer constants 12, 0xFA, 012 have type int. the integer constant 12U has
type unsigned int. The integer constant 12LL has type long long

II.7.3 Floating constants


Real floating constants can be of type float, double or long double. Suffixes can be appended to
floating constants to specify their type: f (or F) for float, l (or L) for long double. With no
suffix, a floating constant is of type double. Here are some floating constants: 1.0, 1., 3.14e1,
3.1e-2, 2.8f, 2.618e-2L.

II.8 Type qualifiers


[21]

The C language specifies three kinds of type qualifiers: const, volatile and restrict . A type
without a qualifier is called unqualified type: such as int, float A type with a qualifier is
called qualified type: const int, volatile int, restrict int, const restrict int, const volatile restrict int A
type can be qualified with one, two or three qualifiers in any order. A qualifier does not
change the representation of a type but the way it is used. For example, an object of type
const int has the same representation as an int but it is used as a read-only object.

II.8.1 Const
So far, our variables could be altered at any time. In some cases, programmers do not
want their variables to be modified. The C variable defines the type qualifier const that tells
the compiler the variable that follows it cannot be modified once created. The const

qualifier can be placed before or after the type it qualifies. Such a variable is not an actual
constant such as 16, 1.2, or hello.

For example:
$ cat const1.c
#include <stdlib.h>

int main(void) {
float const pi = 3.14;
pi = 3.1459;

return EXIT_SUCCESS;
}
$ gcc -o const1 -std=c99 -pedantic const1.c
const1.c: In function main:
const1.c:5:3: error: assignment of read-only variable pi

The compilation failed because we tried to modify the variable pi declared as read-only
with the qualifier const. What happened if we did not initialize it at declaration time?
$ cat const2.c
#include <stdlib.h>

int main(void) {
float const pi;

pi = 3.14;

return EXIT_SUCCESS;
}
$ gcc -o const2 -std=c99 -pedantic const2.c
const2.c: In function main:
const2.c:6:3: error: assignment of read-only variable pi

We got the same error. So, do not forget to initialize your const variable at the time of
declaration.

The const qualifier can also be placed before the type it qualifies:
$ cat const3.c
#include <stdio.h>
#include <stdlib.h>


int main(void) {
const float pi = 3.14;

printf(pi=%f\n, pi);
return EXIT_SUCCESS;
}
$ gcc -o const3 -std=c99 -pedantic const3.c
$ ./const3
pi=3.140000

II.8.2 Volatile
Though not often used, the type qualifier volatile may be useful in some circumstances. It
tells the compiler to avoid performing any optimization related to volatile variables
because they may be altered by external routines other than the pieces of code containing
them (by a hardware component or a thread).

What does it actually mean? Most of the time, in a C program, a variable is modified by a
single routine in a predictable way. For this reason, the compiler may perform
optimizations. Optimizations allow the program to run faster. For example, some variables
have not to be accessed each time they are used as in the following code:
int flag=0;

while (flag == 0)
;;
printf(Flag=%d\n, flag);

The compiler considering the flag variable is not modified between its initialization and the
while loop, could optimize it like this:
int flag=0;

while (1)
;;
printf(Flag=%d\n, flag);

It makes sense. Most of the time, the compiler is right but it happens that optimizations
cause an unexpected behavior of the program if variables are also modified by an element
external to the program (such a hardware component or a thread). By qualifying a variable
as volatile, the register storing the value will be checked each time the variable is accessed
and no optimization is done.

Volatile variables are also used when the functions setjmp() and longjmp() are invoked (see
section XI.15).

II.9 Aliasing types


The C language allows creating new types (broached in Chapter VI) and aliasing existing
types. The typedef keyword lets you create a synonym for an existing type:
typedef exitsing_type_name new_name

Both the types are the same and considered the same way. In the following example, we
create an alias for the type int:
$ cat alias_type.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
typedef int myinteger;
myinteger i = 10;

printf(i=%d\n, i);

return EXIT_SUCESS;
}

II.10 Compatible types


We will talk again about compatible types; later, we will complete the definition when we
broach pointers, arrays, structures, unions and functions. Two types are said to be
compatible if they are the same. Two compatible types with the same qualifiers (whatever
the order the qualifiers) are also compatible. In Table II25, types within the same cell are
compatible types.

Table II25 Examples of compatible types


Two compatible types with the same qualifiers are compatible: const volatile int is compatible
with volatile const int. Two types with different qualifiers are not compatible: const volatile int is
not compatible with const int. A corollary is an unqualified type is not compatible with a
qualified type: for example, const int is not compatible with the type int.

II.11 Conversions
II.11.1 Assigment
As explained earlier, a variable is characterized by its name, its type and the value it holds.
The name of the variable identifies an object that is a memory area of the computer,
identified by an address, holding a value. The type of the variable defines the way the
piece of data it holds is represented, the range of values allowed and the operations that
can apply on. The value is the contents of the variable depending on its type. This means
that you cannot store any value in a variable. At any time, you can set a value to a variable
as follows:
varname=val;

Where:
o varname is the identifier of the variable composed of letters, underscores and digits,
starting with a letter or an underscore.
o val is an expression. An expression is a combination of functions, operations, literals and
variables. Later in the book, we will talk about expressions, and functions. For now, let
us just imagine val as a literal or another variable.


Take note that in C, the equals sign (=) is an assignment operator (it is not a comparison
operator). The variable, that is an lvalue (object that can store a value), is on the left side
of the equals sign operator while the value to be stored, sometimes called an rvalue, is on
the right hand.

A value or a variable (object) has an implicit or an explicit type. Literals have an implicit
type. A variable has an explicit type given at the time of its declaration. If the type of the
value val to assign (on the right side of =) is the same as that of the variable varname (on the
left side of =), there is no conversion. The value val is just copied into the variable,
replacing its older value. If the type of the variable is different from the type of the value
val to assign, the value is converted to the type of the variable before being copied into the
variable. Such an operation is known as an implicit conversion or implicit cast.

A variable can appear on the left hand or on the right hand of the equals sign. When a
variable appears on the left side of the assignment operator =, it means the programmer
wants to set it: it is then used as a container. When it appears on the right side, it used as
its value: the variable is then replaced by its contents.

A variable is an lvalue, meaning it refers to an object (memory block). If you attempt to
assign a value to an operator or a literal, you will get an error at compilation time:
$ cat assig1.c
#include <stdio.h>

int main(void) {
17 = 1;
}
$ gcc -o assig1 -std=c99 -pedantic assig1.c
assig1.c: In function main:
assig1.c:4:2: error: lvalue required as left operand of assignment

The integer constant 17 does not refer to an object. An object has a memory location that
you can access through its name or its address. Literals have no memory address. They are
loaded into registers when used but have to memory address that you can deal with.

In the following example, we assign the integer variable x the value of 31:
$ cat assig2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x;
x = 31;
printf(x=%d\n, x);
return EXIT_SUCCESS;
}
$ gcc -o assig2 -std=c99 -pedantic assig2.c
$ ./assig2
x=31

In the following example, we assign the integer variable x the value of the variable y:
$ cat assig3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x;
int y;
y = 31;
x = y;
printf(x=%d\n, x);
return EXIT_SUCCESS;
}
$ gcc -o assig3 -std=c99 -pedantic assig3.c
$ ./assig3
x=31

The contents of a variable may vary over time, and can be altered as many times as you
wish:
$ cat assig4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x;

x = 31;
printf(x=%d\n, x);

x = 407;
printf(x=%d\n, x);
return EXIT_SUCCESS;

}
$ gcc -o assig4 -std=c99 -pedantic assig4.c
$ ./assig4
x=31
x=407

You cannot assign any value to a variable. The type of the value you assign to a variable
must be compatible or allowed (explained in the next section). The following example
generates an error because we try to assign a string to a variable of type int.
$ cat assig5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x;

x = hello;
printf(x=%d\n, x);
return EXIT_SUCCESS;
}
$ gcc -o assig5 -std=c99 -pedantic assig5.c
ssig5.c: In function main:
assig5.c:6:4: warning: assignment makes integer from pointer without a cast

So far, we have assigned values that have a type compatible with the variables. Since the
value on the right side of the assignment operator (=) may be converted to the type of the
variable, some questions naturally rise: what happens if we try to assign a floating-point
value to a variable of an integer type? What happens if we assign a negative floating-point
value to a variable of type unsigned int? And so on. Answers in the next sections

II.11.2 Implicit and explicit cast


In C, a value of a certain type can be converted to another type. Depending on the types,
there may be constraints but as far as arithmetic types are concerned, a value of any
arithmetic type can be converted to any arithmetic type. In this chapter, the conversions
we describe are only between arithmetic types. Most of them are quite natural.

The C language has two kinds of type conversions also known as casts. An implicit
conversion (implicit cast) is automatically performed in some expressions (such as the
addition and assignment operations. Expressions are described in Chapter IV), in
assignments, and when passing arguments to function (described in Chapter VII). An
explicit conversion, also known as an explicit cast, is carried out by programmers. The
following example shows an implicit conversion performed by the assignment operation:

$ cat type_conv1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x;

x = 31.2;

printf(x=%d\n, x);
return EXIT_SUCCESS;
}
$ gcc -o type_conv1 -std=c99 -pedantic type_conv1.c
$ ./cast1
x=31

It worked as expected: the float literal 31.2 is automatically converted to int before being
assigned to the variable x. Thus, the fractional part is discarded, only keeping the integer
part after the conversion. Now, run this:
$ cat type_conv2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x;

x = 31;
printf(x=%f\n, x);
return EXIT_SUCCESS;
}
$ gcc -o type_conv2 -std=c99 -pedantic type_conv2.c
x=31.000000

Here again it works as expected, the integer literal 31 is automatically cast to type float
(31.0) before being assigned to the variable x.

The C language allows another type of conversion known as an explicit conversion or
explicit cast. The implicit type conversion is automatically done. The explicit cast acts in
the same way except that the conversion task is controlled by the programmer. To cast
explicitly a value or a variable to type newtype, place before it the new type name newtype
between parentheses:

(newtype)rval

Where:
o newtype is a type name to which the value of the expression rval will be converted.
o rval is an expression evaluating to a value. It can be a function, an operation, a literal, a
variable or a combination of all of them.

Normally, the explicit cast operator is used when a type conversion is required while the
compiler cannot perform it automatically. Let us consider the following example:
$ cat type_conv3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int a = 3;
int b = 2;

float c = a / b;

printf(a/b=%d/%d=%f\n, a, b, c);
return EXIT_SUCCESS;
}
$ gcc -o type_conv3 -std=c99 -pedantic type_conv3.c
$ ./type_conv3
a/b=3/2=1.000000

In the example above, we declared the variables a and b as type int. We also declared the
variable c as float that is assigned the resulting value of the division a/b. As we will find out
in Chapter IV, an arithmetic operation returns an integer type if all of its operands have an
integer type. It returns a floating-point value if either operand has a floating-point type.
For this reason, the division a/b did not return 1.5 as expected but 1. Since all of its
operands have type int, the division returns an integral value: the fractional part is
discarded. Obviously, you can tell the compiler you do not want to get only the integer
part of a division but a floating-point number by using the cast operator. In the following
example, we cast the variable a to float, which causes the division to return a real floatingpoint value:
$ cat type_conv4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

int a = 3;
int b = 2;

float c = (float)a / b;

printf(a/b=%d/%d=%f\n, a, b, c);
return EXIT_SUCCESS;
}
$ gcc -o type_conv3 -std=c99 -pedantic type_conv3.c
$ ./type_conv3
a/b=3/2=1.500000

We could also have cast the variable b to float, which would have yield the same output.
The following example shows implicit and explicit casts:
$ cat type_conv5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float v = 1/3; /* implicit cast */
float w = 1/3.0; /* no cast*/
float x = 1.0/3; /* no cast */
float y = (float)1/3; /* explicit cast */
float z = 1/(float)3; /* explicit cast */


printf(v=%f\nw=%f\nx=%f\ny=%f\nz=%f\n, v, w, x, y, z);
return EXIT_SUCCESS;
}
$ gcc -o type_conv5 -std=c99 -pedantic type_conv5.c
$ ./type_conv4
v=0.000000
w=0.333333
x=0.333333
y=0.333333
z=0.333333

Explanations:
o float v = 1/3 declares the v variable as float and assigns it the output of the operation 1/3. As
all operands of the operation are of type int, the result will be of type int. Therefore, being
of type int, the expression 1/3 evaluates to 0. Then, it is converted to float before being
assigned to the variable v.

o In the statement float w = 1/3.0 there is no type casting. The division operation 1/3.0 has type
float and then fits into the float variable w; both have the same type.
o Similarly to the previous statement, in the statement float x = 1.0/3 there is no type casting
since there is one operand of type float causing the operation 1.0/3 to be evaluated to float.
o The statement float y = (float)1/3 uses an explicit casting. In this case, only the integer
number 1 is converted to float causing the whole expression to be evaluated to float before
being actually processed.
o The statement float z = 1/(float)3 also uses an explicit casting. Only the integer number 3 is
converted to float causing the expression to be of type float before being actually
computed.

While converting a value, there may be a change of its representation. For example,
converting a value of type float to type int leads to a representation change. That is the bit
pattern representing a value may change after a conversion. Programmers do not have to
be aware about the representation changes.

II.11.3 Conversion to integer types


II.11.3.1 Conversion to Boolean type
A value of any arithmetic type can be converted to a Boolean type _Bool. If the value to
convert is 0, the Boolean value will be 0 after conversion. Otherwise, it will be 1. There is
no overflow.

II.11.3.2 Conversion to a signed integer
A value of any arithmetic type (we call it source value) can be converted to a signed
integer (target type). There are two cases:
o The target signed integer type is too small to represent the value. That is, the source
value is out of the range of the values that can be represented by the target signed integer.
o The target signed integer type is large enough to represent the value. That is, the source
value is in the range of the values that can be represented by the target signed integer.

In this section, we will call val the original value (source value), int_val its integral part if it
is a floating-point number, tgt_max the maximum value of the target signed integer type and
tgt_min the minimum value of the target signed integer type.

Table II26 Conversion to signed integer types


If the original value has an integer type and the target signed integer type is too small to
represent it, the value obtained after conversion is undefined. That is, the range of values
that can be represented by the target signed integer type does not contain the original
value: an overflow occurs (val > tgt_max or val < tgt_min). The result is undefined. In the
following example, the variables sh1 and sh2 have an undefined value:
$ cat conv2signed_int1.c
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main(void) {
signed short sh1 = INT_MAX; /* overflow */
signed short sh2 = 9876543210.123456; /* overflow */

return EXIT_SUCCESS;
}
$ gcc -o conv2signed_int -std=c99 -pedantic conv2signed_int.c
conv2signed_int.c: In function main:

conv2signed_int.c:6:4: warning: overflow in implicit constant conversion


conv2signed_int.c:7:4: warning: overflow in implicit constant conversion

If the original value has an integer type and the target signed integer type is large enough
to represent it, the value obtained after conversion is the same (tgt_min val tgt_max).

If the source value has a floating-point type, the fractional part is discarded. If the integral
part of the original value (int_val) is within the range of values that can be represented by
the target signed integer type, the target value is the integral value (tgt_min int_val
tgt_max). Otherwise, an overflow occurs generating an undefined target value. Here is an
example:
$ cat conv2signed_int2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int ui = 10;
double f = 19.123456;
signed short sh1 = ui; /* conversion to signed int */
signed short sh2 = f; /* conversion to signed int */

printf(sh1=%d sh2=%d\n, sh1, sh2);
return EXIT_SUCCESS;
}
$ gcc -o conv2signed_int2 -std=c99 -pedantic conv2signed_int2.c
$ ./conv2signed_int2
sh1=10 sh2=19


II.11.3.3 Conversion to an unsigned integer
A value of any arithmetic type can be converted to an unsigned integer. In this section, we
will call val the original value, int_val its integral part if it is a floating-point number, umax
the maximum value of the target unsigned integer type.

First, let us consider only original values that are positive. If the original value has an
integer type:
o If the original value is outside the range of the values that can be represented by the
target unsigned integer type (val > umax), the value obtained after conversion is the
original value modulo the maximum value of the unsigned integer type plus one (val %
(umax+1)). The result is always defined.
o If the value is within the range of the values that can be represented by the target

unsigned integer type (0 val umax), the value obtained after conversion is the same as
the original value.

What happens if a negative integer value is converted to an unsigned integer type? The
original value v is converted to ( v + p*(umax+1) ) % (umax+1), where p is a positive integer such
that v + p*(umax+1) 0. Consider the following example:
$ cat conv2unsigned_int1.c
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main(void) {
int i = -1;
int j = -10;

unsigned int ui1 = i;
unsigned int ui2 = j;

printf(UINT_MAX=%u u1i=%u ui2=%u\n, UINT_MAX, ui1, ui2);
return EXIT_SUCCESS;
}
$ gcc -o conv2unsigned_int1 -std=c99 -pedantic conv2unsigned_int1.c
$ ./conv2unsigned_int1
UINT_MAX=4294967295 u1i=4294967295 ui2=4294967286

The value -10 (of type int) is converted to ( -10 + 1*(4294967295+1) ) modulo
(4294967295+1)= 4294967286 modulo 4294967296 = 4294967286.

The same rule applies for a longer target unsigned integer:
$ cat conv2unsigned_int2.c
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main(void) {
int j = -10;
unsigned long long ull = j;

printf(ULLONG_MAX=%llu u1=%llu\n, ULLONG_MAX, ull);
return EXIT_SUCCESS;

}
$ gcc -o conv2unsigned_int2 -std=c99 -pedantic conv2unsigned_int2.c
$ ./conv2unsigned_int2
ULLONG_MAX=18446744073709551615 u1=18446744073709551606

In the example above, the value -10 is converted to (-10+1*(18446744073709551615+1))


modulo
(18446744073709551615+1)
=
18446744073709551606
modulo
18446744073709551616 = 18446744073709551606.

If the source value has a floating-point type, the fractional part is expelled:
o If the integral part of the original value is within the range of the values that can be
represented by the target unsigned integer type (0 int_val umax), the resulting value
obtained after conversion is the integral part of the original value.
o If the fractional part is not within the range that can be represented by the target
unsigned integer type (int_val < 0 or int_val > umax), the value obtained is undefined.
Implementations often perform modulo operations as for integer values.

Table II27 Conversion to unsigned integer types

II.11.4 Conversion to floating-point types


A value of any arithmetic type can be converted to a floating-point type. There are several
cases described in Table II28.

Table II28 Conversion to real floating-point types

II.12 Exercises
Exercise 1. Display the size of the types int and long
Exercise 2. Why the value -128 can be represented by the type signed char on some systems
(we suppose it is represented by eight bits)?
Exercise 3. Why the operation x = 1+10e-30 is equivalent to x = 1 in some systems (x is of
type float)?
Exercise 4. What would be the output of the operation x = (unsigned int)-1?

CHAPTER III ARRAYS, POINTERS


AND STRINGS

III.1 Introduction
In the previous chapter, we have learned to work with variables and basic types. So far, a
variable can hold only one value at a time. Suppose you need to create a program that
reads a file containing information about one thousand of persons and you need to store
some pieces of data about all of them in order to perform some processes. Let us say you
want to store the names, surnames and ages: how many variables are needed? 3000! Could
you imagine you declare 3000 variables and work with them?

Fortunately, the C language has two other very useful types that ease programming: arrays
and pointers. Though they are similar and often interchangeable, they are different and
must not be confused.

III.2 Arrays
An array is an object composed a set of items having the same type. An array is identified
by a name composed of underscores, letters and digits, starting with an underscore or a
letter. We can distinguish two kinds of arrays: one-dimensional arrays and multidimensional arrays.

III.2.1 One-dimensional array


III.2.1.1 Declaration
Before being used, an array must be declared as shown below so that a memory block is
allocated for the items if contains:
arr_type arr_name[n];

Where:
o arr_type is a user-defined type or a C standard type (int, long, float, array, pointer). Userdefined types will be discussed later.

o arr_name is the name of the array.


o n is a positive integer number indicating the number of elements the array stores. It
represents the length of the array. More generally, n can be an integer constant expression
(an expression that evaluates to an integer constant (see Chapter IV Section IV.14).
An expression is a simple value, an operation or a combination of operations (Chapter
IV). For example, you could declare an array as arr[2+4+1], which equivalent to arr[7]: the
expression 2+4+1 evaluates to an integer constant (i.e. known at compile time).

The contiguous memory area allocated at compile time is large enough to hold all of its
elements: the array size is n * sizeof arr_type (see Figure III1). Built from other types, an
array type is a derived type. Containing several objects (of same type), it is also an aggregate
type. The size of an array does not change over time: it is determined at compile time and
cannot be changed afterwards.

Below, the array age is declared with five elements of type int (see Figure III1):
$ cat array_decl1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[5];

return EXIT_SUCCESS;
}

Our array age can store five values of type int. All elements are independent from each
other: they can be directly accessed or modified as any variable. Before talking about how
we can have access to elements, let us explain how an array can be initialized.

Figure III1 Memory layout of the array age[5]


In C, the length of an array had to be a positive integer constant (integer literal).

III.2.1.2 Initialization
You have two methods to assign values in an array: at the time of declaration
[22]
(initialization
) or after the declaration of the array. When you declare an array, you can
also initialize it by giving values enclosed between braces:
arr_type arr_name[n]={val1,val2,,valp};

Where:
o arr_type is a user-defined type or a C type.
o arr_name is the name of the array.
o n is an integer number indicating the number of elements the array stores (length).
o val1,,valp are p values of type arr_type.
o n p. If n = p, all elements are initialized. Otherwise, other elements having subscript m

such that m > p are set to 0 by default.



The first element denoted by arr_name[0] takes the value of val1, the second one denoted by
arr_name[1] takes the value of val2,, the last element denoted by arr_name[p-1] takes the value
of valp. Take note after you declare an array, you cannot set values of the array in this way.

Figure III2 Representation of the array age after initialization


The following example declares and initializes all items of the array age at the same time
(depicted in Figure III2):
$ cat array_init1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[5] = {54,17,59,44,64};


return EXIT_SUCCESS;
}


The length of the array n can be omitted if n=p: the length of the array is then computed by
the compiler by counting the number of values between the braces. The following
statement is equivalent to previous one if n=p:
arr_type arr_name[]={val1,val2,,valn};

The previous example is equivalent to the following code:


$ cat array_init2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[] = {54,17,59,44,64};

return EXIT_SUCCESS;
}

If you do not initialize your array at declaration time, you can no longer do it in a single
statement; you must then use the second method that consists in assigning directly values
to elements of the array. An item in an array can be accessed by its index (subscript) that is
an integer number: array[i] references the item number i+1. The first item of an array is
placed at index 0, the second one at index 1, and so on. The last index (element number n)
is n-1 where n is the length of the array.

In our example array_init2.c, the array age is composed of five elements: the first item is
denoted by age[0], the second one by age[1]and the last one (fifth) by age[4] (see Figure
III2). Each item of the array age is a number of type int. The following example assigns
each element of the array age:
$ cat array_init3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[5];

age[0] = 54;
age[1] = 17;
age[2] = 59;

age[3] = 44;
age[4] = 64;

return EXIT_SUCCESS;
}

As of C99, you can initialize only some specific elements in an array at declaration time as
shown below:
$ cat array_init4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[100] = {54,17,59,44,64,[50]=22,[90]=47};

return EXIT_SUCCESS;
}

In the example above, we set the elements from index 0 through index 4, along with
elements of index 50 and index 90. It is equivalent to the following code:
$ cat array_init5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[100];
age[0] = 54;
age[1] = 17;
age[2] = 59;
age[3] = 44;
age[4] = 64;
age[50] = 22;
age[90] = 47;

return EXIT_SUCCESS;
}


III.2.1.3 Accessing elements in an array
All of the elements of an array are of the same type and then of the same size. The only
way to have access to an element in an array is to resort to its subscript: if arr is the name
of an array, arr[i] is an element of the array: i is the subscript (index) that allows you to

reference the element number i+1. Why i+1 and not i? Because, in C, the first element is
placed at index 0, which involves that 0 i n-1 (where n is the number of items of the
array).

An element of an array may be modified (it can be assigned another value as shown in
example array_init5.c) or a read (the value it holds is retrieved). In the following example,
we assign the variable v the value held in the second element of the array age, and then we
display both the contents of the variable v and the second element of the array age.
$ cat array_access1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[5];
int v;

age[0] = 54;
age[1] = 17;
age[2] = 59;
age[3] = 44;
age[4] = 64;

v = age[1];
printf(v=%d and age[1]=%d\n, v, age[1]);

return EXIT_SUCCESS;
}
$ gcc -o array_access1 -std=c99 -pedantic array_access1.c
$ ./array_access1
v=17 and age[1]=17


Keep in mind that an array declared as type arr[n] contains n elements: the first one is arr[0] and the last
one is arr[n-1]. A common mistake made by beginners is they consider the last item is arr[n], which causes bugs


What happens if we use elements in an array that were not initialized? Consider the

following example:
$ cat array_access2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[100] = {54,17,59,44,64,[50]=22,[90]=47};

printf(age[4]=%d\n, age[4]);
printf(age[5]=%d\n, age[5]);
printf(age[54]=%d\n, age[54]);
printf(age[90]=%d\n, age[90]);

return EXIT_SUCCESS;
}
$ gcc -o array_access2 -std=c99 -pedantic array_access2.c
$ ./array_access2
age[4]=64
age[5]=0
age[54]=0
age[90]=47

Uninitialized elements in an initialized array take the value of 0. However, if the array had
not been initialized, things would have been different. Compare with the following
example:
$ cat array_access3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[100];

printf(age[4]=%d\n, age[4]);
printf(age[5]=%d\n, age[5]);
printf(age[54]=%d\n, age[54]);
printf(age[90]=%d\n, age[90]);

return EXIT_SUCCESS;
}
$ gcc -o array_access3 -std=c99 -pedantic array_access3.c
$ ./array_access3

age[4]=2
age[5]=-25616384
age[54]=134546946
age[90]=-16782720

Elements of uninitialized arrays have undetermined values. So, do not forget to initialize
your arrays or setting values to their elements before using them.

Ensure the elements of your arrays have been initialized. You can initialize an array at the time of
declaration or later by setting separately their elements. Whatever the method you apply, never use an item with an
undefined value.


III.2.1.4 Array size
The size of an array is its length multiplied by the size of an item. The sizeof operator
returns the size of an array in bytes as shown below:
$ cat array_size1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int array1[5];
float array2[21];

printf(size of array1=%d Bytes\n, sizeof array1);
printf(size of array2=%d Bytes\n, sizeof array2);

return EXIT_SUCCESS;
}
$ gcc -o array_size1 -std=c99 -pedantic array_size1.c
$ ./array_size1
size of array1=20 Bytes
size of array2=84 Bytes

It is easy to get the number of elements an array holds: just divides the size of the array in
bytes by the size of an element also expressed in bytes:
$ cat array_size2.c
#include <stdio.h>
#include <stdlib.h>


int main(void) {
int array1[5];
float array2[21];

printf(Nb of elements in array1=%d\n, sizeof array1 / sizeof array1[0] );
printf(Nb of elements in array2=%d\n, sizeof array2 / sizeof array2[0] );

return EXIT_SUCCESS;
}
$ gcc -o array_size2 -std=c99 -pedantic array_size2.c
$ ./array_size2
Nb of elements in array1=5
Nb of elements in array2=21

Here, we chose to use the first element of each array but nothing prevents you from using
any element in the array as shown below:
$ cat array_size3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int array1[5];
float array2[21];

printf( Nb of elements in array1=%d\n, sizeof array1 / sizeof array1[1] );
printf( Nb of elements in array2=%d\n, sizeof array2 / sizeof array2[8] );

return EXIT_SUCCESS;
}
$ gcc -o array_size3 -std=c99 -pedantic array_size3.c
$ ./array_size3
Nb of elements in array1=5
Nb of elements in array2=21

As explained in the previous chapter, the sizeof operator returns the size of a type or a
variable. Now, you also know that it can get the size of an array or an element of an array.
The size of an element in an array is the size of the type of the element. Thus, though the
previous example is a better programming style, the previous example could also be
written like this:
$ cat array_size4.c
#include <stdio.h>

#include <stdlib.h>

int main(void) {
int array1[5];
float array2[21];

printf( Nb of elements in array1=%d\n, sizeof array1 / sizeof(int) );
printf( Nb of elements in array2=%d\n, sizeof array2 / sizeof(float) );

return EXIT_SUCCESS;
}
$ gcc -o array_size4 -std=c99 -pedantic array_size4.c
$ ./array_size4
Nb of elements in array1=5
Nb of elements in array2=21

The operand of the sizeof operator can be a type name or an identifier (such as a variable, a
pointer, an array). If the argument is an identifier, you can omit the parentheses but if the
argument is a type name, you must use the parentheses around it telling the compiler the
operand is a type.

The sizeof operator returns a number of bytes (that is not necessarily 8 bits). In C, a byte means
sizeof(char) that is the smallest amount of memory that the computer can access: the macro CHAR_BIT, defined in the
limits.h header file, stores the bit-length of a byte.

As we will see it later, the operand of the sizeof operator can be an expression. The size in bytes of
the expression is the size of the type of the resulting value. The expression sizeof(1/3) returns 4 while sizeof(1.0/3)
returns 8 in our computer: the type of the first expression is evaluated to an int while the second one to a double.


Keep in mind that an arrays subscript must not be greater than the length of the array
minus one (in-1 where i is the index and n the length of the array). The following example
generates no error at compilation time but will cause bugs:

$ cat array_size5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[] = {200,300,400,500,600};
int i = 1;
int v;

arr[5] = 10;
arr[6] = 10;
v = arr[5];

printf( v=%d\n,v);
printf( i=%d\n,i);

return EXIT_SUCCESS;
}
$ gcc -o array_size5 -std=c99 -pedantic array_size5.c
$ ./array_size5
v=10
i=10

The result is unpredictable. In our example, we accessed by mistake the memory location
of the variable i and we modified it involuntarily! As the example shows it, C lets you do
illegal accesses to memory. The C language is permissive because it lets you the whole
control of your program. It does not check the indexes you use. It is interesting to note you
can use negative integers as subscript without any complaints from the compiler:
$ cat array_size6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[] = {200,300,400,500,600};
int v;

arr[-1] = 10;

v = arr[-1];
printf( v=%d\n,v);

return EXIT_SUCCESS;

}
$ gcc -o array_size6 -std=c99 -pedantic array_size6.c
$ ./array_size6
v=10

Of course, this program is not correct. Why negative integers are allowed? This will be
explained when we will talk about pointers

If n is the length of an array (n a positive integer), subscripts to access elements are in the range [0,n-1].







III.2.1.5 Showing all elements of an array
The for loop, described in Chapter V, allows you to display all the elements of an array.
$ cat array_disp1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int age[] = {54,17,59,44,64};
int i;
int age_size = sizeof age / sizeof age[0];
printf( Display %d elements of array age\n,age_size);
for (i=0; i < age_size; i++) {
printf( age[%d]=%d\n,i, age[i]);
}

return EXIT_SUCCESS;
}
$ gcc -o array_disp1 -std=c99 -pedantic array_disp1.c
$ ./array_disp1
Display 5 elements of array age
age[0]=54

age[1]=17
age[2]=59
age[3]=44
age[4]=64

The for loop is composed of three parts separated by a semicolon within parentheses, and a
set of statements list_statements enclosed between braces ({}) known as a block:
for (part1;part2;part3) {
list_statements
}

When the for loop statement is executed:


o Firstly, the expression part1 is processed. It is the initialization step of the loop. Here, in
our example array_disp1.c, the variable i is assigned the value of 0. It is executed only
once.
o Secondly, the expression part2 is evaluated. If it is true, the block is executed. Otherwise,
the loop ends.
o Thirdly, the expression part3 is processed. In our example, the expression i++ is shorthand
for i=i+1. That is, the variable i is incremented.
o Then, the expression part2 is evaluated again, if it is true, the block is executed.
Otherwise, the loop ends.
o The expression part3 is processed, and so on.
o Partt2 and part3 are executed at each iteration until the loop ends.

In our example as long as the condition i < age_size is true, the for loop executes. Let us view
the cycles of the for loop of our example:
o array_size is evaluated to 5.
o Initialization of the for loop: i is set to 0.
o Cycle 1:
i holds the value of 0. The condition i < array_size is then true, the block is run: the

text age[0]=54 is printed.


The expression i++ increments i yielding 1.

o Cycle 2:
i holds the value of 1. The condition i < array_size is then true, the block is run: the

text age[1]=17 is printed.


The expression i++ increments i. The variable i holds 2.

o And so on
o Cycle 4:

i holds the value of 4. The condition i < array_size is then true, the block is run: the

text age[4]=64 is printed.


The expression i++ increments i. The variable i holds 5.

o Cycle 5:
i holds the value of 5. The condition i < array_size is false, the loop ends.


III.2.1.6 Boundaries
The C language lets you go beyond the memory allocated for an array without
complaining. There is no bound checking at all. Accordingly, check your subscripts are
valid

III.2.1.7 Memory address
The memory address of an object can be known thanks to the operator &: &v stands for the
address of an object called v. For example, if age is a variable &age represents its memory
address; if name_list is a one-dimensional array, &name_list[0] represents the memory address
of its first element (whose subscript is 0), &name_list[1] the address of its second element

What would the address of an array be? The address of an array is the address of its very
first element. Therefore, if name_list is a one-dimensional array, &name_list[0] is the also
address of the array. To be consistent, in C, &name_list is the address the array as well. This
is only a taste of what we are going to explain when we talk about pointers and
addresses

III.2.2 Multidimensional arrays


A C multidimensional array is an array of arrays. Let us begin with a two-dimensional
array. A two-dimensional array is declared like this:
arr_type arr_name[n][p];

Where:
o arr_type is a type name.
o arr_name is the name of the array.
o n is an integer number indicating the number of p-length one-dimensional arrays of type
arr_type it stores. The number n is the first dimension.
o p is a positive integer number indicating the number of elements of type arr_type stored in
each array arr_name[i] (where i n-1). The number p is the second dimension.
o An element of the array is represented by arr_name[i][j], where i ranges from 0 to n-1, and j
ranges from to p-1:


The two-dimensional array arr_name can be represented as an n x p matrix, composed of n
rows and p columns, but in fact, a multidimensional array is not laid out like this in
memory. A row arr_name[i] represents a one-dimensional array of p elements and arr_name[i]
[j] represents an element of the one-dimensional array arr_name[i].

What we say about one-dimensional arrays also applies to multidimensional arrays. An
element of a two-dimensional array arr_name[i][j] can be manipulated as a variable: you can
get its value or alter it. As you can easily guess it, the memory address of an element
arr_name[i][j] is &arr_name[i][j]. The memory address of an array arr_name[i] is given by
[23]
&arr_name[i] or &arr_name[i][0]
.


The following example creates a two-dimensional array called arr.
$ cat array_multidim1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char arr[2][3];

printf(ARRAY arr[0] (row 0):\n);
printf(address of arr[0][0]=%p and address of arr[0]=%p\n, &arr[0][0], &arr[0]);
printf( address of arr[0][1]=%p\n, &arr[0][1]);
printf( address of arr[0][2]=%p\n, &arr[0][2]);

printf(\nARRAY arr[1] (row 1):\n);
printf(address of arr[1][0]=%p and address of arr[1]=%p\n, &arr[1][0], &arr[1]);
printf( address of arr[1][1]=%p\n, &arr[1][1]);
printf( address of arr[1][2]=%p\n, &arr[1][2]);

printf(\nsizeof arr[0][0]=%d and sizeof arr[0]=%d\n, sizeof arr[0][0], sizeof arr[0]);
printf(sizeof arr[1][0]=%d and sizeof arr[0]=%d\n, sizeof arr[1][0], sizeof arr[1]);
return EXIT_SUCCESS;
}
$ gcc -o array_multidim1 -std=c99 -pedantic array_multidim1.c
$ ./array_multidim1
ARRAY arr[0] (row 0):
address of arr[0][0]=feffea8a and address of arr[0]=feffea8a
address of arr[0][1]=feffea8b
address of arr[0][2]=feffea8c


ARRAY arr[1] (row 1):
address of arr[1][0]=feffea8d and address of arr[1]=feffea8d
address of arr[1][1]=feffea8e
address of arr[1][2]=feffea8f

sizeof arr[0][0]=1 and sizeof arr[0]=3
sizeof arr[1][0]=1 and sizeof arr[0]=3

In our example array_multidim1.c, the array arr, declared as char arr[2][3], is a two-dimensional
array composed of two arrays of three char. Another way to say is the array arr holds two
arrays arr[0] and arr[1], each containing three elements of type char (see Figure III3 and
Figure III4). A two dimensional array can be viewed as a table (2x3 matrix) composed of
rows and columns as depicted in Figure III3 or as a linear table as sketched in Figure III4
that is the way a multidimensional array is actually laid out in memory.

We can see, as pointed out by our previous program, and represented by Figure III3 and
Figure III4, the addresses of arr[i][0] and arr[i] are identical (i taking the value 0 or 1 in our
example). However, do not confuse the objects arr[i][0] and arr[i]. The object arr[i] is a onedimensional array, whose size is 3 bytes, holding three objects of type char while the object
arr[i][0] is an object of type char whose size is one byte as highlighted by the program
array_multidim1.c.

Figure III3 Two-dimension array arr[2][3] viewed as a table


A better way to view a multidimensional array is a linear representation (real layout in
memory) as depicted in Figure III4.

Figure III4 Memory layout of a two-dimension array arr[2][3]


You can initialize a two-dimensional array at declaration time:


$ cat array_multidim2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[2][3] = {
{ 1, 2, 3 }, /* first array: array arr[0] */
{ 11, 12, 13 } /* second array: array arr[1] */
};

return EXIT_SUCCESS;
}

Which is equivalent to (but prone to errors):


$ cat array_multidim3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[2][3] = { 1, 2, 3 , /* first array: array arr[0] */
11, 12, 13 /* second array: array arr[1] */
};
return EXIT_SUCCESS;
}

Without comments, we have this:


$ cat array_multidim4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[2][3] = { 1, 2, 3, 11, 12, 13 };
return EXIT_SUCCESS;
}

Multidimensional arrays work in the same way as one-dimensional arrays. Elements in a


multi-dimensional array are accessed through their subscripts. In a two-dimensional array,
an element is determined by two indexes as shown below:
$ cat array_multidim5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[2][3] = {
{ 1, 2, 3 },
{ 11, 12, 13 }
};
printf( arr[0][0]=%d\n, arr[0][0]);
printf( arr[1][2]=%d\n, arr[1][2]);

return EXIT_SUCCESS;
}

$ gcc -o array_multidim5 -pedantic array_multidim5.c


$ ./array_multidim5
arr[0][0]=1
arr[1][2]=13

The Initialization of an array can be done quite after the declaration:


$ cat array_multidim6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[2][3];

/* init first array */
arr[0][0]=1;
arr[0][1]=2;
arr[0][2]=3;

/* init second array */
arr[1][0]=11;
arr[1][1]=12;
arr[1][2]=13;


printf( arr[0][0]=%d\n, arr[0][0]);
printf( arr[1][2]=%d\n, arr[1][2]);

return EXIT_SUCCESS;
}
$ gcc -o array_multidim6 -pedantic array_multidim6.c
$ ./array_multidim6
arr[0][0]=1
arr[1][2]=13

As we saw it for one-dimensional arrays, an element of a multidimensional array that has


not been initialized has an undefined value. Therefore, do not forget to set the elements in
your multidimensional arrays before using them.

In the following example, uninitialized elements of the initialized array arr take the default
value of 0:
$ cat array_multidim7.c

#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[2][3] = {
{ 1, 2 },
{ 11, 12, 13 }
};
printf( arr[0][2]=%d\n, arr[0][2]);
printf( arr[1][0]=%d\n, arr[1][0]);

return EXIT_SUCCESS;
}
$ gcc -o array_multidim7 -std=c99 -pedantic array_multidim7.c
$ ./array_multidim7
arr[0][2]=0
arr[1][0]=11

In the example above, the array arr[0] was initialized with only two values: the last element
arr[0][2] was not initialized. By default, it took the value of 0. Compare with the following
example:
$ cat array_multidim8.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[2][3];
printf( arr[0][2]=%d\n, arr[0][2]);
printf( arr[1][0]=%d\n, arr[1][0]);

return EXIT_SUCCESS;
}
$ gcc -o array_multidim8 -std=c99 -pedantic array_multidim8.c
$ ./array_multidim8
arr[0][2]=134548698
arr[1][0]=134614376

The elements in the uninitialized array arr have an undetermined value.



The last two examples show you that you have to initialize your arrays or setting values to
their items before using them.

At declaration, the first dimension can be omitted if the array is initialized while the
second dimension cannot be omitted even if you fully initialize the array. Here is an
example omitting the first dimension:
$ cat array_multidim9.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int arr[][3] = {
{ 1, 2 },
{ 11, 12, 13 }
};
printf( arr[0][2]=%d\n, arr[0][2]);
printf( arr[1][0]=%d\n, arr[1][0]);

return EXIT_SUCCESS;
}
$ gcc -o array_multidim9 -std=c99 -pedantic array_multidim9.c
$ ./array_multidim9
arr[0][2]=0
arr[1][0]=11

Figure III5 Three-Dimensional array arr[2][2][3] in a matrix representation


Now, let us talk about three-dimensional arrays. You will find out nothing new, they work
the same way as two-dimensional arrays. A three-dimensional array arr declared as type
arr[n][p][q] is an array of n two-dimensional arrays. Naturally, we would tend to view a
three-dimensional array as an nxpxq matrix (see Figure III5) though it is not the best way
to comprehend them. Figure III5 shows a 2x2x3 array viewed as a 3-D matrix.


Figure III6 Memory layout of the three-Dimensional array arr[2][2][3]




A more appropriate way to view a multidimensional array in C is the flat representation
that is the also memory layout of a multidimensional array (see Figure III6). A threedimensional array arr declared as
type arr[n][p][q]


where n 1, p 1, and q 1
could be viewed like this (Figure III6):
o arr is an array of n two-dimensional arrays.
o arr[i] is a pxq two-dimensional array, where 0 i n-1.
o arr[i][j] is a one-dimensional array composed of q elements, where 0 i n-1 and 0 j p1.
o arr[i][j][k] is an element, where 0 i n-1, 0 j p-1, and 0 k q-1.

The following example shows what said above and depicted in Figure III6:
$ cat array_multidim10.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char arr[2][2][3];
int i, j, k;

printf(== ADDRESSES ==\n);
printf(ARRAY arr:\n);
printf(&arr=%p\n, arr);

printf(\nARRAY arr[0]:\n);
printf(&arr[0]=%p\n &arr[0][0]=%p\n &arr[0][0][0]=%p\n, &arr[0], &arr[0][0], &arr[0][0][0]);

printf(\nARRAY arr[1]:\n);
printf(&arr[1]=%p\n &arr[1][0]=%p\n &arr[1][0][0]=%p\n, &arr[1], &arr[1][0], &arr[1][0][0]);

printf(\n\n== SIZES ==\n);
printf(sizeof arr=%d\n, sizeof arr);
printf( sizeof arr[0]=%d\n, sizeof arr[0]);
printf( sizeof arr[0][0]=%d\n, sizeof arr[0][0]);
printf( sizeof arr[0][0][0]=%d\n, sizeof arr[0][0][0]);

printf(\n sizeof arr[1]=%d\n, sizeof arr[1]);
printf( sizeof arr[1][0]=%d\n, sizeof arr[1][0]);
printf( sizeof arr[1][0][0]=%d\n, sizeof arr[1][0][0]);

return EXIT_SUCCESS;

}
$ gcc -o aray_multidim10 -std=c99 -pedantic aray_multidim10.c
$ ./aray_multidim10
== ADDRESSES ==
ARRAY arr:
&arr=feffea84

ARRAY arr[0]:
&arr[0]=feffea84
&arr[0][0]=feffea84
&arr[0][0][0]=feffea84

ARRAY arr[1]:
&arr[1]=feffea8a
&arr[1][0]=feffea8a
&arr[1][0][0]=feffea8a


== SIZES ==
sizeof arr=12
sizeof arr[0]=6
sizeof arr[0][0]=3
sizeof arr[0][0][0]=1

sizeof arr[1]=6
sizeof arr[1][0]=3
sizeof arr[1][0][0]=1

What we said about two-dimensional arrays holds true for multi-dimensional arrays. Here
is another example with a three-dimensional array:
$ cat array_multidim11.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
/* arr is a three-dimensional array holding 2 two-dimensional arrays */
char arr[2][3][2] = { /* 2 x two-dimensional arrays */
{ /* First array holding a 3 two-dimensional arrays of two items: arr[0] */
{ a, b }, /* arr[0][0] first one-dimensional array: 2 elements */
{ c, d }, /* arr[0][1] second one-dimensional array: 2 elements */
{ e, f } /* arr[0][2] Third one-dimensional array: 2 elements */
},


{ /* Second array of holding a 3x2 two-dimensional array: arr[1] */
{ A, B }, /* arr[1][0] first two-dimensional array: 2 elements */
{ C, D }, /* arr[1][1] second two-dimensional array: 2 elements */
{ E, F } /* arr[1][2] Third two-dimensional array: 2 elements */
}
};
printf(Displaying three-dimensional array 2x3x2 arr:\n);
printf(First two-dimensional array arr[0]:\n);
printf( First one-dimensional array arr[0][0]:\n);
printf( arr[0][0][0]=%c arr[0][0][1]=%c\n\n, arr[0][0][0], arr[0][0][1]);
printf( Second one-dimensional array arr[0][1]:\n);
printf( arr[0][1][0]=%c arr[0][1][1]=%c\n\n, arr[0][1][0], arr[0][1][1]);
printf( Third one-dimensional array arr[0][2]:\n);
printf( arr[0][2][0]=%c arr[0][2][1]=%c\n\n, arr[0][2][0], arr[0][2][1]);


printf(\nSecond two-dimensional array arr[1]:\n);
printf( First one-dimensional array arr[1][0]:\n);
printf( arr[1][0][0]=%c arr[1][0][1]=%c\n\n, arr[1][0][0], arr[1][0][1]);
printf( Second one-dimensional array arr[1][1]:\n);
printf( arr[1][1][0]=%c arr[1][1][1]=%c\n\n, arr[1][1][0], arr[1][1][1]);
printf( Third one-dimensional array arr[1][2]:\n);
printf( arr[1][2][0]=%c arr[1][2][1]=%c\n, arr[1][2][0], arr[1][2][1]);

return EXIT_SUCCESS;
}
$ gcc -o array_multidim11 -std=c99 -pedantic array_multidim11.c
$ ./array_multidim11
Displaying three-dimensional array 2x3x2 arr:
First two-dimensional array arr[0]:
First one-dimensional array arr[0][0]:
arr[0][0][0]=a arr[0][0][1]=b

Second one-dimensional array arr[0][1]:
arr[0][1][0]=c arr[0][1][1]=d

Third one-dimensional array arr[0][2]:
arr[0][2][0]=e arr[0][2][1]=f


Second two-dimensional array arr[1]:

First one-dimensional array arr[1][0]:


arr[1][0][0]=A arr[1][0][1]=B

Second one-dimensional array arr[1][1]:
arr[1][1][0]=C arr[1][1][1]=D

Third one-dimensional array arr[1][2]:
arr[1][2][0]=E arr[1][2][1]=F

More generally, an M-dimensional array declared as type arr[n1][n2][nM] is an array


containing n1 dimensional arrays of dimension M-1. That is, an array arr[i] is an array of
n2xxnM arrays where 0 i n1-1.

III.3 Pointers
III.3.1 Definition
A pointer is a memory location holding the memory address of an object (an object is a
memory area holding a value), hence the name pointer: a pointer is a variable that points
to an object (Figure III7).

Figure III7 Representation of a pointer


Introduced in this way, with no practical examples, you may wonder what kind of help we
could expect from them. In C, pointers are so handy that you could not work without
them. They are extensively used because they allow creating and manipulating high-level
objects (this will be described in the next chapters, mainly in Chapter VI in which we
explain how to create and work with your own data types). We will also use them to pass
data to functions or to work directly on it instead of a copy (detailed in Chapter VII and
Chapter VIII). For now, we are just trying to tame the concept that is so important in C
programming. Declaring a pointer is done is like this:
ptr_type *ptr_name

Where:
o ptr_name is a name (called identifier) identifying the pointer. It is made of letters,
underscores and digits starting with a letter or an underscore.
o ptr_type is the type of the object the pointer points to.

o The asterisk * declares a pointer, meaning the name appearing after is a pointer.

The following example declares pointers:
$ cat pointer1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float *fp; /* pointer to an object of type float */
int *ip; /* pointer to an object of type int */
unsigned int *uip; /* pointer to an object of type unsigned int */
char *s; /* pointer to an object of type character */

return EXIT_SUCCESS;
}

III.3.2 Memory addresses


Since a pointer is a variable holding the address of an object, how could we get the
address of an object in order to initialize a pointer? This can be done by using the addressof operator & as shown below:
$ cat pointer2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 10;
float f = 1.23;

printf(v holds value %d and has address %p\n, v, &v);
printf(f holds value %f and has address %p\n, f, &f);

return EXIT_SUCCESS;
}
$ gcc -o pointer2 -std=c99 -pedantic pointer2.c
$ ./pointer2
v holds value 10 and has address feffea8c
f holds value 1.230000 and has address feffea88

The memory address of the v variable is denoted by &v and the address of the f variable is

[24]
denoted by &f. We used the specifier %p to show the addresses held in pointers
. More
generally, to get the address of an object named obj_name, precede it by an ampersand:
&obj_name.

III.3.3 Null pointers


In C, a special pointer constant, called a null pointer constant, indicates a pointer does not
point to an object but to nothing that can store a value. A null pointer constant is a
constant expression (see Chapter IV IV.14) that evaluates to 0 (integer constant
expression) or (void*)0 (address constant expression): for example, 0, 2-2, 0*8 are constant
expressions that evaluates to 0. The implementation chooses the null pointer constant as 0
or (void *)0. The macro NULL, representing the null pointer constant, is defined in the
standard header file stdlib.h.

A null pointer constant cast to a given pointer type is known as a null pointer. When a null
pointer constant is cast to a pointer type, it is called a null pointer. For example, if you
declare the pointer p as float *p = NULL, p will be set to a null pointer (i.e. (float *)0) that has
type float *. This means there is a null pointer for each pointer type: null pointer of type char
*, null pointer of type float *

Whatever the representation of null pointers, the following rules are always true:
o A null pointer compares unequal to a pointer pointing to an object or a function. This is
an important rule. It means null pointers allow us to set pointers to indicate they do not
have to be used to get or set values. This avoids having uninitialized pointers (invalid
pointer) that can hold any address that may represent no objet: uninitialized pointers may
point anywhere! A null pointer assigned to a pointer tells the program Do not attempt to
access this pointer. It does not point to an object.
o A null pointer, whatever its type, can be converted to a null pointer to another type. Two
null pointers compare equal even if their types are different. For example, if p and q are
declared as int *p=NULL and float (*q)[10] = NULL, the expression p == (int *)q is true. This does
not mean all null pointers hold the same value: as their types are different, their internal
representation may then differ. Whether null pointers may not have the same internal
representation should not worry you since the compiler knows when it deals with null
pointers and performs the appropriate conversions.

III.3.4 Initializing a pointer


Now you know that a pointer stores a memory address, you might think you could have
[25]
access to any address of the computers memory. This is not true
:
o Your program does not have access to the whole memory of your computer. The UNIX
system and most of modern operating systems use the concept of virtual memory that

give the illusion that your program uses the entire main memory but this is not true.
o Your program when run becomes a process that will be has a specific address space split
into several areas. Some areas are read-only and then if you try to modify them your
program will crash.

This means you should not set a pointer to any address. That is, you should avoid
initializing a pointer with any integer literal as in the following example:
$ cat pointer3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p = 10;

printf(p holds address %p\n, p);

return EXIT_SUCCESS;
}
$ gcc -o pointer3 -std=c99 -pedantic pointer3.c
$ ./pointer3
pointer3.c: In function main:
pointer3.c:5:12: warning: initialization makes pointer from integer without a cast
p holds address a

You may think it worked. Yes but it did nothing: we just set the value of the pointer p to
the address 10 and printed the value in the pointer p. You can notice the compiler
complained: in our code, the variable p is a pointer to an int while the integer literal 10 is a
numeric value that is not a pointer. The compiler did an implicit type casting and
generated a warning telling you please check this doubtful assignment. You can be more
specific to avoid such a warning telling the compiler Yes, I do know what I am doing.
Please go ahead:
$ cat pointer4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p = (int *)10;

printf(p holds address %p\n, p);

return EXIT_SUCCESS;

}
$ gcc -o pointer4 -std=c99 -pedantic pointer4.c
$ ./pointer4
p holds address a

No warnings generated by the code pointer4.c at compilation time. What did we do? We just
explicitly cast the integer literal 10 to the expected type: (int *)10 tells the compiler that the
integer literal 10 is not a mere integer but a pointer to int or another way to say it is the
literal 10 is an address referencing a memory location holding an int. Thus, the type of (int
*)10 is the same as that of the pointer p. Always be cautious when you resort to explicit
casts: this will bypass warnings of the compiler but can be a cause of bugs. Our program
generated no warnings but still suffers a big problem: the address 10 is illegal as it is not
allocated by the operating system, it is an arbitrary value: it is an invalid pointer. What
happens if we try to access it? Run this:
$ cat pointer5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p = (int *)10;

printf(p holds address %p\n, p);
printf(Value referenced by pointer p %d\n, *p);

return EXIT_SUCCESS;
}
$ gcc -o pointer5 -std=c99 -pedantic pointer5.c
$ ./pointer5
p holds address a
Segmentation Fault (core dumped)

Invalid pointers do not point to valid objects. If you try to access an invalid address, your
program will have an undetermined behavior messing the memory. The second printf()
function crashed our program because we tried to access an illegal address (Segmentation
Fault error).

The variable p is a variable holding the address of an object while *p is the object itself: *p
represents the contents of the memory location pointed to by the pointer p. The operator *
means the contents of the memory block identified by the address held in a pointer.


Figure III8 Relationship between a pointer and the object it references


So, remember that you do not have to manage the memory of the computer, just use the
memory that the

The first way of initializing a pointer is to work with addresses of existing objects by
using the address-of operator & as in the following example in which we assign the
address of the variable v to the pointer p (depicted in Figure III8)
$ cat pointer6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 21;
int *p = &v;


printf(variable v holds value %d and has address %p\n, v, &v);
printf(pointer p holds value %p and points to value %d\n, p, *p);

return EXIT_SUCCESS;
}
$ gcc -o pointer6 -std=c99 -pedantic pointer6.c
$ ./pointer6
variable v holds value 21 and has address feffea88
pointer p holds value feffea88 and points to value 21

If pointers were used only to store addresses of existing objects (allocated by the compiler
at compile time), they would not be conceived! Obviously, we can imagine they can do
more for programmers Suppose you wrote a C program that read a file holding
information on customers stored into arrays as we studied it previously. Suppose you had
one hundred customers: obviously, you created arrays with a size larger than one hundred;
lets say two hundred. At the time you created your program you imagined that your
arrays were big enoughWhat happens if the number of customers grows to two hundred
and one? You program will fail. Therefore, you have to allocate memory dynamically.

Using addresses of existing objects, as described earlier, may be useful but do not enable
to write programs working with dynamic data: existing objects are known at compilation
time. The problem is your program may need much more objects depending on events.
You could use arrays but arrays cannot be resized once created: once your array of two
hundred elements has been created, you could not insert the 201th element. Fortunately,
and this is what makes pointers so useful, there is another way to initialize a pointer: using
the malloc() function that is part of the C standard library, declared in the system header file
stdlib.h.

The malloc() functions requests the operating system a piece of available memory and
returns a pointer to the allocated memory area. This method allows you to get dynamically
memory according to the needs. Let us start smoothly with malloc():
$ cat pointer7.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int *p = malloc( sizeof(int) );

*p = 10;
printf(pointer p holds value %p and points to value %d\n, p, *p);

*p = 19;
printf(pointer p holds value %p and points to value %d\n, p, *p);

return EXIT_SUCCESS;
}
$ gcc -o pointer7 -std=c99 -pedantic pointer7.c
$ ./pointer7
pointer p holds value 8061010 and points to value 10
pointer p holds value 8061010 and points to value 19

In this example, the call malloc(sizeof(int)) allocates a piece of memory of size of an int and
returns its address. That is, the operating system will allocate a memory area that can store
an object of type int. Once the pointer references a valid address, you can work with it
safely. In our example, the allocated memory lied at address 8061010. Take note that at each
execution of the executable, the address may change: it is not fixed since memory is
dynamically allocated.

The statement *p = 10 stores the value of 10 in the memory location pointed to by the
pointer p. Likewise, the statement *p = 19 stores the value of 19 in the memory location
pointed to by the pointer p.

We used so far the symbol * to declare a pointer and to access the value a pointer points to.
When used with a pointer, it is a unary operator. This symbol * also denotes the
multiplication operator: it is then an operator requiring two operands (binary operator). So,
do not confuse them:
o If p and q are variables holding numbers, the statement x=q*p is a multiplication operation
(two operands), it has nothing to do with pointers. The operand p and q have numeric
values.
o If p has been declared as a pointer, the statement x=*p stores the value pointed to by the
pointer p: it is not a multiplication operation. The operator * applies to the operand that
follows it. In this case, the operand must a pointer.

Contrast the following example:
$ cat pointer8.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int p = 5;
int x = *p;

printf(x=%d\n, x);

return EXIT_SUCCESS;
}
$ gcc -o pointer8 -std=c99 -pedantic pointer8.c
pointer8.c: In function main:
pointer8.c:6:11: error: invalid type argument of unary * (have int)

With:
$ cat pointer9.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int v = 5;
int *p = &v;
int x = *p;

printf(x=%d\n, x);

return EXIT_SUCCESS;
}
$ gcc -o pointer9 -std=c99 -pedantic pointer9.c
$ ./pointer9
x=5

The program pointer8.c failed because the compiler expected a pointer while we gave it an
int. The statement int x =*p is illegal.

Let us take one step further. Consider now the following example:
$ cat pointer10.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int n = 5;
int *p = malloc( n * sizeof(int) );

return EXIT_SUCCESS;
}

What does it means? The call malloc(n * sizeof(int)) dynamically allocates a contiguous piece

of memory that can store n elements of type int. Since n holds the value 5, the pointer p
points to a memory area that can take five numbers of type int. It becomes very interesting,
such a pointer looks like an array

You may think we could have declared our pointer p as char p[5], we would have gotten the
same result. The output would have been the same but there are differences. In program
pointer10.c, the memory area is dynamically allocated, which means the allocation is done
while the program is running not at compile time. The second big difference is our
memory area can be resized while the size of an array cannot change (we will explain it
soon). The third difference is we can free the memory allocated when we no longer need
it. We will find out throughout the book other differences between arrays en pointers.

In our previous example, we allocated a memory area composed of five elements of type
int: malloc() returned a pointer to it. The question is if a pointer pointing to a memory area
can store several elements, how can we access each element? The answer is not so
obvious because the pointer holds only one address not the location of all the elements.
Let us give a clue: the pointer holds the location of the memory area that is also the
address of the first element. This implies that if the pointer p contains the address of the
first element (let us call it addr) and as the allocated memory area is contiguous, the second
element is at address addr+sizeof(int), the third at addr+2*sizeof(int)as depicted in Figure III9.
At this stage, you may think that since a pointer is a variable holding the address of the
first element (we called it addr) then the first element should logically also be at address p,
the second one at address p+sizeof(int), and so on. This seems to be obvious since p holds the
value addr but in C, things are different because pointer arithmetic comes into play




Figure III9 Memory allocation with malloc()



The reasoning is mathematically valid but is not true in C! Why? Because the compiler
does not process a pointer as a mere numeric value even though it holds an integer number
representing an address. For the compiler, a pointer is also bound to the type of the object
it points to: a pointer is not an integer type; it is more than a variable holding an address.
In C, a pointer has two attributes: an address and a type it points to. Thus, if the compiler
encounters a pointer in an addition or a subtraction operation such as p+1, it translates it to
addr+sizeof(obj_type). This is known as pointer arithmetic. More generally, if p is a pointer
(holding addr) to an object obj of type obj_type, the operation pi is converted to addr

i*sizeof(obj_type) by the compiler. It is interesting is to note if p is a pointer and i an integer

value, the addition p+i works in pointer context (pointer arithmetic) and then also returns a
pointer: keep it in mind.

Why doing such a conversion? Previously, we came to the conclusion that if p, holding the
value addr, is the address of the allocated contiguous memory area that is also the address
of the first element, addr+sizeof(obj_type) is the address of the second elementand then addr+
(i-1)*sizeof(obj_type) is the address of ith element (counting from 1). Since the compiler
converts pointers when encountered in addition and subtraction operations, this means the
first element is at address p, the second one at address p+1, the third at p+2and the ith
element at p+i-1. This is a good news because they you do have to work with addresses.
Working with addresses should be avoided because the size of an address held in a pointer
depends on computers and then is not portable. The following example sets and displays
the first and second items of the memory area pointed to by p:
$ cat pointer11.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int n = 5;
int *p = malloc( n * sizeof(int) ); /* allocates memory for 5 items of type int */

*p = 1;
*(p+1) = 2;

printf(first element=%d \n, *p);
printf(second_element=%d\n, *(p+1));


return EXIT_SUCCESS;
}
$ gcc -o pointer11 -std=c99 -pedantic pointer11.c
$ ./pointer11
first element=1
second_element=2

The C language allows you use array subscripts with pointers. The following example is
equivalent to the previous one:
$ cat pointer12.c
#include <stdlib.h>
#include <stdio.h>


int main(void) {
int *p = malloc( 5 * sizeof(int) ); /* allocates memory for 5 items of type int */

p[0] = 1;
p[1] = 2;

printf(first element=%d \n, p[0]);
printf(second_element=%d\n, p[1]);


return EXIT_SUCCESS;
}
$ gcc -o pointer12 -std=c99 -pedantic pointer12.c
$ ./pointer12
first element=1
second_element=2

In summary, if p is a pointer to a memory area composed of several items:


o p is a pointer to the memory area
o p is also a pointer to the first object of the memory area
o p[0] holds the value of the first item of the memory area: p[0] is synonym for *p
o p+i is a pointer to the ith item of the memory area (counting from 0)
o p[i] and *(p+i) hold the value of the ith item of the memory area (counting from 0).
o The compiler converts p[i] to *(p+i).

Remember that even if pointers and arrays use the same notation, they are two different
types: a pointer is not an array. This will be detailed the subsequent sections.

Figure III10 Representation of a pointer to int


We also draw your attention that pointers cannot be used in any numeric operations: you
cannot use pointers in multiplications and divisions. You can add or subtract an integer to
a pointer yielding a pointer, and you can subtract two pointers of the same type to get the
number of elements between the given pointers. The following example shows you that
the addition operation also returns a pointer of the same type. The example pointer13.c is
equivalent to pointer12.c (see Figure III10):
$ cat pointer13.c
#include <stdlib.h>

#include <stdio.h>

int main(void) {
int *p = malloc( 5 * sizeof(int) ); /* allocates memory for 5 items of type int */

int *p_first_element = p;
int *p_second_element = p + 1;

*p_first_element = 1;
*p_second_element = 2;

printf(first element=%d \n, p[0]);
printf(second_element=%d\n, p[1]);

return EXIT_SUCCESS;
}
$ gcc -o pointer13 -std=c99 -pedantic pointer13.c
first element=1
second_element=20

Explanation:
o The statement int *p=malloc(5*sizeof(int)) allocates a contiguous memory area that can store
five numbers of type int. The pointer p stores the address of the first element.
o The statement int *p_first_element=p declares p_first_element as a pointer to an int and
initializes it to the value held in the pointer p. It points to the first element of a memory
area.
o The statement int *p_second_element=p+1 declares p_second_element as a pointer to an int and
initializes it to the value held in the pointer p+1. It points to the second element.
o The statement *p_first_element=1 assigns the element pointed to by the pointer p_first_element
to the value of 1.
o The statement *p_second_element=2 assigns the element pointed to by the pointer
p_second_element to the value of 2.
o The printf(first element=%d \n, p[0]) statement displays the value of the first element.
o The printf(second_element=%d\n, p[1]) statement displays the value of the second element.

This simple example shows us a very important subtlety that could make you crazy if you
do not understand it at the beginning of your learning. You have noticed that the pointer
p_first_element points to same object as the pointer p and the pointer p_second_element points to
the same object as the pointer p+1. This means that they have access to the same object.
However, the pointer p_first_element is not the pointer p and the pointer p_second_element is not
the pointer p+1. They are actually two different pointers pointing to the same object. To

allows you understand clearly the subtlety, consider the following example:
$ cat pointer14.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int *p = malloc( 5 * sizeof(int) ); /* allocates memory for 5 items of type int */

int *q = p;

*p = 1;

printf(p holds %p and points to %d but p is at address %p\n, p, p[0], &p);
printf(q holds %p and points to %d but q is at address %p\n, q, q[0], &q);

return EXIT_SUCCESS;
}
$ gcc -o pointer14 -std=c99 -pedantic pointer14.c
$ ./pointer14
p holds 8061068 and points to 1 but p is at address feffea8c
q holds 8061068 and points to 1 but q is at address feffea88

The above example shows that both the pointers p and q points to the same memory area.
The memory area lied at memory address 8061068. This implies that you can access the
memory area equally through the pointer p or q (Figure III11). The example also shows
that the pointer p is different from the pointer q: they have two different addresses meaning
they represent two different objects (p and q are two distinct variables). This means that we
could assign another value to the pointer q without altering the pointer p as in the
following example:
$ cat pointer15.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int *p = malloc( 5 * sizeof(int) ); /* allocates memory for 5 items of type int */
int *r = malloc( 2 * sizeof(int) ); /* allocates memory for 2 items of type int */

int *q = p;
*p = 1;

printf(p holds %p and points to %d but p is at address %p\n, p, p[0], &p);
printf(q holds %p and points to %d but q is at address %p\n, q, q[0], &q);


q = r;
r[0]=10;

printf(\np holds %p and points to %d but p is at address %p\n, p, p[0], &p);
printf(r holds %p and points to %d but r is at address %p\n, r, r[0], &r);
printf(q holds %p and points to %d but q is at address %p\n, q, q[0], &q);

return EXIT_SUCCESS;
}
$ gcc -o pointer15 -std=c99 -pedantic pointer15.c
$ ./pointer15
p holds 8061160 and points to 1 but p is at address feffea6c
q holds 8061160 and points to 1 but q is at address feffea64

p holds 8061160 and points to 1 but p is at address feffea6c
r holds 8061968 and points to 10 but r is at address feffea68
q holds 8061968 and points to 10 but q is at address feffea64

As we explained it several times, your objects should always be set to valid values before
using them. An uninitialized pointer is an invalid pointer that may have any value. What
default value could we give to a pointer that we want to initialize with a valid address later
in our program? A corollary of the question is how could we know that a pointer has been
properly initialized or not? That is, how could we know that we could use safely a pointer?
Every time you declare a pointer, initialize it with an address of an existing object, with a
memory allocation function such as malloc() or just set it to the default value NULL. The
macro NULL, representing a null pointer constant, is defined in the standard header file
stdlib.h. A null pointer indicates there is no object pointed to: a null pointer points to no
object. Accordingly, before accessing an object pointed to by a pointer, just check if it
holds the NULL value: if yes, do not attempt dereference it with the operator *. The
following example initializes the pointer q to NULL:
$ cat pointer16.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int *q = NULL;

return (EXIT_SUCESS);
}

We said previously that the malloc() function returned a pointer to the allocated memory
block but this not always true. It may happen that malloc() cannot allocate memory, in this

case, it returns a null pointer. Thats why, you will have to check the return value of the
function. If the returned pointer compares equal to NULL, it means you cannot work with
it. From now, we will check the pointer return by the malloc() function as shown below:
if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

In your programs, after calling malloc(),check if the returned pointer is valid. If the pointer
compares equal to NULL, the program could print a warning message and ends with the
exit code EXIT_FAILURE.

Figure III11 Pointers p and q referencing the same object

If you attempt to access a pointer holding the value NULL, your program will crash.

III.3.5 Accessing an object through a pointer


We have already talked about how to access pointers. In this section, we just review with
additional explanations what we explained earlier. A pointer is a variable holding the
address (sometimes called a reference) of an object. You can access the pointer itself by
using its name as you would do with any variable. Thus, in the statement p = &v, the
pointer p is considered a container (left side of =) in which a value is placed while in the
statement q = p, the pointer p (in the right side of =) represents the value it holds (an
address).

However, here is the thing: a pointer has a double meaning. It is more than a simple
address. It references an object. To have access to the object the pointer p references, just
place the dereferencing operator * before the pointer: *p is the object the pointer p
[26]
references
. Conversely, if obj is an object, to get its address, just place the reference
operator & before the object name. Thus, &obj is a pointer to obj (see Figure III8). For
example, if v is a variable of type int, &v is a pointer to int. Conversely, if r is pointer to a
float, *r is a float

We have also seen that a pointer could reference a memory area composed of several
items. In such a case, the pointer p references the very first item, p+1 the second one
Which means, that *p is the first item, *(p+1) denotes the second itemThere is another
method to access a pointer that is also extensively used: accessing a pointer as an array.
Though a pointer is not array, you can resort to array subscripts to have access to objects
in memory area pointed to by a pointer: p[0] is a synonym for *p, *(p+1) is a synonym for p[1]
which implies &p[0] is a synonym for p, &p[1] is a synonym for p+1 as shown below:
$ cat pointer17.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
long *p = malloc( 2*sizeof(long) ); /*allocates memory for 2 items of type long*/
if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);

}

p[0] = 1;
p[1] = 2;
printf(size of a long=%d\n, sizeof(long));
printf(p[0]=%ld *p=%ld , p=%p &p[0]=%p\n, p[0], *p, p, &p[0]);
printf(p[1]=%ld *(p+1)=%ld , p+1=%p &p[1]=%p\n, p[1], *(p+1), p+1, &p[1]);

return EXIT_SUCCESS;
}
$ gcc -o pointer17 -std=c99 -pedantic pointer17.c
$ ./pointer17
size of a long=4
p[0]=1 *p=1 , p=8061090 &p[0]=8061090
p[1]=2 *(p+1)=2 , p+1=8061094 &p[1]=8061094

In the example above, we can notice that in our computer the type long fits in 4 bytes: the
address stored in p is 8061070, and the pointer p+1 holds the address 8061074. The rationale, if
you remember what we said in the previous section, is the pointer p+1 is converted to
addr+sizeof(long) by the compiler. Take note that the array operator [] takes precedence over
the address-of operator &: &(p[i]) means &p[i] that is the address of the object p[i]: &(p[i]) is
equivalent to p+i.

You may remember that in C, you can use negative subscripts to access items. The
rationale is the array notation is translated to a pointer notation by the compiler: p[-1] is
converted to *(p-1) as shown below:
$ cat pointer18.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int *p = malloc( 5 * sizeof(int) ); /* allocates memory for 5 items of type int */

if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

int *p_second_item = p + 1;
int *p_first_item = p_second_item - 1;

p[0] = 12;

p[1] = 98;


printf(p[0]=%d address=%p\n, p[0], &p[0]);
printf(p_second_item[-1]=%d address=%p\n, p_second_item[-1], &p_second_item[-1]);
printf(p_first_item[0]=%d address=%p\n, p_first_item[0], & p_first_item[0]);

return EXIT_SUCESS;
}
$ gcc -o pointer18 -std=c99 -pedantic pointer18.c
$ ./pointer18
p[0]=12 address=8061088
p_second_item[-1]=12 address=8061088
p_first_item[0]=12 address=8061088

In the example above, we could access any element from the second item p_second_item
even the first one. The first element can be denoted by p_first_item[0], p[0], or p_second_item[-1].

Do not use illegal subscripts. If you have created a memory area, holding n objects, pointed to by the
pointer p, do not try to access the element p[n]: the index is out of range. It should be in the range [0,n-1]

III.3.6 Freeing a pointer


The malloc() function dynamically allocates memory to your program and returns a pointer.
If the return pointer compares equal to NULL, it means the function failed to get free
memory. In this case, of course, the pointer is not useable. However, if the memory
allocation succeeds, you will be returned a valid pointer to a memory area. If your
program consumes a lot of memory and never releases it, there may be memory shortage:
your program may crash and could disrupt other running processes requesting memory.
You should always think about freeing memory each time you allocate it: it is good
practice to determine when allocated memory can be freed. The function free() relinquishes
the memory area pointed to by the given pointer as shown in the following example:
$ cat pointer19.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int *p = malloc( 5 * sizeof(int) ); /* allocates memory for 5 items of type int */


if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

p[0] = 12;

printf(p[0]=%d address=%p\n, p[0], &p[0]);
free(p);
p = NULL;

return (EXIT_SUCCESS);
}

In our example above, we freed the allocated memory pointed to by the pointer p. After
you release a pointer, it is best practice to set it to the NULL value indicating the pointer is
no longer valid. Take not that if you provide a null pointer to the free() function, it does
nothing.

Do not pass a pointer that was not returned by the malloc() function
The following program is not correct:
$ cat pointer20.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int *p = malloc( 5 * sizeof(int) ); /* allocates memory for 5 items of type int */

if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

int *p_second_item = p + 1;

p[0] = 12;
printf(p[0]=%d address=%p\n, p[0], &p[0]);

free(p_second_item);

[27]
to the free() function.

return EXIT_SUCCESS;
}

The above example frees the memory area pointed to by the pointer p_second_item that is not
the beginning of the allocated memory.

The following example is a heresy:
$ cat pointer21.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int v = 10;
int *p = &v;

free(p);

return EXIT_SUCCESS;
}

Here is the third thing to avoid: do not reuse a pointer released by the free() function. A
pointer relinquished by free() becomes an invalid pointer. The following example seems to
work but it actually upsets the memory of your program: it would crash if it were more
complex and had to run for a long time.
$ cat pointer22.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int *p = malloc( 5 * sizeof(int) ); /* allocates memory for 5 items of type int */

if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

p[0] = 12;

printf(p[0]=%d address=%p\n, p[0], &p[0]);
free(p);

p[0] = 13;
printf(p[0]=%d address=%p\n, p[0], &p[0]);

return (EXIT_SUCCESS);
}
$ gcc -o pointer22 -std=c99 -pedantic pointer22.c
$ ./pointer22
p[0]=12 address=8061038
p[0]=13 address=8061038

To avoid reusing pointers that have been freed, always set them to a pointer as in example
pointer19.c.

Keep in mind that setting a pointer to another value does not free the allocated memory:
$ cat pointer23.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int *p = malloc( 5 * sizeof(int) ); /* allocates memory for 5 items of type int */

if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

p[0] = 12;

printf(p[0]=%d address=%p\n, p[0], &p[0]);
p = NULL;

return (EXIT_SUCCESS);
}

The example pointer23.c does not free the allocated memory, it just loses the reference to the
allocated memory (causing memory leak). If you do that, the memory will remain
allocated until the program terminates.

If possible, write the statement that releases allocated memory at the same time you write
code that allocates it. Thus, you will not forget to free unused memory. Memory blocks
remain allocated until you free them with the free() function or at the termination of the
program. When your program terminates all the resources (including allocated memory

blocks) that it uses will be relinquished.

III.3.7 void * pointer


III.3.7.1 Definition
The void * pointer type is a special type used to represent any pointer. Why introducing
such a type in C? It happens that the type of an object that a pointer points to is not known.
For example, if you have a look at the declaration of the malloc() function, you will see
something like this:
void *malloc(size_t s);

We can see two special types that we have not talked about so far. The type size_t is defined
in the header file stdlib.h. It is an unsigned integer measuring the size of an object (in
bytes). The sizeof operator returns an integer number of type size_t. The argument s of the
malloc() function denotes the number of bytes of the memory area to be allocated. As matter
of fact, it is not a new basic type but an alias: we will explain how to create aliases of
existing types later. In 64-bit computer, size_t is usually an alias for unsigned long. The size s
is the size of a type or that of an object itself.

The type void * is very interesting. It is a pointer to an object of unknown type. The
malloc() function reserves a memory space having the requested size s. It does not need to
know what you will put in it: if you request four bytes, it will allocate four bytes: you will
be able to put an integer, a floating-point number, four characters it is up to you. Of
course, the pointer void * will be cast to a known type later in order to work with it. For
example, the statement int *p = malloc(sizeof(int)) allocates memory to an object of type int but
the type of the pointer returned by malloc() does not remain as a void *, it is implicitly cast to
type int *.

Remember the malloc() function does not always return a valid pointer. If the function
cannot allocate memory, a null pointer is retuned.

Please, take note that in some examples (pointer7.c, pointer11.c, pointer12.c, pointer13.c, pointer14.c,
and pointer15.c), we assumed the malloc() function returned a valid pointer (that is not a null
pointer) without checking the returned value. We prefer explaining smoothly new concepts
with very simple examples without complicating them with too many details when
introducing them. As far as you are concerned, in your code, you have to check the pointer
returned by malloc().

III.3.7.2 Usage
The void * pointer is subject to some constraints. Since its type is unknown, you cannot use
it to access objects unless you cast it. For example, you cannot access an object it points to
by dereferencing it with * or using the subscript operator []. The following example will

not compile:
$ cat void_ptr1.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int v = 10;
void *p = &v;

printf(%d\n, *p);

return EXIT_SUCCESS;
}
$ gcc -o void_ptr1 -std=c99 -pedantic void_ptr1.c
void_ptr1.c: In function main:
void_ptr1.c:8:18: warning: dereferencing void * pointer
void_ptr1.c:8:3: error: invalid use of void expression

The following example will not compile either:


$ cat void_ptr2.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int v = 10;
void *p = &v;

printf(%d\n, p[0]);

return EXIT_SUCCESS;
}
$ gcc -o void_ptr2 -std=c99 -pedantic void_ptr2.c
void_ptr2.c: In function main:
void_ptr2.c:8:19: warning: pointer of type void * used in arithmetic
void_ptr2.c:8:19: warning: dereferencing void * pointer
void_ptr2.c:8:3: error: invalid use of void expression

While the following example will work:


$ cat void_ptr3.c
#include <stdlib.h>

#include <stdio.h>

int main(void) {
int v = 10;
void *p = &v;

printf(%d\n, *(int *)p);
printf(%d\n, ((int *)p)[0]);

return EXIT_SUCCESS;
}
$ gcc -o void_ptr3 -std=c99 -pedantic void_ptr3.c
$ ./void_ptr3
10
10

Any pointer to an object can be converted to void * and back to its original type without
losing data. In the following example, the pointer p that is of type float * is converted void *
and then back to float *:
$ cat void_ptr4.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
float * p = malloc( 2*sizeof(float) );
void *q;
float *r;
p[0] = 10.1; p[1]= 9.7;

q = p; /* float * converted to void */
r = q; /* void * converted to float */

printf(%f %f\n, r[0], r[1]);

return EXIT_SUCCESS;
}
$ gcc -o void_ptr4 -std=c99 -pedantic void_ptr4.c
$ ./void_ptr4
10.100000 9.700000

III.3.8 Sizeof operator and pointers

The sizeof operator returns the size of an object or a type. If you pass a type, do not forget
to enclose it between parentheses. For example:
$ cat size1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
long long i;

printf(sizeof(long long)=%d, sizeof(i)=%d\n, sizeof(long long), sizeof i);

return (EXIT_SUCCESS);
}
$ gcc -o size1 -std=c99 -pedantic size1.c
$ ./size1
sizeof(long long)=8, sizeof(i)=8

It is interesting to note it also holds true for pointers:


$ cat size2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
double *p = NULL;

printf(size of double=%d, size of object=%d\n, sizeof(double), sizeof *p);

return (EXIT_SUCCESS);
}
$ gcc -o size2 -std=c99 -pedantic size2.c
$ ./size2
size of double=8, size of object=8

Very interestingAt compile time, the sizeof operator evaluates to an integer constant that
represents the size of the operand. It means, sizeof *p represents the size of the object
pointed to by p even though the pointer points to nothing meaningful. Accordingly, the
statement int *p = malloc(10*sizeof(int)) can also be written int *p = malloc(10*sizeof *p). The
compiler will replace *p by the type of the object the pointer p points to. Why is it
interesting? If you change the type referenced by a pointer, you do not need to change it in
malloc() calls: you will have to do it only once, at the declaration of the pointer. This will
save time and avoid you many errors.

This also works with pointers to pointer as in the following example:


$ cat size3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
double **p = malloc( 2 * sizeof *p );

p[0] = malloc( 3 * sizeof **p);
p[1] = malloc( 3 * sizeof **p);

return (EXIT_SUCCESS);
}

In this example, p is a pointer to memory area holding two pointers to type double (p is a
pointer to type double *, p is a pointer to pointer to double), and then *p is a pointer to type
double. This implies, p = malloc( 2*sizeof(double *) ) can be replaced by p = malloc(2 * sizeof *p). In
the same way, p[0] = malloc(3 * sizeof **p) is equivalent to p[0] = malloc( 3 * sizeof(double) ).

III.3.9 Const and pointers


In Chapter II, we introduced the const qualifier that makes a variable read-only. Normally, a
const variable should not be modified by an indirect mean. Otherwise, the result would be
undefined. The following example modifies the value of a const variable through a pointer
(it does not conform to the C standard):
$ cat pointer_const1a.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
const int v = 10;
int *p = (int *)&v;

printf(v=%d\n, v);
*p = 20;
printf(v=%d\n, v);

return EXIT_SUCCESS;
}
$ gcc -o pointer_const1a -std=c99 -pedantic pointer_const1a.c
$ ./pointer_const1a
v=10

v=20
&v is a pointer to const int. Therefore, the statement int *p = (int *)&v makes an explicit cast to
int *. We can see though the variable v was qualified as const, it could be altered through the

pointer p. The program shows that the const qualifier may not protect against writes. The
program pointer_const1a.c worked in our computer but you should never do something like
this: the behavior is classified as undefined by the C standard, which means its result is
unpredictable and then not portable. Our program was compiled with no error message
because we used an explicit cast. If you remove the explicit cast and write int *p =&v
(implicit cast), you will get a warning message:
$ cat pointer_const1b.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
const int v = 10;
int *p = &v;

printf(v=%d\n, v);
*p = 20;
printf(v=%d\n, v);

return EXIT_SUCCESS;
}
$ gcc -o pointer_const1b -std=c99 -pedantic pointer_const1b.c
pointer_const1b.c: In function main:
pointer_const1b.c:6:12: warning: initialization discards qualifiers from pointer target type


The const qualifier can also be used with a pointer either to make the referenced objet readonly or to make the pointer itself read-only. To make a pointer read only, just place the
modifier const after the asterisk *. For example, the declaration int *const p makes the pointer
p read-only while const int *p or int const *p means p is a pointer to const int.

The following example makes the pointer p read-only. That is, the pointer p cannot be
modified:
$ cat pointer_const2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int * const p = malloc(10 * sizeof(int) );

int v = 10;

if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

p=&v;
printf(%s\n, p);

free(p);
return EXIT_SUCCESS;
}
$ gcc -o pointer_const2 -std=c99 -pedantic pointer_const2.c
pointer_const2.c: In function main:
pointer_const2.c:13:3: error: assignment of read-only variable p

The compilation failed because we attempted to modify the pointer p that was declared as
a constant pointer.

The following example makes the object pointed to by the pointer q read-only (q points to
elements of type const int):
$ cat pointer_const3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

int *p = malloc(2*sizeof(int) );
const int *q = p;/* q points to const int */

if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

p[1] = 20;
printf(q[1]=%d\n, q[1]);

p[1] = 40;
printf(q[1]= %d\n, q[1]);


free(p);
return EXIT_SUCCESS;
}
$ gcc -o pointer_const3 -std=c99 -pedantic pointer_const3.c
$ ./pointer_const3
q[1]=20
q[1]=40

It works fine as long as we make modification through the pointer p but if we try to make
modifications through the pointer q, we get an error:
$ cat pointer_const4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

int *p = malloc(2*sizeof(int) );
const int *q = p;

if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return EXIT_FAILURE;
}

q[1] = 20;
printf(q[1]=%d\n, q[1]);

free(p);

return EXIT_SUCCESS;
}
$ gcc -o pointer_const4 -std=c99 -pedantic pointer_const4.c
$ ./pointer_const4
pointer_const4.c: In function main:
pointer_const4.c:14:3: error: assignment of read-only location *(q + 4u)

The example shows that the same object can be modified through the pointer p while it
cannot through the pointer q.

Generally, the const qualifier is used in function declarations to tell the programmer the
function will not modify the object pointed to by the pointer you pass to it. For example,

the declaration int myfunc(char *s2, const char *s1) indicates the string pointed to by s1 will not
be modified by the function myfunc().

III.3.10 Arrays and pointers


You have guessed that, in C, pointers and arrays are closely connected. The rationale is the
compiler translates arrays to pointers except in the following cases:
o The array is an operand of the sizeof operator. If the array arr contains n element of type
obj_type, sizeof arr evaluates to n * sizeof(obj_type). In contract, if p is a pointer, sizeof p evaluates
to size of the pointer whatever is the type it points to.
o The identifier appearing on the left side of the assignment operator (=): p = something. This
is not allowed for arrays while permitted for pointers.

Thus, the identifier of an array appearing in expressions is converted to a pointer to the
first element:
int arr[10];
int *p;
p = arr; /* arr converted to &arr[0] */
p = arr + 1; /* arr converted to &arr[0] and p points to the second element */

Which is equivalent to:


int arr[10];
int *p;
p = &arr[0];
p = &arr[0] + 1;

An array is also converted to a pointer if it is an argument of a function. In the following


example, the array is translated to a pointer to its first element:
int arr[10];
strcpy(arr, copy this);

The example above is then equivalent to:


int arr[10];
strcpy(&arr[0], copy this);

and equivalent to:


int arr[10];
int *p = arr;
strcpy(p, copy this);

As already mentioned, an element denoted by s[i] is translated to *(s+i) whether s is a

pointer or an array.

III.4 Strings
III.4.1 Definition
Now, let us talk about an import concept related to arrays and pointers: strings. A string is
a sequence of characters terminated by the null character. What is a null character? In
computing, a character is in fact represented by a code fitting in one or more bytes. The
null character has the character code 0, denoted by the character literal \0: all its bits are set
to the value of 0. Therefore, a string is character string terminated by the null character \0.
It is important to note that in C, the length of a string is the number of characters
preceding the null character. For example, the string hello has a length of five characters.

A string literal is a string composed of character literals enclosed within double-quotes ()
such as C Programming.

III.4.2 Strings and arrays


We have already talked about strings in chapter two. We said a string could be declared as
char *. This is true but it can also be declared as an array of characters. The type string is
not a basic type but a sequence of char. Let us start with a string as an array of char. When
you work with strings, always remember that they terminate with the string terminator,
called a null character, denoted by \0. You have two methods to initialize an array of char
with char literals: by enclosing character literals between braces or using string literals.
The following example initializes the array s with the string hello.
$ cat string1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char msg[6] = {h, e, l, l, o, \0 };

printf(msg=%s\n, msg);
return EXIT_SUCCESS;
}
$ gcc -o string1 -std=c99 -pedantic string1.c
$ ./string1
msg=hello

In the example string1.c, we declared an array of six elements of type int. The array msg is
large enough to hold the string hello. The following example is not correct because the
array msg is too small:

$ cat string2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char msg[5] = {h, e, l, l, o, \0 };

printf(msg=%s\n, msg);
return EXIT_SUCCESS;
}
$ gcc -o string2 -std=c99 -pedantic string2.c
string2.c: In function main:
string2.c:5:4: warning: excess elements in array initializer
string2.c:5:4: warning: (near initialization for msg)

The compiler generated the executable but with warnings: the array is too small. The last
character is ignored (\0). The code above is same as the following one:
$ cat string3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char msg[5] = {h, e, l, l, o};

printf(msg=%s\n, msg);
return EXIT_SUCCESS;
}

The example string3.c is not correct. There is no warning but the code contains a bug: we
used the msg array as a string while it is not terminated by the null character. If you run it,
you will see strange characters on your screen because the printf() function displays the
characters of the array until it meets the null character.

Instead of specifying the size of our array, we could let the compiler compute it for us:
$ cat string4.c
1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4 int main(void) {
5 char msg[] = {h, e, l, l, o, \0 };
6 size_t msg_nb_elt = sizeof msg;

7 size_t string_len = strlen(msg);


8
9 printf(Array msg holds %s\n, msg);
10 printf(Size of array msg=%d\n, msg_nb_elt);
11 printf(Length of string %s=%d\n, msg, string_len);
12
13 return EXIT_SUCCESS;
14}
$ gcc -o string4 -std=c99 -pedantic string4.c
$ ./string4
Array msg holds hello
Size of array msg=6
Length of string hello=5

Explanation:
o Line 1: we include the header file stdio.h that declares the function printf().
o Line 2: we include the header file string.h that declares the function strlen().
o Line 5: we define msg as an array of char holding six character literals. Its size is
evaluated by the compiler since it is fully initialized.
o Line 6: we get the number of characters in the msg array. You have noticed we did not
write msg_nb_elt = sizeof msg/sizeof(char) but msg_nb_elt = sizeof msg because sizeof(char) is always
1. Thus, the size of an array of char (in bytes) is the number of characters it contains: the
size is 6.
o Line 7: the strlen() function counts the number of characters (preceding the null character)
of the given array. It returns 5.

Figure III12 Initialization of an array with a string literal


The C language also lets you initialize an array with a string literal:
$ cat string5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char msg[6] = hello;


printf(msg=%s\n, msg);
return EXIT_SUCCESS;
}
$ gcc -o string5 -std=c99 -pedantic string5.c
$ ./string5
msg=hello

This method is more convenient but as explained earlier your array must by large enough
to contain all the character of the string including the null character. The following
example is not correct because the null character cannot be placed in the array (too small):
$ cat string6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char msg[5] = hello;

printf(msg=%s\n, msg);
return EXIT_SUCCESS;
}

You can let the compiler compute the size of the array itself:
$ cat string7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char msg[] = hello;

printf(msg=%s\n, msg);
return EXIT_SUCCESS;
}
$ gcc -o string7 -std=c99 -pedantic string7.c
$ ./string7
msg=hello

The statements char msg[] = hello and char msg[] = {h, e, l, l, o, \0 } are equivalent: they
copies the literal characters into the array (see Figure III12).

The example string7.c is also equivalent to the following:

$ cat string8.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char msg[6];
msg[0] = h;
msg[1] = e;
msg[2] = l;
msg[3] = l;
msg[4] = o;
msg[5] = \0;

printf(msg=%s\n, msg);
return EXIT_SUCCESS;
}
$ gcc -o string8 -std=c99 -pedantic string8.c
$ ./string8
msg=hello

In this example, we copied ourselves the character literals to the array.


III.4.3 Strings and pointers


If a string is a sequence of characters terminated by the null character, it can be also
viewed as a pointer to char. We just need to allocate enough memory to store the characters
as shown below:
$ cat string9.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *msg = malloc(6*sizeof(char));

if ( msg == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

msg[0] = h;
msg[1] = e;
msg[2] = l;

msg[3] = l;
msg[4] = o;
msg[5] = \0;

printf(msg=%s\n, msg);

free(msg);

return EXIT_SUCCESS;
}
$ gcc -o string9 -std=c99 -pedantic string9.c
$ ./string9
msg=hello

Since sizeof(char) is always 1 then, the code string9.c could have written as follows:
$ cat string10.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *msg = malloc(6);

if ( msg == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

msg[0] = h;
msg[1] = e;
msg[2] = l;
msg[3] = l;
msg[4] = o;
msg[5] = \0;

printf(msg=%s\n, msg);

free(msg);

return EXIT_SUCCESS;
}
$ gcc -o string10 -std=c99 -pedantic string10.c
$ ./string10

msg=hello

You have now understood what a pointer is and how to work with them. Do you think the
following example is equivalent to the examples string9.c and string10.c?
$ cat string11.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *msg = hello;
printf(msg=%s\n, msg);

return EXIT_SUCCESS;
}
$ gcc -o string11 -std=c99 -pedantic string11.c
$ ./string11
msg=hello

Figure III13 Initialization of a pointer with a string literal


We got the same output and yet they are completely different! Why? A pointer is a
reference to an object. It is a variable holding an address pointing to an object. Remember
that a pointer can be initialized with an address of an existing object or with malloc(). In the
example above, we initialized the pointer with a string literal: a string literal is not an
address but the C language allows it to ease programming. This means the compiler
assigns the address of the string literal to the pointer (see Figure III13).

Since the pointer msg was not initialized with malloc(), it must not be freed. Since, it has
been initialized with a string constant, the object it references should not be modified
either. In other words, you have to avoid doing something like this:
$ cat string12.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *msg = hello;

msg[0]= H;
printf(msg=%s\n, msg);

return EXIT_SUCCESS;
}
$ gcc -o string12 -std=c99 -pedantic string12.c
$ ./string12
Segmentation Fault (core dumped)

In our computer, our program crashed. The behavior depends on the implementation. In C,
you must not attempt to modify a literal even if pointers let you think you can do it.
Certainly, the C language saves you time by initializing a pointer with a string literal but it
is assumed you understand what you can do and not do with it.

III.4.4 Manipulating strings


III.4.4.1 Introduction
The C language itself does not provide facilities to work with strings: this task is
performed by libraries. A library can be viewed as a set of objects and functions
performing specific actions provided externally. When you install a compiler in your
system, a number of libraries comes bundled with it. However, only the C standard library
is actually required. Programmers often create their own libraries. As far as we are
concerned, for now, we will just use the C standard library. Later, we will learn how to
build libraries and how to use external libraries.

The C standard library is actually made of several modules (we will talk about them later
in the book): there is a module for manipulating strings, another one for managing
errorsFor each module, there is a header file declaring the functions and objects that are
implemented by the module. In this section, we will work with some functions declared in
the header file string.h.

III.4.4.2 strcpy()

The C standard function strcpy(), declared in the standard header file string.h, copies the
string pointed to by src into the memory block pointed to by the pointer dest, and returns
dest:
char *strcpy(char *dest, const char *src);

The prototype of the function above is easy to understand: the src pointer points to const
char, which indicates the programmer that the string pointed to by the pointer src will not
[28]
be altered by the function. You can pass safely pointers or arrays
to the function. The
following example copies the characters in the array s1 into the array s2:
$ cat strcpy1.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100] = hello;
char s2[8];

strcpy(s2, s1);
printf(s1 holds %s and s2 holds %s\n, s1, s2);
printf(size of s1=%d, size of s2=%d\n, sizeof s1, sizeof s2);
printf(Length of string held s1=%d, length of string held s2=%d\n, strlen(s1), strlen(s2));

return EXIT_SUCCESS;
}
$ gcc -o strcpy1 -std=c99 -pedantic strcpy1.c
$ ./strcpy1
s1 holds hello and s2 holds hello
size of s1=100, size of s2=8
Length of string held s1=5, length of string held s2=5

The example declared two arrays of char. Both were large enough to hold the string
hello. At least a size of six bytes was required (do not forget the null character). As you
can see, the strcpy() function copied the contents of the array s1 into the array s2. Of course,
you could also work with pointers in place of arrays as shown below:
$ cat strcpy2.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100] = hello;
char *s2 = malloc(8);

if ( s2 == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

strcpy(s2, s1);
printf(s1 holds %s and s2 holds %s\n, s1, s2);
printf(size of s1=%d, size of s2=%d\n, sizeof s1, sizeof s2);
printf(Length of string held s1=%d, length of string held s2=%d\n, strlen(s1), strlen(s2));

free(s2);
return EXIT_SUCCESS;
}
$ gcc -o strcpy2 -std=c99 -pedantic strcpy2.c
$ ./strcpy2
s1 holds hello and s2 holds hello
size of s1=100, size of s2=4
Length of string held s1=5, length of string held s2=5

We got the same output with the exception of size of s2. As we fully explained in the
previous sections, the size of s2 is the size of a pointer.

What happens if the target array is not large enough?
$ cat strcpy3.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100] = hello;
char s2[2];

strcpy(s2, s1);
printf(s1 holds %s and s2 holds %s\n, s1, s2);

return EXIT_SUCCESS;
}
$ gcc -o strcpy3 -std=c99 -pedantic strcpy3.c
$ ./strcpy3
s1 holds llo and s2 holds hello

The example strcpy3.c showed that whether the target array was too small to hold a string
was not a problem for the strcpy() function, it performed the copy anyway. No boundary
check is done by the function. The rationale is you can pass an array or a pointer.
Therefore, the function cannot guess the size of memory area that is pointed to. This
means, if you pass an array (or a pointer) that is not large enough, the function strcpy() will
incorrectly modify memory blocks that it should not access. There is an undetermined
behavior each time illegal memory addresses are modified. In our example, you can notice
that s1 array was corrupted by the strcpy() function: it held the string llo.

Before passing an array to the strcpy() function, check the target array is large enough for the copy.


The strcpy() function is supposed to deal with strings. So, do not provide a source array that
contains something else. Therefore, the source array has to contain the null character.
Otherwise, the strcpy() function will read and copy all the characters it finds until it meets a
null character. The following example contains an error causing an undetermined
behavior:
$ cat strcpy4.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100];
char s2[8];

strcpy(s1, hello);
s1[5] = !;

strcpy(s2, s1);
printf(s1 holds %s and s2 holds %s\n, s1, s2);

return EXIT_SUCCESS;

Have you guessed where the error is located? Yes, the statement s1[5]=! replaces the null
character with the exclamation mark. The program was compiled with no error, yet it
contains a bug.

Here is another error that you must avoid: giving two overlapping pointers:
$ cat strcpy5.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100] = hello;

strcpy(s1+1, s1);

printf(s1 holds %s\n, s1);

return EXIT_SUCCESS;
}
$ gcc -o strcpy5 -std=c99 -pedantic strcpy5.c
$ ./strcpy5
s1 holds hhelll

The target and source pointers should not overlap. That is why, C99 specifies a new
qualifier known restrict. As of C99, the prototype of strcpy() has been updated:
char *strcpy(char *restrict dest, const char *restrict src);

The function prototype is valid only as of the C99 standard. Compilers that do not
implement the C99 standard cannot use it and use the previous function prototype.

What does the keyword restrict mean? The C99 standard introduced it to qualify a pointer
only. It means that the passed pointer is the only pointer that has access to the memory
area it points to: there is no other pointer that will attempt to access it. A declaration with
the restrict qualifier warns programmers: if the requirement is not met, the function may not
work properly. The compiler does not check if the requirement is met, it is the
responsibility of the programmer to ensure it before using the function.

For efficiency reasons, some functions require that the passed pointers have an exclusive

access to the memory blocks they point to. Of course, it is possible to implement a
function that does the same job as strcpy() without such a requirement. However, such a
function would be less efficient. We will explain how to implement it in Chapter VII.

III.4.4.3 strncpy()
Another interesting function that copies strings is strncpy(). It does the same job as strcpy()
except it copies at most n characters.
Until C95:
char *strncpy(char *dest, const char *src, size_t n)

As of C99:
char *strncpy(char *restrict dest, const char *restrict src, size_t n);

If the source string pointed to by src has a length less than n, it copies the whole string
including the null character to the memory block pointed to by dest. Characters following
the null character are not copied. Moreover, extra null characters are appended to the
target string until the total number of characters written reaches the value n. If the source
string has a length greater than n, the memory area pointed to by dest is not terminated by
the null character.

The following example copies the string hello world entirely because the null character has
been met before writing at most 19 characters.
$ cat strcpy6.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100] = hello world;
char s2[20];
size_t n = 19; /* number of character to copy */

strncpy(s2, s1, n);
printf(s1 holds %s and s2 holds %s\n, s1, s2);

return EXIT_SUCCESS;
}
$ gcc -o strcpy6 -std=c99 -pedantic strcpy6.c
$ ./strcpy6

s1 holds hello world and s2 holds hello world

The following example copies a part of the string hello world: five characters. It seems to
be correct, yet it contains an error. Find it:
$ cat strcpy7.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100] = hello world;
char s2[20];
size_t n = 5; /* number of character to copy */

strncpy(s2, s1, n);
printf(s1 holds %s and s2 holds %s\n, s1, s2);

return EXIT_SUCCESS;
}

Its behavior is undetermined because the array s2 had not the null character. We have to
copy it. So, the previous example should rewritten like this:
$ cat strcpy8.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100] = hello world;
char s2[20];
size_t n = 5; /* number of character to copy */

strncpy(s2, s1, n);
s2[n] = \0;
printf(s1 holds %s and s2 holds %s\n, s1, s2);

return EXIT_SUCCESS;
}
$ gcc -o strcpy8 -std=c99 -pedantic strcpy8.c
$ ./strcpy8

s1 holds hello world and s2 holds hello

What we said about strcpy() holds true for strncpy():


o Ensure your character strings are terminated with the null character
o Do not use overlapping pointers
o The target array must be large enough to store the characters that will be copied
III.4.4.4 strcat() and strncat()
The function strcat() and strncat() concatenate two strings. For example, let us assume we
have an array storing the string some and another one storing the string thing, we can
concatenate them to get the string something. Let us start with strcat():
Until C95:
char *strcat(char *dest, const char *src);

As of C99:
char *strcat(char *restrict dest, const char *restrict src);

It copies the string (including the null character) pointed to by src to the end of the string
pointed to by dest, overwriting the null character of the string pointed to by dest. The
resulting concatenated string (terminated with the null character) will be stored in the
memory block pointed to by dest. The contents of src are left untouched. Of course, the
memory block pointed to by dest must be large enough to hold the concatenated string.

The following example concatenates the string held the array s1 to the string held in the
array s2:
$ cat strcat1.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100] = some;
char s2[20] = thing good;

strcat(s1, s2);
printf(s1: %s and s2: %s\n, s1, s2 );

return EXIT_SUCCESS;
}
$ gcc -o strcat1 -std=c99 -pedantic strcat1.c

$ ./strcat1
s1: something good and s2: thing good


The strncat() has a prototype that looks like this:
char *strncat(char *dest, const char *src, size_t n);

The function strncat() also concatenates two strings. It copies n characters of the string
pointed to by src to the end of the string pointed to by dest, overriding the null character of
the string pointed to by src. If n is greater than length of the string pointed to by src, all the
characters of the string are copied. The resulting concatenated string will be terminated
with the null string (unlike strncpy()), and stored in the memory block pointed to by dest.
The contents of src are left untouched:

The following example concatenates the string held by the array s1 to the string held in the
array s2:
$ cat strcat2.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[100] = some;
char s2[20] = thing good;

strncat(s1, s2, 5);

printf(s1: %s and s2: %s\n, s1, s2 );

return EXIT_SUCCESS;
}
$ gcc -o strcat1 -std=c99 -pedantic strcat1.c
$ ./strcat1
s1: something and s2: thing


What we said about strcpy() and strncpy() holds true for strcat() and strncat(). To avoid an
undetermined behavior of your programs:
o Ensure the character strings pointed to by src and dest are terminated with the null
character

o Do not use pointers that overlap


o The target array must be large enough to store the characters that will be copied

As of C99, the prototype of strcat() and strncat() have the following prototype:
char *strcat(char *restrict dest, const char *restrict src);

char *strncat(char *restrict dest, const char *restrict src, size_t n);

The restrict qualifier does not change the behavior of the functions.

III.4.4.5 strcmp() and strncmp()
In the C language, the operator that compares two objects and tells if they are equal is
denoted by two equals signs ==. Do not confuse it with the assignment operator that is
represented by one equals sign =. The expression x == y returns 1 (true) if x equals y, and 0
(false) otherwise. This will be detailed in the next chapter, we give, here, a little overview
so that you could understand why the function strcmp() should be invoked to compare
strings. The following example compares two variables x and y:
$ cat strcmp1.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

int x ;
int y ;
int z ;

x = 10 ; y = 20 ; z = x == y ;
printf(x=%d, y=%d. z=%d\n, x, y, z ); /* x and y are not equal => Returns 0 */

x = 10 ; y = 10 ; z = x == y ;
printf(x=%d, y=%d. z=%d\n, x, y, z ); /* x and y are equal => Returns 1 */

return EXIT_SUCCESS;
}
$ gcc -o strcmp1 -std=c99 -pedantic strcmp1.c
$ ./strcmp1
x=10, y=20. z=0
x=10, y=10. z=1

The expression z = x == y seems to be quite strange but it is valid. The == operator takes
precedence over the assignment operator =: it is evaluated first. In the example above, if x
holds the value 10 and y holds the value 20, the expression x == y evaluates to the value of
0 that is then assigned to the variable z. Let us now compare two strings:
$ cat strcmp2.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[] = hello ;
char s2[] = hello;
int z ;

z = s1 == s2 ;
printf(s1=%s, s2=%s. z=%d\n, s1, s2, z );

return EXIT_SUCCESS;
}
$ gcc -o strcmp2 -std=c99 -pedantic strcmp2.c
$ ./strcmp2
s1=hello, s2=hello. z=0

The arrays s1 and s2 contains the same string, yet they are evaluated to be different. If you
remember what we said earlier, an array name appearing without the array symbol [] is
converted to the address to its first element (i.e. a pointer to its first element). This implies
the statement s1 == s2 compares two addresses, which are, of course different. We would
have the same problem with pointers:
$ cat strcmp3.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char *s1 = malloc(6) ;
char s2[] = hello;
int z ;

if ( s1 == NULL ) { /* memory allocation failed */

printf(malloc() cannot allocate memory\n);


return (EXIT_FAILURE);
}

strcpy(s1, s2);
z = s1 == s2 ;
printf(s1=%s, s2=%s. z=%d\n, s1, s2, z );

free(s1);

return EXIT_SUCCESS;
}
$ gcc -o strcmp3 -std=c99 -pedantic strcmp3.c
$ ./strcmp3
s1=hello, s2=hello. z=0

The functions strcmp() and strncmp() compares the strings pointed to by the pointers s1 and s2
and returns 0 if they hold the same characters. Here is the prototype of strcmp():
int strcmp(const char *s1, const char *s2);

It is very important to remember the strcmp() returns the value of 0 if the strings pointed to
by the passed pointers contain the same characters. Consider the function strcmp() as a
comparison function, it should not be viewed as an equal-to operator for strings. The
function reads the first character of s2 (let c1s2 be this character) and the first character of s1
(let c1s1 be this character): if c1s2 is greater than c1s1, it returns a positive integer, if c1s2 is
less than c1s1, it returns a negative integer. Otherwise, it continues the comparison of
strings according to the same process (if the second character c2s2 is greater than c2s1, it
returns a positive integer). If the strings contain the same characters, the value of 0 is
returned. Now, we can correct our example strcmp2.c as follows:
$ cat strcmp4.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[] = hello;
char s2[] = hello;
int z ;

z = strcmp(s1, s2);
printf(s1=%s, s2=%s. z=%d\n, s1, s2, z );

return EXIT_SUCCESS;
}
$ gcc -o strcmp4 -std=c99 -pedantic strcmp4.c
$ ./strcmp4
s1=hello, s2=hello. z=0

In the following example, the strcmp() function returns a negative integer because the
character h is less than the character H.
$ cat strcmp5.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[] = Hello;
char s2[] = hello;
int z ;

z = strcmp(s1, s2);
printf(h=%d, H=%d\n, H, h );
printf(s1=%s, s2=%s. z=%d\n, s1, s2, z );

return EXIT_SUCCESS;
}
$ gcc -o strcmp5 -std=c99 -pedantic strcmp5.c
$ ./strcmp5
h=72, H=104
s1=Hello, s2=hello. z=-32

Generally, the function used to determine if two strings are equal.



The strncmp() does the same job as strcmp() except it compares at most n characters:
int strncmp(const char *s1, const char *s2, size_t n);

For example:
$ cat strcmp6.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {

char s1[] = hello!;
char s2[] = hello;
int z1,z2 ;

z1 = strcmp(s1, s2);
z2 = strncmp(s1, s2, 5);

printf(s1=%s, s2=%s. z1=%d and z2=%d\n, s1, s2, z1, z2 );

return EXIT_SUCCESS;
}
$ gcc -o strcmp6 -std=c99 -pedantic strcmp6.c
$ ./strcmp6
s1=hello!, s2=hello. z1=33 and z2=0

In our example strcmp.c, the strcmp() function compares all the characters preceding the null
character while strncmp() compares only the first five characters.

III.4.4.6 atoi()
The atoi() function converts a string s to the integer number it contains:
int atoi(const char *s);

For example:
$ cat atoi1.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
printf(atoi(\10\)=%d\n, atoi(10) );
printf(atoi(\V10\)=%d\n, atoi(V10) );
printf(atoi(\10.7\)=%d\n, atoi(10.7) );
return EXIT_SUCCESS;
}
$ gcc -o atoi1 -std=c99 -pedantic atoi1.c
$ ./atoi1
atoi(10)=10
atoi(V10)=0
atoi(10.7)=10

In the example, we used the escape character \ preceding the double quotation marks to
prevent the compiler from interpreting it, which allowed us to print it. We can notice two
things:
o If the argument of the atoi() function contains a non-numeric character, it returns 0
o If the argument of the atoi() function contains a floating-point value with a fractional part,
only the integral part is returned.

III.4.4.7 atof()
The atof() function converts a string s to the floating-point number it contains:
double atof(const char *s);

For example:
$ cat atof1.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
printf(atof(\10\)=%f\n, atof(10) );
printf(atof(\V10\)=%f\n, atof(V10) );
printf(atof(\10.7\)=%f\n, atof(10.7) );
return EXIT_SUCCESS;
}
$ gcc -o atof1 -std=c99 -pedantic atof1.c
$ ./atof1
atof(10)=10.000000
atof(V10)=0.000000
atof(10.7)=10.700000

The example shows that if the argument of the atof() function contains a non-numeric
character, it returns 0.

III.5 Arrays are not pointers


One question arises: is a string an array or a pointer? Both can be used indifferently. A
pointer is an object holding the address of an object while an array is an object holding
other objects (see Figure III14).

Figure III14 Representation of an array and a pointer


Figure III14 represents an array and a pointer. An array is an object holding objects
whose size is the sum of the size of its item. A pointer just points to the beginning of a
memory area it references. That is, from the pointers perspective, the number of elements
contained in the referenced memory area cannot be guessed unlike an array. In other way
to say it, an array can be viewed as a set of objects grouped into the same box holding a
name. From the perspective of a pointer, a memory area allocated by malloc() is a set of
independent contiguous objects, the first element of which is referenced and actually
known by the pointer.


The following example shows that the array a_msg and the pointer p_msg can be used in the
same way:
$ cat array_vs_pointer1.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
char a_msg[3];
char *p_msg = malloc(3);

if ( p_msg == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

p_msg[0] = a_msg[0] = O;
p_msg[1] = a_msg[1] = K;
p_msg[2] = a_msg[2] = \0;

size_t a_string_len = strlen(a_msg);
size_t p_string_len = strlen(p_msg);

printf(Array a_msg holds %s and pointer p_msg holds %s\n, a_msg, p_msg);
printf(Length of string in a_msg %s=%d\n, a_msg, a_string_len);
printf(Length of string in p_msg %s=%d\n, p_msg, p_string_len);

free(p_msg);

return EXIT_SUCCESS;
}
$ gcc -o array_vs_pointer1 -std=c99 -pedantic array_vs_pointer1.c
$ ./array_vs_pointer1
Array a_msg holds OK and pointer p_msg holds OK
Length of string in a_msg OK=2
Length of string in p_msg OK=2

We can see the only difference between the array a_msg and the pointer p_msg is their
declaration: a_msg was declared as an array of three elements of type char and p_msg was
declared as a pointer to char pointing to a memory area (allocated by malloc()) that can hold
three elements. Therefore, you can store your strings into arrays or pointers. If you work

with pointers, do not forget to allocate memory and then free it



However, their behavior is completely different if you use a string literal to initialize them.
Assigning a string literal to an array triggers a copy of the character literals composing the
string literal to the array. Assigning a string literal to a pointer just copies the address of
the string to the pointer. Why such a different behavior? Because when you declare an
array, a memory space is reserved for it: int a[5] allocates a chunk of memory that can hold
five elements of type int. When you declare a pointer, only a memory space for storing an
address is reserved not for the object itself: for example, the statement int *p allocates a
piece of memory called p that can hold an address only. This point is very important to
understand. When you write something like this:
int v =10;
int *p =&v,


A piece of memory is reserved to store the address of the object v into the pointer p; the
object v has been created before by the statement int v = 10. When you write char *p_msg =
malloc(3), a memory block, whose size is three bytes, is allocated and its address is stored in
p_msg. That is, the statement allocates two pieces of memory: one for holding the address
of the object and one holding the object itself (of three bytes).

Now you can guess an array is not a pointer. An array is a named memory area. A pointer
is a reference to a memory area that can exist or not; if it does not exit, it points to nothing
that can be used. Let us examine through examples the difference between an array and a
pointer.
o Difference one: an array cannot be altered
$ cat array_vs_pointer2.c
1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4
5 int main(void) {
6 char a_msg[] = hello;
7 char *p_msg = hello;
8
9 printf(a_msg=%s and p_msg=%s\n, a_msg, p_msg);
10
11 p_msg = OK;
12 a_msg = OK;
13 printf(a_msg=%s and p_msg=%s\n, a_msg, p_msg);
14 return EXIT_SUCCESS;

15 }
$ gcc -o array_vs_pointer2 -std=c99 -pedantic array__vs_pointer2.c
array_vs_pointer2.c: In function main:
array_vs_pointer1.c:12:10: error: incompatible types when assigning to type char[6] from type char *

Explanation:
Line 6-7: we initialize both the array and the pointer to the string literal hello.
Line 9: we display the contents of the array and the string pointed to by the pointer
Line 11: we set the array to a new string
Line 12: we set the pointer to a new string

This code failed at compilation time at line 12! The reason is we cannot modify an array
but only its contents. An array is not a reference to a memory block, it is a named
memory block. Line 11 passed successfully the compilation: a pointer can be modified.
An array is not a pointer.

o Difference two: pointers and arrays are different sizes:
$ cat array_vs_pointer3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char a_msg[100];
char *p_msg = malloc(100);

if ( p_msg == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return EXIT_FAILURE;
}

printf(sizeof a_msg=%d and sizeof p_msg=%d\n, sizeof a_msg, sizeof p_msg);

free(p_msg);
return EXIT_SUCCESS;
}
$ gcc -o array_vs_pointer3 -std=c99 -pedantic array_vs_pointer3.c
$ ./array_vs_pointer3
sizeof a_msg=100 and sizeof p_msg=4

In our example, our array is 100 bytes (100 elements of type char) and our pointer is 4

bytes. The returned size of the array comprises all elements of the array.

Now, let us list their similarities:
o Case one: both can use the operator [] to access elements
$ cat array_vs_pointer4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *p=hello;
char a[]=hello;

printf(Second char in array=%c\n, a[1]);
printf(Second char in string pointed to by pointer=%c\n, p[1]);

return EXIT_SUCCESS;
}
$ gcc -o array_vs_pointer4 -std=c99 -pedantic array_vs_pointer4.c
$ ./array_vs_pointer4
Second char in array=e
Second char in string pointed to by pointer=e

The compiler converts the array notation X[i] to the pointer notation X+i.

o Case two: both can use the dereference operator * to access elements
$ cat array_vs_pointer5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *p=hello;
char a[]=hello;

printf(Fifth char in array=%c\n, *(a+4));
printf(Fifth char in string pointed to by pointer=%c\n, *(p+4));

return EXIT_SUCCESS;
}
$ gcc -o array_vs_pointer5 -std=c99 -pedantic array_vs_pointer5.c
$ ./array_vs_pointer5

Fifth char in array=o


Fifth char in string pointed to by pointer=o


o Case three: the address of the first element is also the address of the memory area
holding the elements
$ cat array_vs_pointer6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *p=hello;
char a[]=hello;

printf(ARRAY: addr a=%p, addr first element=%p\n, a, &a[0]);
printf(POINTER: addr p=%p, addr first element=%p\n, p, &p[0]);

return EXIT_SUCCESS;
}
$ gcc -o array_vs_pointer6 -std=c99 -pedantic array_vs_pointer6.c
$ ./array_vs_pointer6
ARRAY: addr a=feffea66, &a=feffea66, addr first element=feffea66
POINTER: addr p=8050d8c, addr first element=8050d8c


The C compiler converts the array name to its address in expressions. The following
example shows it clearly:
$ cat array_vs_pointer7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char a[]=hello;

printf(a=%p, and &a=%p\n, a, &a);

return EXIT_SUCCESS;
}
$ gcc -o array_vs_pointer7 -std=c99 -pedantic array_vs_pointer7.c
$ ./array_vs_pointer7
a=feffea6a, and &a=feffea6a

A pointer can simulate an array, but the reverse is not true. You can then assign an array to
a pointer and work with it as you would do with the array itself. Thus, the pointer can
modify the contents of the array as shown below:
$ cat array_vs_pointer8.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char msg[]=hello;
char *p = msg;

p[0] = W;
p[1] = O;
p[2] = R;
p[3] = L;
p[4] = D;

printf(msg=%s\n, msg);

return EXIT_SUCCESS;
}
$ gcc -o array_vs_pointer8 -std=c99 -pedantic array_vs_pointer8.c
$ ./array_vs_pointer8
msg=WORLD

The statement char *p = msg assigns the address of the array msg to the pointer p. Of course,
the assignment is allowed because the array msg contains elements of type char. However,
be aware that the statement p = msg does not mean that the pointer p and the array msg are
the same: p contains a reference to the array msg but is not an array. If you use the array
msg, you access directly the memory block that holds the characters but if you use the
pointer, you do not access it directly: the computer first accesses the address in the pointer
and then the referenced memory block holding the characters. That means, internally, it is
faster to access data through an array than a pointer. Often, programmers use the pointer p
as if it was an array and conversely. That is fine if you keep in mind the differences. Here
is another example:
$ cat array_vs_pointer9.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char msg[] = hello; /* containes 6 characters including \0 */

char *p = hello; /* containes 6 characters including \0 */



int len_msg = strlen( msg );
int len_p = strlen( p );

printf(Array msg. Nb of char preceding the null character=%d\n, len_msg);
printf(Pointer p. Nb of char preceding the null character=%d\n, len_p);

printf(Array msg. sizeof msg=%d\n, sizeof msg);
printf(Pointer. sizeof p=%d\n, sizeof p);

return EXIT_SUCCESS;
}
$ gcc -o array_vs_pointer9 -std=c99 -pedantic array_vs_pointer9.c
$ . array_vs_pointer9
Array msg. Nb of char preceding the null character=5
Pointer p. Nb of char preceding the null character=5
Array msg. sizeof msg=6
Pointer. sizeof p=4

We can notice that since sizeof(char) always returns 1, sizeof s returns the number of character
in the array. So, from now, never consider an array is a pointer though they have a similar
behavior in some cases.

III.6 malloc(), realloc() and calloc()


As previously said, the malloc() function does not initialize the allocated memory block as
shown below:
$ cat malloc1.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
int nb_elt = 3;
int *p = malloc( nb_elt * sizeof(int) );

if ( p == NULL ) { /* memory allocation failed */
printf(malloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

printf(p[0]=%d, p[1]=%d, p[2]=%d\n, p[0], p[1], p[2]);



free(p);
return EXIT_SUCCESS;
}
$ gcc -o malloc1 -std=c99 -pedantic malloc1.c
$ ./malloc1
p[0]=134615120, p[1]=0, p[2]=0

The objects in the memory space pointed to by p had undefined values: on your computer,
you may have different values than our example. Instead of setting each element to the
value of 0, you can invoke the calloc() function that performs exactly the same job as malloc()
and initializes each object of the allocated memory with the value of 0 as in the following
example:
$ cat calloc1.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
int nb_elt = 3;
int *p = calloc( nb_elt, sizeof(int) );

if ( p == NULL ) { /* memory allocation failed */
printf(calloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

printf(p[0]=%d, p[1]=%d, p[2]=%d\n, p[0], p[1], p[2]);

free(p);
return EXIT_SUCCESS;
}
$ gcc -o calloc1 -std=c99 -pedantic calloc1.c
$ ./calloc1
p[0]=0, p[1]=0, p[2]=0

The prototype of the function calloc() is given below:


void *calloc(size_t nb_elt, size_t obj_size);

Where nb_elt is the number of items whose size is obj_size. The calloc() function allocates a
memory space having the size nb_elt*obj_size, sets each element to the value of 0, and returns
a pointer to the allocated memory area. If the function cannot allocate memory, a null

pointer is retuned.

Assume we allocated for our pointer p ten bytes with malloc() or calloc() and then we wished
to grow it so that it could store more objects. How could we have done? The malloc()
function cannot help us as it is because if we call it again, it just allocates a new bigger
piece of memory and we will lose our data. So, we could call the malloc() function to
allocate a bigger memory space, then copy our data into it, and free the original memory
space. This is a good idea but it is time consuming: the best solution is to invoke realloc().
The realloc() function allocates a bigger memory area and copies data if required: if it can
just enlarge the existing memory area, it keeps the original pointer, but if it cannot do it, it
creates a new one, copies the objects from the old memory space into the new one, and
releases the old memory space. The function returns a pointer to the new memory area.

Generally, the realloc() function is used to reallocate more space in order to store additional
objects but it can also be used to release memory by requesting a smaller memory space.
Even in this case, it works in the same way: it returns a pointer to a memory block having
the requested size, and frees the old memory space.

If realloc() cannot allocate a memory space having the requested size, it returns a null
pointer, leaving the original pointer untouched. The prototype of the function looks like
this:
void *realloc(void *p_orig, size_t s);

If the pointer p_orig is a null pointer, the function is equivalent to malloc(). That is, if s is a
size in bytes, realloc(NULL, s) and malloc(s) have the same behavior. If the function cannot
allocate memory, it returns a null pointer, leaving the memory area pointed to by p_orig
unchanged. Otherwise, it allocates a memory space having the size s, copies data pointed
to by p_orig into it if needed, releases the memory space pointed to by the pointer p_orig, and
returns a pointer to the new memory block. Of course, the passed pointer p_orig must have
been previously allocated by malloc(), calloc() or realloc().

The following example is not correct (find out the reason), it is supposed to grow the
pointer p by adding ten elements of type int:
$ cat realloc1.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
int nb_elt = 2;
int nb_elt_new = 12;

int *p = calloc( nb_elt, sizeof(int) );



if ( p == NULL ) { /* memory allocation failed */
printf(calloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

p[0] = 10;
p[1] = 20;

printf(p[0]=%d, p[1]=%d\n, p[0], p[1]);

p = realloc( p, nb_elt_new * sizeof(int) );
p[2] = 30;
p[3] = 40;

printf(\nAfter realloc():\n);
printf(p[0]=%d, p[1]=%d\n,p[0], p[1]);
printf(p[2]=%d, p[3]=%d \n,p[2], p[3]);

free(p);
return EXIT_SUCCESS;
}
$ gcc -o realloc1 -std=c99 -pedantic realloc1.c
$ ./realloc1
p[0]=10, p[1]=20

After realloc():
p[0]=10, p[1]=20
p[2]=30, p[3]=40

The example realloc1.c shows how to call the realloc() function but contains a programming
error. The example works as long as the realloc() function can allocate memory: what
happens if realloc() cannot allocate a bigger memory block? In this case, the realloc()
function returns a null pointer assigned to the pointer p and does not release the initial
memory block. This means the initial memory block remains but and no more accessible
while the p pointer takes a null pointer

Here is a better version of the previous example:
$ cat realloc2.c
#include <stdio.h>
#include <string.h>

#include <stdlib.h>

int main(void) {
int nb_elt = 2;
int nb_elt_new = 12;
int *p = calloc( nb_elt, sizeof(int) ); /* initial allocation*/
int *new_p;

if ( p == NULL ) { /* memory allocation failed */
printf(calloc() cannot allocate memory\n);
return (EXIT_FAILURE);
}

p[0] = 10;
p[1] = 20;

printf(Original address=%p\n, p);
printf(p[0]=%d, p[1]=%d\n, p[0], p[1]);

/* grow the original allocated memory block pointed to by p */
new_p = realloc( p, nb_elt_new * sizeof(int) );

if ( new_p == NULL ) {
/* memory allocation failed
We cannot grow our dynamic array
*/
printf(realloc() cannot allocate memory\n);
printf(However the pointer p is still valid and contains:\n);
printf(p[0]=%d, p[1]=%d\n, p[0], p[1]);

free(p);
return (EXIT_FAILURE);
} else {
/* Memory successfully allocated. The dynamic array has been grown
The new memory area is pointed to by new_p.
The pointer p is no longer valid.
*/

/* since new_p is valid, we can make assignement.
Pointer new_p becomes useless */
p = new_p;
}


p[2] = 30;
p[3] = 40;

printf(\nAfter realloc():\n);
printf(new address=%p\n, p);
printf(p[0]=%d, p[1]=%d\n, p[0], p[1]);
printf(p[2]=%d, p[3]=%d \n, p[2], p[3]);

free(p);

return (EXIT_SUCCESS);
}
$ gcc -o realloc2 -std=c99 -pedantic realloc2.c
$ ./realloc2
Original address=8061268
p[0]=10, p[1]=20

After realloc():
new address=8061C68
p[0]=10, p[1]=20
p[2]=30, p[3]=40

In this code, even if the realloc() function returns a null pointer (statement if ( new_p == NULL
)), we will not lose the reference to the original memory block pointed to by p. Conversely,
if realloc() returns a valid pointer (else statement), the pointers new_p and p will point to it. This
ensures us that our pointers are always valid and then can be used.

The following example shrinks the original allocated memory area:
$ cat realloc3.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
int nb_elt = 12;
int nb_elt_new = 2;
int *p = calloc( nb_elt, sizeof(int) ); /* initial allocation*/
int *new_p;

if ( p == NULL ) { /* memory allocation failed */

printf(calloc() cannot allocate memory\n);


return (EXIT_FAILURE);
}

p[0] = 10;
p[1] = 20;
p[2] = 30;
p[3] = 40;

printf(Original address=%p\n, p);
printf(p[0]=%d, p[1]=%d p[2]=%d\n, p[0], p[1], p[2]);

new_p = realloc( p, nb_elt_new * sizeof(int) ); /* shrink to 2 elements */

if ( new_p == NULL ) { /* memory allocation failed
We cannot shrink our dynamic array
*/
printf(realloc() cannot allocate memory\n);
printf(However the pointer p is still valid and contains:\n);
printf(p[0]=%d, p[1]=%d p[2]=%d\n, p[0], p[1], p[2]);
free(p);

return (EXIT_FAILURE);
} else { /* Memory successfully allocated */
/*
Memory area has been shrinked.
It can hold now only nb_elt_new element
*/

/* since new_p is valid, the pointer p is no longer valid
After assignment, p can now point to the new allocated memory area */
p = new_p;
}

printf(\nAfter realloc()\n);
printf(New address=%p\n, p);
printf(p[0]=%d, p[1]=%d\n,p[0], p[1]);

free(p);
return (EXIT_SUCCESS);
}
$ gcc -o realloc3 -std=c99 -pedantic realloc3.c

$ ./realloc3
Original address=8061268
p[0]=10, p[1]=20 p[2]=30

After realloc()
New address=8061338
p[0]=10, p[1]=20

In the example above, we can see, the realloc() function did not keep the original memory
block, it allocated a new one, copied the piece of memory of size nb_elt_new * sizeof(int) into
it, and freed the old memory block. This implies, the pointer p became invalid after the
invocation of realloc().

III.7 Emulating multidimensional arrays with pointers


We talked earlier about arrays of arrays but we did not explain how to emulate them with
pointers:
o A simple array holding elements of type obj_type is declared as obj_type arr[n]. A onedimensional dynamic-length array can be implemented by a pointer declared as obj_type *p.
o A two-dimensional array holding elements of type obj_type is declared as obj_type arr[n][p]. A
two-dimensional dynamic-length array can be implemented by a pointer declared as
obj_type **p.
o A three-dimensional array holding elements of type obj_type is declared as obj_type arr[n][p]
[q]. A three-dimensional dynamic-length array can be implemented by a pointer declared
as obj_type ***p.
o And so on.

Figure III15 Pointer to pointer to int: int **p


The following example shows how to work with a pointer to pointer emulating a dynamic
two-dimensional array (see Figure III15):
$ cat pointer2pointers1.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
/*
- p is a pointer to pointer to int: p references an object of type *int
- *p is a pointer to int: it has type * int
- **p has type int
*/
int **p = calloc( 2, sizeof *p );

/* p[i] is a pointer to 3 elements of type int */
p[0] = calloc( 3, sizeof **p );
p[1] = calloc( 3, sizeof **p );

p[0][0] = 1; p[0][1] = 2; p[0][2] = 3;
p[1][0] = 11; p[1][1] = 12; p[1][2] = 13;

printf(p=%p p[0]=%p p[1]=%p\n, p, p[0], p[1]);

free(p[0]); free(p[1]);
free(p);
return (EXIT_SUCCESS);
}
$ gcc -o pointer2pointers1 -std=c99 -pedantic pointer2pointers1.c
$ ./pointer2pointers1
p=8061088 p[0]=8061490 p[1]=80614a8

You can do the same with an array:


$ cat pointer2pointers2.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
int p[2][3];

p[0][0] = 1; p[0][1] = 2; p[0][2] = 3;
p[1][0] = 11; p[1][1] = 12; p[1][2] = 13;

printf(p=%p p[0]=%p p[1]=%p\n, p, p[0], p[1]);
return (EXIT_SUCCESS);
}

Here are some interesting comments on the example pointer2pointers1.c. The first one is
about the invocation of calloc() (or malloc()):
o The statement int **p = calloc(2, sizeof(int *)) can also be written int **p = calloc(2, sizeof *p)30.
The compiler will automatically translates sizeof *p to sizeof (int *).

Do not be confused by the notations: the statement means we allocate memory that will
be able to hold two pointers to int. Once allotted, the pointer p will point to the first object
of the memory area (a pointer to int). That is, p is a pointer to type int *: p[0] denotes the
first element and p[1] the second element. Both p[0] and p[1] point to type int. Since, p[0]
and p[1] are also pointers, we have to allocate memory for them as well.
o The statements calloc(3, sizeof(int)) can also be written calloc(2, sizeof **p)
will automatically convert sizeof **p to sizeof(int).

[29]

. The compiler


Remember that if p_obj is a pointer to a memory area holding nb objects of type obj_type,
declared as obj_type *p_obj, you allocate memory for it as follows:
o malloc( nb * sizeof(obj_type) ) or calloc( nb, sizeof(obj_type) )
o malloc( nb * sizeof *p_obj ) or calloc( nb, sizeof *p_obj)

Remember the argument of the sizeof operator is the type of the referenced object or an
object. In pointer2pointers1.c, p points to the object *p of type int *, and *p points to the object
**p of type int.

The second note is it is important not to forget that you have to allocate memory for the
first indirection p and for the second indirection *p. The first indirection p references an
address to a memory location that stores two pointers, each of which (second indirection)
has to be also initialized with malloc() or calloc().

You can use a pointer to pointer to store a list of dynamic strings as below (Figure III16):
$ cat pointer2pointers3.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
int nb = 3;
/* str holds 3 strings */
char **str = calloc( nb, sizeof *str );

str[0] = calloc( 10, sizeof **str);


str[1] = calloc( 10, sizeof **str );
str[2] = calloc( 10, sizeof **str );

strcpy(str[0], string 1 );
strcpy(str[1], string 2 );
strcpy(str[2], string 3 );

printf(str[0]=%s, str[1]=%s and str[2]=%s\n, str[0], str[1], str[2] );

free(str[0]); free(str[1]); free(str[2]);
free(str);
return (EXIT_SUCCESS);
}
$ gcc -o pointer2pointers3 -std=c99 -pedantic pointer2pointers3.c
$ ./pointer2pointers3
str[0]=string 1, str[1]=string 2 and str[2]=string 3

Figure III16 Pointer to pointer to strings

As explained earlier, the compiler converts p[i] to *(p+i) whether p is an array or a pointer. OK, it is
easy to catch but how do you think p[i][j] and p[i][j][k] are translated by the compiler? According to the same rule:
p[i][j] is converted to *( *(p+i) + j ). If we write q = p[i] = *(p+i), then p[i][j] = q[j] = *(q+j) = *(*(p+i)+j). Likewise,
p[i][j][k] is converted to *( *( *(p+i) + j ) + k).

III.8 Array of pointers, pointer to array and pointer to


pointer

Figure III17 Representation of char arr[2][3]


We have learned, in C, a multidimensional array is in fact an array of array. For example,
the array arr[3][10] is an array of 3 arrays of 10 characters. The main constraint on arrays is
we cannot resize them, which leads programmers to resort to pointers. Suppose we need to
store strings composed of 64 characters at most. If the maximum number of strings is

known, say 100, we could use the array arr[100][64] (see Figure III17). Thus, each array
arr[i] holds a string having not more than 64 characters.

Suppose now we have to deal with bigger strings whose length is unknown. In this case,
we have to use pointers. The object we need to store our strings can be viewed as a 100 x n
table: 100 lines and n rows. We can express it as an array of variable-length strings or
symbolically (this is our own notation for easing the understanding) by arr[100][?]. We
could read it as an array of 100 pointers (see Figure III20). In C, we would declare it as
char *arr[100].

Suppose now the string size is not more 64 characters and the maximum number of strings
to store is unknown. Here again, we have to use pointers. The object we need to store our
strings can be viewed as an n x 64 table: n lines and 64 rows. Using our educational
notation, we can express it symbolically as arr[?][64] where ? means dynamic-length in our
own notation. We can read it as arr is a pointer to array[64] or a pointer to array of 64 char
(see Figure III19). In C, we would declare it as char (*arr)[100]. Why using parentheses
around the pointer? Because arrays have precedence over pointers ([] has precedence over
*). If you remove the parentheses, *arr[100] means array of 100 pointers.

The last possibilities, is the length of strings and the maximum number of strings to store
are both unknown: the pointer **arr can be used for such a case (see Figure III18).

Figure III18 Representation of char **arr

Figure III19 Representation of char (*arr)[3]

Figure III20 Representation of char *arr[2]


In summary, a 3x10 array can be represented by arr[3][10], *arr[10], (*arr)[10] or **arr.
Similarly, a 2x3x4 array can be represented by arr[2][3][4], (*arr)[3][4], (*arr[2])[4], *arr[2][3],
(**arr)[4], *(*arr)[3], **arr[2] or ***arr. You have noticed that combining arrays with pointers
make things trickierFurther explanations are required to understand how to read
declarations involving arrays and pointers.

First, we have to talk about precedence of arrays and pointers in declarations. An array has
precedence over pointer. To increase the precedence of the pointer operator, you have to

enclose it between parentheses. For example *arr[2] is an array of two pointers. In contrast,
(*arr)[2] means arr is a pointer to an array of 2 objects. Another example: (*arr[2])[4] is an
array of 2 pointers to an array of 4 items.

The array symbol [] is always on the right hand and the pointer symbol * is always on the
left side. Therefore, the successive symbols [] are read from left to right (the first [] to read
is the leftmost) and the successive symbols * are read from right to left (the first * to read
is the rightmost)! Here is an informal method for deciphering declarations involving
pointers and arrays:
a. Locate the object name. Read name is
b. Read the next enclosing parentheses (starting with the innermost up to the outermost
parentheses) and apply steps c and d. If there is no parenthesis, go to the next step (step
c).
c. Read the next [] on the right side. Read array of.
d. Then read next * on the left side. Read pointer to.
e. Go to step b until you finish reading the declaration.
f. You finish the process by reading the leftmost type.

Let us apply the method to some declarations listed in Table I29.

Table III1 Declarations mixing arrays and pointers


Conversely, how to declare a pointer to array of 3 pointers to char? We apply the reverse
method taking care to enclose pointers between parentheses. Here is an example. A pointer
to an array of 3 pointers to char
o A pointer to: (*arr)
o array of 3: (*arr)[3]
o pointers to: *(*arr)[3]
o char: char *(*arr)[3]

Another example: arr is an array of 2 arrays of 3 pointers to char. Here are the steps
dissected:
o arr is an array of 2 : arr[2]
o arrays of 3: arr[2][3]

o pointers to: *arr[2][3]


o char: char *arr[2][3]

The last example, arr is an array of 2 pointers to an array of 4 char:
o arr is an array of 2: arr[2]
o pointers to : (*arr[2])
o an array of 4: (*arr[2])[4]
o char: char (*arr[2])[4]

Now, we know how to read declarations relating to arrays and pointers, we could easily
find out how to declare dynamic multidimensional arrays by using pointers. Let us
consider a program that stores items in the array arr[2][3][4]. If the maximum number of
items to be stored in it is known and unchanged over time, we can choose an array. Now,
imagine that the first dimension varies over time because our needs have changed. The
best way to proceed is to use a pointer representing the first dimension. To ease our
discussion, let us adopt the following notation: we write ? for a varying dimension that
will be denoted by a pointer. In our example, according to our convention, arr[?][3][4] is an
array whose the first dimension may be resized over time. Such an array is an array of
varying-length array of array of 3 array of 4. The variable dimension can be implemented
as a pointer. Therefore, our variable array arr can be represented by a pointer to array of 3
arrays of 4:
o arr is a pointer to: (*arr)
o array of 3: (*arr)[3]
o array of 4: (*arr)[3][4]

Table III2 shows the different ways to implement the array arr[2][3][4] depending on the
dimension you wish to be dynamic (changeable at run time).

Table III2 Examples of implementation of a dynamic three-dimensional array


In the following example, we declare the object p as int (*p)[3] (pointer to array of 3 ints)
and we allocate a memory area than can hold two arrays of 3 ints (see Figure III21):
$ cat pointer2array1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int (*p)[3]; /* pointer to array[3] */

p = malloc( 2*sizeof *p); /* allocate memory for 2 array of 3 ints */

p[0][0] = 0; p[0][1] = 1; p[0][2] = 2; /* first array in p[0]: 3 items */
p[1][0] = 10; p[1][1] = 11; p[0][2] = 12; /* second array in p[1]: 3 items */
printf(int (*p)[3]:\n);

printf(sizeof p=%d (pointer)\n,sizeof p);


printf( sizeof p[0]=%d (=sizeof(int)*%d)\n,sizeof p[0], 3);
printf( sizeof p[0][0]=%d (=sizeof(int))\n,sizeof p[0][0]);

printf(\nFirst array: first item=%d second item=%d\n, *(*p), *(*p)+1);
printf(First array: first item=%d second item=%d\n, p[0][0], p[0][1]);

printf(\nSecond array: first item=%d second item=%d\n, *(*(p+1)), *(*(p+1))+1);
printf(Second array: first item=%d second item=%d\n, p[1][0], p[1][1]);

free(p);
return EXIT_SUCCESS;
}
$ gcc -o pointer2array1 -std=c99 -pedantic pointer2array1.c
$ ./pointer2array1
int (*p)[3]:
sizeof p=4 (pointer)
sizeof p[0]=12 (=sizeof(int)*3)
sizeof p[0][0]=4 (=sizeof(int))

First array: first item=0 second item=1
First array: first item=0 second item=1

Second array: first item=10 second item=11
Second array: first item=10 second item=11

Figure III21 Pointer to array and pointer to int


Have a look at Figure III21. The pointer p1 points to an int. It is initialized by an array of
ints. However, p1 is not a pointer to an array. Why? Because p1 = s is equivalent to p1 = &s[0].
That is, p1 does not point to an array but to s[0] that is an object of type int (the first element
of the array s).

In the following example, we declare an array of three pointers:

$ cat pointer2array2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p[3]; /* array of 3 pointers to int */
int i;

i=0; /* p[0] is the first pointer */
p[i] = malloc( 2 * sizeof (*p[0])); /* can hold 2 ints */
p[i][0] = i*10; p[i][1] = i*10+1;
i=1; /* second pointer */
p[i] = malloc( 2 * sizeof (*p[0])); /* can hold 2 ints */
p[i][0] = i*10; p[i][1] = i*10+1;

i=2; /* third pointer */
p[i] = malloc( 2 * sizeof (*p[0])); /* can hold 2 ints */
p[i][0] = i*10; p[i][1] = i*10+1;


printf(int *p[3]: p contains 3 pointers:\n);
i=0
printf(pointer %d: first item=%d second item=%d\n, i, p[i][0], p[i][1]);

i=1
printf(pointer %d: first item=%d second item=%d\n, i, p[i][0], p[i][1]);

i=2
printf(pointer %d: first item=%d second item=%d\n, i, p[i][0], p[i][1]);

free(p[0]); free(p[1]); free(p[2]);

return EXIT_SUCCESS;
}
$ gcc -o pointer2array2 -std=c99 -pedantic pointer2array2.c
$ ./pointer2array2
int *p[3]: p contains 3 pointers:
pointer 0: first item=0 second item=1
pointer 1: first item=10 second item=11
pointer 2: first item=20 second item=21

In order to keep the examples pointer2array1.c and pointer2array2.c easier to catch, we did not

test the pointer returned by malloc(). The program can be simplified with the for loop studied
in Chapter V:
$ cat pointer2array2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p[3]; /* array of 3 pointers to int */
int i;

for (i=0; i < 3; i++) {
p[i] = malloc( 2 * sizeof (*p[0])); /* can hold 2 ints */
p[i][0] = i*10; p[i][1] = i*10+1;
}

printf(int *p[3]: p contains 3 pointers:\n);
for (i=0; i < 3; i++)
printf(pointer %d: first item=%d second item=%d\n, i, p[i][0], p[i][1]);

for (i=0; i < 3; i++)
free(p[i]);

return EXIT_SUCCESS;
}


We learned that if s1 is array, in the expression p = s1, the array is converted to a pointer to
its first element. How is the array s2 declared as int s2[10][5] converted? The C language is
coherent, such an array is also converted to a pointer to its first element that is &s2[0].

Now, consider the statement p = s2. Can you guess the declaration of the pointer p? The
element s2[0] (the first element) being an array of 5 int, &s2[0] is a pointer to an array of 5 int.
Consequently, our pointer would be declared as int (*p)[5].

III.9 Variable-length arrays and variably modified types


So far, we have learned that the size of an array must be known at compile time. To be
able to work with an array whose size is unknown at compile time, we have to use a
pointer. In the following example, we store the strings passed to the program in a memory
area, allocated by malloc(), pointed to by the pointer ptr_list_string:
$ cat vla1.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_STRING_LEN 255

int main(int argc, char **argv) {
/* pointer to string of MAX_STRING_LEN characters */
char (*list_string)[MAX_STRING_LEN];
int i;
size_t list_string_len;

if (argc < 2) {
printf(USAGE: %s string1 string2\n, argv[0]);
return EXIT_FAILURE;
}

/* number of strings */
list_string_len = argc-1;

list_string = malloc(list_string_len * sizeof *list_string);

/* copy strings */
for (i=0; i < list_string_len; i++)
/* argv[0]: program name. argv [1]: first string */
strcpy(list_string[i], argv[i+1]);

/* display strings */
for (i=1; i < list_string_len; i++)
printf(String %d: %s\n, i, list_string[i]);

free(list_string);

return EXIT_SUCCESS;
}
$ gcc -o vla1 -std=c99 -pedantic vla1.c
$ ./vla1 hello how are you?
String 1: hello
String 2: how are you?

The C99 standard introduced a new type of array called variable-length array or VLA for
short. It is different from fixed-sized arrays we studied in that their length is known at run-

time only. The length of a VLA does not have to be a constant expression (see Chapter IV
Section IV.14) but an expression that evaluates to a positive integer (known at run time). A
VLA works as a fixed-sized array and is declared in the same way. The previous example
can be written using a VLA:
$ cat vla2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_STRING_LEN 255

int main(int argc, char **argv) {
if (argc < 2) {
printf(USAGE: %s string1 string2\n, argv[0]);
return EXIT_FAILURE;
}

size_t list_string_len = argc - 1;
char list_string[list_string_len][MAX_STRING_LEN];
int i;

/* copy strings */
for (i=0; i < list_string_len; i++)
/* argv[0]: program name. argv [1]: first string */
strcpy(list_string[i], argv[i+1]);

/* display strings */
for (i=0; i < list_string_len; i++)
printf(String %d: %s\n, i, list_string[i]);

return EXIT_SUCCESS;
}
$ ./vla2 hello how are you?
String 0: hello
String 1: how are you?

However, the size of a VLA does not vary over time. Once, the value of its length is
known, the VLA keeps the same size during its lifetime: unlike pointers, it cannot be
resized.

In the following example, we declare a VLA whose size is an expression (composed of a
variable) evaluating to a positive integer:

$ cat vla3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int array_size = 5;
int age[ array_size ];

return EXIT_SUCCESS;
}

The size of a VLA can be known only at run time as in the following example:
$ cat vla4.c
#include <stdio.h>
#include <stdlib.h>

int main(int c, char **argv) {
int array_size = atoi(argv[1]);
int age[ array_size ];

printf( Array size is %d\n, array_size );
return EXIT_SUCCESS;
}
$ gcc -o array3 -std=c99 -pedantic array3.c
$ ./array3 10
Array size is 10

Such an array is called variable-length array. We will not fully describe this example now.
Briefly:
o The atoi() function converts a string containing digits into a number. For example, if
THEa string is 123, atoi() turns it into the number 123.
o The parameters c of the main() function holds the number of arguments in the command
line when you have launched the program. Here, c holds 2 because the command line is
composed of the name of the program and the argument 10.
o The second parameter argv of the main() function holds the name of the program, and its
arguments. Here, the program name array3 is stored in argv[0] and the argument 10 is held
in argv[1].
o The statement int array_size = atoi(argv[1]) stores the value you have passed to the program
into the variable array_size that will be then used as the size of the array age.

We have not talked about the initialization of a VLA because since the size of a VLA is

not known at compile time, you cannot initialize it as a fixed-size array.



A type deriving from (i.e. constructed from) a VLA is known as a variably modified type
(VM type). For example, the pointer p has a VM type:
int n = 10;
long long *p[n];

VLAs and objects having VM types are subject to some constraints described in Chapter
VII Section VII.17.

III.10 Creating types from array and pointer types


Array and pointer types are constructed from other types: they are known as derived types.
Now, we suggest creating new types derived from arrays and pointers. The typedef keyword
allows building new type names from existing types. The typedef keyword is used as if you
declare an object. Let us find out how it works through examples:
o Defining myInteger type as long type:
typedef long myInteger;


o Create the string10 type as an array of 10 chars:
typedef char string10[10];


For example:
$ cat typedef_ptr_array1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef char string10[10];
string10 arr;
printf( Array size is %d\n, sizeof arr);
return EXIT_SUCCESS;
}
$ gcc -o typedef_ptr_array1 -std=c99 -pedantic typedef_ptr_array1.c
$ ./typedef_ptr_array1
Array size is 10


o Create the ptr_dbl type as a pointer to double:

typedef double *ptr_double;


$ cat typedef_ptr_array2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
double f = 10.2;
typedef double *ptr_double;

ptr_double ptr_dbl = &f;
printf( %f\n, *ptr_dbl);
return EXIT_SUCCESS;
}
$ gcc -o typedef_ptr_array2 -std=c99 -pedantic typedef_ptr_array2.c
$ ./typedef_ptr_array2
10.200000


o Create array3D_10x20x30 type as an array of 10 arrays of 20 arrays of 30 chars:
typedef char array3D_10x20x30[10][20][30];
$ cat typedef_ptr_array3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef char array3D_10x20x30[10][20][30];
array3D_10x20x30 arr;

printf( %d\n, sizeof arr);
return EXIT_SUCCESS;
}
$ gcc -o typedef_ptr_array3 -std=c99 -pedantic typedef_ptr_array3.c
$ ./typedef_ptr_array3
6000


o Create the ptr_arr type as a pointer to array of 3 float and the type arr3 as an array of 3
float:
typedef float (*ptr_arr)[3];
typedef float arr3[3];

$ cat typedef_ptr_array4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef float (*ptr_arr)[3];
typedef float arr3[3];

arr3 s[2] = { {1.1, 1.2, 1.3}, {2.1, 2.2, 2.3} };
ptr_arr p_arr = s;

printf( %f %f\n, p_arr[0][0], p_arr[1][2]);
return EXIT_SUCCESS;
}
$ gcc -o typedef_ptr_array4 -std=c99 -pedantic typedef_ptr_array4.c
$ ./typedef_ptr_array4
1.100000 2.300000

III.11 Qualified pointer types


The C standards, until C95, specified two type qualifiers: const and volatile. C99 added a
new one known as restrict. An object declared without a type qualifier has an unqualified
type. If declared with a type qualifier, its type is qualified. For example, float is an
unqualified type while const float is a qualified type (const-qualified type). Qualifiers do not
change the representation of the type (neither its alignment).

There can be several qualifiers, in any order, in a declaration. The types const volatile int,
volatile const int, const int volatile represent the same type. Keep in mind, a qualified type is
different from the corresponding unqualified type: they represent different types even
though they have the same representation and alignment.

The qualifier applies to a type. It can be placed after or before the type it qualifies but
when applied to a pointer, it must be placed after the asterisk *. For example, the pointer
type char * const is qualified: a pointer of that type is made read-only. Compare the
following declarations:
o char * const p declares p as a read-only pointer. The pointer p has a const-qualified type.
o char const * p declares p as a pointer to an object of type const char. The pointer p has an
unqualified type while the object it points to has a const-qualified type.
o const char * p is identical to the previous declaration.


In summary, a pointer type does not inherit the qualifiers of the types from which it is
built. That is, the pointer type char const * derives from the qualified type char const but is not
qualified itself.

III.12 Compatible types


In Chapter II section II.10, we said two types are compatible if they are the same. Two
compatible types are also compatible if they have the same qualifiers whatever their order.
Thus, const float and float are not compatible while const volatile int and volatile const int are
compatible.

Two arrays are compatible if they are the same size and their elements have compatible
type. Two pointer types are compatible if they have the same type qualifiers and they
points to compatible types. The following pointer types are compatible:
o short int * and short *
o unsigned * and unsigned int *
o int *const and signed int *const
o const long *const and signed long const *const

The following pointer types are not compatible:
o short int * and const short int *
o unsigned * and unsigned *const

III.13 Data alignment


We learned that depending on the data type, the amount of storage allocated is a byte or a
group of bytes. For example, an object of type int may be stored in 4 bytes. The group of
bytes is located at a certain address in memory. The issue is most of the computers (even
[30]
in computers allowing byte-addressable
memory) require that each data type to be
placed at certain addresses: this is known as data alignment. That is, not all addresses can
be used to place any piece of data. The constraints vary from processor to processor. The
allowed addresses are multiples of some specific sizes. In older computers, data had to be
placed at addresses that were a multiple of a word size (varying with the processor
architecture). On modern computers, pieces of data have to be put at addresses that are
multiple of their type size (known natural alignment). For example, if a short is 16-bit
wide, an integer of that type will be placed at an address multiple of 16 bits (2 bytes): it is
aligned on 16-bit boundaries. If an int has a size of 32 bits, an integer of that type will be
placed at an address multiple of 32: it is aligned on 32-bit boundaries. Fortunately,

generally, you do not have to worry about data alignment since the compiler will do the
job. On modern computers whose (memory is byte-addressable) an object fitting in a byte
can be put at any address.

[31]
However, when dealing with object pointers
(pointers to objects or another way to put
it pointers to data) and performing conversion between pointers (described in Chapter III
Section III.14), you have to care about data alignment constraints. In C, you can convert a
data pointer, through an explicit cast, any pointer to any data pointer type, which can lead
to misalignment. Not all processor can handle misalignments. To highlight the problem,
let us consider two kinds of processors: SPARC and Intel. The following example
works on Intel based computer:
$ cat pointer_align1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char s[5] = { 0,0,0,0,0};
int *p = (int *)&s[0];

printf(sizeof int=%d\n, sizeof(int));
printf(p=%u s=%u\n, p, s);
printf(*p=%d\n, *p);
return EXIT_SUCCESS;
}
$ gcc -o pointer_align1 -std=c99 -pedantic pointer_align1.c;
$ ./pointer_align1
sizeof int=4
p=2147482768 s=2147482768
*p=0

Both Intel and SPARC processors require a 32-bit int to be aligned on 32-bit
boundaries but SPARC processors cannot handle data misalignment while Intel
processors can. If the program pointer_align1.c is executed on SPARC systems, it may
crash or work depending on the address of s[0]. To show it clearly, consider the following
example:
$ cat pointer_align2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char s[5] = { 0,0,0,0,0 };

int *p = (int *)&s[0];


int *q = (int *)&s[1];

printf(p=%u q=%u s=%u\n, p, q, s);
printf(*p=%d\n, *p);
printf(*q=%d\n, *q);

return EXIT_SUCCESS;
}

On an Intel platform, it works fine though the object pointed to by pointer p may not be
strictly aligned on a 32-bit boundary:
p=4278184563 q=4278184564 s=4278184563
*p=0
*q=0

On a SPARC computer, it crashes:


p=2147482768 q=2147482769 s=2147482768
*p=0
Bus Error (core dumped)

In the above example, the object pointed to by the pointer q (whose address = 2147482769
= 67108836*32 + 17) was misaligned causing the program to be halted abnormally. As
long as we do not access a misaligned object, there is no problem but if we attempt to
access it, on SPARC processors, the program crashes with a Bus Error. In our example,
the object (of type 32-bit int) pointed to by the pointer p was safely accessed because it was
aligned on its natural boundary while the object pointer to by q was misaligned.

There are two kinds of alignments with pointers: the alignment of the pointer itself and the
alignment of the object it points to. In most of modern computers, all object pointers are
represented as an integer and have the same size and then when converting an object
pointer to any data pointer type, there is no issue regarding the pointer itself. However, the
C standard has not such a requirement and then, there might be computers that have object
pointer types of different sizes. That is, if you convert an object pointer of type P1 to type
P2, and the object pointer types are of a different size, the conversion of the pointer might
lead to an issue on some computers imposing data alignment constraints. In our example,
pointer_align2.c, the alignment restrictions concerned only objects pointed to by pointers
since all data pointers have the same representation on SPARC processors.

There is no misalignment if you assign a variable org of type T1 to a variable tgt of type T2 because,
the value of the variable org is converted and then copied into the variable tgt: int tgt = org. The variables tgt and org are
automatically aligned at their inception: their address will not change until their destruction.


In C standard, a pointer to void has the same alignment and representation as a pointer to a
character type. Pointers to qualified and unqualified compatible types have the same
representation and alignment.

III.14 Conversions
As explained in Chapter II Section II.11, in C, there are two kinds of conversions, also
known as casts: implicit conversions and explicit conversions. A conversion occurs when
the type of a value (resulting from an expression) is changed to another type. Implicit
conversions may be performed by some operators such as arithmetic operators (+, -, *, /)
and the assignment operator =, while explicit conversions are under control of the
programmer.

The implicit cast is a conversion that the compiler is allowed to do silently if it meets the
implicit conversion rules of the concerned operator. There are specific rules for implicit
and explicit conversions. When a conversion is required by an operator but the compiler
cannot perform silently (implicit conversion), the compiler may print a warning message
and forces the conversion according to the explicit conversion rules.

III.14.1 Pointer conversions


For pointers, two kinds of conversions (casts) may occur: implicit conversions performed
by the assignment operation and explicit conversions through the cast operator. The C
standard specifies specific rules for both of them.

If obj is an object, the explicit cast (tgt_type)obj converts obj to type tgt_type. The assignment
operation is composed of one operator = and two operands: one operand before the equals
sign and the other after:
lvalue=rvalue

Since expressions are described later, we can consider the left operand lvalue is a pointer
and the right operand rvalue is a value we want to assign to the pointer.

III.14.1.1 Conversion between pointers and integers


A pointer may be explicitly converted to an integer type but the result depends on the
implementation. A pointer may be the same size as an integer type and have the same
representation but this is not requirement. A pointer may not be representable by an
integer type. In many computers, a pointer has the same representation as an integer type,
and then, can be converted to an integer type and back keeping the original value. On our
computer, a pointer can be converted to type unsigned int as shown below:
$ cat pointer2int1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
double v = 10.2;
double *p =&v;
unsigned int u = (unsigned int)p;

printf(sizeof p=%d sizeof unsigned int=%d\n, sizeof p, sizeof u );
printf(p=%u u=%u\n, p, u );

return EXIT_SUCCESS;
}
$ gcc -o pointer2int1 -std=c99 -pedantic pointer2int1.c
$ ./pointer2int1
sizeof p=4 sizeof unsigned int=4
p=4278184560 u=4278184560

In some implementations allowing conversion between pointers and integers, two special
types may be defined (in stdint.h): intprt_t and uintprt_t. They are large enough to store a
pointer. If you use them, keep in mind, your program will not work on systems that do not
define them. In our computer, they are defined. Our previous example can be rewritten as:
$ cat pointer2int2.c
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

int main(void) {
double v = 10.2;
double *p =&v;
uintptr_t u = (uintptr_t)p;

printf(sizeof p=%d sizeof uintptr_t=%d\n, sizeof p, sizeof u );
printf(p=%u u=%u\n, p, u );


return EXIT_SUCCESS;
}
$ gcc -o pointer2int2 -std=c99 -pedantic pointer2int2.c
$ ./pointer2int2
sizeof p=4 sizeof uintptr_t=4
p=4278184560 u=4278184560


Conversely, if the implementation allows it, you can explicitly convert an integer to a
pointer type. However, any implementation permits the conversion of 0 to a pointer type.
An integer constant expression evaluating to 0 or an integer constant expression
evaluating to 0 cast to void * is called a null pointer constant represented by the macro
NULL. When you convert a null pointer constant to a pointer type, you obtain a null
pointer: (char *)0, (int *)0, (double *)0 are examples of null pointers. If the representation of
two null pointers may be different, they always compare equal: for instance, a null pointer
to char compares equal to null pointer to float. even if their representation is different.

There is no implicit conversion between pointers and integers.


III.14.1.2 Conversion between pointers and void *
Let us start with the implicit conversions performed by the simple assignment operation.
Say the left operand of the assignment operator p_left is an object pointer to type LT and the
right operand p_right is an object pointer to type RT. In an assignment operation LT *p_left =
RT *p_right, an automatic conversion occurs if the following conditions are met:
o the type RT or LT is a qualified or unqualified version of the type void
o the type that is pointed to by the left pointer p_left contains at least the qualifiers of the
type pointed to by the right pointer p_right.

Otherwise, the compiler generates a warning message unless an explicit cast is used. In the
following example, the second warning produces a warning message:
$ cat pointer_conv_void1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
const void *m;
const int *p = m; /* OK */
int *q = m; /* Line 7: missing const, generate warning. Be cautious */


return EXIT_SUCCESS;
}
$ gcc -o pointer_conv_void1 -std=c99 -pedantic pointer_conv_void1.c
pointer_conv_void1.c: In function main:
pointer_conv_void1.c:7:13: warning: initialization discards qualifiers from pointer target type

The compiler gcc complains but forces the cast. If we use the explicit cast, the warning
disappears:
$ cat pointer_conv_void2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
const void *m;
const int *p = m; /* OK */
int *q = (int *)m; /* No warning.
Be cautious: do not attempt to alter
the object pointed to by q
*/

return EXIT_SUCCESS;
}
$ gcc -o pointer_conv_void2 -std=c99 -pedantic pointer_conv_void2.c

An explicit cast allows converting a pointer to a qualified or unqualified version of the


type void to any pointer type and conversely.

In the following example, the pointer to void is on left side of the assignment operator:
$ cat pointer_conv_void3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
const int *m;
const void *p = m; /* OK */
void *q = m; /* Line 7: generate warning, missing const */

return EXIT_SUCCESS;
}
$ gcc -o pointer_conv_void3 -std=c99 -pedantic pointer_conv_void3.c

pointer_conv_void3.c: In function main:


pointer_conv_void3.c:7:14: warning: initialization discards qualifiers from pointer target type

We also got a warning: the implicit conversion could not be done. The compiler generated
a warning but forced the cast. An explicit cast removes the warning:
$ cat pointer_conv_void4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
const int *m;
const void *p = m; /* OK */
void *q = (void *)m; /* OK. Be cautious */

return EXIT_SUCCESS;
}


If the right pointer points an unqualified type, the implicit conversion occurs whether the
left pointer points to a qualified or unqualified type as shown below:
$ cat pointer_conv_void5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *m1;
const void *p1 = m1; /* OK */
void *q1 = m1; /* OK */

void *m2;
const int *p2 = m2; /* OK */
int *q2 = m2; /* OK */

return EXIT_SUCCESS;
}



III.14.1.3 Conversion between pointers
Let us call LTver a qualified or unqualified version of the type LT and RTver a qualified or
unqualified version of the type RT (for example, the type const int is a qualified version of

the type int). In the assignment operation LTver *p_left = RTver *p_right, an implicit conversion
occurs if the following conditions are met:
o The types LT and RT are compatible. This means that the unqualified versions of the
types of the pointed-to objects are compatible.
o The type LTver as at least the qualifiers of the type RTver. This means the type of the left
pointed-to object has the at least the qualifiers of the type of the right pointed-to object.

Otherwise, the compiler produces a warning message unless an explicit cast is used. The
rule just dictates that pointers refer to objects having the same way to interpret them (same
alignment, same representation) and respecting the constraints enforced by qualifiers.

For example:
$ cat pointer_conv_assign3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
signed int m = 17;
const signed int c = 19;
float f = 10;

const int *p2c;
int *p2m;
const int **pp2c;
int **pp2m;
p2c = &m; /* OK */
p2c = &c; /* OK */

p2m = &m; /* OK */
p2m = &c; /* Line 18. KO: const missing in left type */

p2m = &f; /* Line 20. KO: int and float not compatible */

pp2m = pp2c; /* Line 22. KO: const int * and int * not compatible */

return EXIT_SUCCESS;
}
$ gcc -o pointer_conv_assign3 -std=c99 -pedantic pointer_conv_assign3.c
pointer_conv_assign3.c: In function main:
pointer_conv_assign3.c:18:8: warning: assignment discards qualifiers from pointer target type

pointer_conv_assign3.c:20:8: warning: assignment from incompatible pointer type


pointer_conv_assign3.c:22:9: warning: assignment from incompatible pointer type

The example is quite simple and it is easy to understand why the warnings are generated
except for the statement in line 22: pp2m = pp2c. Symbolically, we can write it like this: int **
= const int **. If int * is called LTver and const int * is called RTver, then LTver * = RTver *. Written
like this, we could deduct their unqualified version: LT is int * and RT is const int * which
appear clearly not compatible, hence the output. Your question might be why RT is const int
* and not int *? Take note that RT is pointer to an object of type const int: the qualifier const is
related to the object pointed to by the pointer and does not qualify the pointer. If RT was int
*const, we could have said its unqualified version was int *.

Now, if apply explicit casts to the previous example, we get no warnings:
$ cat pointer_conv_assign4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
signed int m = 17;
const signed int c = 19;
float f = 10;

const int *p2c;
int *p2m;
const int **pp2c;
int **pp2m;
p2c = &m; /* OK */
p2c = &c; /* OK */

p2m = &m; /* OK */
p2m = (int *)&c; /* no warning but be cautious */

p2m = (int *)&f; /* no warning but bad idea */

pp2m = (int **)pp2c; /* no warning but be cautious */

return EXIT_SUCCESS;
}

The explicit cast rules allow converting a pointer to any pointer type. Explicit casts seem
to be the cure for warnings yielded by the compiler. Do not consider the goal of the
compiler is to annoy you: it gives valuable information. Always check carefully your

explicit casts. Explicit casts get rid of the warnings but it does not mean there will no
unexpected consequences. As an example, let us consider a read-only variable modified
using a pointer:
$ cat pointer_conv_assign4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
const int v =12;
int *p = (int *)&v;
*p = 20;
printf(v=%d\n, v);
return EXIT_SUCCESS;
}

This code fragment seems to be correct and may work on many computers. Yet it is not
compliant. The statement *p = 20 has an undefined behavior. Modifying an object of constqualified type through a pointer is not portable and should be avoided (see Chapter III).
The same rule applies for the volatile qualifier.

There are always good reasons for a conversion is not done automatically; you have to
watch out for the warning messages of the compiler. The C standard lets you use explicit
casts that are less restrictive but this does not mean you can do anything. Using an explicit
cast suppose you know the consequences of what you are doing. An explicit cast lets
convert a pointer type to any other type as in the following example:
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float *q;
long long *p = (long long *)q;

return EXIT_SUCCESS;
}

This kind of conversion is not portable and even may crash your program on some
systems, as described in section III.13, if you attempt to access the object pointed to by p
because the type float and long long may not have the same alignment.

More generally, an explicit cast (TTG)p_obj converting an object p_obj of type TORG to type
TTG may lead to misalignment. If the alignment constraints for the type TTG is stricter than
for the type TORG, there may be data misalignment causing an undefined behavior. That is,

if the type TORG is aligned on mod_org boundaries and the type TTG is aligned on mod_tgt
boundaries, there may be misalignment if mod_tgt > mod_org. Conversely, if mod_tgt mod_org,
and mod_org is a multiple of mod_tgt, data will be correctly aligned and the cast is safe.

Converting any pointer type to void * or a pointer to character type and back is always safe.
The rationale is the character types (fitting in a byte) have the least strict alignment
constraints (no constraint on computers having byte-addressable memory) and the pointer
void * has the same representation and alignment as a pointer to a character type.

III.14.2 Pointer and arithmetic conversion rules


We summarize in the following two sections what we learned so far about conversions.

III.14.2.1 Explicit cast
Table III3 lists allowed explicit conversions applied on arithmetic and pointer types.

Table III3 Explicit conversions on pointer and arithmetic types

III.14.3 Assignment conversions


Table III4 lists allowed assignment conversions applied on arithmetic and pointer types.

Table III4 Assignment conversions on pointer and arithmetic types


A conversion not listed in Table III4 requires an explicit cast.

III.15 Exercises
Exercise 1. What are the differences between the types char s[10][64] and char *s[64]?

Exercise 2. Let s be an array of char (i.e. declared as char s[]). Explain why the expression

sizeof s yields the same output as strlen(s) + 1 if s contains a string.


Exercise 3. Let s be a pointer to char (i.e. declared as char *s). Explain why the expression
sizeof s does not yield the same value as strlen(s) + 1 if s contains a string.

Exercise 4. Let s be an array. Is the expression s++ valid? Explain why.

Exercise 5. The following program contains is wrong. Correct it.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char msg[]=Hello;
char *p;

strcpy(p, msg);
return EXIT_SUCCESS;
}

Exercise 6. The following program contains an error. Correct it.


#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char msg[]=Hello;
int len = strlen(msg);
char *p = malloc(len);

strcpy(p, msg);
return EXIT_SUCCESS;
}


Exercise 7. In the following example, is p a pointer to an array?
int *p;
int s[10];

p=s;


Exercise 7. In the following example, p is a pointer to an array of 2 int. Why the following
assignments are not valid?
int (*p)[2];
int s1[2];
int s2[2];

p[0]=s1;
p[1]=s2;


Exercise 8. List the different ways to declare an object p emulating a 5x7 table.

Exercise 9. Explain why the following program is not correct:
#include <stdio.h>
#include <stdlib.h>

int main(void) {
long a[2][2];
long **p;

p = a;
a[0][0] = 0;
a[0][1] = 1;
a[1][0] = 10;
a[1][1] = 11;

printf(%ld\n, p[1][0]);
return (EXIT_SUCCESS);
}


Exercise 10. How would declare a dynamic array that can hold objects of different types?

CHAPTER IV OPERATORS

IV.1 Introduction
An operator is a symbol invoked with one or more arguments, known as operands,
performing a specific calculation and returns a numeric value. A C operator can take one
operand (unary operator), two operands (binary operator) or three operands (ternary
operand). The number of operands is called an arity.

An operand does not work with any operands: operands are expected with specific types.
In the chapter, we will describe five types of operators:
o Arithmetic operators
o Relational operators
o Logical operators
o Bitwise operators
o Assignment operators

Operators can be combined to form expressions. An expression can be as simple as a
literal such as the integer literal 10, the string literal hello, the variable msg, an assignment,
an operation or a combination of all of those. An expression is a set of operations,
variables, literals, and function calls. Here are some examples of expressions:
o msg
o 12
o msg=hello
o x=12
o 12+x*8/1.1
o i=atoi(argv[1])
o v=6.2*x

IV.2 Arithmetic operators


Operation

Meaning

+E1

Unary plus

-E1

Unary minus

E1 + E2

Addition operator

E1 - E2

Subtraction operator

E1 * E2

Multiplicative operator

E1 / E2

Division operator

E1 % E2

Modulo operator
Table IV1 Arithmetic operators


[32]
Arithmetic operators take operands of arithmetic types. An arithmetic type
is an
integer type (char, unsigned char, short, unsigned short, int, unsigned int, long ), a real floating
type (float, double, long double) or a complex type (float _Complex, double _Complex, long double
_Complex).

The operands of the operators are expressions that evaluates to a numeric value. The
expressions E1 and E2 can be:
o A numeric literal such as 1 (integer literal), or 2.8 (floating literal)
o A variable of arithmetic type. For example x, where x is a numeric variable (integer, float,
double)
o An operation such as 8*x
o A combination of numeric literals, variables and operations such as 1*v+y-9.

IV.2.1 Unary plus


The unary plus denotes the positive sign of a number. It can be omitted, it has no effect on
the value to which it is applied. For example:
$ cat unary_plus.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

int j = +10;
int i = 10;

printf(i=%d and j=%d\n, i, j);
return EXIT_SUCCESS;
}
$ gcc -o unary_plus -std=c99 -pedantic unary_plus.c
$ ./unary_plus
i=10 and j=10

The general syntax of the unary plus is given below:


+E

The operand E can be a numeric literal, a variable or more generally an expression. For
example, 1+v*y is an expression composed of two operations: addition and multiplication.

Since the unary plus does nothing, it is generally omitted. It has been specified for the
consistency of the C language: since the unary minus exists (and does something), the
unary plus has been specified.

IV.2.2 Unary minus


The unary minus denotes the negative sign of a number: it negates its operand. For
example:
$ cat unary_minus1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i = -10;
int j = -i;

printf(i=%d j=%d\n, i, j);
return EXIT_SUCCESS;
}
$ gcc -o unary_minus1 -std=c99 -pedantic unary_minus1.c
$ ./unary_minus1
i=-10 j=10

The general syntax of the unary minus is given below:


-E

The operand E is an expression. The following example negates the expression


(multiplication):

2*i

$ cat unary_minus2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i = 10;
int j = -(2*i);

printf(i=%d j=%d\n, i, j);
return EXIT_SUCCESS;
}
$ gcc -o unary_minus2 -std=c99 -pedantic unary_minus2.c
$ ./unary_minus2
i=10 j=-20

IV.2.3 Addition
IV.2.3.1 Numeric operands
The addition operator denoted by the plus sign + (binary +) takes two arithmetic operands
and returns a numeric value resulting of the addition of its operands. The operands can be
integer or floating numbers. The following example adds integer values:
$ cat addition1.c
1 #include <stdio.h>
2 #include <stdlib.h>
3 int main(void) {
4 int i;
5 int j;
6
7 i = 2 + 2;
8 j = 1 + i;
9
10 printf(i=%d and j=%d\n, i, j);
11 return EXIT_SUCCESS;
12 }
$ gcc -o addition1 -std=c99 -pedantic addition1.c
$ ./addition1
i=4 and j=5

Explanation:

o Line 4: declaration of the i variable as type int.


o Line 5: declaration of the j variable as type int.
o Line 7: first, the addition 2+2 evaluates to the value of 4 that is then is assigned to the
variable i.
o Line 8: the variable i holds the value 4. The resulting value of the addition 1+i (i.e. 5) is
stored in the variable j.

Since operations can be used at declaration time (initialization), the previous example can
also be written as follows:
$ cat addition2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i = 2 + 2;
int j = 1 + i;

printf(i=%d and j=%d\n, i, j);
return EXIT_SUCCESS;
}
$ gcc -o addition2 -std=c99 -pedantic addition2.c
$ ./addition2
i=4 and j=5

The operands of the addition operator can be any numeric value (i.e. integer or floating
type). In the following example, there is one operand of type float and one operand of type
int:
$ cat addition3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float i = 2.1 + 2;
float j = 1 + i;

printf(i=%f and j=%f\n, i, j);
return EXIT_SUCCESS;
}
$ gcc -o addition3 -std=c99 -pedantic addition3.c
$ ./addition3

i=4.100000 and j=5.100000

Both operands can be of type floating types:


$ cat addition4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
double i = 2.1;
float j = 1.20 + i;

printf(i=%f and j=%f\n, i, j);
return EXIT_SUCCESS;
}
$ gcc -o addition4 -std=c99 -pedantic addition4.c
$ ./addition4
i=2.100000 and j=3.300000


IV.2.3.2 Pointer operands
Whether the addition operator takes two numeric operands is not very surprising but what
is unusual is it also works with pointers in a particular way. It allows a single operand to
be of type pointer, while the second one is an integer operand. An addition involving a
pointer looks like this:
p + E

Where:
o p is a pointer
o E is an expression evaluating to an integer number n

If E is an expression evaluating to an integer number n and p is pointer to an object obj of
type obj_type storing the address addr, the expression p + E evaluates to a pointer holding the
address addr + n * sizeof(obj_type). Remember the expression p + E has a pointer type.

Let us consider a simple example. Let assume that:
o The pointer p was declared as int *p
o In our computer the type int is represented by four bytes (i.e. sizeof(int) would return 4)
o The address in the pointer p is 8061028.

In such a case, the expression p + 1 would return a pointer of the same type holding the
address 8061028 + 1*4=806102C as shown in the following example:
$ cat addition5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p = malloc(3 * sizeof *p);
p[0] = 1;
p[1] = 2;
p[2] = 3;

printf(address in p=%p, holds %d\n, p, *p);
printf(address in p+1=%p, holds %d\n, p+1, *(p+1));
printf(address in p+2=%p, holds %d\n, p+2, *(p+2));
return 0;
}
$ gcc -o addition5 -std=c99 -pedantic addition5.c
$ ./addition5
address in p=8061078, holds 1
address in p+1=806107c, holds 2
address in p+2=8061080, holds 3

It worth noting that the operation p+n does not return a numeric value but a pointer of the
same type as p as shown below:
$ cat addition6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p = malloc(3 * sizeof *p);
int q;

p[0] = 1;
p[1] = 2;
p[2] = 3;


q = p + 1; printf(address in q=%p, holds %d\n, q, *q);
q = p + 2; printf(address in q=%p, holds %d\n, q, *q);
return EXIT_SUCCESS;

}
$ gcc -o addition6 -std=c99 -pedantic addition6.c
addition6.c: In function main:
addition6.c:13:6: warning: assignment makes integer from pointer without a cast
addition6.c:14:6: warning: assignment makes integer from pointer without a cast
addition6.c:14:56: error: invalid type argument of unary * (have int)

The compilation failed because q must be a pointer as in the following example:


$ cat addition7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p = malloc(3 * sizeof *p);
int *q;

p[0] = 1;
p[1] = 2;
p[2] = 3;

q = p; printf(address in p=%p, address in q=%p holds %d\n, p, q, *q);
q = p + 1; printf(address in p=%p, address in q=p+1=%p holds %d\n, p, q, *q);
q = p + 2; printf(address in p=%p, address in q=p+2=%p holds %d\n, p, q, *q);
return EXIT_SUCCESS;
}
$ gcc -o addition7 -std=c99 -pedantic addition7.c
$ ./addition7
address in p=80610d8, address in q=80610d8 holds 1
address in p=80610d8, address in q=p+1=80610dc holds 2
address in p=80610d8, address in q=p+2=80610e0 holds 3

IV.2.4 Subtraction
IV.2.4.1 Arithmetic operands
The Subtraction operator denoted by the symbol (binary minus) works the same way as
the addition operator. It subtracts two numeric expressions and returns the resulting
numeric value. The following example subtracts integer values:
$ cat substract1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i;
int j;

i = 2 - 3;
j = 4 + i;

printf(i=%d and j=%d\n, i, j);
return EXIT_SUCCESS;
}
$ gcc -o subtract1 -std=c99 -pedantic subtract1.c
$ ./subtract1
i=-1 and j=3

Since operations can be used at declaration time, the previous example can also be written
as follows:
$ cat subtract2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i = 2 - 3;
int j = 4 + i;

printf(i=%d and j=%d\n, i, j);
return EXIT_SUCCESS;
}
$ gcc -o subtract2 -std=c99 -pedantic subtract2.c
$ ./subtract2
i=-1 and j=3

The subtraction operator works with arithmetic values. In the following example, there is
one operand of type float and one of type int:
$ cat substract3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float i = 2.1 - 2;
float j = 1 - i;

printf(i=%f and j=%f\n, i, j);

return EXIT_SUCCESS;
}
$ gcc -o subtract3 -std=c99 -pedantic subtract3.c
$ ./subtract3
i=0.100000 and j=0.900000

IV.2.4.2 Pointer operands


The subtraction operator works in the same way as the addition operation. It allows a
single operand to be of type pointer, while the second one is an integer operand:
p - E

Where:
o p is a pointer
o E is an expression evaluating to an integer number n.

If E is an expression evaluating to an integer number n and p is pointer (holding the address
addr), the expression p - E returns a pointer holding the address addr - n * sizeof *p.

For example:
$ cat subtraction4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p = malloc(3 * sizeof *p);
int *q;

p[0] = 1;
p[1] = 2;
p[2] = 3;

q = &p[2];

printf(address in q=%p, holds %d\n, q, *q);
printf(address in q-1=%p, holds %d\n, q-1, *(q-1));
printf(address in q-2=%p, holds %d\n, q-2, *(q-2));
return 0;
}
$ gcc -o subtract4 -std=c99 -pedantic subtract4.c
$ ./subtract4

address in q=8061090, holds 3


address in q-1=806108c, holds 2
address in q-2=8061088, holds 1

The operation returns a pointer as shown below:


$ cat substract5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int *p = malloc(3 * sizeof *p);
int *last_element, *q;

p[0] = 1;
p[1] = 2;
p[2] = 3;

last_element = &p[2];

q=last_element; printf(*q=%d\n, *q);
q=last_element-1; printf(*q=%d\n, *q);
q=last_element-2, printf(*q=%d\n, *q);
return 0;
}
$ gcc -o subtract5 -std=c99 -pedantic subtract5.c
$ ./subtract5
*q=3
*q=2
*q=1

IV.2.5 Multiplication
The multiplication operator denoted by the symbol * multiplies two arithmetic operands
and returns the resulting numeric value. The following example multiplies two integer
literals and stores the returning value in the variable v:
$ cat mult1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 2*8;


printf(v=%d\n, v);
return EXIT_SUCCESS;
}
$ gcc -o mult1 -std=c99 -pedantic mult1.c
$ ./mult1
v=16

The following example multiplies two arithmetic literals and stores the resulting value into
the variable v:
$ cat mult2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float v = 2 * 7.23;

printf(v=%f\n, v);
return EXIT_SUCCESS;
}
$ gcc -o mult2 -std=c99 -pedantic mult2.c
$ ./mult2
v=14.460000

The following example multiplies an arithmetic literal by a variable and stores the
resulting value in the variable w:
$ cat mult3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float v = 7.23;
float w = 2.1 * v;

printf(w=%f\n, w);
return EXIT_SUCCESS;
}
$ gcc -o mult3 -std=c99 -pedantic mult3.c
$ ./mult3
w=15.183000

IV.2.6 Division
The division operator denoted by the symbol / divides two arithmetic operands and returns
the resulting numeric value. The division operation works as you learned it in
mathematics. However, we have to warn you this operation produces a result that may
appear surprising if both operands are of integer type. We will explain in detail why when
we talk about the rule called usual arithmetic conversions. If the operands in an operation
(including division), expecting arithmetic types, are of integer types, the resulting value is
also of integer type as shown below:
$ cat div_op1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x = 1;
int y = 3;
float z = x/y;

printf(%f/%f=%f\n, x, y, z);
return EXIT_SUCCESS;
}

Explanation:
o int x = 1 declares the x variable as int type and sets it to 1.
o int y = 3 declares the x variable as int type and sets it to 3.
o float z = x/y declares the z variable as float and assigns it the output of the division x/y (i.e.
1/3).
o The statement printf(%f/%f=%.24f\n, x, y, z) displays the result of the operation x/y held in
the variable z.

Intuitively, we would expect to obtain something like 0.333333. Let us run it:
$ gcc -o div_op1 -std=c99 -pedantic div_op1.c
$ ./div_op1
x/y=1.000000/3.000000=0.000000

We got the value of 0! Is it a bug? No. The rationale is none of the operands of the
expression 1/3 were of type float but int. All happened as if we did something like this:
$ cat div_op2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float z = 1/3;

printf(1/3=%f\n, z);
return EXIT_SUCCESS;
}
$ gcc -o div_op2 -std=c99 -pedantic div_op2.c
$ ./div_op2
1/3=0.000000

The operation 1/3 divides the integral number 1 by the integral number 3: the type of the
expression 1/3 is then also considered an integer (both the operands are of type int). If we
used 1.0 (float type) instead of 1 (int type), we would have gotten this:
$ cat div_op3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float z = 1.0/3;

printf(1/3=%f\n, z);
return EXIT_SUCCESS;
}
$ gcc -o div_op3 -std=c99 -pedantic div_op3.c
$ ./div_op3
1/3=0.333333

The same results would have been produced if we used the operand 3.0 instead of 3. What
happened?
The type of the operation 1.0/3 is now considered float because the type of the literal 1.0 is
float. Symbolically, we could write this: type of expression 1.0/3 = float/int = float.

You have two methods to tell the compiler you want to work with floating types: either by
using floating literals or explicitly casting (explicit conversion) at least one of the two
literals to a floating type. The following example forces the division to return a floating
number by specifying literals as floating type:
$ cat div_op4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

float v = 3.0/2;
float w = 3/2.0;
float x = 3.0/2.0;

printf(v=%f, w=%f, x=%f\n, v, w, x);
return EXIT_SUCCESS;
}
$ gcc -o div_op4 -std=c99 -pedantic div_op4.c
$ ./div_op4
v=1.500000, w=1.500000, x=1.500000

It worked as expected just by adding the fractional part 0! If in mathematics, 3.0 is same as
3, in C, there is a big difference: 3.0 has a real floating type while 3 is of integer type.

In the second method (explicit conversion), we force the division to return a floating
number by casting literals to type float:
$ cat div_op5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float v = (float)3/2;
float w = 3/(float)2;
float x = (float)3/(float)2;

printf(v=%f, w=%f, x=%f\n, v, w, x);
return EXIT_SUCCESS;
}
$ gcc -o div_op5 -std=c99 -pedantic div_op5.c
$ ./div_op5
v=1.500000, w=1.500000, x=1.500000

In the following example, we divide two variables of type float:


$ cat div_op6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float v = 3;
float w = 2;
float x = v / w;


printf(x=%f\n, x);
return EXIT_SUCCESS;
}
$ gcc -o div_op6 -std=c99 -pedantic div_op6.c
$ ./div_op6
x=1.500000

You may think the example div_op2.c is same as div_op6.c, yet they are different. In example
div_op2.c, we divided an integer number by another integer number. In example div_op6.c, we
divided a floating number by another floating number. We assigned the integer literal 3 to
the floating variable v: the statement float v = 3 means the integer literal 3 is converted to the
target type float. The same process is done for the statement float w=2. That is, the variable v
held a floating type: the division v/w returned a floating type. We would get the same result
with the following code:
$ cat div_op7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float v = 3;
int w = 2;
float x = v / w;

printf(x=%f\n, x);
return EXIT_SUCCESS;
}
$ gcc -o div_op7 -std=c99 -pedantic div_op7.c
$ ./div_op7
x=1.500000

Now, can you guess why the following example displays an incorrect value?
$ cat div_op8.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(1/3=%f\n, 1/3);
return EXIT_SUCCESS;
}
$ gcc -o div_op8 -std=c99 -pedantic div_op8.c
$ ./div_op8

1/3=-547185123929

The answer was given previously, the operation 1/3 outputs a number of integer type,
which implies the value returned by the division 1/3 has not a floating type as expected by
the printf() specifier %f. A correct code would be:
$ cat div_op9.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(1/3=%d\n, 1/3);
return EXIT_SUCCESS;
}
$ gcc -o div_op9 -std=c99 -pedantic div_op9.c
$ ./div_op9
1/3=0

In summary, retain that a division returns a value of integer type if all of its operands have
integer types.

IV.2.7 Modulo operator


The modulo operator (also known as modulus operator or remainder operator) denoted by
the symbol % takes two integer operands and returns an integer value that is the remainder
of the integer division. A division involving two integer numbers i and j can be
mathematically expressed like this: i/j=j*n+r. The remainder r is returned by the modulo
operator %. For example:
o 3/2 = 2*1+1. The integral part n=1 and the remainder r=1.
o 7/3 = 3*2+1. The integral part n=2 and the remainder r=1.

Here is a program coding this:
$ cat modulo_op1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i = 3;
int j = 2;
int n = i / j;
int r = i % j;

printf(%d/%d=%d*%d+%d\n, i, j, i, n, r );
return EXIT_SUCCESS;
}
$ gcc -o modulo_op1 -std=c99 -pedantic modulo_op1.c
$ ./modulo_op1
3/2=3*1+1

The modulus operator seems to be of little interestCan you imagine a simple method to
determine if a number is odd or even? With the modulus operator, it is very easy: an even
number p can be expressed as p=2*n where n is an integer number, which means if p%2
evaluates to 0, the number if even. Conversely, an odd number p can be expressed as
p=2*n+1, which means if p%2 evaluates 1, the number if odd. More generally, an integer
number p is multiple of an integer number q if p%q evaluates to 0. The example below
reads the number you have typed, translates it into a number and tells if it is even or odd:
$ cat modulo_op2.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(int argc, char **argv) {
5 int n;
6
7 if (argc == 1) {
8 printf(Please provide an argument\n);
9 printf(USAGE: %s n\n,argv[0]);
10 return (EXIT_FAILURE);
11 }
12
13 n=atoi(argv[1]);
14
15 if ( n%2 == 0 ) {
16 printf( %d is even\n, n );
17 } else {
18 printf( %d is odd\n, n );
19 }
20 return (EXIT_SUCCESS);
21 }
$ gcc -o modulo_op2 -std=c99 -pedantic modulo_op2.c
$ ./modulo_op2 10
10 is even

Explanation:
o Line 1: the header file stdio.h is included because we use the printf() function.

o Line 2: the header file stdlib.h is included because we use the function atoi() and the values
EXIT_SUCCESS and EXIT_FAILURE.
o Line 4: the function main() is declared with two arguments argc and argv. The integer
number argc holds the number of arguments including the program name, and argv stores
the arguments themselves. If you run the program with no argument, argc holds the value
1 (there is only the program name). If you pass one argument, argc stores the value 2
(program name and the argument you pass)The pointer argv is a pointer to pointers to
char (array of arrays of char). The array argv[0] stores the name of the program, argv[1]
stores the first argument
o Line 5: The variable n is declared as type int. It will hold the value that the user passes to
the program.
o Line 7-Line 11: we test if an argument has been passed to the program. If argc has not
given an argument, it holds the value of 1. In this case, we print a little help explaining
how to run the program: argv[0] contains the name of the program.
o Line 13: we convert the passed argument (stored as a string in argv[1]) into a number.
o Line 15-16: we test if the number n is even: n%2 evaluates to 0.
o Line 17-18: this code is executed if n%2 does not evaluate to 0.

IV.3 Relational operators


[33]
A relational operator takes two operands of real types
, compares them and evaluates to
an integer of type int. The operation evaluates to 1 if the comparison is true or 0 if false. In
C, 0 means false, while any other value means true (whether it is negative or positive).

Table IV2 Relational Operators


Both operands can also be pointers to qualified or unqualified versions of compatibles
object types.

Here are some examples. Below, we compare integer literals:
$ cat relop1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int r1 = 3 > 2;
int r2 = 2 > 3;
int r5 = 2 >= 2;
int r6 = 6 != 2;

printf(3>2 evaluates to %d\n, r1 );
printf(2>3 evaluates to %d\n, r2 );
printf(2>=2 evaluates to %d\n, r5 );
printf(6!=2 evaluates to %d\n, r6 );

return EXIT_SUCCESS;

}
$ gcc -o relop1 -std=c99 -pedantic relop1.c
$ ./relop1
3>2 evaluates to 1
2>3 evaluates to 0
2>=2 evaluates to 1
6!=2 evaluates to 1

We can notice the relational operations are evaluated first, then, the resulting numeric
value is assigned to the variable: relation operators take precedence over the assignment
operator (=).

The following example compares numeric values of different types:
$ cat relop2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
printf(3.2 > 2.9 evaluates to %d\n, 3.2 > 2.9 );
printf(2.1 > 2 evaluates to %d\n, 2.1 > 2 );
printf(8.7 <= 8 evaluates to %d\n, 8.7 <= 8 );

return EXIT_SUCCESS;
}
$ gcc -o relop2 -std=c99 -pedantic relop2.c
$ ./relop2
3.2 > 2.9 evaluates to 1
2.1 > 2 evaluates to 1
8.7 <= 8 evaluates to 0

Of course, you can compare variables:


$ cat relop3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int j = 2*7;
float r = 12.1;
float t = 14.0;

printf(%d > %d evaluates to %d\n, j, 5, j > 5 );

printf(%f <= %f evaluates to %d\n, r, t, r <= t );



return EXIT_SUCCESS;
}
$ gcc -o relop3 -std=c99 -pedantic relop3.c
$ ./relop3
14 > 5 evaluates to 1
12.100000 <= 14.000000 evaluates to 1

More generally, relational operator takes two operands that are expressions as shown
below:
$ cat relop4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float r = 12.1;
float t = 14.0;

printf(2*3+10 > 2+7/3 evaluates to %d\n, 2*3+10 > 2+7/3 );
printf(%f*1.2-2 <= %f*3+1 returns %d\n, r, t, r*1.2-2 <= t*3+1 );

return EXIT_SUCCESS;
}
$ gcc -o relop4 -std=c99 -pedantic relop4.c
$ ./relop4
2*3+10 > 2+7/3 returns 1
12.100000*1.2-2 <= 14.000000*3+1 returns 1

Before the comparison occurs, the expressions are evaluated to a numeric value. For
example, in the operation 2*3+10 > 2+7/3, first, the expression 2*3+10 evaluates to 16 and 2+7/3
evaluates to 4. Then, the comparison 16 > 4 is performed.

Relational operators are generally used in control flow constructs (for loop, while loop, if
statement). The following example prints the first six digits:
$ cat relop5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int max = 5;

int i = 0;

while ( i <= max ) {
printf(i=%d\n, i);
i = i + 1;
}

return EXIT_SUCCESS;
}
$ gcc -o relop5 -std=c99 -pedantic relop5.c
$ ./relop5
i=0
i=1
i=2
i=3
i=4
i=5


Take note that a statement such as x < y < z means:
o Evaluate x < y to 0 if the operation is false or 1 otherwise. Let res be this value.
o Then, evaluate the expression res < z (res is 0 or 1)

When several relational operators (having the same precedence) are present, the compiler
uses the left associativity. Accordingly, x < y < z is equivalent to (x < y) < z. The mathematical
expression x < y < z is interpreted as x < y && y < z in the C language. Associativity will be
broached later in the chapter.

IV.4 Equality operators


Equality operators are often considered relational operators but in C, there is a subtle
[34]
distinction. They take two operands of arithmetic types
and compare them (relational
operators accept real types. They do not compare complex types). Equality operations
evaluate a value of type int: 1 if the comparison is true or 0 if false. In C, 0 means false,
while any other value means true (whether it is negative or positive). Two complex
numbers are equal if their real parts are equal and their imaginary parts are equal.

Table IV3 Equality Operators


Like relational operators, both operands can also be pointers to qualified or unqualified
versions of compatibles object types.

Relational operators have precedence over equality operators. For example, the statement z
== x < y first compares x and y then the resulting value of x < y is compared to z. Here is an
example:
$ cat equop1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x = 5;
int y = 6;
int z = 1;

printf(%d == %d < %d returns %d\n, z, x, y, z == x < y );

return EXIT_SUCCESS;
}
$ gcc -o equop1 -std=c99 -pedantic equop1.c
$ ./equop1
1 == 5 < 6 returns 1

With equality operators, one operand can be a pointer to an object and the other operand
can be a pointer to a qualified or unqualified version of void. This is not permitted with
relational operators.

With equality operators, one operand can be a pointer and the other operand can be a null
pointer constant. This is not permitted with relational operators.
$ cat equop2.c
#include <stdio.h>
#include <stdlib.h>


int main(void) {
int *p = NULL;

printf(p == NULL: %d\n, p == NULL );

return EXIT_SUCCESS;
}
$ gcc -o equop2 -std=c99 -pedantic equop2.c
$ *./equop2
p == NULL: 1

The following example checks if the passed argument has a fractional part. The test is
done by the if statement that compares the number given as argument of the program with
its integer part: if they are equal, it means the number has no fractional part:
$ cat equop3.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
double f;
long i;

if (argc == 1) {
printf(Please provide a number\n);
printf(USAGE: %s number\n,argv[0]);
return (EXIT_FAILURE);
}

f = atof(argv[1]); /* converts the string to a float number */
i = atoi(argv[1]); /* converts the string to an integer number.
If argv[1] holding the first argument has
a fractional part, it is discarded. Only the
integral part is kept.
*/

if ( i == f ) {
printf( %s is an integer number\n, argv[1] );
} else {
printf( %s has a fractional part\n, argv[1] );
}
return (EXIT_SUCCESS);

}
$ gcc -o equop3-std=c99 -pedantic equop3.c
$ ./equop3 9.9
9.9 has a fractional part
$ ./equop3 10
10 is an integer number

In case pointers or arrays are part of operands, you have to watch out for what you really
mean: are you talking about the address held in the pointer or the value it points to? The
program below compares two pointers:
$ cat equop4.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char *str1 = malloc(20 * sizeof *str1);
char *str2 = malloc(20 * sizeof *str2);

strcpy(str1, hello);
strcpy(str2, hello);

printf(str1 holds %s, str2 holds %s \n, str1, str2 );
printf(%X == %X returns %d\n, str1, str2, str1 == str2 );

return EXIT_SUCCESS;
}
$ gcc -o equop4 -std=c99 -pedantic equop4.c
$ ./equop4
str1 holds hello, str2 holds hello
80610A0 == 80610C0 returns 0

Both pointers str1 and str2 points to memory blocks containing the same character string,
but the address they hold are different; which implies the expression str1 == str2 evaluates to
0 (false). The relational operation str1 == str2 does not compare the referenced objects but
the pointers themselves. The function strcmp() or strncmp() are commonly used to compare
strings as in the following example:
$ cat equop5.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char *str1 = malloc(20 * sizeof *str1);
char *str2 = malloc(20 * sizeof *str2);
int cmp;

strcpy(str1, hello);
strcpy(str2, hello);

cmp = strcmp(str1, str2);

printf(strcmp(\%s\, \%s\) returns %d: , str1, str2, cmp );

if ( cmp == 0 ) {
printf(same characters\n);
} else {
printf(different characters\n);
}

return EXIT_SUCCESS;
}
$ gcc -o equop5 -std=c99 -pedantic equop5.c
$ ./equop5
strcmp(hello, hello) returns 0: same characters

Here, be aware that the strcmp() function returns 0 if strings hold the same characters. It
should not be confused with the relational operators.

IV.5 Logical operators


IV.5.1 Definition
A logical operator takes one or two integer operands and evaluates to an integer value: 0
(for false) and 1 (for true). In Table IV3, the operands A and B are expressions that
evaluate to an integer value. In C, remember that an integer value different from zero
(negative or positive) is considered true. Only the value of zero is considered false.

Table IV4 Logical operators

IV.5.2 Logical NOT


The ! operator is a unary operator that inverts the logical value of its operand: if the
expression A is true then !A is false and if A is false then !A is true. That is, !A returns 1 if
the expression A evaluates to 0 and returns 0 otherwise as shown below:
$ cat logop1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i;

i = 5; printf(!%d=%d\n, i, !i);
i = 0; printf(!%d=%d\n, i, !i);
i = -10; printf(!%d=%d\n, i, !i);

return EXIT_SUCCESS;
}
$ gcc -o logop1 -std=c99 -pedantic logop1.c
$ ./logop1
!5=0
!0=1
!-10=0

In example equop5.c, we used the condition cmp == 0 to test the value returned by strcmp().
Since !A returns 1 if A evaluates to 0, cmp == 0 is accordingly the same as !cmp:
$ cat logop2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char *str1 = malloc(20 * sizeof *str1);
char *str2 = malloc(20 * sizeof *str2);
int cmp;

strcpy(str1, hello);
strcpy(str2, hello);

cmp = strcmp(str1, str2);

printf(strcmp(\%s\, \%s\) returns %d: , str1, str2, cmp );

if ( !cmp ) {
printf(same characters\n);
} else {
printf(different characters\n);
}

return EXIT_SUCCESS;
}
$ gcc -o logop2 -std=c99 -pedantic logop2.c
$ ./logop2
strcmp(hello, hello) returns 0: same characters

IV.5.3 Logical AND


The logical operator && is known as a logical AND. It takes two operands and evaluates to
an integer of type int; it evaluates to 0 (false) or 1 (true). The logical expression A && B
returns 1 only if both the operands are true (value different from 0). Otherwise, it returns 0
(Table IV5).

Table IV5 Logical AND


The operands A and B are expressions whose resulting values have arithmetic types or
[35]
pointer types
.

Here is an example:
$ cat logop3.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


int main(void) {
int i, j;

i = 5; j = 1; printf(%d && %d = %d\n, i, j, i && j);
i = 0; j = 1; printf(%d && %d = %d\n, i, j, i && j);
i = 0; j = 0; printf(%d && %d = %d\n, i, j, i && j);
i = -3; j = 0; printf(%d && %d = %d\n, i, j, i && j);
i = -3; j = 1; printf(%d && %d = %d\n, i, j, i && j);

return EXIT_SUCCESS;
}
$ gcc -o logop3 -std=c99 -pedantic logop3.c
$ ./logop3
5 && 1 = 1
0 && 1 = 0
0 && 0 = 0
-3 && 0 = 0
-3 && 1 = 1

Obviously, you will not use it this way, you will most often use it with control flow
constructs. The following example displays integer numbers in the interval [2,7]:
$ cat logop4.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int min = 2;
6 int max = 7;
7 int i = min;
8
9 while ( min <= i && i <= max ) {
10 printf(i=%d\n, i);
11 i = i + 1;
12 }
13
14 return EXIT_SUCCESS;
15 }
$ gcc -o logop4 -std=c99 -pedantic logop4.c
$ ./logop4
i=2
i=3

i=4
i=5
i=6
i=7

Explanation:
o Line 5: the integer variable min is initialized to the value 2.
o Line 6: the integer variable max is initialized to the value 7.
o Line 7: The i variable is initialized to the value held in the min variable. It will be used in
the while loop as a counter that will be incremented at each iteration (line 11).
o Line 9: The while loop tests if the variable i has a value greater than or equal to the
variable min and less than or equal to the variable max. If the relational expression
evaluates to true, the while block is executed. The block of the while loop consists of two
statements at lines 10 and 11. The while loop stops when the i variable becomes greater
than the max variable (the relational expression evaluates to false).
o Line 10: the value of the i variable is printed.
o Line 11: the i variable is incremented.

IV.5.4 Logical OR
The logical operator || is known as a logical OR. It takes two operands and evaluates to an
integer value of type int: 0 (false) or 1 (true). The logical expression A || B returns 1 if at
least one of the operands is true. Otherwise, it returns 0. To put it another way, it returns 0
if both the operands are false and 1 otherwise (see Table IV6).

Table IV6 Logical OR


The operands A and B are expressions whose resulting values have scalar types
Here is an example:
$ cat logop5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i, j;

[36]
.

i = 5; j = 1; printf(%d || %d = %d\n, i, j, i || j);


i = 0; j = 1; printf(%d || %d = %d\n, i, j, i || j);
i = 0; j = 0; printf(%d || %d = %d\n, i, j, i || j);
i = -3; j = 0; printf(%d || %d = %d\n, i, j, i || j);
i = -3; j = 1; printf(%d || %d = %d\n, i, j, i || j);

return EXIT_SUCCESS;
}
$ gcc -o logop5 -std=c99 -pedantic logop5.c
$ ./logop5
5 || 1 = 1
0 || 1 = 1
0 || 0 = 0
-3 || 0 = 1
-3 || 1 = 1

The following example test if two arrays store different character strings:
$ cat logop6.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char s1[] = hello;
char s2[] = world;

if (strcmp(s1,s2) > 0 || strcmp(s1,s2) < 0) {
printf(s1 and s1 stores different strings\n);
} else {
printf(s1 and s1 stores same string\n);
}
return EXIT_SUCCESS;
}
$ gcc -o logop6 -std=c99 -pedantic logop6.c
$ ./logop6
s1 and s1 stores different strings

IV.6 Bitwise operators


The bitwise operands take one or two operands of integer type. They work on each bit of
the given operands. In Table IV7, the operands A, B and N are expressions evaluating to an

integer value.

Table IV7 Bitwise operators


In the section, we will use the notations of the second chapter allowing us to make the
distinction between a number in base 10 (decimal base) and in base 2 (binary base):
o N10 or N represents a number in base 10. For example, 510 or 5 denotes the number 5 in
base 10.
o N2 represents a number in base 2. For example, 1012 denotes the number 510.


Here, we just do brief revision about what we explained in Chapter II when we talked
about types. In your program, you will normally work with numbers using the usual
decimal representation (in base 10). However, if you work with bitwise operations, you
have to represent numbers in base two, which ease computations. Internally, a number fits
in a fixed number of bits depending on the type used. In our computer, a number of type
char fits in eight bits, a number of type int fits in thirty-two bits (four bytes)In the next
sections, for the sake of simplicity, we will work with eight bits. For example, a variable
of type char, holding the value 5, has the binary representation 00000101. If it were
declared as an int, it would have the binary representation
00000000000000000000000000000101.

The least significant bit (the right most bit according to our convention) is at position 0. If
a number fits in n bits, the most significant bit (the left-most bit according to our
representation) is at position n-1. Working with eight bits, the most significant bit is at
position seven.

On a computer, there are several ways to represent a negative integer number: the C
language does impose a specific the internal representation of numbers. For this reason,
the bitwise operations on negative numbers yield an undefined result. In the following
sections, we will work with positive integer numbers.

IV.6.1 Bitwise complement


~A

Where A is an expression evaluating to an integer value. The unary operator ~ is the


bitwise complement. It inverts each bit of the operand (Figure IV1). Here are some
examples:
o ~02=12
o ~112=002
o ~1002=0112

Let us consider an unsigned char represented by eight bits, which corresponds to the range
[0-255]. The decimal value 510, that can fit in eight bits, can be represented by the octet
000001012. Thus, ~510=~000001012=111110102=25010 as shown below:
$ cat bitwise_not1.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


int main(void) {
unsigned char i = 5; /* 00000101 */
unsigned char j = ~i; /* 11111010 = 250*/

printf(i=%u and j=~%u=%u\n, i, i, j);

return EXIT_SUCCESS;
}
$ gcc -o bitwise_not1 -std=c99 -pedantic bitwise_not1.c
$ ./bitwise_not1
i=5 and j=~5=250

Now, if we consider the number 5 as an unsigned int, it can be represented by four bytes on
our computer: 510=000000000000000000000000000001012. Thus:

~5=~000000000000000000000000000001012=111111111111111111111111111110102=42949672
as shown below:
$ cat bitwise_not2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int i = 5;
unsigned int j = ~i;

printf(i=%u and j=~%u=%u\n, i, i, j);
return EXIT_SUCCESS;
}
$ gcc -o bitwise_not2 -std=c99 -pedantic bitwise_not2.c
$ ./bitwise_not2
i=5 and j=~5=4294967290

Figure IV1 Bitwise NOT

IV.6.2 Left shift operator


B << N

Where B and N are two expressions evaluating to an integer value we will can b and n
respectively.

Figure IV2 Bitwise left shift


The left shift operator denoted by the symbol << takes two integer operands. The left shift
operation b << n shifts the bits of the integer number b by n bits towards the most
significant bit (Figure IV2). As an example, let us consider the number 5 represented by
eight bits (character type):
o 510 << 110 = 000001012 << 110 = 000010102 = 1010
o 510 << 210 = 000001012 << 210 = 000101002 = 2010

o 510 << 310 = 000001012 << 310 = 001010002 = 4010


o 510 << 410 = 000001012 << 410 = 010100002 = 8010

The left shift operation b << n is equivalent to b * 2n (where b and n are integer values). For
example:
o 5 << 1 is equivalent to 5*21=10.
o 5 << 2 is equivalent to 5*22=20.
o 5 << 3 is equivalent to 5*23=40.
o 5 << 4 is equivalent to 5*24=80.

Here is an example:
$ cat bitwise_left_shift1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned char b = 5;
int n;

n = 1; printf(%u << %u = %u\n, b, n, b << n);
n = 2; printf(%u << %u = %u\n, b, n, b << n);
n = 3; printf(%u << %u = %u\n, b, n, b << n);
n = 4; printf(%u << %u = %u\n, b, n, b << n);

return EXIT_SUCCESS;
}
$ gcc -o bitwise_left_shift1 -std=c99 -pedantic bitwise_left_shift1.c
$ ./bitwise_left_shift1
5 << 1 = 10
5 << 2 = 20
5 << 3 = 40
5 << 4 = 80

It is important to note some constraints. If the right operand n of the operation b << n is
negative or too big, the result is undefined. What does too big mean? If b is an integer
number fitting in p bits (width of the integer), the number n must be less than p to avoid an
undefined behavior. In the following example, the compiler reminds us this constraint (on
our computer sizeof(int) = 4 bytes = 32 bits):

$ cat bitwise_left_shift2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int b = 5;

printf(%d\n, b << 32);
return EXIT_SUCCESS;
}
$ gcc -o bitwise_left_shift2 -std=c99 -pedantic bitwise_left_shift2.c
bitwise_left_shift2.c: In function main:
bitwise_left_shift2.c:7:4: warning: left shift count >= width of type [enabled by default]
printf(%d\n, b << 32);
^

In C, if possible, you should avoid undefined behaviors. According to the C standard, a


behavior or a result is said to be undefined when anything might occur. That is, the
implementation has its specific way to handle it: it can implement its own behavior, it may
ignore it or generate an error.

Take note the width of a number is less than or equal to its size as returned by the sizeof operator.
The width of a number is the number of bits used to represent it excluding the padding bits (see Chapter III section
III.6.1).

IV.6.3 Right shift bitwise operator


B >> N

Where B and N are two expressions evaluating to an integer value we will can b and n
respectively.

Figure IV3 Bitwise right shift


The right shift operator is represented by the symbol >>. It takes two integer operands.
The expression b >> n shifts the bits of the integer number b by n bits towards the less
significant bit (Figure IV3). As an example, let us consider the number 16010
(101000002) represented by eight bits (character type):
o 16010 >> 110 = 101000002 >> 110 = 010100002 = 8010
o 16010 >> 210 = 101000002 >> 210 = 001010002 = 4010
o 16010 >> 310 = 101000002 >> 310 = 000101002 = 2010

o 16010 >> 410 = 101000002 >> 410 = 000010102 = 1010



The bitwise operation b >> n is equivalent to b = b / 2n (where b and n are integer values). For
example:
o 160 >> 1 is equivalent to 160/21=80.
o 160 >> 2 is equivalent to 160/22=40.
o 160 >> 3 is equivalent to 160/23=20.
o 160 >> 4 is equivalent to 160/24=10.

He is an example showing what have said so far:
$ cat bitwise_right_shift1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned char b = 160;
int n;

n = 1; printf(%u >> %u = %u\n, b, n, b >> n);
n = 2; printf(%u >> %u = %u\n, b, n, b >> n);
n = 3; printf(%u >> %u = %u\n, b, n, b >> n);
n = 4; printf(%u >> %u = %u\n, b, n, b >> n);

return EXIT_SUCCESS;
}
$ gcc -o bitwise_right_shift1 -std=c99 -pedantic bitwise_right_shift1.c
$ ./bitwise_right_shift1
160 >> 1 = 80
160 >> 2 = 40
160 >> 3 = 20
160 >> 4 = 10

Of course, if we continue shifting the number, we will get 0:


$ cat bitwise_right_shift2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

unsigned char b = 160;


int n;

n = 6; printf(%u >> %u = %u\n, b, n, b >> n);
n = 7; printf(%u >> %u = %u\n, b, n, b >> n);
n = 8; printf(%u >> %u = %u\n, b, n, b >> n);
n = 9; printf(%u >> %u = %u\n, b, n, b >> n);

return EXIT_SUCCESS;
}
$ gcc -o bitwise_right_shift2 -std=c99 -pedantic bitwise_right_shift2.c
$ ./bitwise_right_shift2
160 >> 6 = 2
160 >> 7 = 1
160 >> 8 = 0
160 >> 9 = 0

If the right operand n of the operation b >> n is negative, the result depends on the
implementation. If the right number n of the operation b >> n is greater than or equal to its
width, the resulting value is undefined: the implementation may choose to generate an
error, ignore it leading to an unpredictable value or specify a specific behavior.

IV.6.4 Bitwise AND


A & B

Where A and B are expressions evaluating to an integer value. The bitwise AND denoted
by the ampersand symbol & is similar to the logical AND. It takes two integer numbers
and applies the bitwise AND at bit-level according to the truth Table IV8.

Table IV8 Bitwise AND


Let us consider the decimal numbers 160 and 116. The bitwise AND operation 160 & 116
would yield 32. You cannot guess the result if you work with the decimal representation
because the bitwise operation processes at bit-level. To understand how the operation
works, you have to use the binary representation of the numbers. Let the numbers 160 and
116 be two integers of type unsigned char (fitting in eight bits). Since in our convention the
most significant bit is on the left side, their binary representations are then respectively
101000002 and 011101002. In this case, the bitwise AND operation 16010 &
11610=101000002 & 011101002 would produce 001000002 that represents the decimal
number 32 as depicted in Figure IV4.

Figure IV4 Bitwise AND


More generally, let A be an integer number represented by the binary number an-1an-2a1a0
and B an integer number represented by the binary number bn-1bn-2b1b0. Both the
numbers fit in n bits. The operation A&B yields the binary number cn-1cn-2c1c0, where cn1= an-1&bn-1, cn-1= an-1&bn-1 ,, c0= a0&b0 according to the truth Table IV8.

The following code gives some examples of bitwise AND operations:
$ cat bitwise_AND.c
#include <stdio.h>

#include <stdlib.h>

int main(void) {
unsigned char a;
unsigned char b;

a = 160; b=116 ; printf(%u & %u = %u\n, a, b, a & b);
a = 0; b=1 ; printf(%u & %u = %u\n, a, b, a & b);
a = 1; b=1 ; printf(%u & %u = %u\n, a, b, a & b);

return EXIT_SUCCESS;
}
$ gcc -o bitwise_AND -std=c99 -pedantic bitwise_AND.c
$ ./bitwise_AND
160 & 116 = 32
0 & 1 = 0
1 & 1 = 1

IV.6.5 Bitwise inclusive OR


A | B

Where A and B are expressions evaluating to an integer value.


Figure IV5 Bitwise OR


The bitwise OR denoted by the symbol | takes two integer numbers and operates on bits of
each operand according to Table IV9. if A and B are two integer numbers fitting n bits
represented respectively by the binary number an-1an-2a1a0 and bn-1bn-2b1b0, the
operation A|B yields the binary number cn-1cn-2c1c0, where cn-1= an-1|bn-1, cn-1= an-1|bn-1 ,
, c0= a0|b0 according to the truth Table IV9.

Table IV9 Bitwise OR


For example, the OR operation 160 | 116 produces the value 244 as depicted in Figure
IV5. The following code gives some examples of bitwise OR operations:
$ cat bitwise_OR.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned char a;
unsigned char b;


a = 160; b=116 ; printf(%u | %u = %u\n, a, b, a | b);
a = 0; b=1 ; printf(%u | %u = %u\n, a, b, a | b);
a = 1; b=1 ; printf(%u | %u = %u\n, a, b, a | b);

return EXIT_SUCCESS;
}
$ gcc -o bitwise_OR -std=c99 -pedantic bitwise_OR.c
$ ./bitwise_OR
160 | 116 = 244
0 | 1 = 1
1 | 1 = 1

IV.6.6 Bitwise exclusive OR (XOR)


A ^ B

Where A and B are expressions evaluating to an integer value. The bitwise operator XOR
denoted by the symbol ^ takes two integer numbers and operates on bits of operands
according to Table IV10. if A and B are two integer numbers fitting n bits represented
respectively by the binary number an-1an-2a1a0 and bn-1bn-2b1b0, the operation A^B yields
the binary number cn-1cn-2c1c0, where cn-1= an-1^bn-1, cn-1= an-1^bn-1 ,, c0= a0^b0
according to the truth Table IV10.

Table IV10 Bitwise XOR


Figure IV6 depicts the operation 160 ^ 116 that produces the value 212.


Figure IV6 Bitwise XOR




The following code gives some examples of bitwise XOR operations:
$ cat bitwise_XOR.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

unsigned char a;
unsigned char b;

a = 160; b=116 ; printf(%u ^ %u = %u\n, a, b, a ^ b);
a = 0; b=1 ; printf(%u ^ %u = %u\n, a, b, a ^ b);
a = 1; b=1 ; printf(%u ^ %u = %u\n, a, b, a ^ b);

return EXIT_SUCCESS;
}
$ gcc -o bitwise_XOR -std=c99 -pedantic bitwise_XOR.c
$ ./bitwise_XOR
160 ^ 116 = 212
0 ^ 1 = 1
1 ^ 1 = 0

IV.7 Address and dereferencing operators


The operators * and & allow programmers to deal with pointers and arrays. If p is a
pointer, p is variable holding a memory address to a storage area. Which implies you can
have direct access to the memory address of the object pointed to by the pointer p but you
cannot access directly the object pointed to by the pointer p. The indirect access (to the
object itself) can be done through the unary operator *: *p represents the objet itself
through the pointer p. The address of the object is first accessed, then, the object is
accessed. Dereferencing the pointer p means accessing the object *p .

You may have noticed the symbol * is used in three different ways that might lead to
confusion:
o It is used as a multiplication operator (binary operand) taking two operands. This
operator has nothing to do with pointers.
o It is used to declare a pointer such as int *p. The symbol * indicates the name following it
is the identifier of the pointer. This has nothing to do with dereferencing.
o It is used to dereference a pointer such as in the statement obj = *p. The unary operator *
is used to access the object the pointer points to.

The second operator related to pointers is the address-of operator denoted by a single
ampersand &. Here again, we can see the C language uses the same symbol for different
meanings: it denotes both the bitwise AND (binary operator) that takes two integer
operands and the address-of operator that takes a single operand. When used as a unary
operand, it evaluates to the address of its operand. That is, it converts an object to a
pointer to this object: if obj is an object of type obj_type, &obj evaluates to a pointer of type

obj_type *. Of course, *(&obj) = obj


Here is an example:
$ cat pointers_op.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
long u = 100L;
long *p = &u;
long v = *p;

printf(address p=%p, address &u=%p, v=%ld\n, p, &u, v);
return EXIT_SUCCESS;
}
$ gcc -o pointer_op -std=c99 -pedantic pointer_op.c
$ ./pointer_op
address p=feffeaa4, address &u=feffeaa4, v=100

IV.8 Increment and decrement operators


IV.8.1 Prefix increment operator
The prefix increment operator denoted by ++ is a unary operator placed before an
[37]
[38]
operand
of real or pointer type
. It has the following form:
++var

If var is a variable, it increments it and evaluates to the resulting value. For example, if v=5,
the expression ++v evaluates to 6 and v is set to this value as shown below:
$ cat prefix_inc1.c
include <stdlib.h>
#include <stdio.h>


int main(void) {
int v = 5;
int w = ++v;

printf(v=%d and w=%d\n, v, w);

return EXIT_SUCCESS;
}
$ gcc -o prefix_inc1 -std=c99 -pedantic prefix_inc1.c
$ ./prefix_inc1
v=6 and w=6

The operand can be a real floating number:


$ cat prefix_inc2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float v = 5.2;
float w = ++v;

printf(v=%f and w=%f\n, v, w);
return EXIT_SUCCESS;
}
$ gcc -o prefix_inc2 -std=c99 -pedantic prefix_inc2.c
$ ./prefix_inc2
v=6.200000 and w=6.200000

If the operand is a pointer, the meaning is quite the same but not exactly. The unary
operator ++ evaluates to the pointer to the next object and stores that address into the
pointer. A another way to put it is if p is a pointer, the expression ++p is identical to p=p+1: if
p holds the value addr, it sets the pointer p to the new address addr + sizeof *p and evaluates
to that new pointer as depicted below:
$ cat prefix_inc3.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int n = 3;
6 int *var = malloc(n * sizeof *var) ;
7 int *p;
8
9 var[0] = 10;
10 var[1] = 11;
11 var[2] = 17;
12
13 printf(sizeof int=%d\n, sizeof *var);
14 p=var; printf(p=%p and var=%p. *p=%d and *v=%d\n, p, var, *p, *var);

15 p=++var; printf(p=%p and var=%p. *p=%d and *v=%d\n, p, var, *p, *var);
16 p=++var; printf(p=%p and var=%p. *p=%d and *v=%d\n, p, var, *p, *var);
17
18 return EXIT_SUCCESS;
19}
$ gcc -o prefix_inc3 -std=c99 -pedantic prefix_inc3.c
$ ./prefix_inc3
sizeof int=4
p=80610d0 and var=80610d0. *p=10 and *v=10
p=80610d4 and var=80610d4. *p=11 and *v=11
p=80610d8 and var=80610d8. *p=17 and *v=17

Explanation:
o Line 5: the variable n is the number of elements in the memory area we allocate in the
next line.
o Line 6: we declare var as a pointer to int and we initialize it with the address of the
memory space allocated by the malloc() function. The allocated memory area can store n
(set to 3) values of type int.
o Line 7: we declare p as a pointer to int. It will be used to get the value returned by the
expression ++var.
o Line 9-11: we initialize the elements in the memory area allocated by malloc().
o Line 13: the size of the objects (int) pointed to by the pointer var is displayed: in our
computer, a value of type int fits in 4 bytes (32 bits).
o Line 14: the pointer p is assigned the value held in the pointer var. We display the
addresses held in both the pointers through the printf() specifier %p along with the values
they point to. In our computer, the pointer var stored the address 80610d0.
o Line 15: the postfix expression ++var increments the pointer var by the size of the type it
points to (int) and returns the newly computed address: it is the same as var = var + 1. In our
computer, the operation produced the value 80610d0+4=80610d4 that is also assigned to the
pointers p and var. The printf() function displays the addresses and the values the pointers
var and p point to.

IV.8.2 Prefix decrement operator


The prefix decrement operator denoted by is a unary operator placed before an
[39]
operand
of real or pointer type. It has the following form:
var

It decrements the value of the operand and evaluates to the resulting value. For example, if
v=5, the expression v evaluates to 4 and v is set to this value as shown below:

$ cat prefix_dec1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 5;
int w = v;

printf(v=%d and w=%d\n, v, w);
return EXIT_SUCCESS;
}
$ gcc -o prefix_dec1 -std=c99 -pedantic prefix_dec1.c
$ ./prefix_dec1
v=4 and w=4

The operand can be a real floating number:


$ cat prefix_dec2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float v = 5.2;
float w = v;

printf(v=%f and w=%f\n, v, w);
return EXIT_SUCCESS;
}
$ gcc -o prefix_dec2 -std=c99 -pedantic prefix_dec2.c
$ ./prefix_dec2
v=4.200000 and w=4.200000

If the operand is a pointer, the prefix decrement operation alters it to the address of the
previous object and evaluates to a pointer holding that address: the expression var is the
same as the expression var=var-1. It sets the pointer var to the address var-sizeof *var and
returns a pointer holding that value as depicted below:
$ cat prefix_dec3.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int n = 3;
6 int *var = malloc(n * sizeof *var) ;

7 int *p_elt, *p;


8
9 var[0] = 10;
10 var[1] = 11;
11 var[2] = 17;
12 p_elt = &var[2];
13
14 printf(sizeof int=%d\n, sizeof *var);
15 p=p_elt; printf(p=%p and p_elt=%p. *p=%d and *p_elt=%d\n, p, p_elt, *p, *p_elt);
16 p=p_elt; printf(p=%p and p_elt=%p. *p=%d and *p_elt=%d\n, p, p_elt, *p, *p_elt);
17 p=p_elt; printf(p=%p and p_elt=%p. *p=%d and *p_elt=%d\n, p, p_elt, *p, *p_elt);

return EXIT_SUCCESS;
}
$ gcc -o prefix_dec3 -std=c99 -pedantic prefix_dec3.c
$ ./prefix_dec3
sizeof int=4
p=80610d0 and p_elt=80610d0. *p=17 and *p_elt=17
p=80610cc and p_elt=80610cc. *p=11 and *p_elt=11
p=80610c8 and p_elt=80610c8. *p=10 and *p_elt=10

Explanation:
o Line 5: the variable n is the number of elements in the memory area we allocate in the
next line.
o Line 6: we declare var as a pointer to type int and we initialize it with the address of the
memory space allocated by the malloc() function. The allocated memory area can store n
(set to 3) values of type int.
o Line 7: we declare p and p_elt as a pointers to int.
o Line 9-11: we initialize the elements in the memory area allocated by malloc().
o Line 12: the pointer p_elt is initialized to the address of the last element var[2];
o Line 14: the size of the object (of type int) pointed to by the pointer var is displayed: in
our computer, a value of type int fits in 4 bytes (32 bits).
o Line 15: the pointer p is assigned the value stored in p_elt. We display the addresses held
in both the pointers p and p_elt. In our computer, the pointer var stored the value 80610d0.
o Line 16: the postfix expression p_elt decrements the pointer p_elt by the size of the type
it points to (int) and evaluates to the resulting pointer: it is equivalent to the expression
p_elt = p_elt - sizeof(int). In our computer, the operation produced the value 80610d0-4=80610cc
that is then also assigned to the pointers p. The printf() function displays the addresses
and the values the pointers p_elt and p point to.


Obviously, do not use invalid pointers. The following example contains an error: the last
pointers are invalid:
$ cat prefix_dec4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int nb_element = 2;
int *var = malloc(nb_element * sizeof *var) ;
int *p_elt, *p;

var[0] = 10;
var[1] = 11;
p_elt = &var[1];

printf(sizeof int=%d\n, sizeof *var);
p=p_elt; printf(p=%p and var=%p. *p=%d and *p_elt=%d\n, p, p_elt, *p, *p_elt);
p=p_elt; printf(p=%p and var=%p. *p=%d and *p_elt=%d\n, p, p_elt, *p, *p_elt);

/* the following pointers p and p_elt are invalid */
p=p_elt; printf(p=%p and var=%p. *p=%d and *p_elt=%d\n, p, p_elt, *p, *p_elt);

return EXIT_SUCCESS;
}
$ gcc -o prefix_dec4 -std=c99 -pedantic prefix_dec4.c
$ ./prefix_dec4
sizeof int=4
p=80610cc and var=80610cc. *p=11 and *p_elt=11
p=80610c8 and var=80610c8. *p=10 and *p_elt=10
p=80610c4 and var=80610c4. *p=0 and *p_elt=0

IV.8.3 Postfix increment operator


The postfix increment operator is a unary operator taking one operand
pointer type. It follows its operand as shown below:

[40]
having real or

var++

The expression var++ evaluates to the value stored in the operand var and then increments
the value of var. For instance, if v=5, the expression v++ evaluates to the value 5 and then

alters the variable v to 6 as shown below:


$ cat postfix_inc1.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int v = 5;
int w = v++;

printf(v=%d and w=%d\n, v, w);
return EXIT_SUCCESS;
}
$ gcc -o postfix_inc1 -std=c99 -pedantic postfix_inc1.c
$ ./postfix_inc1
v=6 and w=5

If the operand is a pointer, the operation evaluates to the value of its operand and then
changes it to the address of the next object. That is, if var is a pointer, the expression var++
evaluates to the pointer var and then sets the value of the pointer var to var + sizeof *var as
shown below:
$ cat postfix_inc2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int nb_element = 3;
int *var = malloc(nb_element * sizeof *var) ;
int *p;
var[0] = 10;
var[1] = 11;
var[2] = 17;


printf(sizeof int=%d\n, sizeof *var);
printf(var[0]=%d at address %p\n, var[0], &var[0]);
printf(var[1]=%d at address %p\n, var[1], &var[1]);
printf(var[2]=%d at address %p\n, var[2], &var[2]);

printf(\nBefore postfix expression. var=%p. *v=%d\n, var, *var);
p=var++; printf(After p=var++. p=%p and var=%p. *p=%d and *v=%d\n, p, var, *p, *var);
p=var++; printf(After p=var++. p=%p and var=%p. *p=%d and *v=%d\n, p, var, *p, *var);

p=var++; printf(After p=var++. p=%p and var=%p. *p=%d\n, p, var, *p);



return EXIT_SUCCESS;
}
$ gcc -o postfix_inc2 -std=c99 -pedantic postfix_inc2.c
$ ./postfix_inc2
sizeof int=4
var[0]=10 at address 8061200
var[1]=11 at address 8061204
var[2]=17 at address 8061208

Before postfix expression. var=8061200. *v=10
After p=var++. p=8061200 and var=8061204. *p=10 and *v=11
After p=var++. p=8061204 and var=8061208. *p=11 and *v=17
After p=var++. p=8061208 and var=806120c. *p=17

IV.8.4 Postfix decrement operator


The postfix decrement operator works in the same way as the postfix increment operator
but instead of incrementing the value of its operand its decrements it. It has the following
form:
var

The expression var evaluates to the value of var and then decrements the value of var. For
instance, if v=5 then the expression vevaluates to 5 and v contains 4 as shown below
$ cat postfix_dec1.c
#include <stdlib.h>
#include <stdio.h>

int main(void) {
int v = 5;
int w = v;

printf(v=%d and w=%d\n, v, w);
return EXIT_SUCCESS;
}
$ gcc -o postfix_dec1 -std=c99 -pedantic postfix_dec1.c
$ ./postfix_dec1
v=4 and w=5

If the operand is a pointer, the operation evaluates to the pointer and then changes it to the

address of the previous object. That is, if var is a pointer, the expression var evaluates to
the pointer var and then sets it to the value var - sizeof *var as shown below:
$ cat postfix_dec2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int nb_element = 3;
int *var = malloc(nb_element * sizeof *var) ;
int *p, *p_elt;
var[0] = 10;
var[1] = 11;
var[2] = 17;
p_elt = &var[2];

printf(sizeof referenced objects=%d Bytes\n, sizeof *var);
printf(var[0]=%d at address %p\n, var[0], &var[0]);
printf(var[1]=%d at address %p\n, var[1], &var[1]);
printf(var[2]=%d at address %p\n, var[2], &var[2]);

printf(\nBefore postfix expression. Last element p_elt=%p. *p_elt=%d\n, p_elt, *p_elt);
p=p_elt; printf(After p=p_elt. p=%p and p_elt=%p. *p=%d and * p_elt=%d\n, p, p_elt, *p, * p_elt);
p=p_elt; printf(After p=p_elt. p=%p and p_elt=%p. *p=%d and * p_elt=%d\n, p, p_elt, *p, * p_elt);

return EXIT_SUCCESS;
}
$ gcc -o postfix_dec2 -std=c99 -pedantic postfix_dec2.c
$ ./postfix_dec2
sizeof referenced objects=4 Bytes
var[0]=10 at address 80611d8
var[1]=11 at address 80611dc
var[2]=17 at address 80611e0

Before postfix expression. Last element p_elt=80611e0. *p_elt=17
After p=p_elt. p=80611e0 and p_elt=80611dc. *p=17 and * p_elt=11
After p=p_elt. p=80611dc and p_elt=80611d8. *p=11 and * p_elt=10

IV.8.5 Subscript operator


When we talked about arrays and pointers, we said there were two methods to access an

object stored in an array or in an memory area pointed to by a pointer: by using the


operator [] or *. The operator denoted by [], known as a subscript operator, takes two
operands: the operand preceding the left square bracket is the name of a pointer or an
array, and the operand between the square brackets is an expression that evaluates to an
integer number. It evaluates to an element of an array. The general form is given below:
arr[E]

Where:
o arr is the name of an array or a pointer
o E is an expression that evaluates to an integer value. If the expression E evaluates to the
integer number n, arr[n] denotes the object located at index n-1 of the array arr.

If the expression E evalues to an integer n, the expression arr[n] is equivalent to *(arr + n).

Here is an example:
$ cat subscript1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int nb_element = 3;
int *iList = malloc(nb_element * sizeof *iList) ;

iList[0] = 10;
iList[1] = 11;
iList[2] = 17;

printf(iList[0]=%d\n, iList[0]);
printf(iList[1]=%d\n, iList[1]);
printf(iList[2]=%d\n, iList[2]);

return EXIT_SUCCESS;
}
$ gcc -o subscript1 -std=c99 -pedantic subscript1.c
$ ./subscript1
iList[0]=10
iList[1]=11
iList[2]=17

We can use the postfix increment operator to produce a program that is equivalent:

$ cat subscript2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int nb_element = 3;
int *iList = malloc(nb_element * sizeof *iList) ;
int i = 0;

iList[i] = 10; i++;
iList[i] = 11; i++;
iList[i] = 17;

i=0;
printf(iList[0]=%d\n, iList[i]); i++;
printf(iList[1]=%d\n, iList[i]); i++;
printf(iList[2]=%d\n, iList[i]);

return EXIT_SUCCESS;
}
$ gcc -o subscript2 -std=c99 -pedantic subscript2.c
$ ./subscript2
iList[0]=10
iList[1]=11
iList[2]=17

IV.8.6 sizeof
sizeof E
sizeof(obj_type)

Where:
o E is an expression. Parentheses around the expression can be omitted but if E contains
several operators, you may have to resort to parentheses to prevent the sizeof operator to
take precedence over the operators of the expression.
o obj_type is a type name.

The sizeof operator takes a single operand and returns its size in byte. The type of the value
returned by the sizeof operator is size_t that is an unsigned integer defined by the
implementation.

The operand can be a type or an expression. If the operand is a type, it must be surrounded
by parentheses. If the operand is an expression, it returns the size of the type of the
expression.

Take note you may have to use parentheses around the expression if it is composed of
operators: the sizeof operator may have precedence over other operators.

Here is an example:
$ cat sizeof_op1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x =10;
double f = 1.2;

printf (sizeof(int)=%d\n, sizeof(int));
printf (sizeof(double)=%d\n, sizeof(double));
printf (sizeof x=%d\n, sizeof x);
printf (sizeof f=%d\n, sizeof f);
printf (sizeof(x + 1)=%d\n, sizeof(x + 1) );
printf (sizeof(f + 1)=%d\n, sizeof(f + 1) );

return EXIT_SUCCESS;
}
$ gcc -o sizeof_op1 -std=c99 -pedantic sizeof_op1.c
$ ./sizeof_op1
sizeof(int)=4
sizeof(double)=8
sizeof x=4
sizeof f=8
sizeof(x + 1)=4
sizeof(f + 1)=8

In the example above, we surrounded the expression x+1 and f+1 with parentheses to
prevent the sizeof operator from taking the precedence over the addition operation: the
expression sizeof x + 1 operator would compute the size of the x variable, and then adds it to
1 as shown below:
$ cat sizeof_op2.c
#include <stdio.h>
#include <stdlib.h>


int main(void) {
int x =10;

printf (sizeof(x + 1)=%d\n, sizeof(x + 1) );
printf (sizeof x + 1=%d\n, sizeof x + 1 );

return EXIT_SUCCESS;
}
$ gcc -o sizeof_op2 -std=c99 -pedantic sizeof_op2.c
$ ./sizeof_op2
sizeof(x + 1)=4
sizeof x + 1=5

It is interesting to note the operand of sizeof is evaluated only if it is a VLA (variable-length


array). Otherwise, the operand is not evaluated and the value the sizeof expression is an
[41]
integer constant
. Try this:
$ cat sizeof_op3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x = 10;
int y = sizeof(++x);

printf (x=%d\ny=%d\n, x, y );

return EXIT_SUCCESS;
}
$ gcc -o sizeof_op3 -std=c99 -pedantic sizeof_op3.c
$ ./sizeof_op3
x=10
y=4

As shown above, the expression ++x is not evaluated within the sizeof operator.

IV.9 lvalue
We talked about lvalues in Chapter II Section II.9. Here, we refine our definition. Usually,
in programming, the word lvalue refers to a modifiable variable that can appear on the left
side of the assignment operator =. An rvalue is any expression that appears on the right
side of the assignment operator: lvalue=rvalue. This implies an lvalue can be altered. In C,

such a definition is insufficient: an expression can be an lvalue and an lvalue may not
alterable!

An lvalue is an expression that refers to an object. That is, it refers to a storage region
identified by an address that can hold a piece of data. Practically, if you can get the
address of the resulting value of an expression that represents an object, it is an lvalue. For
example:
o a variables is an lvalue
o a pointer is an lvalue
o if p is a pointer, *p is an lvalue
o an array is an lvalue
o If p is pointer, the expression *(p+1) is an lvalue since *(p+1) refers to an object.

The following items are not lvalues:
o The constant 12 is not an lvalue
o If v is a variable, the expression v+1 is not an lvalue: v+1 does not refer to an object but to
a value of an expression. If you try to do something like this &(v+1), you will get an error.
o If f is a function, f is not an lvalue: it does not refer to an object but a piece of code.
o If v is an lvalue, &v is not an lvalue but the value of an expression that is the address of
the lvalue.
o If v is an lvalue, sizeof v is not an lvalue but the value of an expression that is the size of
the lvalue.

The following example fails to compile:
$ cat lvalue1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v;

v+1=10; /* fails: not lvalue */
12 = 1; /* fails: not a lvalue */
&v=10; /* fails: not a lvalue */

return EXIT_SUCCESS;
}

$ gcc -o lvalue1 -std=c99 -pedantic lvalue.c


lvalue.c: In function main:
lvalue.c:7:3: error: lvalue required as left operand of assignment
lvalue.c:8:3: error: lvalue required as left operand of assignment
lvalue1.c:9:3: error: lvalue required as left operand of assignment

In C, some lvalues are not alterable:


o Arrays cannot be altered
o Constant variables and pointers (declared with the type qualifier const)
o Structures and unions having members declared with the type qualifier const are not
modifiable (see Chapter VI)
o lvalues that have incomplete type other than void (see Chapter VIII Section VIII.6.3.2)

The following example attempts to modify lvalues that are not modifiable:
$ cat lvalue2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int const v; /* constant variable: read-only lvalue */

/* structure my_int containing a read-only member called i */
struct my_int {
int const i;
} str;

v=10; /* fails: not modifiable lvalue */
str.i = 10 ; /* fails: not modifiable lvalue */

return EXIT_SUCCESS;
}
$ gcc -o lvalue2 -std=c99 -pedantic lvalue2.c
lvalue2.c: In function main:
lvalue2.c:12:3: error: assignment of read-only variable v
lvalue2.c:13:3: error: assignment of read-only member i


There is an important rule that you have to keep in mind in order to understand the
underlying logics of conversions: qualifiers are discarded from the type of the value of an
lvalue. An lvalue has a type and evaluates to a value. If the lvalue has a qualified type, its

value has an unqualified version of that type. Otherwise, if the lvalue has not a qualified
type, both the lvalue and its value have the same type. For example:
int x = 10;
int y = x ; // x is an lvalue, its value 10 has the same type int

const int v = 10;
int w = v ; /* v is an lvalue, it has the const-qualified type const int,
but its value is of type int
*/

int *const p = &x;
int *q = p ; /* p is an lvalue, it has the const-qualified type int *const,
but its value is of type int *
*/

IV.10 Assignment operators


The C language specifies several ways to assign a value resulting from the evaluation of
expressions to a variable. We first start with the simple assignment that we have already
studied.

IV.10.1 Simple assignment


Assigning a value of an expression to an lvalue takes the following form:
var=expr

Where:
o var is an lvalue such as the name of a variable, element of an array or a pointer
Anything that stores a value can be put on the left side of the assignment operator.
o expr is an expression

The simple assignment is composed of three elements: the operator =, an lvalue located on
the left hand of the operator and an rvalue on the right hand of the operator.

Keep in mind, the simple assignment operation performs two tasks:
o It evaluates the rvalue and assigns its value to the lvalue.
o It evaluates to the value of the rvalue. This means that the assignment expression
evaluates to the value of expr.


As a consequence, since c=1 also evaluates the value of 1, we could write something like
a=b=c=1 as shown below:
$ cat assign_op1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int a,b,c,d;

a=b=c=d=10;

printf (a=%d, b=%d, c=%d, d=%d\n, a, b, c, d);
return EXIT_SUCCESS;
}
$ gcc -o assign_op1 -std=c99 -pedantic assign_op1.c
$ ./assign_op1
a=10, b=10, c=10, d=10

The rvalue can be an expression much more sophisticated than a simple variable or literal:
it can be composed of several operations.
$ cat assign_op2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float f;
float v = 1.9;

f=10*2.7/v-2;

printf (f=%f\n, f);
return EXIT_SUCCESS;
}
$ gcc -o assign_op2 -std=c99 -pedantic assign_op2.c
$ ./assign_op2
f=12.210526

While assigning a value to an lvalue, an implicit cast may occur. The assignment operation
evaluates the rvalue, casts its value (if it can) according to the type of the lvalue, then
assigns the value to the lvalue and returns it. In the following example, the value of the
expression v+1.2 is converted to type int that is the type of the variable j:

$ cat assign_op3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float f;
float v = 1.3;
int i;

i = f = v + 1.2; printf( f=%f and i=%d\n, f, i );
f = i = v + 1.2; printf( f=%f and i=%d\n, f, i );

return EXIT_SUCCESS;
}
$ gcc -o assign_op3 -std=c99 -pedantic assign_op3.c
$ ./assign_op3
f=2.500000 and i=2
f=2.000000 and i=2

Can you see the difference between the two simple assignment operations?
o Let us consider the first expression i = f = v + 1.2. First, the expression v + 1.2 evaluated to
the floating number 2.5. In the second step, that value was assigned to the variable f
having the type float (no cast). The simple assignment itself evaluates to the value 2.5.
Then, that value was cast to type int to yield the integer number 2 that was finally
assigned to the variable i of type int.
o The same process occurred for the second expression f = i = v + 1.2. First, the expression v +
1.2 evaluated to the floating number 2.5. In the second step, that value was cast to type int
to yield the integer number 2 before being assigning to the variable i having the type int
(implicit cast). That assignment returned the integer number 2 that was finally assigned
to the variable f.

In the following program, we assign a variable and we test the value of another variable in
the same relational expression:
$ cat assign_op4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int const val = 4;
int x;
int y = 8;


(x=val) < y ? printf(y=%d and x = %d. y > x\n, y, x)
: printf(y=%d and x = %d. y < x\n, y, x) ;

return EXIT_SUCCESS;
}
$ gcc -o assign_op4 -std=c99 -pedantic assign_op4.c
$ ./assign_op4
y=8 and x = 4. y > x

The simple assignment operator can work with other types than arithmetic values such as
pointers, strings, or user-defined types we will describe later. In the following example,
the lvalue is an array:
$ cat assign_op5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char a[20] = Wonderful;

printf(a=%s\n, a);

return EXIT_SUCCESS;
}
$ gcc -o assign_op4 -std=c99 -pedantic assign_op5.c
$ ./assign_op5
a=Wonderful

As we explained it in details, you can assign a string literal to an array only at the time of
declaration. The following example is not equivalent to the previous one. It is erroneous
and cannot be compiled:
$ cat assign_op6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char a[20];

a = Wonderful;
printf(a=%s\n, a);

return EXIT_SUCCESS;

}
$ gcc -o assign_op6 -std=c99 -pedantic assign_op6.c
assign_op6.c: In function main:
assign_op6.c:7:5: error: incompatible types when assigning to type char[20] from type char *
a = Wonderful;
^

After the declaration of an array, you can no longer assign it a value: you can only assign
its elements individually or invoking a copy function such as strcpy() to copy data into it.

Pointers in assignment operations work as variables. The following assignment involves a
pointer:
$ cat assign_op7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *p = Wonderful;

printf(p=%s\n, p);

return EXIT_SUCCESS;
}

In the example, the pointer p pointed to the string literal Wonderful. That is, the address of
the string literal was assigned to the pointer p. This should not be confused with the
previous example in which the string literal Wonderful was copied into the array a.

You may be tempted to write cryptic programs as you master the C language. Remember,
it is always better to have a program easy to be readThe C language allows you do
perform several tasks in a very condensed way and this could be a problem when you will
have to debug your programs if you abuse of this facility.

IV.10.2 Compound assignments


The C language specifies several compound assignments that are just handy shortcuts.
They take the following form:
var op= expr

Where:
o op is one of the following arithmetic operators: +, -, /, %, *, ^, |, &, << and >>.

o expr is an expression.
o var is an lvalue that can be a variable, an element of array or a pointer

The syntax is equivalent to var = var op expr.

For example, x += 1 is the same as x = x + 1 that means incrementing the value of the variable
x and placing the result in it, which is also the value of the expression. In the examples
given in Table IV11, the x variable holds the value of 2 before the assignments.

Table IV11 Compound assignments


Here is an example:
$ cat compound_assign_op1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x;
x = 2; x += 5; printf(x = 2; x += 5; x=%d\n, x);
x = 2; x *= 2; printf(x = 2; x *= 2; x=%d\n, x);
x = 2; x %= 2; printf(x = 2; x %%= 2; x=%d\n, x);

return EXIT_SUCCESS;
}
$ gcc -o compound_assign_op1 -std=c99 -pedantic compound_assign_op1.c
$ ./compound_assign_op1
x = 2; x += 5; x=7
x = 2; x *= 2; x=4
x = 2; x %= 2; x=0

IV.11 Ternary conditional operator


The ternary conditional operation takes three operands and returns the value of an
operand. It has the following syntax:
condition ? expr:alternate_expr

Where:
o The first operand condition is an expression that evaluates to true (nonzero value) or false
(zero). However, be aware that the expression cannot contain assignment operators
unless they lie in parentheses (see section IV.13).
o expr is an expression.
o alternate_expr is an expression but not any expression as the second operand. It cannot
contain assignment operators unless they are between parentheses because they ternary
operator has precedence over assignment operators as we will find it out in section IV.13.
o The value of the ternary expression is either the value of expr or alternative_expr depending
on the expression condition
o Blanks around ? and : are permitted
o Newlines after ? and after : are permitted.

Thus, if the expression condition is true (any nonzero value), the expression expr is evaluated
and the ternary expression takes this value. Otherwise, the value of the expression is
alternate_expr is taken.

Here is a very basic example:
$ cat ternary_cond_op1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *s;

int x;

x=0; s = x ? TRUE : FALSE ; printf (if x=%d, s=%s\n, x, s);
x=10; s = x ? TRUE : FALSE ; printf (if x=%d, s=%s\n, x, s);
x=-1; s = x ? TRUE : FALSE ; printf (if x=%d, s=%s\n, x, s);
}
$ gcc -o ternary_cond_op1 -std=c99 -pedantic ternary_cond_op1.c
$ ./ternary_cond_op1
if x=0, s=FALSE
if x=10, s=TRUE
if x=-1, s=TRUE

In the example above, we notice the ternary condition operator has precedence over the
simple assignment operator. That is, it is evaluated before the assignment occurs. In our
example, the ternary condition operator evaluates to a string but it can return any value
depending on its operand. In the following example, it may return a float or an int:
$ cat ternary_cond_op2.c
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int main(int argc, char **argv) {
6 char *program_name = argv[0];
7 char *type_pi;
8 float pi;
9
10 if (argc < 2) {
11 printf(USAGE: %s {int|float}\n, program_name );
12 printf(argument can be int or float\n);
13 return EXIT_FAILURE;
14 }
15
16 type_pi = argv[1];
17 if ( strcmp(type_pi, int) && strcmp(type_pi, float) ) {
18 printf(USAGE: %s {int|float}\n, program_name );
19 printf(Unknown argument %s. Argument must be int or float\n, type_pi);
20 return EXIT_FAILURE;
21 }
22
23 pi = !strcmp(type_pi, int) ? 3 : 3.14159;
24 printf (pi=%f\n, pi);
25

26 return EXIT_SUCCESS;
27 }
$ gcc -o ternary_cond_op2 -std=c99 -pedantic ternary_cond_op2.c
$ ./ternary_cond_op2 int
pi=3.000000
$ ./ternary_cond_op2 float
pi=3.141590

Explanation:
o Line 5: the main() function is defined with two arguments. The first one argc is meant for
storing the number of arguments of the program including the program name. The
second argument argv is an array of strings that will store the arguments: argv[0] holds the
program name, argv[1] the first argument
o Lines 10-14: since the program expects one argument, we check the user has actually
provided one. Otherwise, a little help is displayed explaining how to use the program.
o Line 16: We store the first argument argv[1] in the variable type_pi.
o Lines 17-21: The logical relation strcmp(type_pi, int) && strcmp(type_pi, float) returns 0 if
the variable type_int holds a string different from int and float. In this case, we display a
message indicating the expected argument has to be the string float or int.
o Line 23: the ternary operation returns 3 if the passed argument is int. Otherwise, it returns
3.14159. The returned value is assigned to the pi variable.
o Line 24: we display the value of the variable pi.

Keep in mind that the first and the third operand are particular expressions. Assignment
operations are part of them only if they are enclosed between parentheses. Let us consider
the following example:
$ cat ternary_cond_op3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x, y=10;
float f;

f = x = y ? 3.14159 : 3 ;
printf (x=%d,y=%d and f=%f\n, x, y, f);

return EXIT_SUCCESS;
}
$ gcc -o ternary_cond_op3 -std=c99 -pedantic ternary_cond_op3.c

$ ./ternary_cond_op3
x=3,y=10 and f=3.000000

In our example above, the first operand is not x = y as you may think but y. The expression f
= x = y ? 3.14159 : 3 is equivalent to f = x = (y ? 3.14159 : 3). Since y is different from zero, the
ternary operation evaluates to 3.14159 and since x has an integer type, an implicit cast is
performed. Thus, the value 3 is stored in x and then in f.

Compare with the following code:
$ cat ternary_cond_op4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x, y=10;
float f;

f = (x = y) ? 3.14159 : 3 ;
printf (x=%d,y=%d and f=%f\n, x, y, f);

return EXIT_SUCCESS;
}
$ gcc -o ternary_cond_op4 -std=c99 -pedantic ternary_cond_op4.c
$ ./ternary_cond_op4
x=10,y=10 and f=3.141590

In example ternary_cond_op4.c, the first operand of the ternary operator is (x = y). The first
operand is evaluated, the variable x is assigned the value of the variable y and the
expression evaluates to the value taken from y. Since the expression evaluates to 10, a
value different from zero, the ternary operation evaluates to the value of the second
expression 3.14159 that is finally assigned to the variable f.

You can use assignment operations in the second operand without resorting to parentheses:
$ cat ternary_cond_op5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x, y=10;
float f;

f = y ? x = 3 : 3.14159;
printf (x=%d,y=%d and f=%f\n, x, y, f);

return EXIT_SUCCESS;
}
$ gcc -o ternary_cond_op5 -std=c99 -pedantic ternary_cond_op5.c
$ ./ternary_cond_op5
x=3,y=10 and f=3.000000

IV.12 Comma operator


expr1,expr2,,expr3

Where:
o expr1, expr2,, exprN are expressions.

The expressions expr1, expr2,, and exprN are executed sequentially. The value of the
comma expression is the value of the last expression exprN. The comma operator has the
lowest precedence (see next section).

The comma operator has nothing to do with the comma separator used in declarations. In
the following example, we declare three variables as int using a comma that is not a
comma operator.
$ cat comma_op1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x, y=10, z=9;

return EXIT_SUCCESS;
}

In the following example, we use the comma operator between two expressions executed
sequentially:
$ cat comma_op2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

int i, x, y;

i = ( x=1+2, y=2*10 ); /* comma operator */

printf(x=%d, y=%d, i=%d\n, x, y, i);
return EXIT_SUCCESS;
}
$ gcc -o comma_op2 -std=c99 -pedantic comma_op2.c
$ ./comma_op2
x=3, y=20, i=20

We used the parentheses because the assignment operator has precedence over the comma
operator.

The comma operator is not often used. It is sometimes used in the for loop described in the
next chapter.

IV.13 Operator precedence


The C language allows you to build expressions involving several operators. The problem
is in which order will the computer perform the calculations? For example, without any
specific rule, the expression 2*6+2 may be evaluated in two ways:
o If the addition is performed first, the expression evaluates to 16: 2*6+2=2*8=16.
o If the multiplication is carried out first, the expression evaluates to 14: 2*6+2=12*2=14

Accordingly, in the same way as we do it in mathematics, we define precedence for
operators. In C, we have precedence rules indicating the evaluation order of operations.
For example, in C, as in mathematics, the multiplication operator has precedence over
addition, so, 2*6+2 evaluates to 14. Table IV12 lists the operators from the highest to
lowest precedence.

Table IV12 Operator precedence in decreasing order


In Table IV12, E1, E2, E are expressions and var is an lvalue (variable, element of an
array). You can notice we introduced two new operators that will talk about at Chapter
VI: the member-access operators . and ->. They allow accessing members of unions and
structures.

The following example shows the increment operators take precedence over the
multiplication operator:
$ cat precedence_op1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int a = 1 ;
int b = 2 * a++;

int c = 1;
int d = 2 * ++c;

printf(a=%d and b = %d\n, a, b);
printf(c=%d and d = %d\n, c, d);


return EXIT_SUCCESS;
}
$ ./precedence_ip1
a=2 and b = 2
c=2 and d = 4

The parentheses allow you to modify the operator precedence. For example, 2 * 6 + 2
evaluates to 14. With parentheses, you can change the precedence by evaluating the
addition first. Thus, 2 * (6+2) evaluates to 16.

If you are in doubt about evaluation order in expressions, use parentheses. Also reset to parentheses to ease
the reading


How do expressions evaluate if operators have the same precedence? For certain operators
such as addition, this is not a problem: it evaluates to the same value whatever the
evaluation order may be (for example, 1+2+9). However, the evaluation order is relevant for
other operations such as the division: for example 12/2/2/2. To resolve the issue, the
associativity is used to specify the evaluation order: from left to right (left associativity) or
from right to left (right associativity). For instance, since the associativity of the division
operator is left, the expression 12/3/2/2 is equivalent to ((12/3)/2)/2 which evaluates to 1. Let us
consider another example:
$ cat precedence_op2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int a = 1;
int b;
int d = 2 * (b=a);

printf(a = %d, b = %d and d = %d\n, a, b, d);

return EXIT_SUCCESS;
}
$ gcc -o precedence_op2 -std=c99 -pedantic precedence_op2.c
$ ./precedence_op2
a = 1, b = 1 and d = 2

The expression d = 2 * (b=a) is evaluated in several steps:


o Parentheses takes precedence over the multiplication: the expression b=a is evaluated
first. The variable b is assigned the value of the variable a. Then, the expression evaluates
to the value of the variable a that is 1. Thus, b holds the value 1 and the expression b=a
evaluates to 1.
o The multiplication operation d = 2 * (b=a) evaluates to 2 * 1 = 2. Therefore, d holds the value
2.

You could wonder why we have used the parentheses. Try the same example without
parentheses:
$ cat precedence_op3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int a = 1;
int b;
int d = 2 * b=a;

printf(a = %d, b = %d and d = %d\n, a, b, d);

return EXIT_SUCCESS;
}
$ gcc -o precedence_op3 -std=c99 -pedantic precedence_op3.c
precedence_op3.c: In function main:
precedence_op3.c:7:4: error: lvalue required as left operand of assignment

The compilation failed. Can you see why? The compiler gave an explanationIf you have
a look at Table IV12, you can notice the assignment operators have the lowest precedence
and has a right associativity, which means the expression d = 2 * b=a is equivalent to d = ( (2 *
b) = a ). The problem is the expression 2*b is not an lvalue. Consequently, the statement
(2*b)=a is invalid.

The error in the example above appears now more obvious. The following example shows
the same symptom, yet it is not glaringly obvious:

$ cat precedence_op4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x = 6;
int y = 7;
int res;

res = x > y ? x : x = y;

printf(x=%d y=%d res=%d\n, x, y, res);
return EXIT_SUCCESS;
}
$ gcc -o precedence_op4 -std=c99 -pedantic precedence_op4.c
precedence_op4.c: In function main:
precedence_op4.c:9:4: error: lvalue required as left operand of assignment

In the example above, the expression res = x > y ? x : x = y seems to be the same as:

if ( x > y) {
res = x;
} else {
res = x = y;
}


However, this is not the case. Why? Because the third operand of the ternary operator is
not x = y but x! Remember that the = operator is an assignment operator and its precedence
is less than that of the ternary operator. Which means that x > y ? x : x = y is equivalent to (x >
y ? x : x) = y. As you may have guessed, the ternary operation cannot be an lvalue and then
generates a compilation error.

Why is the expression res = x > y ? x : x = y not equivalent to ( res = (x > y ? x : x) ) = y but to res = (
(x > y ? x : x) ) = y)? The associativity of the simple assignment operator is right

Now, we can write a correct version of the example precedence_op4.c:
$ cat precedence_op5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x = 6;
int y = 7;
int res;

res = x > y ? x : (x = y);

printf(x=%d y=%d res=%d\n, x, y, res);
return EXIT_SUCCESS;
}
$ gcc -o precedence_op5 -std=c99 -pedantic precedence_op5.c
$ ./precedence_op5
x=7 y=7 res=7

OK, you have gotten it but why does the following code work without parentheses?
$ cat precedence_op6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x = 6;
int y = 7;
int res;

res = x < y ? x = y : x;

printf(x=%d y=%d res=%d\n, x, y, res);
return EXIT_SUCCESS;
}
$ gcc -o precedence_op6 -std=c99 -pedantic precedence_op6.c
$ ./precedence_op6
x=7 y=7 res=7

A clue? If you remember what we said about the ternary condition operator, the first and
third operands are not any expression: unlike the second operand, they cannot contain
assignment operators unless they are between parentheses. The second operand can work
with assignment operators without using parentheses.

IV.14 Type conversion


We end the chapter with a very important point: the conversion of types. The subject may
appear as tricky for beginners not because it is difficult but mainly because several kinds

of type conversions may be involved. Let us start with the integer conversion ranks and
integer promotions.

IV.14.1 Integer conversion rank


The C language has several integer types: char, signed char, unsigned char, short, unsigned short,
int, unsigned int, long, unsigned long, long long, unsigned long long. In some specific conditions,
described in the next section, the compiler automatically converts an integer type to
another integer type of higher rank according to the conversion rank order depicted in
Figure IV7.

Figure IV7 Integer conversion rank



In Figure IV7, we can see the type _Bool has the lowest conversion rank and the types char,
signed char and unsigned char have same conversion rank If an implementation introduces
new types (extended types), they also have a conversion rank described by a
documentation.

IV.14.2 Integer promotions


[42]
[43]
In expressions
expecting operands of arithmetic types, integer types of lower rank
than that of type int are converted to int if their value can be held in an int or to unsigned int
otherwise: this is known as integer promotions. In the following example, the operands a
and b of type char are first promoted to type int before carrying out the addition:
$ cat integer_promotion1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char a = 120;
char b = 120;
int c;

c = a + b;
printf(a=%d, b=%d, c=a+b=%d+%d=%d\n, a, b, a, b, c);
return EXIT_SUCCESS;
}
$ gcc -o integer_promotion1 -std=c99 -pedantic integer_promotion1.c
$ ./integer_promotion1
a=120, b=120, c=a+b=120+120=240

In our computer, the type char is represented by one byte while int is represented by four
bytes. The following example shows the addition promotes its operand to int and then
evaluates to an int:
$ cat integer_promotion2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char a = 120;
char b = 120;

printf(sizeof a=%d, sizeof b=%d, sizeof(a+b)=%d \n, sizeof a, sizeof b , sizeof(a+b));
return EXIT_SUCCESS;
}
$ gcc -o integer_promotion2 -std=c99 -pedantic integer_promotion2.c
$ ./integer_promotion2
sizeof a=1, sizeof b=1, sizeof(a+b)=4

Of course, the integer promotions are silently performed and you do not have to worry

about it. It is only the very first step of a process known as integer conversions. However,
you must watch out for the integer conversions described in the next section because it
may lead to unexpected behaviors when you mix unsigned and signed operands in your
expressions.

IV.14.3 Conversions and unary operators


Only the integer promotions apply to unary operators since they have a single operand:
unary plus +, unary minus -, and unary bitwise not ~ (bitwise complement). If the operand
has a type with lower rank than that of int, the integer promotions promote the operand to
int or unsigned int as appropriate, which is also the type of the result.

Though the bitwise shift operator is binary, only the integer promotions apply to its
operands. The resulting value has the type of the left operand after the integer promotions.

In the following example, the unary operator promotes the integer types unsigned short and
unsigned char to int before carrying out the operation. In all cases, the type of the expression
is the type of the operand after the integer promotions.
$ cat unary_promot1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned short h = 1;
unsigned int i = 1;
unsigned char j = 1;

long long x;

x = -h; printf(x=%lld\n, x); //h promoted to int, type of h is int
x = -i; printf(x=%lld\n, x); //No conversion. type of -i is unsigned int
x = -j; printf(x=%lld\n, x); //j promoted to int, type of j is int

return EXIT_SUCCESS;
}
$ gcc -o unary_promot1 -std=c99 -pedantic unary_promot1.c
$ ./unary_promot1
x=-1
x=4294967295
x=-1

IV.14.4 Conversions and binary operators


Integer conversions, more generally usual arithmetic conversions, occur within
expressions composed of binary operators. Consider the following example:
$ cat integer_conversion1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int a = 100;
signed int b = -1;

if (b < a) {
printf(%d < %d\n, b, a);
} else {
printf(%d > %d\n, b, a);
}
return EXIT_SUCCESS;
}

Could you guess the output? Here is it:


$ gcc -o integer_conv1 -std=c99 -pedantic integer_conv1.c
$ ./integer_conv1
-1 > 100

Incredible, isnt it? Let us explain whyThe cause: the integer conversions automatically
yielded by the compiler.

As explained earlier, the integer promotions convert an integer number smaller than int to
int or unsigned int. After the integer promotions, integer conversions may take place: this
happens within expressions mixing integer numbers of different types. After the integer
promotions, the following rules are applied:
o Rule 1: If the operands have the same type, no conversion is done and the resulting
value has this type.
o Rule 2: Otherwise, if the operands are all signed or all unsigned, the operand having a
type with lower conversion rank is converted to the type of the operand having greater
conversion rank that is also the type of the resulting value.
o Otherwise, if the types unsigned and signed integer are mixed:
Rule 3: If the unsigned integer operand has a type with conversion rank greater or

equal to that of the signed integer operand, the signed integer operand is converted

to the type of the unsigned integer operand that is also the type of the resulting value
of the operation.
Rule 4: Otherwise, if the signed integer operand has a type with greater conversion

rank than that of the unsigned integer operand, and can represent all the values of
the type of the unsigned integer operand, the unsigned integer operand is converted
to the type of the signed integer operand that is also the type of the resulting value of
the operation.
Rule 5: Otherwise, (if the signed integer operand has a type with greater

conversion rank than that of the unsigned integer operand, but cannot represent all
the values of the type of the unsigned integer operand), both operands are converted
to the unsigned version of the signed integer operand.

The integer conversion rule given above is part of a more general rule known as usual
arithmetic conversions (described in the next section). As the integer conversions are
rather tricky, we have split it to ease the understanding. Once understood, the general rule
for converting arithmetic operands will appear clearer. Let us give some examples
depicting the integer conversions:
o Rule 1: If the operands have the same type after the integer promotions, no conversion is
done and the resulting value has this type. In the following, the integer promotions and
integer conversions do not occur since both operands have the same type that has same
rank than int.
$ cat integer_conversion2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int a = 100;
unsigned int b = 1;

if (b < a) {
printf(%d < %d\n, b, a);
} else {
printf(%d > %d\n, b, a);
}
return EXIT_SUCCESS;
}


o Rule 2: If the operands are all signed or unsigned, the operand having a type with lower
conversion rank is converted to the type of the operand having greater conversion rank
that is also the type of the resulting value.

$ cat integer_conversion3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int a = 100;
unsigned long long b = 1;

printf(sizeof a=%d sizeof b=%d sizeof(a+b)=%d\n, sizeof a, sizeof b, sizeof(a+b));
printf(%u + %llu = %llu\n, a, b, a+b);
return EXIT_SUCCESS;
}
$ gcc -o integer_conv3 -std=c99 -pedantic integer_conv3.c
$ ./integer_conv3
sizeof a=4 sizeof b=8 sizeof(a+b)=8
100 + 1 = 101

The operand a of the expression a + b is converted to unsigned long long that is also the type
of the returned value.

o If unsigned and signed integer types are mixed:
Rule 3: if the unsigned integer operand has a type with conversion rank greater or

equal to that of the signed integer operand, the signed integer operand is converted
to the type of the unsigned integer operand that is also the type of the resulting value
of the operation.

In the following example, the operand b (operation a > b) is converted to unsigned int:
$ cat integer_conversion4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int a = 5;
int b = -3;
unsigned int c = (unsigned int)b;

if ( a > b ) { /* a and b have type unsigned int */
printf(%u > %d\n, a, b);
} else {
printf(%u < %d\n, a, b);

}

printf(operand b=%d takes the value %u when converted to unsigned int\n, b, c);

return EXIT_SUCCESS;
}
$ gcc -o integer_conv4 -std=c99 -pedantic integer_conv4.c
$ ./integer_conv4
5 < -3
operand b=-3 takes the value 4294967293 when converted to unsigned int

The operand b is negative, when converted to unsigned int, it takes the value 232[44]
3=4294967295 in our computer
, which explains why the a variable seems to be less
than the variable b. In fact, the evaluated expression is 5 > 4294967295 that is false.

Of course, if the value of b was positive, all would be fine as shown below:
$ cat integer_conversion5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int a = 5;
int b = 3;
unsigned int c = (unsigned int)b;

if ( a > b ) { /* a and b have type unsigned int */
printf(%u > %d\n, a, b);
} else {
printf(%u < %d\n, a, b);
}

printf(operand b=%d takes the value %u when converted to unsigned int\n, b, c);

return EXIT_SUCCESS;
}
$ gcc -o integer_conv5 -std=c99 -pedantic integer_conv5.c
$ ./integer_conv5
5 > 3
operand b=3 takes the value 3 when converted to unsigned int

A positive number of a signed integer type can be represented as an unsigned integer

type with no change but a negative number in a signed integer type is changed to a
positive integer number after converting it to an unsigned integer type.

Here is another example showing another unexpected behavior when mixing signed
and unsigned integer types in a C expression:
$ cat integer_conversion6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int a = 1;
int b = -2;
unsigned int c = (unsigned int)b;

long long int d = a + b; /* b converted to unsigned int */
long long int e = a + c; /* a and c have same type unsigned int */

printf(d=a+b=%u+%d=%lld\n, a, b, d);
printf(e=a+c=%u+%u=%lld\n, a, c, e);

return EXIT_SUCCESS;
}
$ gcc -o integer_conv6 -std=c99 -pedantic integer_conv6.c
$ ./integer_conv6
d=a+b=1+-2=4294967295
e=a+c=1+4294967294=4294967295

In the expression d = a + b, the compiler performs two different conversions:


The integer promotions convert the operand b to unsigned int (the value of b

becomes 4294967295 in our computer), then the expression a + b is evaluated to 1 +


4294967295=4294967296 that is of type unsigned int
The resulting value (of type unsigned int) is implicitly converted to the type of the

lvalue d (long long int) that will store it (implicit cast).



Rule 4: If the signed integer operand has a type with greater conversion rank than

that of the unsigned integer operand, and can represent all the values of the type of
the unsigned integer operand, the unsigned integer operand is converted to the type
of the signed integer operand that is also the type of the resulting value of the
operation.

Unlike example integer_conversion4.c, the following example yields the expected result:
$ cat integer_conversion7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int a = 5;
long long int b = -1;

if ( a > b ) { /* a and b have type long long int */
printf(%u > %d\n, a, b);
} else {
printf(%u < %d\n, a, b);
}

return EXIT_SUCCESS;
}
$ gcc -o integer_conv7 -std=c99 -pedantic integer_conv7.c
$ ./integer_conv7
5 > -1

It works as expected because the unsigned integer variable a is converted to type long
long int. The conversion rank of long long int is greater than that of unsigned int. Moreover,
in our computer, it is represented by eight bytes, which is enough to store the values
of the type unsigned int (fitting in four bytes in our computer). As a consequence, the
value of the variable b (negative number) remains unchanged while the operation a > b
is evaluated.

Rule 5: Otherwise, (if the signed integer operand has a type with greater

conversion rank than that of the unsigned integer operand, but cannot represent all
the values of the type of the unsigned integer operand), both operands are converted
to the unsigned version of the signed integer type.

In the following example, we will meet the same problem as revealed by example
integer_conversion8.c.
$ cat integer_conversion8.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int a = 5;

long int b = -3;



if ( a > b ) {
printf(%u > %d\n, a, b);
} else {
printf(%u < %d\n, a, b);
}

return EXIT_SUCCESS;
}
$ gcc -o integer_conv8 -std=c99 -pedantic integer_conv8.c
$ ./integer_conv8
5 < -3

Take note, only the integer promotions apply to operands of the bitwise shift operators. The type of
the result is the type of the left operand after the integer promotions.


In summary, we can conclude that we may have expected behaviors when we mix signed
and unsigned types and when signed operands have negative values. This means that you
should avoid mixing signed and unsigned values unless you actually know what you are
doing.

IV.14.5 Usual arithmetic conversions


Now, you have understood the integer conversions, the general arithmetic conversion rule,
known usual arithmetic conversions, will be very easy to catch. In C, an expression may
involve several arithmetic operands of different types. For example, an addition operation
can have one operand of type int and another one of type float as in the following example:
$ cat arithmetic_conv1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int a = 120;
float b = 12.23;

printf(a+b=%d+%f=%f\n, a, b , a+b);
return EXIT_SUCCESS;
}

In such a case, we could wonder what could be the type of the value resulting from the
addition involving an integer value and a floating value. The C standard gives specific
rules known as usual arithmetic conversions. The process consists in converting all the
arithmetic operands to a common type. This common type is also the type of the evaluated
[45]
value of the expression with the exception of the relational and equality operations
(operators <, <=, >, >=, == and !=) that evaluates to type int.

The usual arithmetic conversion affects arithmetic operations, relational operations,
bitwise operations, logical operations and the ternary operation. When such operations
involve operands having different arithmetic types, the following rules apply:
o If an operand has type long double, the common type is long double.
o Otherwise, if an operand has type double, the common type is double.
o Otherwise, if an operand has type float, the common type is float.
o Otherwise (operands have integer types), the integer promotions take place followed
by the integer conversions.

In the following example, the operand a is converted to type double:
$ cat usual_conv1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned int a = 5;
double b = -3;

if ( a > b ) { /* a and b have type float */
printf(%u > %f\n, a, b);
} else {
printf(%u < %f\n, a, b);
}

return EXIT_SUCCESS;
}

Both the operands a and b have the common type double before evaluating the expression a
> b.


Now, let us check that you have understood the usual arithmetic conversions. Assume we
had declared two variables a and b as integer types: a as short and b as char. Could you
guess the type of the resulting value of the following operations?
o Type of a + b?
The resulting value has type int as shown below:
$ cat usual_conv2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
short a = 120;
char b = 120;

printf(%d + %d = %d\n, a, b, a + b);
printf(sizeof(int)=%d, sizeof(char)=%d, sizeof(short)=%d, sizeof(a+b)=%d\n, sizeof(int), sizeof(char),
sizeof(short), sizeof(a+b));

return EXIT_SUCCESS;
}
$ gcc -o usual_conv2 -std=c99 -pedantic usual_conv2.c
$ ./usual_conv2
120 + 120 = 240
sizeof(int)=4, sizeof(char)=1, sizeof(short)=2, sizeof(a+b)=4

o Type of a * b?
The resulting value has type int as shown below:
$ cat usual_conv3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
short a = 120;
char b = 12;

printf(%d * %d = %d\n, a, b, a * b);
printf(sizeof(int)=%d, sizeof(char)=%d, sizeof(short)=%d, sizeof(a*b)=%d\n, sizeof(int), sizeof(char),
sizeof(short), sizeof(a*b));

return EXIT_SUCCESS;
}

$ gcc -o usual_conv3 -std=c99 -pedantic usual_conv3.c


$ ./usual_conv3
120 * 120 = 14400
sizeof(int)=4, sizeof(char)=1, sizeof(short)=2, sizeof(a*b)=4

o What is the type of a / b?


The resulting value has type int as shown below:
$ cat usual_conv4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
short a = 30;
char b = 20;

printf(%d / %d = %d\n, a, b, a / b);
printf(sizeof(int)=%d, sizeof(char)=%d, sizeof(short)=%d, sizeof(a/b)=%d\n, sizeof(int), sizeof(char),
sizeof(short), sizeof(a*b));

return EXIT_SUCCESS;
}
$ gcc -o usual_conv4 -std=c99 -pedantic usual_conv4.c
$ ./usual_conv4
30 / 20 = 1
sizeof(int)=4, sizeof(char)=1, sizeof(short)=2, sizeof(a/b)=4


In all of the three previous examples, the integer promotions convert the operands a and b
to int, which is also the type of the resulting value of the operations.

Same question if the variable a is declared as float and the variable b declared as char:
o Type of a + b? After the integer promotions, b takes the type int. After the usual
arithmetic conversions, both the operands a and b and the resulting value of the
operation have type float.
o What is the type of a * b? Same as above.
o What is the type of a / b? Same as above.

IV.15 Constant expressions


A constant expression is an expression that evaluates to a constant value known before the

startup of the program. It can be a constant or an operation composed of constant operands


and operators. Since its value is evaluated at compile time, it is subject to some
constraints. Not all operators can be used: are not allowed function calls and the operators
increment (++), decrement (), assignment (=), and comma (-) except when they are part of
[46]
an expression that is not interpreted
. That is, a constant expression is a constant (literal
or enumeration constant) or an operation composed of constants and allowed operators.
Here are some constant expressions:
o 10
o 1+28
o 2*9
o 2/7+1-7
o 2.9*7
o Hello
o H
o sizeof(char)
o sizeof(v) where v is a variable
o &v where v is a variable

A constant expression can evaluate to two kinds of constants: arithmetic constants and
address constants.

IV.15.1 Arithmetic constant expression


An arithmetic constant expression may evaluate to:
o An integer constant such a 2
o A floating constant such as 1.207

An arithmetic constant expression can be an integer constant, a floating constant, a
character literal (e.g. H), an enumeration constant (described in Chapter VI), sizeof
expressions, or an operation composed of those constants as operands. Here is a piece of
code with arithmetic constant expressions:
#include <stdio.h>
#include <stdlib.h>

enum bool_val { FALSE, TRUE }; // enumeration
int b = TRUE;
int c = H;

int i1 = 10;
int i2 = 10*2;
int i3 = 5 * sizeof(long);
int i4 = sizeof(i1);
float f = 3.14;

int main(void) {
printf(%d %d %d %c %d %d %f\n, i1, i2, b, c, i3, i4, f);
return EXIT_SUCCESS;
}

The sizeof operator evaluates to an integer constant unless the operand is a VLA (variablelength array). For example, before the main() function starts, at the end of the compilation,
sizeof(char) is replaced by an integer constant while sizeof(arr) is evaluated at run time if arr is
a VLA.

IV.15.2 Address constant


[47]
An address constant is a null pointer, a pointer to a static object
, a pointer to a
function. Here are five examples:
#include <stdio.h>
#include <stdlib.h>

char *p1 = Literal string;
int *p2 = NULL;
float *p3 = (float *)0;
int v = 10;
int *p4 = &v;

int main(void) {
printf(%p %p %p %p\n, p1, p2, p3, p4);
return EXIT_SUCCESS;
}

IV.16 Exercises
Exercise 1. If x=5, y=6 and z=7, what is the value of the expression y < z = x ?

Exercise 2. If x=7, y=6 and z=7, what is the value of the expression y < z == x ?


Exercise 3. If x=6, y=6 and z=5, what is the value of the expression x <= y < z ?
Exercise 4. If x=10, n=4, what is the value of the expression x << n ?

Exercise 5. If x=10, what are the values of the expression sizeof ++x and x?

Exercise 6. Let x be a variable, why does the statement &(x+1) is considered erroneous by
the compiler?

Exercise 7. Let x be a variable holding the value 1, how would the compiler evaluate the
expression x++++?

Exercise 8. Consider the following variables:
int j = 4;
float f = 10.8;
float g = 0.4;
int k;
float h


What would be the values of k set below?
k = 2 *f;
k = 2 *g;
k = (float) 2 * g;


What would be the value of h set below?
h = 2 *g;
h = 2 * (int)g;
h = 2 / g;


Exercise 9. Consider the following snippets of code and guess the output the printf()
functions:
int x1 = 2;
int y1 = x1++;
printf(x=%d, y=%d\n,x1, y1);


int x2 = 2;

int y2 = ++x2;
printf(x=%d, y=%d\n,x2, y2);


int x3 = 2;
int y3 = x3++ ;
printf(x=%d, y=%d\n,x3, y3);


int x4 = 2;
int y4 = ++x4;
printf(x=%d, y=%d\n,x4, y4);

Exercise 10. Let x and y be variables type short int. What would be the type of expression x
* y?

Exercise 11. What would be the output of the following code snippets?
unsigned short x = 2;
short y = -1;
if ( x > y ) {
printf(x > y\n);
} else {
printf(x < y\n);
}

Exercise 12. What would be the output of the following code snippets?
unsigned long x = 2;
signed char y = -1;
if ( x > y ) {
printf(x > y\n);
} else {
printf(x < y\n);
}


Exercise 13. What would be the output of the following code snippets?
unsigned long x = 2;
float y = -1;
if ( x > y ) {
printf(x > y\n);
} else {
printf(x < y\n);

CHAPTER V CONTROL FLOW


V.1 Introduction
Control flow statements are statements that break the normal flow of execution that
consists in executing statements in the order they appear. Instead, they execute a set of
statements if some conditions are met (if, while, for, switch) or just branch to another point in
the program unconditionally (break, continue return). They will allow you to write programs
that can perform the right actions depending on some conditions.

V.2 Statements
A statement is a task telling the computer what to do. A set of statements can be grouped
into braces (between { and }) to form a logical unit known a block or a compound
statement:
{
statement1;
statement2;

statementN;
}

Where
o statement1,, statementN are statements.
o Blanks (newlines, spaces and tabs) can be added before or after the braces ({ and }).
o Blanks (newlines, spaces and tabs) can be added before or after any statement.

V.3 if statement
The if statement executes a set of statements depending on a given condition. In its
simplest form, it is composed of two parts:
if (condition) block

Where:
o condition is an expression. It is the selection condition.

o block is a set of statements between braces. However, if there is only one statement,
braces can be omitted.

If the expression condition evaluates to a value different from zero (meaning true), the set of
statements block is executed. Here are some examples.

o Example 1: In C, the value of 0 is treated as false. Any other value is considered true as
shown below:
$ cat if_statement1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
if (-1) printf(-1 IS TRUE\n);
if (10) printf(10 IS TRUE\n);
if (0) printf(0 IS TRUE\n);
if (0.9) printf(0.9 IS TRUE\n);

return EXIT_SUCCESS;
}
-1 IS TRUE
10 IS TRUE
0.9 IS TRUE

o Example 2: The selection condition can be a variable.


$ cat if_statement2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 10;

if (v) printf(v=%d IS TRUE\n, v);

return EXIT_SUCCESS;
}
$ gcc -o if_statement2 -std=c99 -pedantic if_statement2.c
$ ./if_statement2
v=10 IS TRUE

o Example 3: The selection condition can be an arithmetic operation.

$ cat if_statement3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 10;
int w = -5;

if (v + w) printf(v+w=%d IS TRUE\n, v+w);

return EXIT_SUCCESS;
}
$ gcc -o if_statement3 -std=c99 -pedantic if_statement3.c
$ ./if_statement3
v+w=5 IS TRUE

o Example 4: The selection condition can be a relational operation.


$ cat if_statement4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 10;
int w = -5;

if ( v > w ) printf(%d > %d IS TRUE\n, v, w);

return EXIT_SUCCESS;
}
$ gcc -o if_statement4 -std=c99 -pedantic if_statement4.c
$ ./if_statement4
10 > -5 IS TRUE

o Example 5: The selection condition can be a logical operation.


$ cat if_statement5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 10;
int w = -5;


if ( v > 0 && v > w ) printf(%d > 0 && %d > %d IS TRUE\n, v, v, w);

return EXIT_SUCCESS;
}
$ gcc -o if_statement5 -std=c99 -pedantic if_statement5.c
$ ./if_statement5
10 > 0 && 10 > -5 IS TRUE

o Example 6: The selection condition can be an assignment.


$ cat if_statement6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 5;
int w = -5;

if ( v = w ) printf(v holds now value %d\n, v);

return EXIT_SUCCESS;
}
$ gcc -o if_statement6 -std=c99 -pedantic if_statement6.c
$ ./if_statement6
v holds now value -5

In the example above, the expression v = w assigns the value of the variable w (i.e. -5) to
the variable v and then evaluates that value. Thus, if w holds a value different from zero,
the condition v = w is considered true.

Example it_statement6.c must not be confused with the following one that compares the
value of v with the value of w:
$ cat if_statement7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int v = 5;
int w = -5;

if ( v == w ) printf(v holds value %d\n, v);


return EXIT_SUCCESS;
}


The block of the if statement may contain several statements. In this case, the statements
must be enclosed between braces:
$ cat if_statement8.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char s1[40] = IF statement;
char s2[80] = IF statement;

if ( !strcmp(s1, s2) ) {
printf(The arrays s1 and s2 hold the same string\n);
printf(s1=%s\n, s1);
}

return EXIT_SUCCESS;
}
$ gcc -o if_statement8 -std=c99 -pedantic if_statement8.c
$ ./if_statement8
The arrays s1 and s2 hold the same string
s1=IF statement


The second form of the if statement allows executing an alternative block if the selection
condition is false:
if (condition) block
else alternative_block

If the selection expression condition evaluates to a value different from zero, the set of
statements block is executed. Otherwise, the set of statements of alternative_block is executed.
If block and alternative_block are composed of several statements, braces ({}) must enclose the
statements. If there is only one statement, the braces can be omitted. Here is an example:
$ cat if_statement9.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


int main(void) {
char s1[40] = IF statement;
char s2[80] = WHILE statement;

if ( !strcmp(s1, s2) ) {
printf(The arrays s1 and s2 hold the same string\n);
printf(s1=%s\n, s1);
} else {
printf(The arrays s1 and s2 hold different strings\n);
printf(s1=%s\n, s1);
printf(s2=%s\n, s2);
}

return EXIT_SUCCESS;
}
$ gcc -o if_statement9 -std=c99 -pedantic if_statement9.c
$ ./if_statement9
The arrays s1 and s2 hold different strings
s1=IF statement
s2=WHILE statement


The third form of the if statement allows using several selection conditions:
if (condition1) block1
else if (condition2) block2

else if (conditionN) blockN


else alternative_block

If condition1 evaluates to a value different from zero, block is executed. Otherwise, if condition2
evaluates to a value different from zero, block2 is executed Otherwise, if conditionN
evaluates to a value different from zero, blockN is executed. Otherwise, alternative_block is
executed. If a block composed of several statements, braces ({}) must enclose the
statements. If there is only one statement, the braces can be omitted. The following
program is an implementation of a basic calculator that computes the results of the
operations: +, -, * and /. The executable expects three arguments of the form n1 op n2 where
n1 and n2 are arithmetic values and op an arithmetic operator (+, -, * or /); it outputs the
result of the operation. If the user passes unexpected arguments, a help is displayed.
$ cat if_statement10.c
1 #include <stdio.h>
2 #include <stdlib.h>

3 #include <string.h>
4
5 int main(int argc, char **argv) {
6 float n1, n2;
7 char op;
8
9 if ( argc != 4 ) {
10 printf(USAGE: %s number op number\n, argv[0]);
11 printf(Where op is +, -, *, /\n\n);
12
13 return EXIT_FAILURE;
14 }
15
16 n1 = atof(argv[1]);
17 op = *argv[2]; /* first character of string argv[2] */
18 n2 = atof(argv[3]);
19
20 if ( op == + )
21 printf(%f + %f = %f\n, n1, n2, n1 + n2);
22 else if ( op == - )
23 printf(%f - %f = %f\n, n1, n2, n1 - n2);
24 else if ( op == * )
25 printf(%f * %f = %f\n, n1, n2, n1 * n2);
26 else if ( op == / )
27 printf(%f / %f = %f\n, n1, n2, n1 / n2);
28 else {
29 printf(Unknown operator %c\n, op);
30 printf(USAGE: %s number op number\n, argv[0]);
31 printf(Where op is +, -, *, /\n\n);
32
33 return EXIT_FAILURE;
34 }
35
36 return EXIT_SUCCESS;
37 }
$ gcc -o if_statement10 -std=c99 -pedantic if_statement10.c
$ ./if_statement10
USAGE: ./if_statement10 number op number
Where op is +, -, *, /

$ ./if_statement10 10 / 7
10.000000 / 7.000000 = 1.428571

$ ./if_statement10 10 + 7
10.000000 + 7.000000 = 17.000000
$ ./if_statement10 5 % 10
Unknown operator %
USAGE: ./if_statement10 number <op> number
Where op is +, -, *, /

Explanation:
o Line 6: the variable n1 and n2 are declared as float. They will store the operands.
o Line 7: the variable op, declared as char, will hold the character representing the
operator: +, -, * or /.
o Lines 9-14: the relational expression argc != 4 tests if the number of arguments (argc) is
different from 4 (4 arguments are expected). If it is true, a help is displayed explaining
how to run the program. Remember the array argv[0] holds the program name.
o Line 16: argv[1] is a string. It is the first operand of the operation. It is converted to a
number of type float through the C standard function atof() and then assigned to the
variable n1.
o Line 17: argv[2] is a string. Since an operator is represented by a character, only the
very first character of the string is taken and assigned to the variable op.
o Line 18: argv[3] is a string. It is the second operand of the operation. It is converted to a
number of type float through the C standard function atof() and then assigned to the
variable n2.
o Lines 20-34: The if statement check the value of the variable op. If an expected
operator is found (+, -, *, or /), the corresponding operation is executed but if the
variable op does not hold an expected operator, a help is displayed (lines 28-34).

V.3.1 Switch statement


The switch statement is similar to the if statement. If also executes a set of statements
depending on the resulting value of the selection expression. It takes the following general
form:
switch (expr) {
case const1:
statement1_1;
statement1_2;

statement1_P1;
case const2:
statement2_1;
statement2_2;

statement2_P2;

case constN:
statementN_1;
statementN_2;

statementN_PN;

default:
statementAlt_1;
statementAlt_2;

StatementAlt_Palt;
}

Where:
o expr is an expression that evaluates to integer type.
o const1, const2,, constP are integer constant expressions (see Chapter IV Section IV.15).
o statementX_Y are statements.
o The default case is optional.

The expression expr evaluates to the value of integer type that we will call val:
o If val equals const1, the set of statements statement1_1,, statement1_P1 is executed. If the
break statement is encountered, the processing of the switch statement stops. Otherwise,
all the statements statement2_1,.., statement2_P1 ,, statementN_P,, statementN_PN,
statementAlt_1,, statementN_Palt are also executed.
o Otherwise, if val equals const2, the set of statements statement2_1,, statement2_P2 is
executed. If a statement is break, the processing of the switch statement stops. Otherwise,
all the statements statement3_1,.., statement3_P3,, statementN_P,, statementN_PN,
statementAlt_1,, statementN_Palt are also executed.
o
o Otherwise, if val equals constN, the set of statements statementN_1,, statementN_PN is
executed. If one of the statements is break, the processing of the switch statement stops.
Otherwise, all the statements statementAlt_1,.., statementAlt_Palt are also executed.
o Otherwise, the statements statementAlt_1,.., statementAlt_Palt also executed.

To put it more concisely, if the integer value of the selection expression corresponds to the
value of a case, all the statements following it are executed until the end of the switch

statement or until the first break statement is met. When the break statement is met, the
switch statement terminates.

In the following example, we have intentionally forgotten the break statement. See what it
yields:
$ cat switch1.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
int n;

if ( argc != 2 ) {
printf(USAGE: %s numner\n, argv[0]);
return EXIT_FAILURE;
}

n = atoi( argv[1] );

switch ( n % 2 ) {
case 0:
printf(Number %d is even\n, n);
case 1:
printf(Number %d is odd\n, n);
}
return EXIT_SUCCESS;
}
$ gcc -o switch1 -std=c99 -pedantic switch1.c
$ ./switch1 10
Number 10 is even
Number 10 is odd
$ ./switch1 11
Number 11 is odd

The selection expression n % 2 evaluates to 0 (if the passed argument is even) or 1 (if the
passed argument is odd). Now, if insert the break statement, only the statements of case 0 are
executed if the n is even:
$ cat switch2.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {


int n;

if ( argc != 2 ) {
printf(USAGE: %s numner\n, argv[0]);
return EXIT_FAILURE;
}

n = atoi( argv[1] );

switch ( n % 2 ) {
case 0:
printf(Number %d is even\n, n);
break;
case 1:
printf(Number %d is odd\n, n);
}
return EXIT_SUCCESS;
}
$ gcc -o switch2 -std=c99 -pedantic switch2.c
$ ./switch2 10
Number 10 is even
$ ./switch2 11
Number 11 is odd

The following example is equivalent to example if_statement10.c:


$ cat switch3.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
float n1, n2;
char op;

if ( argc != 4 ) {
printf(USAGE: %s number op number\n, argv[0]);
printf(Where op is +, -, *, /\n\n);

return EXIT_FAILURE;
}

n1 = atof(argv[1]);

op = *argv[2]; /* first character of string argv[2] */


n2 = atof(argv[3]);

switch ( op ) {
case +:
printf(%f + %f = %f\n, n1, n2, n1 + n2);
break;
case -:
printf(%f - %f = %f\n, n1, n2, n1 - n2);
break;
case *:
printf(%f * %f = %f\n, n1, n2, n1 * n2);
break;
case /:
printf(%f / %f = %f\n, n1, n2, n1 / n2);
break;
default:
printf(Unknown operator %c\n, op);
printf(USAGE: %s number op number\n, argv[0]);
printf(Where op is +, -, *, /\n\n);
return EXIT_FAILURE;
}

return EXIT_SUCCESS;
}

Remember that the selection expression must evaluate to an integer type. The following
example is not correct and cannot be compiled:
$ cat switch4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *operation=addition;

switch ( operation ) {
case +:
printf(Addition\n);
break;
case -:
printf(Subtraction\n);
break;

case *:
printf(Multiplication\n);
break;
case /:
printf(Division\n);
break;
default:
printf(Unknown operator %c\n, op);
return EXIT_FAILURE;
}

return EXIT_SUCCESS;
}
$ gcc -o switch4 -std=c99 -pedantic switch4.c
switch4.c: In function main:
switch4.c:7:13: error: switch quantity not an integer
switch4.c:8:9: error: case label does not reduce to an integer constant
switch4.c:11:9: error: case label does not reduce to an integer constant
switch4.c:14:9: error: case label does not reduce to an integer constant
switch4.c:17:9: error: case label does not reduce to an integer constant
switch4.c:21:44: error: op undeclared (first use in this function)
switch4.c:21:44: note: each undeclared identifier is reported only once for each function it appears in

Do not confuse the character literal + that has integer type with the string +.

The value of a case must be an integer literal or an expression evaluating to an integer
constant. The following example yields an error:
$ cat switch5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int c = 10;
int x = 10;

switch (c) {
case x: printf(case %d\n, x);
}
return EXIT_SUCCESS;
}
$ gcc -o switch5 -std=c99 -pedantic switch5.c

switch5.c: In function main:


switch5.c:9:7: error: case label does not reduce to an integer constant

V.3.2 While loop


The while statement executes a set of statements several times depending on a condition.
while (expr) block

Where:
o expr is an expression.
o block is a set of statements also known as while block or while body. Statements are
enclosed between braces ({}) . Braces can be omitted if there is a single statement.

The while body is executed until the expression expr evaluates to zero (false). Thus, as long
as the expression expr evaluates to a non-zero value, the compound statement block is
executed.

The following example displays the first ten digits:
$ cat while_loop1.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int i = 0;
6 int max = 10;
7
8 while ( i < max ) {
9 printf(i=%d , i);
10 i++;
11 }
12 printf(\n);
13
14 return EXIT_SUCCESS;
15 }
$ gcc -o while_loop1 -std=c99 -pedantic while_loop1.c
$ ./while_loop1
i=0 i=1 i=2 i=3 i=4 i=5 i=6 i=7 i=8 i=9

Explanation:
o Lines 8-11: before entering the while loop, the variable i holds the value 0.

At the first iteration, i holds the value 0, and the relational expression i < max (i.e. 0
< 10) is true. Which causes the while body to be executed: the value of i is displayed
(0), then i is incremented. At the end of the iteration, i holds the value 1.
At the second iteration, i holds the value 1 and the relational expression i < max (i.e.
1 < 10) is still true. The while body is executed: the value of i is displayed (1), then i is
incremented. At the end of the iteration, i holds the value 2.
And so on
At the 10th iteration, i holds the value 9, and the relational expression i < max (i.e. 9
< 10) remains true. The while body is executed: the value of i is displayed (9), then i is
incremented. At the end of the iteration, i holds the value 10.
At the 11th iteration, i holds the value 10 and the relational expression i < max (i.e. 1
< 10) becomes false. The while statement ends.

In the following example, we display the strings held in the array s:
$ cat while_loop2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *s[] = { ONE, TWO, THREE, FOUR };
int i = 0;
int nb_elt = sizeof s / sizeof(char *); /* number of elements in array s */

while ( i < nb_elt ) {
printf(s[%d]=%s\n, i, s[i] );
i++;
}

return EXIT_SUCCESS;
}
$ gcc -o while_loop2 -std=c99 -pedantic while_loop2.c
$ ./while_loop2
s[0]=ONE
s[1]=TWO
s[2]=THREE
s[3]=FOUR

In the following example, we also display the strings held in the array s:
$ cat while_loop3.c
1 #include <stdio.h>

2 #include <stdlib.h>
3
4 int main(void) {
5 char *s[] = { ONE, TWO, THREE, FOUR, NULL };
6 char **p;
7
8 p = s;
9 while ( *p != NULL ) {
10 printf(%s\n, *p );
11 p++;
12 }
13
14 return EXIT_SUCCESS;
15 }
$ gcc -o while_loop3 -std=c99 -pedantic while_loop3.c
$ ./while_loop3
ONE
TWO
THREE
FOUR

Explanation:
o Line 5: the object s is an array of strings. It is composed of five elements but the last
element, NULL, is used only for indicating the end of the list.
o Line 6: p is declared as pointer to pointer to char.
o Line 8: before entering the while loop, the pointer p is initialized to s. The pointer p
points to the very first object of the array s (the string ONE).
o Lines 9-12: as long as the pointer p does not point to a null pointer (i.e. *p != NULL), the
while body is executed. First, the string to which the pointer p points is displayed, then
the pointer p is incremented so that is points to the next object.
At the beginning, p points to the string ONE. Since the expression *p != NULL is
true, the statements of its body are executed. The string ONE is displayed and p is
incremented. The pointer p points now to the string TWO.
At the second iteration, p points to the string TWO. Since the expression *p !=
NULL is true, the statements of its body are executed. The string TWO is displayed
and p is incremented. The pointer p points now to the string THREE.
And so on
At the fourth iteration, p points to the string FOUR. Since the expression *p !=
NULL is true, the statements of its body are executed. The string FOUR is displayed
and p is incremented. The pointer p points now to the string FOUR.
At the fifth iteration, p points to a null pointer (NULL). Since the expression *p !=

NULL become false, the while statement terminates.


Since the macro NULL is synonym for 0 or (void *)0, the expression *p != NULL is the same as
*p != 0 and then is equivalent to the expression *p. The example while_loop3.c can be rewritten
as follows:
$ cat while_loop4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *s[] = { ONE, TWO, THREE, FOUR, NULL };
char **p = s;

while ( *p ) {
printf(%s\n, *p );
p++;
}

return EXIT_SUCCESS;
}
$ gcc -o while_loop4 -std=c99 -pedantic while_loop4.c
$ ./while_loop4
ONE
TWO
THREE
FOUR

Here is another example related to pointers. In the following example, we copy the string
of the array s into a memory area, allocated by malloc(), pointed to by the pointer copy_s.
$ cat while_loop5.c
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int main(void) {
6 char s[] = Hello world;
7 int len = strlen( s );
8 char *copy_s = malloc( len + 1 );
9 char *p1;
10 char *p2;
11

12 if ( ! copy_s ) { /* check if the pointer copy_s is valid */


13 printf(Fatal Error. Cannot allocate memory\n);
14 return EXIT_FAILURE;
15 }
16
17 p1 = s; p2 = copy_s;
18 while ( *p1 != \0 ) {
19 *p2 = *p1;
20 p2++;
21 p1++;
22 }
23
24 *p2 = \0;
25 printf(copy_s=%s\n, copy_s);
26
27 return EXIT_SUCCESS;
28 }
$ gcc -o while_loop5 -std=c99 -pedantic while_loop5.c
$ ./while_loop5
copy_s=Hello world

Explanation:
o Line 6: the array s is initialized to the string Hello world
o Line 7: the len variable is initialized to the number of characters in the array s.
o Line 8: A memory block is allocated by the malloc() function. The requested size is the
number of characters in the array s plus one to include the terminating null character
\0.
o Lines 12-15: we display an error message and terminate the program if the pointer
copy_s is not valid.
o Line 17: the pointer p1 is initialized to s (source data) and p2 to copy_s.
o Lines 18-22: as long as the current character is different from the null character, the
while body is executed.
Line 19: the character pointed to by p1 is copied to the piece of memory pointed to
by p2.
Line 20: move the pointer p1 to the next character
Line 21: move the pointer p2 to the next piece of address memory that can hold a
character
The while loop ends when the current character pointed to by p1 is the null character.
o Line 24: since the null character has not been copied, the character string pointed to
by p2 is ended by the null character.

o Line 25: the string pointed to by copy_s is displayed.



The following example performs the same task as the previous one:
$ cat while_loop6.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char *s = Hello world;
int len = strlen( s ); /* number of characters in the array s */
char *copy_s = malloc( len + 1 );
char *p1;
char *p2;

/* check the pointer copy_s is valid */
if ( ! copy_s ) {
printf(Cannot allocate memory for copy_s\n);
return EXIT_FAILURE;
}

/* copy string from array s to memory pointed to by copy_s */

p1 = s; p2 = copy_s;
while ( (*p2++ = *p1++) != \0 )
; /* while body is empty */

printf(copy_s=%s\n, copy_s);

return EXIT_SUCCESS;
}
$ gcc -o while_loop6 -std=c99 -pedantic while_loop6.c
$ ./while_loop6
copy_s=Hello world

The expression *p2++ = *p1++ carries out the following tasks:


o The piece of memory pointed to by p2 (a character) represented by *p2 takes the object
(current character) pointed to by the pointed p1 (represented by *p1).
o Then, the pointer p2 is incremented by the postfix operator: p2++.
o The pointer p1 is also incremented by the postfix operator: p1++.

o The assignment evaluates to the value pointed to by p2 (represented by *p2): the


current character pointed to by p2.

Then, as long as the assignment evaluates to a value different from the null character, the
while body is executed (here, the body is empty). At the last iteration:
o p2 holds the null character \0. It is assigned to the piece of memory pointed to by p1.
o The assignment *p2++ = *p1++ evaluates to the null character \0 .
o The expression (*p2++ = *p1++) != \0 becomes false and then terminates the while loop.

The while loop allows you to execute indefinitely a set of statements (infinite loop):
while (1) {
statement1;
statement2;

statementN;
}

The following program executes until you press the letter c while holding the CTRL key
(<CTRL-C>).
$ cat while_loop7.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
const int num_len = 32;
char s[num_len];
int n;
float f;

while (1) {
printf(\nPlease type an integer number: );
fgets(s, num_len, stdin); /* read characters typed */
n = atoi( s ); /* convert s to integer */
f = atof( s ); /* convert s to float */

if (f != n) {
printf(The given number is not integer\n);
return EXIT_FAILURE;

}

switch ( n % 2 ) {
case 0:
printf(%d is even\n, n);
break;
case 1:
printf(%d is odd\n, n);
}
}
}
$ gcc -o while_loop7 -std=c99 -pedantic while_loop7.c
$ ./while_loop7

Please type an integer number: 10
10 is even

Please type an integer number: 17
17 is odd

Please type an integer number:

It prints the message Please type an integer number: and waits for you to type a number
terminated by the <RETURN> key. Then, it tells you if the number is odd or even.

In the program, there is a new function that we have not talked about so far: fgets(). We will
say more about it when we talk about the most frequently used C standard functions. For
now, we use it to retrieve the characters typed by the user. That is, the call fgets(s, num_len,
stdin) will retrieve the characters typed and store them in the array s and terminates it with
the null characters \0. The function reads what is typed until at most num_len-1 characters
have been read or the newline character has been read (yielded by the <RETURN> key). The
second argument num_len tells the function to read at most num_len-1 characters because our
array s can hold only num_len characters, the last character being reserved for the null
character \0. The third argument stdin represents the standard input that is associated with
the keyboard: it tells the function to read what is typed.

V.3.3 DoWhile loop


The do/while loop works in the same way as the while loop except it executes at least once
the loop body. The condition is tested only after running the loop body. Its general syntax
is given below (do not forget the semicolon at the end of the statement):
do block while (expr);

Where:
o block is a set of statements
o expr is an expression

The do body (loop body) is executed until the condition expr becomes false. The loop body
is executed first. Then, the condition expr is tested. The following example displays the
first ten digits:
$ cat do_while1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int max = 10;
int i = 0;

do {
printf(%d , i);
i++;
} while ( i < max );

printf(\n);
return EXIT_SUCCESS;
}
$ gcc -o do_while1 -std=c99 -pedantic do_while1.c
$ ./do_while1
0 1 2 3 4 5 6 7 8 9

The loop body is executed at least once. In the following example, the very first value of i
is 0, yet the loop body is executed:
$ cat do_while2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int max = 10;
int i = 0;

do {
printf(%d , i);

i++;
} while ( i < max && i > 0);

printf(\n);
return EXIT_SUCCESS;
}
$ gcc -o do_while2 -std=c99 -pedantic do_while2.c
$ ./do_while2
0 1 2 3 4 5 6 7 8 9

V.3.4 For loop


The for loop does the same thing as the while loop. It is only a concise form of the while loop
easing programming. The for statement executes a set of statements several times
depending on a condition.
for (expr1;expr2;expr3) block

Where:
o expr1, expr2, and expr3 are expressions.
o block is a set of statements also known as loop body or for body. Statements are
enclosed between braces ({}) . Braces can be omitted if there is a single statement.

The expression expr1 is executed first (initialization) and only once. The expression expr2 is
evaluated, if it is true, the for body (block) is executed. Then, the expression expr3 is
executed. Next, we reboot the same process: the expression expr2 is evaluated, if it is true
the for body is executed, followed by the evaluation of the expression exp3the for loop
continues until the expression expr2 becomes false

The following example displays the first ten digits:
$ cat for_loop1.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int max = 10;
6 int i;
7
8 for (i=0; i < max; i++)
9 printf(%d , i);
10

11 printf(\n);
12 return EXIT_SUCCESS;
13 }
$ gcc -o for_loop1 -std=c99 -pedantic for_loop1.c
$ ./for_loop1
0 1 2 3 4 5 6 7 8 9

Explanation:
o Lines 8-9:
The variable i is initialized to the value 0. This is the initialization step.
First iteration. Since i holds the value 0, the expression i < max is true and then the
loop body (line 9) is executed. The value of i is printed (0). The expression i++ is
executed, i holds now the value 1.
Second iteration. Since i holds the value 1, the expression i < max is true and then
the loop body (line 9) is executed. The value of i is printed (1). The expression i++ is
executed, i holds now the value 2.

Tenth iteration. Since i holds the value 9, the expression i < max is true and then the
loop body (line 9) is executed. The value of i is printed (9). The expression i++ is
executed, i holds now the value 10.
Eleventh iteration. Since i holds the value 10, the expression i < max becomes false
and the for loop ends without executing the for body.
o Line 11: a newline is displayed.

The following example is equivalent to the program while_loop2.c previously given. It
displays the strings of the array s:
$ cat for_loop2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *s[] = { ONE, TWO, THREE, FOUR };
int i;
int nb_elt = sizeof s / sizeof(char *); /* number of elements in array s */

for ( i = 0; i < nb_elt; i++ )
printf(s[%d]=%s\n, i, s[i] );

return EXIT_SUCCESS;
}

$ gcc -o for_loop2 -std=c99 -pedantic for_loop2.c


$ ./for_loop2
s[0]=ONE
s[1]=TWO
s[2]=THREE
s[3]=FOUR

The following example is equivalent to while_loop4.c. It displays the strings of the array s by
using pointers.
$ cat for_loop3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *s[] = { ONE, TWO, THREE, FOUR, NULL };
char **p;

for ( p = s; *p; p++ )
printf(%s\n, *p );

return EXIT_SUCCESS;
}
$ gcc -o for_loop3 -std=c99 -pedantic for_loop3.c
$ ./for_loop3
ONE
TWO
THREE
FOUR

The following example is equivalent to while_loop5.c. It copies a string to a memory block


allocated by malloc() and pointed to by the pointer copy_s;
$ cat for_loop4.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char *s = Hello world;
int len = strlen( s ); /* number of characters in the array s */
char *copy_s = malloc( len + 1 );
char *p1;
char *p2;


/* check the pointer copy_s is valid */
if ( copy_s == NULL ) {
printf(Cannot allocate memory for copy_s\n);
return EXIT_FAILURE;
}

/* copy string from array s to memory pointed to by copy_s */
for ( p1 = s, p2 = copy_s; *p1 != \0; p1++, p2++ )
*p2 = *p1;

*p2 != \0; /* a character string is terminated by a null character */
printf(copy_s=%s\n, copy_s);

return EXIT_SUCCESS;
}
$ gcc -o for_loop4 -std=c99 -pedantic for_loop4.c
$ ./for_loop4
copy_s=Hello world


An infinite loop executes indefinitely a set of statements.
for (;;) {
statement1;
statement2;

statementN;
}

The following example is equivalent to while_loop7.c. The user types an integer number and
the program tells if it is even or odd. The program executes until you hit <CTRL-c>.
$ cat for_loop5.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
const int num_len = 32;
char s[num_len]; /* array to store characters typed */
int n;
float f;


for (;;) {
printf(\nPlease type an integer number: );
fgets(s, num_len, stdin); /* retrieve characters typed */
n = atoi( s ); /* convert to integer */
f = atof( s ); /* convert to float */

if (f != n) { /* the given number is a float */
printf(The given number is not integer\n);
return EXIT_FAILURE;
}

switch ( n % 2 ) {
case 0:
printf(%d is even\n, n);
break;
case 1:
printf(%d is odd\n, n);
}
}
}
$ gcc -o for_loop5 -std=c99 -pedantic for_loop5.c
$ ./for_loop5

Please type an integer number: 10
10 is even

Please type an integer number: 11
11 is odd

Please type an integer number: anything
0 is even

Please type an integer number: <CTRL-c>
$

Remember that if the given string starts with something else than a number, the function
atoi() and atof() return 0.

C99 introduces a very useful feature, it permits to declare a variable in the initialization
clause of the for loop:

$ cat for_loop6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
for (int i=0; i < 5; i++)
printf(i=%d\n, i);

return EXIT_SUCCESS;
}
$ gcc -o for_loop6 -std=c99 -pedantic -Wall for_loop6.c
$ ./for_loop6
i=0
i=1
i=2
i=3
i=4

Take note a variable declared in this way can be used only within the for loop. The variable
will be destroyed and then cannot be used anymore when the closing brace } that
terminates the loop is encountered.

V.4 continue
The continue statement jumps to the next iteration of a loop statement (see Figure V1). It
can be used only in a loop body (for, while or do/while statement). The following program
displays the first ten digits with the exception of the digit 3:
$ cat continue1.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int max = 10;
6 int i;
7
8 for (i=0; i < 10; i++) {
9 if ( i == 3 ) continue;
10 printf(%d , i);
11 }
12
13 printf(\n);

14
15 return EXIT_SUCCESS;
16 }
$ gcc -o continue1 -std=c99 -pedantic continue1.c
$ ./continue1
0 1 2 4 5 6 7 8 9

Explanation:
o Lines 8-11:
Initialization: the variable i is set to 0 before entering the loop.
First iteration. i=0 and i < 10 is true. The loop body is executed. The value of i is
printed. The variable i is incremented by the expression i++, i hold the value 1.
Second iteration. i=1 and i < 10 is true. The loop body is executed.

Fourth iteration. i=3 and i < 10 is true. The loop body is executed. As the expression i
== 3 is true, the continue statement is executed: it stops the current iteration without
executing the next statements of the for body. Before starting a new iteration, the
variable i is first incremented by the expression i++, i hold the value 4.
And so son.

In the following example, we display each element in the array s except if it is the string
THREE:
$ cat continue2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char *s[] = { ONE, TWO, THREE, FOUR };
int nb_elt = sizeof s / sizeof(char *);
int i;

i = 0;
while( i < nb_elt ) {
if ( ! strcmp( THREE, s[ i ] ) ) {
i++;
continue;
}

printf(s[ %d ] = %s\n, i, s[ i ]);

i++;
}

return EXIT_SUCCESS;
}
$ gcc -o continue2 -std=c99 -pedantic continue2.c
$ ./continue2
s[ 0 ] = ONE
s[ 1 ] = TWO
s[ 3 ] = FOUR

Figure V1 continue statement


Take note that we incremented the value of i before jumping to the next iteration with the
continue statement. With the for loop, the same example would be easier to write:
$ cat continue3.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


int main(void) {
char *s[] = { ONE, TWO, THREE, FOUR };
int nb_elt = sizeof s / sizeof(char *);
int i;
for(i = 0; i < nb_elt; i++ ) {
if ( ! strcmp( THREE, s[ i ] ) )
continue;

printf(s[ %d ] = %s\n, i, s[ i ]);
}

return EXIT_SUCCESS;
}
$ gcc -o continue3 -std=c99 -pedantic continue3.c
$ ./continue3
s[ 0 ] = ONE
s[ 1 ] = TWO
s[ 3 ] = FOUR

Figure V2 break statement

V.5 break
The break statement terminates a loop statement or the current case of the switch statement in
which it appears (see Figure V2). In the following example, the for loop ends when i
reaches the value 3.
$ cat break1.c
#include <stdio.h>

#include <stdlib.h>

int main(void) {
int max = 10;
int i;

for (i=0; i < 10; i++) {
if ( i == 3 ) break;
printf(%d , i);
}

printf(\n);

return EXIT_SUCCESS;
}
$ gcc -o break1 -std=c99 -pedantic break1.c
$ ./break1
0 1 2

The break statement is useful in infinite loops. Let us consider the example for_loop5.c we
gave earlier and let us modify it so that we leave properly the program after typing the
word quit.
$ cat break2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
const int num_len = 32;
char s[num_len];
int n;
float f;

for (;;) {
printf(\nPlease type an integer number: );
fgets(s, num_len, stdin); /* retrieve characters typed */

/* leave the for loop if word quit is typed */
if ( !strncmp (s, quit, 4 ) )
break;
n = atoi( s ); /* convert to integer */
f = atof( s ); /* convert to float */


if (f != n) { /* if f != n, f is float */
printf(The given number is not integer\n);
return EXIT_FAILURE;
}

switch ( n % 2 ) {
case 0:
printf(%d is even\n, n);
break;
case 1:
printf(%d is odd\n, n);
} /* End of switch */
} /* End of for loop */

printf(\nExiting\n);
return EXIT_SUCCESS;
}
$ gcc -o break2 -std=c99 -pedantic break2.c
$ ./break2

Please type an integer number: 11
11 is odd

Please type an integer number: quit

Exiting

V.6 goto
The goto statement jumps to another point of the program specified by a label (see Figure
V3). Here is an example:
$ cat goto1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int max = 10;
int i;

for (i=0; i < 10; i++) {

if ( i == 3 ) goto END;
printf(%d , i);
}

END:
printf(\n);

return EXIT_SUCCESS;
}
$ gcc -o goto1 -std=c99 -pedantic goto1.c
$ ./goto1
0 1 2

If the variable i holds the value 3, the goto statement jumps to the label END. Which leaves
the for loop.

Figure V3 goto statement

A label does nothing. It is only used to specify a place in the program. It is used by the goto
statement only. In the following example, we use two labels:
$ cat goto2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int max = 10;
int i;

LOOP_FOR: for (i=0; i < 10; i++) {
printf(%d , i);
}

END:
printf(\n);

return EXIT_SUCCESS;
}
$ gcc -o goto2 -std=c99 -pedantic goto2.c
$ ./goto2
0 1 2 3 4 5 6 7 8 9

Programmers often avoid using the goto statement because it makes debugging and
understanding of the source code trickier. So, do not use it if you can.

V.7 Nested loops


A nested loop is a loop inside another loop. Here is an example:
$ cat nested_loop1.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int i, j, k;
6
7 for (i = 1; i < 4; i++ ) {
8 printf(-> %d:\n, i);
9
10 for (j = A ; j < C; j++ ) {
11 printf( %c:\n, j);

12
13 for (k = a; k < c; k++ ) {
14 printf( %c\n, k);
15 }
16
17 }
18
19 }
20 return EXIT_SUCCESS;
21 }
$ gcc -o nested_loop1 -std=c99 -pedantic nested_loop1.c
$ ./nested_loop1
-> 1:
A:
a
b
B:
a
b
-> 2:
A:
a
b
B:
a
b
-> 3:
A:
a
b
B:
a
b

Explanation:
o Lines 7-19: Digits from 1 through 3 are displayed. The first for loop contains two
other loops (lines 10 and 13).
o Lines 10-17: characters from A to B are displayed. The second for loop contains
another loop (line 13).
o Lines 13-15: characters from a to b are displayed. This is the last loop.

Nested loops can be used to display multidimensional arrays are shown below:
$ cat nested_loops2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i, j, k;
/* arr is a three-dimensional */
char arr[][3][2] = {
{ /* First array 2-dimensional array */
{ a, b }, /* first one-dimensional array: 2 elements */
{ c, d }, /* second one-dimensional array: 2 elements */
{ e, f } /* Third one-dimensional array: 2 elements */
},

{ /* Second two-dimensional array */
{ A, B }, /* first two-dimensional array: 2 elements */
{ C, D }, /* second two-dimensional array: 2 elements */
{ E, F } /* Third two-dimensional array: 2 elements */
}
};

/* display three-dimensioanl array */
for ( i=0; i < 2; i++ ) {
for ( j=0; j < 3; j++ ) {
for ( k=0; k < 2; k++ )
printf( arr[%d][%d][%d]=%c\n, i, j, k, arr[i][j][k]);

printf(\n);
}

printf(\n);
}

return EXIT_SUCCESS;
}
$ gcc -o nested_loop2 -std=c99 -pedantic nested_loop2.c
$ ./nested_loop2
arr[0][0][0]=a
arr[0][0][1]=b

arr[0][1][0]=c
arr[0][1][1]=d

arr[0][2][0]=e
arr[0][2][1]=f


arr[1][0][0]=A
arr[1][0][1]=B

arr[1][1][0]=C
arr[1][1][1]=D

arr[1][2][0]=E
arr[1][2][1]=F


The break statement leaves the innermost loop body (see Figure V2). That is, it exits the
first loop in which it is directly contained:
$ cat nested_loops3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i, j, k;
for (i = 1; i < 4; i++ ) {
printf(-> i=%d:\n, i);

for (j = 1 ; j < 4; j++ ) {
printf( j=%d:\n, j);

for (k = 1; k < 5; k++ ) {
if ( k == 3 ) {
printf( k=%d. BREAK\n, k);
break;
}

printf( k=%d\n, k);
}

}

}
return EXIT_SUCCESS;
}
$ gcc -o nested_loop3 -std=c99 -pedantic nested_loop3.c
$ ./nested_loop3
-> i=1:
j=1:
k=1
k=2
k=3. BREAK
j=2:
k=1
k=2
k=3. BREAK
j=3:
k=1
k=2
k=3. BREAK
-> i=2:
j=1:
k=1
k=2
k=3. BREAK
j=2:
k=1
k=2
k=3. BREAK
j=3:
k=1
k=2
k=3. BREAK
-> i=3:
j=1:
k=1
k=2
k=3. BREAK
j=2:
k=1
k=2
k=3. BREAK
j=3:
k=1

k=2
k=3. BREAK

Compare with the following one:


$ cat nested_loop4.c
#include <stdlib.h>

int main(void) {
int i, j, k;
for (i = 1; i < 4; i++ ) {
printf(-> i=%d:\n, i);

for (j = 1 ; j < 4; j++ ) {
if ( j == 2 ) {
printf( j=%d: BREAK.\n, j);
break;
}

printf( j=%d:\n, j);

for (k = 1; k < 5; k++ ) {
printf( k=%d\n, k);
}

}

}
return EXIT_SUCCESS;
}
$ gcc -o nested_loop4 -std=c99 -pedantic nested_loop4.c
$ ./nested_loop4
-> i=1:
j=1:
k=1
k=2
k=3
k=4
j=2: BREAK.
-> i=2:
j=1:
k=1
k=2

k=3
k=4
j=2: BREAK.
-> i=3:
j=1:
k=1
k=2
k=3
k=4
j=2: BREAK.

The continue statement does not stop the current loop but jumps to the next iteration of the
innermost loop body (see Figure V1). That is, it branches to next iteration of the
innermost loop in which it is contained:
$ cat nested_loops5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int i, j, k;
for (i = 1; i < 4; i++ ) {
printf(-> i=%d:\n, i);

for (j = 1 ; j < 4; j++ ) {
printf( j=%d:\n, j);

for (k = 1; k < 4; k++ ) {
if ( k == 2 )
continue;

printf( k=%d\n, k);
}

}

}
return EXIT_SUCCESS;
}
$ gcc -o nested_loop5 -std=c99 -pedantic nested_loop5.c
$ ./nested_loop5
-> i=1:

j=1:
k=1
k=3
j=2:
k=1
k=3
j=3:
k=1
k=3
-> i=2:
j=1:
k=1
k=3
j=2:
k=1
k=3
j=3:
k=1
k=3
-> i=3:
j=1:
k=1
k=3
j=2:
k=1
k=3
j=3:
k=1
k=3

V.8 Exercises
Exercise 1. Write a program that takes a list of numbers separated by spaces and displays
the mean value.

Exercise 2. Write a program that takes a character string and displays the number of
consonants and the number of vowels.

Exercise 3. Explain why the following program is not correct.
#include <stdio.h>

#include <stdlib.h>

int main(void) {
char *s[] = { ONE, TWO, THREE, FOUR };
char **p;

for ( p = s; *p; p++ )
printf(%s\n, *p );

return EXIT_SUCCESS;
}

Exercise 4. Write a program that displays the internal representation of an integer.



Exercise 5. Write a simple program that displays if the processor is little endian or big
endian.

CHAPTER VI USER-DEFINED TYPES


VI.1 Introduction
So far, we have only worked with types defined by the C languages: arithmetic types,
pointers and arrays. Now, you are going to learn to define your own types. In simple C
programs, basic types are enough, you actually do not need to create new types but you
will shortly find out that creating your own types greatly ease your work as your programs
get more complex. For example, you could define a type called student allowing you to
create objects composed of three attributes: name, surname and age. Once defined, you
will be able to use them as any other type.

VI.2 Enumerations
Consider the following example:
$ cat enum1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int const SUNDAY = 0;
int const MONDAY = 1;
int const TUESDAY = 2;
int const WEDNESDAY = 3;
int const THURSDAY = 4;
int const FRIDAY = 5;
int const SATURDAY = 6;

int d;

d = SUNDAY; printf(d=%d\n, d);
d = FRIDAY; printf(d=%d\n, d);
}
$ gcc -o enum1 -std=c99 -pedantic enum1.c
$ ./enum1
d=0
d=5

In the example above, we have defined seven integer constants that represent the days of

the week. The same program can be simplified by using an enumeration type as shown
below:
$ cat enum2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum days { SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY };

enum days d;

d = SUNDAY; printf(d=%d\n, d);
d = FRIDAY; printf(d=%d\n, d);

return EXIT_SUCCESS;
}
$ gcc -o enum2 -std=c99 -pedantic enum2.c
$ ./enum2
d=0
d=5

We defined a new type called days that is an enumerated type. An enumerated type is a list
of integer constant values, each of which is identified by a name. It is defined as follows:
enum enum_tag { id1[=val1], id2[=val], , idN[=valN] };

Where:
o enum_tag is the name you give to the enumeration. It is called an enumeration tag.
o id1, id2,, idN are names of constants known as enumeration constants. They are
composed of letters, digits and underscores, starting with a letter or an underscore.
o va1, val2, , valN are integer constant expressions. They are of type int. Their values can
be negative.

The enumeration constants id1, , idN are initialized respectively with the values of type
int val1, , valN. If a value valP is not given to initialize an enumeration constant idP, idP
takes the value of the preceding enumeration constant incremented. If the very first value
val1 is not specified, id1 takes the value of zero. The declaration of an enumeration creates a
new type.

Keep in mind an enumeration tag is not a type specifier (type name) but the name of the
enumeration. Consequently, once an enumerated type has been defined, you can use it as

any type but you still have to specify the keyword enum before the tag when declaring a
variable. To declare a variable of enumerated type whose tag is enum_tag, use the following
syntax:
enum enum_tag var;

A variable of enumerated type is supposed to take one of the integer constants defined by
the enumeration. If you set to it to any integer value, it does make no sense: in this case,
youd better use an integer type instead of an enumeration type.

In our example enum2.c, we did not give initialization values to the enumeration constants,
which caused the enumeration constant SUNDAY to take the value 0, MONDAY the value 1,
and so on. In the following example, we specify the very first initialization value:
$ cat enum3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum days { SUNDAY=1, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY };

enum days d;

d = SUNDAY; printf(d=%d\n, d);
d = FRIDAY; printf(d=%d\n, d);

return EXIT_SUCCESS;
}
$ comp enum3
$ gcc -o enum3 -std=c99 -pedantic enum3.c
$ ./enum3
d=1
d=6

In the following example, we provide an explicit value to every enumeration constant:


$ cat enum4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum shape { CIRCLE=0, RECTANGLE=4, TRIANGLE=3 };

enum shape s;


s = CIRCLE; printf(s=%d\n, s);
s = TRIANGLE; printf(s=%d\n, s);

return EXIT_SUCCESS;
}
$ gcc -o enum4 -std=c99 -pedantic enum4.c
$ ./enum4
s=0
s=3

You are allowed to use unnamed enumerated type by omitting the tag as in the following
example:
$ cat enum5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum { EVEN = 0, ODD = 1 } remainder;

int x = 10;
remainder = x % 2;

if ( remainder == EVEN ) printf(%d is even\n, x);
else if ( remainder == ODD ) printf(%d is odd\n, x);

return EXIT_SUCCESS;
}
$ gcc -o enum5 -std=c99 -pedantic enum5.c
$ ./enum5
10 is even

As said earlier, when you declare a variable of enumerated type, you have to use the
keyword enum before the tag. There is a convenient way to bypass it: using the typedef
statement that creates an alias for the enumerated type as shown below:
$ cat enum6.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum shape { CIRCLE=0, RECTANGLE=4, TRIANGLE=3 };
typedef enum shape shape;


shape s;

s = TRIANGLE; printf(s=%d\n, s);

return EXIT_SUCCESS;
}
$ gcc -o enum6 -std=c99 -pedantic enum6.c
$ ./enum6
s=3

The typedef statement can also be used at the time of the declaration of the enumerated
type:
$ cat enum7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef enum shape { CIRCLE=0, RECTANGLE=4, TRIANGLE=3 } shape;

shape s;

s = TRIANGLE; printf(s=%d\n, s);

return EXIT_SUCCESS;
}


The C language lets you declare an enumeration type and variables of that type at the
same time:
enum [enum_tag] { id1[=val1], id2[=val2], , idN[=valN] } [var1[, var2]];

Under this form, the tag can be omitted (anonymous enumeration). The following example
creates a new enumeration and two variables with a single declaration:
$ cat enum8.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum shape { CIRCLE=0, RECTANGLE=4, TRIANGLE=3 } s1,s2;

s1 = TRIANGLE; printf(s1=%d\n, s1);


return EXIT_SUCCESS;
}
$ gcc -o enum8 -std=c99 -pedantic enum8.c
$ ./enum8
s1=3

The following example creates a variable having an anonymous enumeration type:


$ cat enum9.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum { CIRCLE=0, RECTANGLE=4, TRIANGLE=3 } e;

e = TRIANGLE; printf(e=%d\n, e);

return EXIT_SUCCESS;
}
$ gcc -o enum9 -std=c99 -pedantic enum9.c
$ ./enum9
e=3

As an enumeration type is an integer type, the arithmetic conversion rules apply (see
Chapter II Section II.11 and more specifically Chapter IV Section IV.14). You can assign
a variable of arithmetic type an enumeration constant or a variable of enumerated type as
shown below:
$ cat enum10.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum shape { CIRCLE=0, RECTANGLE=4, TRIANGLE=3 };
enum shape s = RECTANGLE;

int i = TRIANGLE; printf(e=%d\n, e);
int f = s; printf(f=%d\n, f);

return EXIT_SUCCESS;
}
$ gcc -o enum10 -std=c99 -pedantic enum10.c
$ ./enum10

e=3
f=4

Since enumeration types are integer types, enumeration constants and variables of
enumerated type can be used with arrays as in the following example:
$ cat enum11.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum days { SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY };
char *name_days[] = {[SUNDAY] = SUNDAY,
[MONDAY]=MONDAY,
[TUESDAY]=TUESDAY,
[WEDNESDAY]=WEDNESDAY,
[THURSDAY]=THURSDAY,
[FRIDAY]=FRIDAY,
[SATURDAY]=SATURDAY
}; // subscripts are enumeration constants

int i;
enum days iD = MONDAY;
char *sD = name_days[ iD ]; // subscript is a variable of enumeration type

printf(%d->%s\n, iD, sD);

printf(\nList days:\n);
for (i=SUNDAY; i < SATURDAY; i++)
printf(%d->%s\n, i, name_days[i]);

return EXIT_SUCCESS;
}

$ gcc -o enum11 -std=c99 -pedantic enum11.c
$ ./enum11
1->MONDAY

List days:
0->SUNDAY
1->MONDAY
2->TUESDAY
3->WEDNESDAY

4->THURSDAY
5->FRIDAY


Obviously, if your program is consistent, an object of enumerated type is supposed to be
assigned an enumerated constant or an object of the same type. An enumerated type being
an integer type, you could assign a variable of enumerated type an integer value but the
behavior depends on the implementation. A compiler may choose to represent an
enumerated type by char, a signed integer or unsigned integer. In Chapter VI Section
VI.7.2, we will say more about conversions between integers and enumerated types. To
write a portable C program, if you actually want to use an integer value, do not set a
variable of enumerated type to any integer value: set it to a value ranging from [0SCHAR_MAX] or ranging from the minimum enumeration constant and the maximum
enumeration constant. It is good practice to set it to an enumerated constant or a variable
of the same type as in the following code snippet.
enum shape { CIRCLE=0, RECTANGLE=4, TRIANGLE=3 };

enum shape s1=RECTANGLE, s2;
s2 = s1;

VI.3 Structures
VI.3.1 Declaration
VI.3.1.1 Complete type
A structure, also known as a record in computer science, is a data structure that comprises
a set of elements that can have the same or different types. Each item is called a member
of the structure (in computer science it also known as a field). In C, a structure is declared
as follows:
struct struct_name {
obj_type1 mem1;
obj_type2 mem2;

obj_typeN memN;
};

Where:
[48]
o struct_name, called a tag
, is the identifier of the structure composed of letters, digits
and underscores and starting with an underscore or a letter. The new type called struct
struct_name can be used to declare variables.
o obj_type1, obj_type2, , obj_typeN are the types of the members mem1, mem2, , memN.

o mem1, mem2, , memN are the identfiiers of the members.



The members can be of any type with the exception of variably modified types (VM types,
see Chapter III Section III.9, and Chapter VII Section VII.17). A declaration of a
structure specifying its members is called a definition: the type is said to be complete since
the compiler has enough information to compute its size.

In the following example, we define the structure student composed of three members:
first_name, last_name and age:
$ cat struct_decl1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct student {
char *first_name;
char *last_name;
int age;
};

printf(sizeof(struct student) = %d\n, sizeof(struct student) );

return EXIT_SUCCESS;
}
$ gcc -o struct_decl1 -std=c99 -pedantic struct_decl1.c
$ ./struct_decl1
sizeof(struct student) = 12

The structure student occupies 12 bytes in our computer. This is enough to hold two
pointers (a pointer fits in four bytes in our computer) and one int (four bytes in our
computer). The size of a structure is at least the sum of the sizes of its elements.

A structure type is a programmer-defined type you can use to declare objects as you would
do with any other type. However, the keyword struct must be still specified when declaring
an object of type structure:
struct struct_name obj;

Here is an example:
$ cat struct_decl2.c
#include <stdio.h>

#include <stdlib.h>
#define NAME_MAX_LEN 32

int main(void) {
struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
};

struct student st1;

return EXIT_SUCCESS;
}

In the above example, the object st1 is declared as type structure student.

The typedef statement is often used to create an alias for a structure type.
$ cat struct_decl3.c
#include <stdio.h>
#include <stdlib.h>
#define NAME_MAX_LEN 32

int main(void) {
struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
};
typedef struct student student;

student st1;

return EXIT_SUCCESS;
}

The typedef statement can be placed before the declaration of the structure.
$ cat struct_decl4.c
#include <stdio.h>
#include <stdlib.h>
#define NAME_MAX_LEN 32


int main(void) {
typedef struct student student;

struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
};

student st1;

return EXIT_SUCCESS;
}

The typedef statement can also be used at the time of the declaration of the structure.
$ cat struct_decl5.c
#include <stdio.h>
#include <stdlib.h>
#define NAME_MAX_LEN 32

int main(void) {
typedef struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
} student;

student st1;

return EXIT_SUCCESS;
}

In C, you can also declare objects with an anonymous structure type. In this case, the
structure tag is just omitted as shown below:
$ cat struct_decl6.c
#include <stdio.h>
#include <stdlib.h>
#define NAME_MAX_LEN 32

int main(void) {
struct {

char first_name[ NAME_MAX_LEN ];


char last_name[ NAME_MAX_LEN ];
int age;
} st1, st2;

return EXIT_SUCCESS;
}



VI.3.1.2 Incomplete structure type
The C language let you declare a structure without providing its members, in which case,
the compiler will create an incomplete type that you cannot reuse to declare a variable
until you define it by specifying all its members. The type is incomplete because the
compiler cannot compute its size. An incomplete structure type is explicitly declared as
follows:
struct struct_name;

We will explain the use of such a declaration in Chapter VI Section VI.3.7 and Chapter
VIII Section VIII.6.3.2. An incomplete type is a known type but with an unknown size.
After declaring an incomplete structure type, later, somewhere within the program, you
have to complete it before using it as shown below:
$ cat struct_decl7.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct my_integer; // type declared: incomplete type
struct my_integer { int k; }; // type defined: it is complete

struct my_integer k; // valid

return EXIT_SUCCESS;
}

Normally, in C, if you declare a variable with an unknown type, you get an error
indicating the type does not exist as shown below:
$ cat struct_decl8.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

my_integer k;

return EXIT_SUCCESS;
}
$ gcc -o struct_decl8 -std=c99 -pedantic struct_decl8.c
struct_decl8.c: In function main:
struct_decl8.c:5:3: error: my_integer undeclared (first use in this function)
struct_decl8.c:5:3: note: each undeclared identifier is reported only once for each function it appears in
struct_decl8.c:5:14: error: expected ; before k

The compiler complained logically: the type my_integer was unknown to the compiler. With
structure types, things are quite different. It worth noting the keyword struct followed by a
tag always creates a new structure type if no structure with that tag is visible. Compare the
previous example with the following:
$ cat struct_decl9.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct my_integer k;

return EXIT_SUCCESS;
}
$ gcc -o struct_decl9 -std=c99 -pedantic struct_decl9.c
struct_decl9.c: In function main:
struct_decl9.c:5:21: error: storage size of k isnt known

In the example above, we got a different error. The compiler did not say the structure type
did not exit but it had an unknown size. What does it mean? Keep in mind the keyword
struct followed by a tag creates a new type if no structure type with tag is visible (the rule
has many consequences as we will find it out through the book). If the members are
specified, the structure type is complete but if the members are not present, the new
structure type is incomplete: the compiler has not enough information to compute its size
and then it cannot allocate the appropriate storage for an object of such a type. Thus, as no
structure type with the tag my_integer was visible at the time of the declaration of the object
k, the declaration struct my_integer k created an incomplete type and declared the variable k
with that type. All happens as if we had declared previously the incomplete structure type.
The example struct_decl9.c s equivalent to the following one:
$ cat struct_decl10.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

struct my_integer; // declare incomplete structure type



struct my_integer k; // declare k with an incomplete type. Not permitted

return EXIT_SUCCESS;
}
$ gcc -o struct_decl10 -std=c99 -pedantic struct_decl10.c
struct_decl10.c: In function main:
struct_decl10.c:7:21: error: storage size of k isnt known

In summary, if no structure type is visible and you declare an object of that type, the
compiler will create an incomplete structure type. If a structure type is visible and you
declare an object of that type, the compiler will just declare the object with that type.

VI.3.2 Initializing structures


Initializing an object means giving it a value at the time of the declaration. You can
initialize an object obj of structure type by providing values between braces as for arrays.
At declaration time, a structure can be initialized (such a declaration is called a definition)
as follows:
struct struct_name obj = {
val1,
val2,

valN,
};

Where struct_name is declared as follows:


struct struct_name {
obj_type1 mem1;
obj_type2 mem2;

obj_typeN memN;
};

The members mem1, mem2,.., mem4 are respectively assigned the values val1, val2,, valN.
Here is an example:
$ cat struct_init1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {

typedef struct student student;



struct student {
char *first_name;
char *last_name;
int age;
};

student st1 = {Christine, Sun, 35 };
student st2 = {David, Moon, 44 };

return EXIT_SUCCESS;
}

The drawback of the method is the values within braces must appear in the same order as
the members to be initialized. For example, the statement student st1 = {Christine, Sun, 35 }
sets the member first_name to Christine, last_name to Sun and age to 35. Why is it a
drawback? If you have a structure with several members, say five members, and you wish
to initialize only the last one, with this method, you cannot do it. Fortunately, the C99
introduced a new way of initializing an object of type structure by specifying the values
only for the members to be initialized:
struct struct_name obj = {
.memx=valx;
.memy=valy;

};

Our previous example can be also written as follows:


$ cat struct_init2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef struct student student;

struct student {
char *first_name;
char *last_name;
int age;
};

student st1 = {.age=35, .last_name=Sun, .first_name=Christine};

student st2 = {.first_name=David, .age=44, .last_name=Moon, };



return EXIT_SUCCESS;
}

What is then the default value for uninitialized members? It is too soon to give a
meaningful answer because it depends on the storage duration of the object. If it has
automatic storage duration, uninitialized members have an undefined value. If the object
has static storage duration, uninitialized members take the value of 0. We will not talk
about storage duration now but in Chapter VII Section VII.7.

After the declaration of an object of structure type, you cannot set new values as described
earlier. The following example will fail to compile:
$ cat struct_init3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef struct student student;

struct student {
char *first_name;
char *last_name;
int age;
};

student st1;

st1 = {.age=35, .last_name=sun, .first_name=Christine};

return EXIT_SUCCESS;
}
$ gcc -o struct_init3 -std=c99 -pedantic struct_init3.c
struct_init3.c: In function main:
struct_init3.c:15:9: error: expected expression before { token

After the declaration, to set values to members, you have to access the members of the
structure as described in the following section.

VI.3.3 Accessing members


We have learned the way to declare a structure, let us take one more step forward: how

could we have access to a member? And how could be modify it?



The member-access operator denoted by . (dot) allows you to access a member of a
structure. If struct_obj is an object of structure type, struct_obj.obj_mb1 represents the member
obj_mb1. The example below declares the object st1, initializing it, and displays the values
of the members:
$ cat struct_access1.c
#include <stdio.h>
#include <stdlib.h>

#define NAME_MAX_LEN 32

int main(void) {
typedef struct student student;

struct student {
char first_name[NAME_MAX_LEN];
char last_name[NAME_MAX_LEN];
int age;
};

student st1 = {Christine, Sun, 35 };
student st2 = {David, Moon, 44 };

printf(First Name: %s\n, st1.first_name);
printf(Last Name: %s\n, st1.last_name);
printf(Age: %d\n\n, st1.age);

printf(First Name: %s\n, st2.first_name);
printf(Last Name: %s\n, st2.last_name);
printf(Age: %d\n, st2.age);


return EXIT_SUCCESS;
}
$ gcc -o struct_access1 -std=c99 -pedantic struct_access1.c
$ ./struct_access1
First Name: Christine
Last Name: Sun
Age: 35

First Name: David


Last Name: Moon
Age: 44

The following example is equivalent to the previous one. After declaring the object st1,
without initializing it, it assigns values to its members and displays them:
$ cat struct_access2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NAME_MAX_LEN 32

int main(void) {
typedef struct student student;

struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
};

student st1;

strcpy(st1.first_name, Christine);
strcpy(st1.last_name, Sun);
st1.age = 35;

printf(First Name: %s\n, st1.first_name);
printf(Last Name: %s\n, st1.last_name);
printf(Age: %d\n, st1.age);

return EXIT_SUCCESS;
}
$ gcc -o struct7 -std=c99 -pedantic struct7.c
$ ./struct7
First Name: Christine
Last Name: Sun
Age: 35

VI.3.4 Array of structures


An array can hold elements of structure type. In the following example, the array
student_list contains a set of elements having a structure type.
$ cat struct_array1.c
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>

#define NAME_MAX_LEN 32

int main(void) {
int nb_elt = 10; /* maximum number of students in array student_list */
int i;
typedef struct student student;

struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
};

student student_list[ nb_elt ];

strcpy(student_list[0].first_name, Christine);
strcpy(student_list[0].last_name, Sun);
student_list[0].age = 35;

strcpy(student_list[1].first_name, David);
strcpy(student_list[1].last_name, Moon);
student_list[1].age = 44;

student_list[2].first_name[0] = \0;
student_list[2].last_name[0] = \0;
student_list[2].age = 0;

/* Display list of elements in array student_list */
for (i=0; i < nb_elt; i++ ) {
if ( ! student_list[i].age )
break;

printf(First Name: %s\n, student_list[i].first_name);

printf(Last Name: %s\n, student_list[i].last_name);


printf(Age: %d\n\n, student_list[i].age);
}

return EXIT_SUCCESS;
}
$ gcc -o struct_array1 -std=c99 -pedantic struct_array1.c
$ ./struct_array1
First Name: Christine
Last Name: Sun
Age: 35

First Name: David
Last Name: Moon
Age: 44

The example does not contain problems, except possibly the lines student_list[2].first_name[0] =
\0 and student_list[2].last_name[0] = \0. The third element of the array (of subscript 2) was
used to indicate there are no more items. Take note the subscript operator (i.e. []) and the
member-access operator dot (.) have same precedence and as both have left associativity
student_list[2].first_name[0] is equivalent to ((student_list[2]).first_name)[0].

VI.3.5 Pointer to structure


Structures allow us to build high-level data structures involving pointers. The following
example declares a pointer to a structure:
$ cat struct_pointer1.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NAME_MAX_LEN 32

int main(void) {
typedef struct student student;

struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
};

student *st1 = malloc( sizeof( student ) );



strcpy( (*st1).first_name, Christine );
strcpy( (*st1).last_name, Sun );
(*st1).age = 35;

printf(First Name: %s\n, (*st1).first_name);
printf(Last Name: %s\n, (*st1).last_name);
printf(Age: %d\n, (*st1).age);

return EXIT_SUCCESS;
}
$ gcc -o struct_pointer1 -std=c99 -pedantic struct_pointer1.c
$ ./struct_pointer1
First Name: Christine
Last Name: Sun
Age: 35

The pointer st1 points to a structure. We allocated a memory area that would be able to
store an object of type student. You can notice to access members, we had to dereference
the pointer first in order to access the object pointed to by the pointer. We used
parentheses because the member-access operator (.) has precedence over the dereference
operator *. The C language defines a more convenient operator enabling to access
members without explicitly dereferencing pointers: if p_obj is pointer to an object to a
structure, p_obj->mb1 denotes the member mb1. Thus, (*st1).first_name can also be written st1>first_name. As a consequence, our previous example can be rewritten more gracefully as
follows:
$ cat struct_pointer2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NAME_MAX_LEN 32

int main(void) {
typedef struct student student;

struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
};

student *st1 = malloc( sizeof( student ) );



strcpy( st1->first_name, Christine);
strcpy( st1->last_name, Sun);
st1->age = 35;

printf(First Name: %s\n, st1->first_name);
printf(Last Name: %s\n, st1->last_name);
printf(Age: %d\n, st1->age);

return EXIT_SUCCESS;
}
$ gcc -o struct_pointer2 -std=c99 -pedantic struct_pointer2.c
$ ./struct_pointer2
First Name: Christine
Last Name: Sun
Age: 35

In example struct_array1.c, we defined an array of structures. The drawback of arrays is we


cannot increase their size if there is no enough space to hold new elements: the array size
is defined once and for all at the time of the declaration. That is why pointers are often
preferred. They can be grown as needed. In the following example, we rewrite the
example struct_array1.c with pointers:
$ cat struct_pointer3.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NAME_MAX_LEN 32

int main(void) {
int nb_elt = 10; /* number of students in student_list */
int i;
typedef struct student student;

struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
};

student *student_list = malloc (nb_elt * sizeof *student_list );

if ( !student_list) {
printf(Cannot allocate memory for pointer student_list\n);
return EXIT_FAILURE;
}

strcpy( student_list[0].first_name, Christine );
strcpy( student_list[0].last_name, Sun );
student_list[0].age = 35;

strcpy( student_list[1].first_name, David );
strcpy( student_list[1].last_name, Moon );
student_list[1].age = 44;

strcpy( student_list[2].first_name, EOF_ARRAY );
strcpy( student_list[2].last_name, EOF_ARRAY );
student_list[2].age = 0;

/* Display list of elements in array student_list */
for (i=0; i < nb_elt; i++ ) {
if ( ! strcmp( student_list[i].first_name, EOF_ARRAY ) )
break;
printf(First Name: %s\n, student_list[i].first_name);
printf(Last Name: %s\n, student_list[i].last_name);
printf(Age: %d\n\n, student_list[i].age);
}

return EXIT_SUCCESS;
}
$ gcc -o struct_pointer3 -std=c99 -pedantic struct_pointer3.c
$ ./struct_pointer3
First Name: Christine
Last Name: Sun
Age: 35

First Name: David
Last Name: Moon
Age: 44

VI.3.6 Nested structures


VI.3.6.1 Accessing members of nested structures

As you may have guessed, structures allow building advanced types. For example,
members of a structure can be themselves structures. Structures containing structures are
called nested structures. For example, the following structure is a nested structure:
struct my_struct1 {
struct {
int a;
int b;
} mem1;

float f;
}


The initialization of such a structure is quite natural. Since the inner structure struct { int a;
int b} can be initialized by {10, 20 }, the structure my_struct1 can be initialized with { {10, 20}, 10.8
}.

The question that naturally arises is how could we access the members of nested
structures? In the same way as simple structures. For example, if we declare the object st1
as struct my_struct1 st1
o The member a of the nested structure is accessed like this: st1.mem1.a
o The member b of the nested structure is accessed like this: st1.mem1.b
o The member f is accessed like this: st1.f

If ptr_st1 is declared as struct my_struct1 *ptr_st1:
o The member a of the nested structure is accessed through ptr_st1->mem1.a
o The member b of the nested structure is accessed through ptr_st1->mem1.b
o The member f is accessed like this: st1->f

Here is an example:
$ cat struct_nested1.c
#include <stdio.h>
#include <stdlib.h>

struct my_struct1 {
struct {
int a;
int b;

} mem1;

float f;
};

int main(int argc, char **argv) {
struct my_struct1 st1 = { {10,20}, 10.8 };
struct my_struct1 *ptr_st1 = &st1;

printf(%d %d %f\n, st1.mem1.a, st1.mem1.b, st1.f);
printf(%d %d %f\n, ptr_st1->mem1.a, ptr_st1->mem1.b, ptr_st1->f);
return EXIT_SUCCESS;
}
$ gcc -o nested_struct1 -std=c99 -pedantic nested_struct1.c
$ ./nested_struct1
10 20 10.800000
10 20 10.800000


What if a member is a pointer to another structure? In the following structure, the member
mem1 is a pointer to a structure:
struct my_struct2 {
struct {
int a;
int b;
} *ptr_mem1;

float f;
}


If we declare the object st2 as struct my_struct2 st2
o The member a of the inner structure is accessed like this: st2.mem1->a
o

If we declare the object ptr_st2 as struct my_struct2 *ptr_st2
o The member a of the inner structure can be accessed like this: ptr_st2->mem1->a
o

For example:
$ cat struct_nested2.c
#include <stdio.h>
#include <stdlib.h>

struct my_struct1 {
struct {
int a;
int b;
} *mem1;

float f;
};

int main(int argc, char **argv) {
struct my_struct1 st1;
struct my_struct1 *ptr_st1 = &st1;

st1.mem1 = malloc(sizeof *(st1.mem1));
st1.mem1->a = 10; /* same as ptr_str1->mem1->a = 10 */
st1.mem1->b = 20; /* same as ptr_str1->mem1->b = 20 */
st1.f = 10.8; /* same as ptr_str1->f = 10.8 */

printf(%d %d %f\n, st1.mem1->a, st1.mem1->b, st1.f);
printf(%d %d %f\n, ptr_st1->mem1->a, ptr_st1->mem1->b, ptr_st1->f);

free(st1.mem1); /* same as free(ptr_st1->mem1) */
return EXIT_SUCCESS;
}
$ gcc -o nested_struct2 -std=c99 -pedantic nested_struct2.c
$ ./nested_struct2
10 20 10.800000
10 20 10.800000



VI.3.6.2 Initializing nested structures
Suppose you wish to save in data structures information about students: their first name,
last name and birth date. You have many ways to implement it. A simple way to do it
could be:
struct student {

char first_name[72];
char last_name[72];
char birthdate[9]; /* such as 15122000 */
}

It also could be implemented like this:


struct student {
struct person {
char first_name[72];
char last_name[72];
} person;

struct date {
int month;
int day;
int year;
} birthdate;
}


In the latter case, our structure student is composed of two members that are also of
structure type: person and birthdate.

Now, how do you think such a structure could be initialized? In the same manner as we
did for simpler structures. Since we have two methods for initializing members, and due
the complexity of the structure, you have several ways to initialize it: by giving values
without specifying the members or by giving values specifying the members or both of
them. Let us consider the first embedded structure person. We could initialize it in two
ways:
o { Christine, sun }
o Or { .first_name=Christine, .last_name=sun }

For the second embedded structure date we also have two ways:
o { 7, 4, 2002 }
o Or { .year=2002, .month=7, .day=4 }

This implies you have several ways to initialize the structure student:
o struct student st1= {

{ Christine, sun },
{ 7, 4, 2002 },

}

o struct student st1={
{ .first_name=Christine, .last_name=sun },
{ 7, 4, 2002 },

}

o struct student st1= {
{ .first_name=Christine, .last_name=sun },
{ .year=2002, .month=7, .day=4 }
}


o struct student st1= {
.person={ .first_name=Christine, .last_name=sun },
.birthdate={ 7, 4, 2002 },

}

o struct student st1= {
.person={ Christine, sun },
.birthdate={ 7, 4, 2002 },

}
o

Here is a piece of code showing what we said:
$ cat struct_nested3.c
#include <stdio.h>
#include <stdlib.h>

#define MAX_NAME_LEN 72

int main(void) {
struct student {
struct person {
char first_name[MAX_NAME_LEN];

char last_name[MAX_NAME_LEN];
} person;

struct date {
int month;
int day;
int year;
} birthdate;
};

struct student st1 = {
{ Christine, sun },
{ 7, 4, 2002 },
};

struct student st2 = {
{ .first_name=Christine, .last_name=sun },
{ 7, 4, 2002 },
};

struct student st3 = {
{ .first_name=Christine, .last_name=sun },
{ .year=2002, .month=7, .day=4 }
};

struct student st4 = {
.person={ .first_name=Christine, .last_name=sun },
.birthdate={ 7, 4, 2002 },
};

struct student st5 = {
.person={ Christine, sun },
.birthdate={ 7, 4, 2002 },
};

struct student list_st[] = { st1, st2, st3, st4, st5 };
int i;
int nb_elt = sizeof list_st/sizeof list_st[0];

for (i=0; i < nb_elt; i++)
printf(%s %s %d/%d/%d\n,
list_st[i].person.first_name,

list_st[i].person.last_name,
list_st[i].birthdate.month,
list_st[i].birthdate.day,
list_st[i].birthdate.year);

return EXIT_SUCCESS;
}
$ gcc -o struct_nested3 -std=c99 -pedantic struct_nested3.c
$ ./struct_nested3
Christine sun 7/4/2002
Christine sun 7/4/2002
Christine sun 7/4/2002
Christine sun 7/4/2002
Christine sun 7/4/2002

VI.3.7 Incomplete types and forward references


There are two kinds of declarations for structure types: declarations including a definition
and simple declarations. A declaration that specifies the members of a structure is a
definition: the type is complete. A simple declaration, that omits the members of a
structure, declares an incomplete structure type.

An incomplete type is type whose size is unknown. A structure type that is not defined is
an incomplete type. There are several kinds of incomplete types (described in Chapter
VIII Section VIII.6.3.2), an incomplete structure type is only one of them. An incomplete
type can be explicitly declared such as in the following example:
struct string;

An incomplete type is also created by the declaration of a pointer to an undeclared


structure type. In two special contexts, incomplete structure types can be used:
o When declaring a pointer to a structure type not created creates it
o Creating an alias for a structure type by using typedef

The following example is valid:
$ cat struct_incomplete1.c
int main(void) {
struct string *p; // pointer to incomplete type

return 0;
}

It is equivalent to:
int main(void) {
struct string;
struct string *p; // pointer to incomplete type

return 0;
}

The standard C allows declaring a pointer to an incomplete type because it is not


necessary to know the size of the pointed-to type. The size of a pointer is always known
and then it can be allocated a memory area when declared. You may argue that pointers to
structures may have a size depending on the structure. Fortunately, this is not the case:
pointers to structures have the same representation and alignment.

As long as a pointer to an incomplete type is not dereferenced, all is fine but before
dereferencing it, the structure type struct string has to be completed. Completing a
structure type means declaring it by defining its members. You can do it after the
incomplete type is declared as shown below:
$ cat struct_incomplete2.c
int main(void) {
struct string *p; // pointer to incomplete type. Forward reference
struct string {
char *s;
int len;
}; // struct string is complete

return 0;
}

A new type deriving from an incomplete type can be created with typedef:
$ cat struct_incomplete3.c

int main(void) {
typedef struct string string;
return 0;
}

The new type string cannot be used to declare variables until it is completed.

Allowing incomplete structure types and pointers to incompletes type is very useful.
Consider two structures that reference each other; without such a feature, you will not be

able to do it. The following example uses this facility:


struct A {
char s[255];
struct B *p; // forward reference: points to struct B not yet defined
};

struct B {
int k;
struct A *q;
};

In the example above, the pointer p points to a type whose definition is delayed (forward
reference): at the time the member p of the structure A is declared, the structure B has not
been defined yet. In contrast, the following declaration of the structure A is not valid
because at the time of the declaration of the member str_b, the structure B has not been
defined (its size is unknown and then the member str_b cannot be allocated storage):
struct A {
char s[255];
struct B str_b; // invalid: struct B is an incomplete type
};

struct B {
int k;
struct A str_b; // valid, struct A is a complete type
};


The following example also takes advantage of this feature allowing building recursive
high-level data structures such as linked lists:
struct string {
char s[255];
int len;
};

struct node {
struct string s;
struct node *ptr_next_node;
};

In the example above, the pointer ptr_next_node points to an incomplete type: at the time the
member ptr_next_node of the structure node is declared, the size of the structure node is still
unknown since its definition is being constructed. The definition of a structure is

considered complete when the right brace } is encountered.



Moreover, this feature allows encapsulating your data safely and efficiently as we will find
out in Chapter VIII Section VIII.11.

VI.3.8 High-level data structures


Combining pointers and structures enable to create high-level data structures. The most
commonly used data structures are link lists and trees.

VI.3.8.1 Linked lists
A linked list is a collection of structures called nodes. Each structure contains data and a
pointer to another structure as depicted in Figure VI1.

Figure VI1 Linked list

The last element of a linked list is a null pointer, which allows determining the tail of the
linked list. The head of a linked list is the very first allocated structure. Our examples
struct_array1.c and struct_pointer3.c can be rewritten by using a linked list (see Figure VI1):
$ cat struct_hl_ds1.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NAME_MAX_LEN 32

int main(void) {
int nb_elt = 10; /* number of students in student_list */
int i;
typedef struct student student;
student *p, *student_list, *q;

struct student {
char first_name[ NAME_MAX_LEN ];
char last_name[ NAME_MAX_LEN ];
int age;
student *p_next;
};

/* first structure: head */
student_list = malloc ( sizeof *student_list );
if ( !student_list) {
printf(Cannot allocate memory for pointer student_list\n);
return EXIT_FAILURE;
}

strcpy( student_list->first_name, Christine );
strcpy( student_list->last_name, Sun );
student_list->age = 35;

p = malloc ( sizeof *student_list ); /* allocate memory for next structure */
if ( !p ) {
printf(Cannot allocate memory for pointer student_list\n);
return EXIT_FAILURE;
}
student_list->p_next = p;


/* Second structure */
strcpy( p->first_name, David );
strcpy( p->last_name, Moon );
p->age = 44;
p->p_next = NULL; /* tail of the list */


/* Display linked list student_list */
for (q = student_list; q != NULL; q = q->p_next ) {
printf(First Name: %s\n, q->first_name);
printf(Last Name: %s\n, q->last_name);
printf(Age: %d\n\n, q->age);
}

return EXIT_SUCCESS;
}
$ gcc -o struct_hl_ds1 -std=c99 -pedantic struct_hl_ds1.c
$ ./struct_hl_ds1
First Name: Christine
Last Name: Sun
Age: 35

First Name: David
Last Name: Moon
Age: 44

A linked list is very interesting because only one memory block is allocated at a time for a
structure when required. The linked list can be grown easily: you just allocate a new
memory block, copy information into it, set the p_next pointer of the previous structure to
the pointer of the newly allocated structure. You can also remove easily a structure: the
p_next pointer of the previous structure is set to the pointer p_next of the structure you want
to remove.

VI.3.8.2 Trees
Programmers also resort to trees to organize their data. A tree is a linked list with several
pointers to other structures. The simplest tree is a binary tree. It is a structure holding data
and two pointers as depicted in Figure VI2.

Figure VI2 Tree data structure


An element of a tree is called a node. The top node of the tree is known as a root node or
root. A node is called parent if it references one or more nodes called children. Nodes
that have no children are called leaves. In Figure VI2, the node a is the root and parent of
the children b and c. Nodes d, e, f, and g are leaves.

Here is an example of a tree data structure:

$ cat struct13.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef struct myTree myTree;
myTree *p_left, *root_tree, *p_right, *p;
int c;

struct myTree {
char c;
myTree *p_left;
myTree *p_right;
};

root_tree = malloc( sizeof *root_tree );
root_tree->c = a;

p_left = malloc( sizeof *p_left );
p_left->c = b;
root_tree->p_left = p_left;
p_left->p_left = p_left->p_right = NULL;

p_right = malloc( sizeof *p_right );
p_right->c = c;
root_tree->p_right = p_right;
p_right->p_left = p_right->p_right = NULL;

return EXIT_SUCCESS;
}

In the example above, we did not test the pointers returned by malloc() were valid in order
to make the program easier to understand. Of course, in your program, do it

VI.3.9 Structures and operators


You cannot apply C operators on structures with the exception of the simple assignment
operator = and the address operator &, and the member-access operators (. and ->). Here is
an example:
$ cat struct_op1.c
#include <stdio.h>
#include <stdlib.h>

#include <string.h>

#define NAME_MAX_LEN 32

int main(void) {
typedef struct student student;

struct student {
char first_name[ NAME_MAX_LEN ];
char last_name [NAME_MAX_LEN ];
int age;
};

student st1 = {Christine, Sun, 35 };
student st2 = st1;

printf(First Name: %s\n, st2.first_name);
printf(Last Name: %s\n, st2.last_name);
printf(Age: %d\n, st2.age);


return EXIT_SUCCESS;
}
$ gcc -o struct_op1 -std=c99 -pedantic struct_op1.c
$ ./struct_op1
First Name: Christine
Last Name: Sun
Age: 35

The assignment operation copies the value of each member of the structure on the right
side of the equal sign to the corresponding member of the other structure on the left side of
the equal sign. In the example struct_op1.c, the declaration of the structures st1 and st2 creates
both structures with their members. The assignment st2 = st1 copies the value of each
member of st1 into the corresponding member of st2. Thus, the items of the array first_name
of the structure st1 are copied into the array first_name of structure st2. Likewise, the
elements of the array last_name in the structure st1 are copied into the array last_name in
structure st2. Finally, the value of the member age in the structure st1 is copied into the
member age in structure st2.

The example is interesting because it shows if a member is an array, all of its items are
completely copied. Such a copy is called a deep copy. This holds true for whatever the
type of members unless it is a pointerIf a member is a pointer, only the address of the
referenced object (held in the pointer) is copied: the pointed-to object itself is not copied.

Such copy is also known as a shallow copy. This implies if you assign an object of type
structure to another object of type structure, members that are pointers point to the same
objects!

Consequently, you have to watch out for the assignments of structures if some members
are pointers. Let us show it through simple an example. Can you see why the following
example is not correct?
$ cat struct_op2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NAME_MAX_LEN 32

int main(void) {
typedef struct student student;

struct student {
char *first_name;
char *last_name;
int age;
};

student st1, st2;

st1.first_name = malloc( NAME_MAX_LEN );
st1.last_name = malloc( NAME_MAX_LEN );
strcpy(st1.first_name, Christine);
strcpy( st1.last_name, Sun);
st1.age = 35;

st2 = st1;
strcpy( st2.first_name, David );
strcpy( st2.last_name, Moon );
st2.age = 45;


printf(First Name: %s\n, st1.first_name);
printf(Last Name: %s\n, st1.last_name);
printf(Age: %d\n\n, st1.age);

printf(First Name: %s\n, st2.first_name);


printf(Last Name: %s\n, st2.last_name);
printf(Age: %d\n, st2.age);

return EXIT_SUCCESS;
}
$ gcc -o struct_op2 -std=c99 -pedantic struct_op2.c
$ ./struct_op2
First Name: David
Last Name: Moon
Age: 35

First Name: David
Last Name: Moon
Age: 45

The assignment st2 = st1 copies the value of each member of st1 into the corresponding
member of st2. This implies it also copies the pointers: the pointers of st1 points to the
same objects as the pointers of st2. In our example, the members first_name of the structures
st1 and st2 point to the same memory block (same note for the member last_name). The
following example shows the pointers are copied but not the objects their reference:
$ cat struct_op3.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NAME_MAX_LEN 32

int main(void) {
typedef struct student student;

struct student {
char *first_name;
char *last_name;
int age;
};

student st1, st2;
st1.first_name = malloc( NAME_MAX_LEN );
st1.last_name = malloc( NAME_MAX_LEN );

st2 = st1;


printf(address first_name: st1=%p and st2=%p\n, st1.first_name, st2.first_name);
printf(address last_name: st1=%p and st2=%p\n, st1.last_name, st2.last_name);

return EXIT_SUCCESS;
}
$ gcc -o struct_op3 -std=c99 -pedantic struct_op3.c
$ ./struct_op3
address first_name: st1=8061040 and st2=8061040
address last_name: st1=8061068 and st2=8061068

In summary, you must allocate memory for members that are pointers as in the example
below:
$ cat struct_op4.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NAME_MAX_LEN 32

int main(void) {
typedef struct student student;

struct student {
char *first_name;
char *last_name;
int age;
};

student st1, st2;
st1.first_name = malloc( NAME_MAX_LEN );
st1.last_name = malloc( NAME_MAX_LEN );
strcpy(st1.first_name, Christine);
strcpy( st1.last_name, Sun);
st1.age = 35;

st2.first_name = malloc( NAME_MAX_LEN );
st2.last_name = malloc( NAME_MAX_LEN );
strcpy( st2.first_name, David );
strcpy( st2.last_name, Moon );
st2.age = 45;


printf(First Name: %s\n, st1.first_name);
printf(Last Name: %s\n, st1.last_name);
printf(Age: %d\n\n, st1.age);

printf(First Name: %s\n, st2.first_name);
printf(Last Name: %s\n, st2.last_name);
printf(Age: %d\n, st2.age);

return EXIT_SUCCESS;
}
$ gcc -o struct_op4 -std=c99 -pedantic struct_op4.c
$ ./struct18
First Name: Christine
Last Name: Sun
Age: 35

First Name: David
Last Name: Moon
Age: 45

VI.3.10 Flexible array member


Normally within a structure, the size of arrays must be known at declaration time.
However, as of the C99 standard, you are allowed to use an array with no specified size
(incomplete array type) if it is the last member of the structure: the array is known as a
flexible array member. Take note that the flexible array member is ignored as shown
below:
$ cat struct_flexible_am1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct myArray {
int len;
int s[];
};

printf(Sizeof(int)=%d and sizeof(struct myArray)=%d\n, sizeof(int), sizeof(struct myArray));
return EXIT_SUCCESS;
}
$ gcc -o struct_flexible_am1 -std=c99 -pedantic struct_flexible_am1.c

$ ./struct_flexible_am1
Sizeof(int)=4 and sizeof(struct myArray)=4

In our computer, an int is represented by 4 bytes, and as you can see it, the structure
myArray is also represented in 4 bytes ignoring the last member. This does not mean we
cannot work with the member s. In order to use it, we have first to allocate memory for it.
How could we do that? Through a pointer as shown below:
$ cat struct_flexible_am2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int array_len = 10;
int i;
struct myArray {
int len;
int s[];
};

typedef struct myArray array;

/* allocate memory */
array *int_array = malloc( sizeof(*int_array) + array_len * sizeof(int) );
if ( int_array == NULL ) {
printf(Cannot allocate memory);
return EXIT_FAILURE;
}

int_array->len = array_len;

/* initialize array s */
for (i = 0; i < int_array->len; i++)
int_array->s[i] = i;

/* displaying the array s */
for (i = 0; i < int_array->len; i++)
printf(int_array->s[%d]=%d\n, i, int_array->s[i] );

return EXIT_SUCCESS;
}
$ gcc -o struct_flexible_am2 -std=c99 -pedantic struct_flexible_am2.c
$ ./struct_flexible_am2

int_array->s[0]=0
int_array->s[1]=1
int_array->s[2]=2
int_array->s[3]=3
int_array->s[4]=4
int_array->s[5]=5
int_array->s[6]=6
int_array->s[7]=7
int_array->s[8]=8
int_array->s[9]=9

One question arises, if the flexible array member is ignored, as said earlier, it means that
an assignment of a structure containing such a member is partial as sketched in the
following example:
$ cat struct_flexible_am3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int array_len = 10;
int i;
struct myArray {
int len;
int s[];
};

typedef struct myArray array;

/* allocate memory */
array *int_array1, *int_array2;

int_array1 = malloc( sizeof(*int_array1) + array_len * sizeof(int) );
if ( int_array1 == NULL ) {
printf(Cannot allocate memory);
return EXIT_FAILURE;
}

int_array1->len = array_len;

/* initialize array s in array1*/
for (i = 0; i < int_array1->len; i++)
int_array1->s[i] = i;


int_array2 = malloc( sizeof(*int_array1) + array_len * sizeof(int) );
if ( int_array2 == NULL ) {
printf(Cannot allocate memory);
return EXIT_FAILURE;
}

//Flexible Array Member is ignored by the following assignment
*int_array2 = *int_array1;

printf(int_array2->len=%d\n, int_array2->len); /* member len has been copied */

/* but array s was not copied at all since ignored */
/* attempt to display the array s in array2 */
for (i = 0; i < int_array2->len; i++)
printf(int_array2->s[%d]=%d\n, i, int_array2->s[i] );

return EXIT_SUCCESS;
}
$ gcc -o struct_flexible_am3 -std=c99 -pedantic struct_flexible_am3.c
$ ./struct_flexible_am3
int_array2->len=10
int_array2->s[0]=0
int_array2->s[1]=0
int_array2->s[2]=0
int_array2->s[3]=0
int_array2->s[4]=0
int_array2->s[5]=0
int_array2->s[6]=0
int_array2->s[7]=0
int_array2->s[8]=0
int_array2->s[9]=0

Therefore, to perform a full copy of a structure with a flexible array member, we have to
invoke the memcpy() function:
$ cat struct_flexible_am4.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
int array_len = 10;

int i;
struct myArray {
int len;
int s[];
};

typedef struct myArray array;

/* allocate memory */
array *int_array1, *int_array2;

int_array1 = malloc( sizeof(*int_array1) + array_len * sizeof(int) );
int_array2 = malloc( sizeof(*int_array2) + array_len * sizeof(int) );

if ( ! int_array1|| ! int_array2 ) {
printf(Cannot allocate memory);
return EXIT_FAILURE;
}

int_array1->len = array_len;

/* initialize array s in array1*/
for (i = 0; i < int_array1->len; i++)
int_array1->s[i] = i;

/* copy of structure int_array1 into int_array2 */
memcpy(int_array2, int_array1,
sizeof(*int_array1) + int_array1->len * sizeof(int));

printf(int_array2->len=%d\n, int_array2->len);
for (i = 0; i < int_array2->len; i++)
printf(int_array2->s[%d]=%d\n, i, int_array2->s[i] );

return EXIT_SUCCESS;
}
$ gcc -o struct_flexible_am4 -std=c99 -pedantic struct_flexible_am4.c
$ ./struct_flexible_am4
int_array2->len=10
int_array2->s[0]=0
int_array2->s[1]=1
int_array2->s[2]=2
int_array2->s[3]=3

int_array2->s[4]=4
int_array2->s[5]=5
int_array2->s[6]=6
int_array2->s[7]=7
int_array2->s[8]=8
int_array2->s[9]=9

The program worked! We used the memcpy() function that is similar to strcpy(). While the
function strcpy() copies strings (terminated by \0) only, memcpy() copies anything byte to
byte. It has the following prototype:
Until C95:
void *memcpy(void *dest, const void *src, size_t n);

As of C99:
void *memcpy(void *restrict dest, const void *restrict src, size_t n);

The memcpy() function copies the memory block pointed to by src into the memory chunk
pointed to by dest. Of course, the number of bytes to be copied is specified in the last
parameter n. In our example struct_flexible_am4.c, the last argument of memcpy() was the size in
bytes of the structure int_array1.

In summary, if you use a structure with a flexible array member:
o Work with a pointer to it
o Do not forget to allocate memory for the flexible array member.
o Call the function memcpy() to copy structures. Do not use assignments because the
flexible array member is ignored.

VI.4 unions
VI.4.1 Declarations
VI.4.1.1 Complete type
A union is a user-defined type denoting a value that can take several flavors of types. A
union is declared in the same way as a structure except the keyword enum substitutes for
the keyword struct. A union is declared as follows:
union union_tag {
obj_type1 obj1;
obj_type2 obj2;

obj_typeN objN;
};

Where:
o union_name, called a tag, is the identifier of the structure composed of letters, digits and
underscores and starting with an underscore or a letter. The new type union union_name can
then be used to declare variables.
o obj_type1, obj_type2, , obj_typeN are the types of the members obj1, obj2, , objN.

The members can be of any type with the exception of variably modified types. A
declaration of a union specifying its members is called a definition: the type is said to be
complete since the compiler has enough information to compute its size.

Unions works in the same manner as structures, and the same rules apply to them. What is
the difference? In a structure, every item will be reserved a piece of memory while in a
union, there is a single memory block shared amongst all of the items. Let us start with a
simple example:
$ cat union_decl1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union number {
int iVal;
double fVal;
};

printf(sizeof(int)=%d\n, sizeof(int));
printf(sizeof(double)=%d\n, sizeof(double));
printf(sizeof(union number)=%d\n, sizeof(union number));

return EXIT_SUCCESS;
}
$ gcc -o union_decl1 -std=c99 -pedantic union_decl1.c
$ ./union_decl1
sizeof(int)=4
sizeof(double)=8
sizeof(union number)=8

As you could see it, the size of the union is the size of the largest item. This is actually not
surprising since it is supposed to hold any values of the items.

You have three methods to declare an object of union type:


o Method 1: after declaring the union type.
union union_tag obj;

For example:
$ cat union_decl2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union number {
int iVal;
double fVal;
};

union number uNb;

return EXIT_SUCCESS;
}

o Method 2: at the time of the declaration of the union type.


union union_tag {
obj_type1 obj1;
obj_type2 obj2;

obj_typeN objN;
} obj;

For example:
$ cat union_decl3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union number {
int iVal;
double fVal;
} uNb;

return EXIT_SUCCESS;
}

o Method 3: by using an unnamed union:


union {
obj_type1 obj1;
obj_type2 obj2;

obj_typeN objN;
} obj;

For example:
$ cat union_decl4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union {
int iVal;
double fVal;
} uNb;

return EXIT_SUCCESS;
}

To avoid repeating the keyword union when referring to a union type, programmers
generally invoke the typedef statement that creates an alias to the union type using one of
the following ways:
typedef union union_tag {
obj_type1 obj1;
obj_type2 obj2;

obj_typeN objN;
} union_typename;

Or
typedef union union_tag union_typename;

Or
typedef union {
obj_type1 obj1;
obj_type2 obj2;

obj_typeN objN;

} union_typename;

Where:
o union_tag is the identifier of the union
o union_typename is an alias for union_tag.

For example:
$ cat union_decl5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef union number number;
union number {
int iVal;
double fVal;
};

number uNb;

return EXIT_SUCCESS;
}


VI.4.1.2 Incomplete union type
What we said about structures also applies to unions. You can declare a union without
providing its members, which causes the compiler to create an incomplete type. As for
structures, you cannot use it to declare a variable until you define it by specifying all its
members. An incomplete union type is created as follows:
union union_tag;

There is another way to create an incomplete union type. As for structures, if you declare
an object of an undeclared union type, the compiler will create the incomplete union type.
In the following example, the declaration of the pointer p also declares the incomplete
union type with the tag number:
union number *p;

VI.4.2 Initializing unions


Unions are initialized as structures. At declaration time, a union can be initialized as

follows:
union union_tag obj = {
.memx=valx;
};

The following example declares and initializes the object uNb of union type:
$ cat union_init1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union number {
int iVal;
double fVal;
};
typedef union number number;

number uNb1 = {.iVal = 1003 };
number uNb2 = {.fVal = 407.61 };

printf(uNb.iVal=%d\n, uNb1.iVal);
printf(uNb.fVal=%f\n, uNb2.fVal);

return EXIT_SUCCESS;
}
$ gcc -o union_init1 -std=c99 -pedantic union_init1.c
$ ./union_init1
uNb.iVal=1003
uNb.fVal=407.610000

Take note that only a single member must be initialized. Once declared, you cannot use
this method to set new values to the union. The following example will not compile:
$ cat union_init2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union number {
int iVal;
double fVal;
};

typedef union number number;



number uNb1;

uNb1 = {.iVal = 1003 };

printf(uNb.iVal=%d\n, uNb1.iVal);

return EXIT_SUCCESS;
}
$ gcc -o union_init2 -std=c99 -pedantic union_init2.c
union_init2.c: In function main:
union_init2.c:13:10: error: expected expression before { token

After the declaration, to set values, you will have to access the members as explained in
the next section.

VI.4.3 Accessing union members


Members of a union are accessed in the same way as a structure. The member-access
operator denoted by . (dot) allows you to access a member of a union or a structure. If
union_obj is an object of union type, union_obj.obj_mb1 represents the member obj_mb1. Here is
an example:
$ cat union_access1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union number {
int iVal;
double fVal;
};
typedef union number number;

number uNb;
uNb.iVal = 1003;
printf(uNb.iVal=%d\n, uNb.iVal);

uNb.fVal = 407.61;
printf(uNb.fVal=%f\n, uNb.fVal);

return EXIT_SUCCESS;

}
$ gcc -o union_access1 -std=c99 -pedantic union_access1.c
$ ./union_access1
uNb.iVal=1003
uNb.fVal=407.610000

Remember there is a single memory block shared amongst items. This implies at a given
time only one member is meaningful! Try this:
$ cat union_access2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union number {
int iVal;
double fVal;
};
typedef union number number;

number uNb;
uNb.fVal = 407.61;
printf(uNb.iVal=%d\n, uNb.iVal);

return EXIT_SUCCESS;
}
$ gcc -o union_access2 -std=c99 -pedantic union_access2.c
$ ./ union_access2
uNb.iVal=-1889785610

We set the member fVal and we tried to get the value of the member iVal. As expected, we
retrieved a value with no meaning.

The following example shows the members of a union share the same memory block. We
declare uNb as a union and we display the addresses of the items of the union:
$ cat union_access3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union number {
int iVal;

double fVal;
};

union number uNb;
printf(&iVal=%p\n, &uNb.iVal);
printf(&fVal=%p\n, &uNb.fVal);

return EXIT_SUCCESS;
}
$ gcc -o union_access3 -std=c99 -pedantic union_access3.c
$ ./union_access3
&iVal=feffea98
&fVal=feffea98

Compare with a structure:


$ cat union_access4.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct number {
int iVal;
double fVal;
};

struct number uNb;
printf(&iVal=%p\n, &uNb.iVal);
printf(&fVal=%p\n, &uNb.fVal);

return EXIT_SUCCESS;
}
$ gcc -o union_access4 -std=c99 -pedantic union_access4.c
$ ./union_access4
&iVal=feffea94
&fVal=feffea98

The examples showed us, in a union, members share the same memory area while in a
structure, each member has its own piece of memory.

If programmers must know specifically which member of a union they have to access,
how could they guess which one holds the right value? By embedding the union within a
structureIn the structure, programmers could use an integer (or an enumerated type)

that indicates the type of the current value.



Suppose you wanted to create a new type that would denote positive integer numbers that
can be represented by either type int or a string storing its binary representation. Here is a
piece of code implementing it (using a VLA, works with C99 and C11 compiler):
$ cat union_access5.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum type_number { INTEGER, BINARY, VOID };
typedef enum type_number type_number;

struct number {
type_number type;
union {
unsigned int iVal;
char bVal[sizeof(int)];
} uVal;
};

typedef struct number number;

number nb;

nb.type = INTEGER;
nb.uVal.iVal = 1003;

return EXIT_SUCCESS;
}

In example union_access5.c, we embedded the union described earlier within a structure. In


the structure number, the member type allows determining the member of the union that
holds the correct value. It is has an enumeration type. If the member type holds the value
INTEGER, we will retrieve the value in the member iVal. If it holds the value BINARY, we
will retrieve the value from the member bVal. If it holds the value VOID, it means it contains
nothing valuable.

The following example completes the previous example. The user passes a number along
with its type:
$ cat union_access6.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv) {
enum type_number { INTEGER, BINARY, VOID };
typedef enum type_number type_number;

struct number {
type_number type;
union {
unsigned int iVal;
char bVal[ sizeof(int) ];
} uVal;
};

typedef struct number number;

number nb;

/* expect 2 arguments */
if (argc != 3 ) {
printf(USAGE: %s type number\n, argv[0]);
printf(where\n\n);
printf(- type is INTEGER or BINARY\n);
printf(- number is an integer number\n);

return EXIT_FAILURE;
}

if ( ! strncmp(argv[1], INTEGER, 7) ) {
nb.type = INTEGER;
nb.uVal.iVal = atoi( argv[2] );
} else if ( ! strncmp(argv[1], BINARY, 6) ) {
nb.type = BINARY;
strncpy(nb.uVal.bVal, argv[2], 32 );
} else {
printf(Type %s unknown\n, argv[1]);
return EXIT_FAILURE;
}

switch (nb.type) {

case INTEGER:
printf(iVal=%d\n, nb.uVal.iVal);
break;
case BINARY:
printf(bVal=%s\n, nb.uVal.bVal);
break;
default:
printf(Unknown type\n);
return EXIT_FAILURE;
}

return EXIT_SUCCESS;
}
$ gcc -o union_access6 -std=c99 -pedantic union_access6.c
$ ./union_access6 BINARY 1010
bVal=1010
$ ./union_access6 INTEGER 123
iVal=123

VI.4.4 Nested unions


Nested unions are initialized and accessed as nested structures. The initialization and the
access of members of embedded unions follow the same principle as described in section
VI.3.6. Here a simple example:
$ cat union_nested1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
enum type_number { INTEGER, FLOAT };
typedef enum type_number type_number;

struct number {
type_number type;
union {
unsigned int iVal;
float fVal;
} uVal;
};

typedef struct number number;


number nb1 = { /* init structure */
INTEGER,
{ /* init embedded union */
1003
}
};

number nb2 = {
.type=INTEGER,
.uVal={ .iVal=1003 }
};

number nb3 = {
.type=FLOAT,
{ .fVal=12.8 }
};

printf(%d %d\n, nb1.type, nb1.uVal.iVal);
printf(%d %d\n, nb2.type, nb2.uVal.iVal);
printf(%d %f\n, nb3.type, nb3.uVal.fVal);

return EXIT_SUCCESS;
}
$ gcc -o union_nested1 -std=c99 -pedantic union_nested1.c
$ ./union_nested1
0 1003
0 1003
1 12.800000

VI.4.5 Arrays and unions


Arrays can hold elements of union type but practically since unions are embedded in
structures, you will most often meet arrays or pointers to structures. For example:
$ cat union_array2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {

enum type_number { INTEGER, BINARY, VOID };


typedef enum type_number type_number;

struct number {
type_number type;
union {
unsigned int iVal;
char bVal[ 32 ];
} uVal;
};
typedef struct number number;

int i;
int nb_elt = 32; /* number of elt in array number_list */

number number_list[ nb_elt ];

number_list[0].type = INTEGER;
number_list[0].uVal.iVal = 1003;

number_list[1].type = INTEGER;
number_list[1].uVal.iVal = 407;

number_list[2].type = BINARY;
strcpy(number_list[2].uVal.bVal, 10101);

number_list[3].type = VOID;

/* Display list of elements in array number_list */
for (i=0; i < nb_elt; i++ ) {
if ( number_list[i].type == VOID ) /* End of list */
break;

switch (number_list[i].type) {
case INTEGER:
printf(iVal=%d\n, number_list[i].uVal.iVal);
break;
case BINARY:
printf(bVal=%s\n, number_list[i].uVal.bVal);
break;
default:
printf(Unknown type\n);

return EXIT_FAILURE;
} /* End of Switch */
} /* End of for */

return EXIT_SUCCESS;
}
$ gcc -o union_array1 -std=c99 -pedantic union_array1.c
$ ./union_array1
iVal=1003
iVal=407
bVal=10101

VI.4.6 Pointer to unions


Unions can be used with pointers in the same way we did with structures. The following
example defines a pointer to a union:
$ cat union_pointer1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef union number number;
union number {
int iVal;
double fVal;
};

number *p_uNb = malloc( sizeof *p_uNb );
(*p_uNb).iVal = 10;

printf(iVal=%d\n, (*p_uNb).iVal);

return EXIT_SUCCESS;
}
$ gcc -o union_pointer1 -std=c99 -pedantic union_pointer1.c
$ ./union_pointer1
iVal=10

The member-access operator -> we used to access members of structures pointed to by a


pointer is also used to access members of a union pointed to by a pointer. Thus,
(*p_uNb).iVal can be written p_uNb->iVal. The previous example is then equivalent to:

$ cat union_pointer2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef union number number;
union number {
int iVal;
double fVal;
};

number *p_uNb = malloc( sizeof *p_uNb );
p_uNb->iVal = 10;

printf(iVal=%d\n, p_uNb->iVal);

return EXIT_SUCCESS;
}
$ gcc -o union_pointer2 -std=c99 -pedantic union_pointer2.c
$ ./union_pointer2
iVal=10

VI.4.7 Unions and operators


You cannot apply C operators on unions and structures with the exception of the
assignment operator and the address operator & and the member-access operators (. and >). Here is an example:
$ cat union_op1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef union number number;
union number {
int iVal;
double fVal;
};

number uNb1, uNb2;
uNb1.iVal = 10; // access operator

uNb2 = uNb1; // assignment operator


printf(iVal=%d\n, uNb2.iVal);

return EXIT_SUCCESS;
}
$ gcc -o union_op1 -std=c99 -pedantic union_op1.c
$ ./union_op1
iVal=10

As we explained it when we described structures, if a union contains pointers, you have to


allocate memory to them, other they are invalid.

VI.4.8 Incomplete union types and forward references


All that we said about incomplete structure types and forward references in section VI.3.7
holds true for unions.

VI.4.9 Bit-fields
We just have a glance of bit-fields since they are used only by experienced C programmers
in very specific circumstances. Bit-fields allow programmers to specify the number of bits
of a member in a structure or union as shown below:
$ cat bitfields1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef struct my_time my_time;
struct my_time {
unsigned int h: 5; /* h in range [0-24] */
unsigned int m: 6; /* m in range [0-60] */
unsigned int s: 6; /* m in range [0-60] */
};

my_time t;
/* set time 10:20:18 */
t.h = 10;
t.m = 20;
t.s = 18;

printf(Time is %d:%d:%d\n, t.h, t.m, t.s);
return EXIT_SUCESS;

}
$ gcc -o bitfields1 -std=c99 -pedantic bitfields1.c
$ ./bitfields1
Time is 10:20:18

In our example, the member h (meaning hour) can be represented by five bits since it is in
the range [0-24]. Five bits can represent a number in the range [0-31]. Likewise, the
members m and s (minutes and seconds) can be represented by six bits since they are in the
range [0-59]. Six bits can represent a number in the range [0-63].

You can use bit-fields only with member of type int, signed int or unsigned int and you cannot
use pointers with bit-fields. Bit-fields might be of great help when doing low-level
programming but most of the time, it seems unlikely you work a lot with bit-fields. The
following example using a pointer to a bit-field will fail to compile:
$ cat bitfields2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef struct my_time my_time;
struct my_time {
unsigned int h: 5; /* h in range [0-24] */
unsigned int m: 6; /* m in range [0-60] */
unsigned int s: 6; /* m in range [0-60] */
};

unsigned int *p;

my_time t;
/* set time 10:20:18 */
t.h = 10;
t.m = 20;
t.s = 18;

p = &(t.h);

return EXIT_SUCCESS;
}
$ gcc -o bitfields2 -std=c99 -pedantic bitfields2.c
bitfields2.c: In function main:
bitfields2.c:20:2: error: cannot take address of bit-field h

The following example is correct:


$ cat bitfields3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef struct my_time my_time;
struct my_time {
unsigned int h; /* h in range [0-24] */
unsigned int m; /* m in range [0-60] */
unsigned int s; /* m in range [0-60] */
};

unsigned int *p;

my_time t;
/* set time 10:20:18 */
t.h = 10;
t.m = 20;
t.s = 18;

p = &(t.h);

return EXIT_SUCCESS;
}

VI.5 Alignments
VI.5.1 Structure alignment
The compiler aligns correctly the structures. Then, you do not have to worry about it.
However, it is interesting to understand how a structure is aligned and how members are
organized within a structure. To ease our discussion, we consider computers run with
natural alignments: a value is aligned according its type. A structure is an aggregate type
grouping a set of objects having their own type and representation, each of which having
its own storage. The members are stored in the order they appear within the structure.

The first member starts at the address of the structure. The starting address may be subject
to alignment constraints depending on the computer. On computers having data
alignments constraints, the alignment of each member is properly done by the compiler.
Since the storage for each member is allocated in order, to ensure a correct alignment of

each member, padding bytes may be inserted within the structure. As an example,
consider the following structure:
struct str {
char c;
int j;
}

The member c can be stored at any address while j will have to be stored at an address that
is a multiple of its size, say 4 bytes (see Figure VI3). To meet this requirement, the
compiler adds unused bytes called padding bytes before the member to ensure the right
alignment. This is shown by the following example (your computer may display different
values):
$ cat struct_align1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct str {
char c; // 1 byte
int j; // 4 bytes
}; // the sizeof of the structure may be naively computed as 5 bytes

printf( sizeof(char)=%d\n, sizeof(char) );
printf( sizeof(int)=%d\n, sizeof(int) );
printf( sizeof(struct str)=%d\n, sizeof(struct str) );

return EXIT_SUCCESS;
}
$ gcc -o struct_align1 -std=c99 -pedantic struct_align1.c
$ ./struct_align1
sizeof(char)=1
sizeof(int)=4
sizeof(struct str)=8

In the example above, the member j is not correctly aligned. We might think if we swap
the members, padding bytes would become useless:
struct str {
int i;
char c;
}

In this structure, the member j is properly aligned, yet the size of the structure is still 8 in

our computer as shown the following example:


$ cat struct_align2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct str {
int j; // 4 bytes
char c; // 1 byte
}; // the sizeof of the structure may be naively computed as 5 bytes

printf( sizeof(char)=%d\n, sizeof(char) );
printf( sizeof(int)=%d\n, sizeof(int) );
printf( sizeof(struct str)=%d\n, sizeof(struct str) );

return EXIT_SUCCESS;
}
$ gcc -o struct_align2 -std=c99 -pedantic struct_align2.c
$ ./struct_align2
sizeof(char)=1
sizeof(int)=4
sizeof(struct str)=8

The compiler inserted three trailing padding bytes. Why? Suppose you declared an array
of two structures str:
struct str arr[2];

Figure VI3 Example of padding bytes inside structures


In summary:
o The address of the first member of a structure is the address of the structure
o A structure has at least the alignment of the member with the stricter alignment.

It interesting to note depending how you declare the members within a structure, the size
of a structure varies as shown by the following example (on computer, sizeof(int)=4,
sizeof(short)=2):
$ cat struct_align3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct struct1 {
char c1; //1 byte + 3 padding bytes
int j; // 4 bytes
short int c; // 2 bytes + 2 padding bytes
}; // Total=12 bytes

struct struct2 {
char c1; //1 byte + 1 padding byte
short int c; // 2 bytes
int j; // 4 bytes
}; // Total=8 bytes


printf( sizeof(char)=%d\n, sizeof(char) );
printf( sizeof(short)=%d\n, sizeof(short) );
printf( sizeof(int)=%d\n, sizeof(int) );

printf( sizeof(struct struct1)=%d\n, sizeof(struct struct1) );
printf( sizeof(struct struct2)=%d\n, sizeof(struct struct2) );

return EXIT_SUCCESS;
}
$ gcc -o struct_align3 -std=c99 -pedantic struct_align3.c
$ ./struct_align3
sizeof(char)=1
sizeof(short)=2
sizeof(int)=4
sizeof(struct struct1)=12
sizeof(struct struct2)=8

If you do not want the compiler generates internal padding bytes and want to have full
control of your structures, you can insert your own padding bytes. Of course, such a
program is not portable and depends on the processor architecture on which you intend to
run it. For example, struct1 and struct2 could be written as follows (not portable):
struct struct1 {
char c1; //1 byte
char padd1[3]; // 3 bytes
int j; // 4 bytes
short int c; // 2 bytes
char padd2[2]; // 2 bytes
}; // Total=12 bytes


struct struct2 {
char c1; //1 byte
char padd1[1]; // 1 byte
short int c; // 2 bytes
int j; // 4 bytes
}; // Total=8 bytes

The size of a structure is the sum of the sizes of its members plus the padding bytes. If you
wish to write portable programs, you do not have to care about the padding bytes.

VI.5.2 Union alignment


A union is different from a structure in that a single storage block is allocated for all
members. This implies a union has at least the alignment of the member having the stricter
alignment constraint and its size is at least the size of the largest member type. Trailing
bytes may used for padding to meet the alignment requirements.

Figure VI4 Example of padding bytes in unions


Consider the following union:
union u {
int i;
char s[5]; // 5 bytes
};

What could be the size of such a union? According to the C standard, it must be large
enough to hold the largest member: since in our computer sizeof(int)=4, it must be at least
five bytes (the largest type is the array s) but the compiler may computer a larger size
because of alignment restrictions. For example, if the type int was 4-byte wide and the
computer required the type int to be aligned on 4-byte boundaries, the compiler could add
three trailing padding bytes so that the union would be aligned on 4-byte boundaries (the

member i has the stricter alignment constraint). Therefore, the union u could have a size of
eight bytes and would be then aligned on 4-byte boundaries (see Figure VI4). On our
computer, we get this:
$ cat union_align.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
union u {
int i;
char s[5]; // 5 bytes
};

printf( sizeof(int)=%d\n, sizeof(int) );
printf( sizeof(union u)=%d\n, sizeof(union u) );

return EXIT_SUCCESS;
}
$ gcc -o union_align -std=c99 -pedantic union_align.c
$ ./union_align
sizeof(int)=4
sizeof(union u)=8

Normally, you do not have worry about the padding bytes within unions if you wish to
write portable programs. If is better to let the compiler dealing with the padding bytes.

VI.6 Compatible types


The following sections are incomplete. We complete them after describing the scopes of
identifiers introduced in Chapter VII Section VII.6.

Remember that two compatible types have the same representation and alignment. No conversion is
performed between compatible types.

VI.6.1 Structure and union compatible types

Within a program consisting in a single source file, two structure or union types are
incompatible even if they have the same members declared in the same order. In the
following example, the structure types struct1 and struct2 are not compatible:
$ cat struct_compatible_types1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct struct1 { int k; };
struct struct2 { int k; };

struct struct1 s1;
struct struct2 s2;

s1 = s2; // invalid. Incompatible types
return EXIT_SUCCESS;
}
$ gcc -o struct_compatible_types1 -std=c99 -pedantic struct_compatible_types1.c
struct_compatible_types1.c: In function main:
struct_compatible_types1.c:11:6: error: incompatible types when assigning to type struct struct1 from type struct
struct2

The two unnamed structures (declared with no tag) in the following program are not
compatible either for the same reason:
$ cat struct_compatible_types2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct { int k; } s1;
struct { int k; } s2;

s1 = s2; // invalid. Incompatible types
return EXIT_SUCCESS;
}
$ gcc -o struct_compatible_types2 -std=c99 -pedantic struct_compatible_types2.c
struct_compatible_types2.c: In function main:
struct_compatible_types2.c:8:6: error: incompatible types when assigning to type struct <anonymous> from type
struct <anonymous>

VI.6.2 Enumerated types

Within the same source file, two enumeration types are incompatible. Enumeration types
are integer types compatible with the integer type used to represent them. The compatible
integer type can be char, an unsigned integer type or signed integer type. The compiler is
free to choose the right compatible type provided it could represent its members. The
compatible integer type is implementation-defined but it does not actually matter since an
enumerated type is considered an integer type. Enumerated types are integer types
allowing making programs more readable.

Keep in mind enumeration constants are of type int but an enumeration type is an integer
type that may not be the type int.

Take note unlike structure and unions types, enumerated types cannot be incomplete.

VI.7 Conversions
VI.7.1 Structures and unions
In C, there is no way to cast a type to a structure or a union type. Conversion rules for
structures and unions are those of the simple assignment operator =. An object of type
structure or union can be assigned a value having a compatible type. Qualifiers do not
matter.
#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef struct struct1 { int k; } struct1;
typedef struct struct2 { int k; } struct2;

struct1 s1;
struct2 s2;
const struct1 cs1 = s1; // OK

s1 = s2; // invalid. Incompatible types
s1 = cs1; // OK.
return EXIT_SUCCESS;
}

VI.7.2 Enumerated types


Since enumerated types are integer types and enumerated constants are type int,
conversion rules for arithmetic types apply to enumerated types and enumerated constants
(see Chapter II Section II.11 and Chapter III Section III.14). You can work with
enumerated types and enumerated constants as with integers. An object of enumerated
type can be used as an integer type in expressions. It is unlikely you need to do that, and
you should avoid doing it, but nothing prevents someone from assigning a value of
enumeration type to a variable of another enumeration type since both are arithmetic
types. This denotes a poor programming style:
enum shape { CIRCLE=0, RECTANGLE=4, TRIANGLE=3 } s1, s2;
enum myBool { FALSE=0, TRUE=1 } b1, b2;

b1 = TRUE;
s1 = b1;
s2 = FALSE;
b2 = TRIANGLE;

Take note that enumerated constants are of type int while enumerated types can be
represented by char, a signed integer or an unsigned integer. The compiler is free to choose
how an enumerated type is actually represented. This implies assigning an integer to a
variable of enumerated type may lead to a behavior that you do not expect. Suppose you
declare an enumeration as follows:
enum myBool {FALSE=0, TRUE=1};

The compiler might choose to represent such an enumeration as char. If you assign an
integer value that cannot be represented by char, you will not get the expected result:
enum myBool s = 12345;

If you wish to write a portable program, the integer value to assign should be ranging from
0 to SCHAR_MAX or from the minimum enumeration constant to the maximum enumeration
constant. However, it is better to assign a variable of enumerated type only one of the
enumerated constants of the enumeration or a variable of the same type.

Take note that the compiler may choose different integer types to represent different
enumeration types. The C standard permits the compiler to choose the right integer type
(char, signed integer or unsigned integer) for each enumeration type independently from
each other. However, generally, enumeration types are represented by int.

VI.8 Exercises
Exercise 1. Correct the following code:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
typedef struct student student;

struct student {
char first_name[64];
char last_name[64];
int age;
};

student st1;

st1.first_name = Christine;
st1.last_name = Sun;
st1.age = 35;

printf(First Name: %s\n, st1.first_name);
printf(Last Name: %s\n, st1.last_name);
printf(Age: %d\n, st1.age);

return EXIT_SUCCESS;
}


Exercise 2. Explain why the first program is wrong while the second one is correct
$ cat exercise2_1.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define DEFAULT_ARRAY_LEN 10

struct array_int {
int *a;
size_t nb_elt;
size_t len;
};

int main(void) {

struct array_int a1, a2;



a1.a = calloc(DEFAULT_ARRAY_LEN, sizeof *a1.a);
a2.a = calloc(DEFAULT_ARRAY_LEN, sizeof *a2.a);

printf(a1.a=%p a2.a=%p\n, a1.a, a2.a);

a1.a[0] = 1;
a1.a[1] = 2;
a1.len=DEFAULT_ARRAY_LEN;
a1.nb_elt = 2;

memcpy(&a2, &a1, sizeof a1);

printf(a2.a[0]=%d a2.a[1]=%d a2.len=%d a2.nb_elt=%d\n,
a2.a[0], a2.a[1], a2.len, a2.nb_elt );
printf(a1.a=%p a2.a=%p\n, a1.a, a2.a);

return EXIT_SUCCESS;
}



$ cat exercise2_2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define DEFAULT_ARRAY_LEN 10

struct array_int {
int a[20];
size_t nb_elt;
size_t len;
};

int main(void) {
struct array_int a1, a2;

printf(a1.a=%p a2.a=%p\n, a1.a, a2.a);
a1.a[0] = 1;

a1.a[1] = 2;
a1.len=DEFAULT_ARRAY_LEN;
a1.nb_elt = 2;

memcpy(&a2, &a1, sizeof a1);

printf(a2.a[0]=%d a2.a[1]=%d\n a2.len=%d a2.nb_elt=%d\n,
a2.a[0], a2.a[1], a2.len, a2.nb_elt );

printf(a1.a=%p a2.a=%p\n, a1.a, a2.a);

return EXIT_SUCCESS;
}


Exercise 2. Write a program implementing a stack data structure in wish we push the
numbers from 1 to 10 and then from which those numbers are extracted and printed in the
reversed order.

Exercise 3. Write a program implementing a generic array in which we put the number
3.14 of type float, the number of type int, and the character A of type char.

Exercise 4. Write a program that prompts the user to provide 3 values and their type
(allowed types float, int and char) and stores them. Then, once the user has typed the string
quit, the program displays the values with their type.

Exercise 5. Write a program that prompts the user to type any number of values and their
type (allowed types float, int and char) and stores them. Then, once the user has typed the
string quit, the program displays the values with their type.

Exercise 6. Write a program that shows the alignment of types int, long, and double.
Exercise 7. Using a union, write a program that displays the internal representation of the
number 5 of type int.

Exercise 8. Consider the following structure
struct my_string{
int len;
char s[];
};


o What is the size of the structure?
o Write a piece of code that stores the string Hello! into str1, an object of type my_string.
o Write a piece of code that copies the object str1 into another object of type my_string
called str2.

Exercise 9. Explain why the following program is not correct:
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct rate {
float f;
};

struct currency {
float f;
};

struct rate r = { 1.2} ;
struct currency c;

c = r;
return EXIT_SUCCESS;
}


Exercise 10. Write a piece of code implementing a data structure that would store a list of
strings. The number of strings is unknown at runtime.



CHAPTER VII FUNCTIONS


VII.1 Introduction
Amongst good programming practices, readability and maintenance are part of the most
important for programmers. Could you image debugging your own program of thousands
lines embedded in the main() function months later after writing it? Imagine the time spent
for testing it fully

For this reason, programmers split their code into several subprograms called functions in
the C language (also known as routines or subroutines in computing science), each
performing a specific task. The underlying idea is to have several independent pieces of
code that can be tested and debugged separately. As long as a routine produces the same
effect, the way it performs it does not matter. For example, you can even change
completely an algorithm within a routine without having any impact on your program
provided its output and input remain the same.

In addition to ease maintenance and readability, functions can be reused as many times as
you wish. For example, you could write a function that calculates the average value of a
list of numbers. Instead of writing the same piece of code several times, you will just have
to invoke the function with the list of numbers as arguments, and it will return the average
value. This will save you a great deal of time and avoid introducing errors.

Before programmers start writing a program, they first think the way they will split it. In
the same way as a book is broken into chapters and sections, a program is divided into one
or more parts known as modules, and modules are split into functions. Modules will be
described in the next chapter: they can be compared to a chapter of a book. Functions can
be compared to sections.

A function is a set of statements indentified by a name performing a specific task. A
function identifier is composed of letters, digits and underscores, starting with a letter or
an underscore.

There are two kinds of functions: functions provided by C libraries and functions defined
by users. In the chapter, you will learn how to create and use your own functions.

In the chapter, we will also go into details about declarations, definitions, variable scopes,
storage durations and initializations of identifiers. We refine several features of the C
language we studied in previous chapters.

VII.2 Definition
Before a function can be called, it must be defined somewhere. Defining a function means
providing a declaration and the code corresponding to the tasks to perform. A function
cannot be defined within another function. Let us start with a simple example. In the
following example, the function add() adds two given numbers and returns the resulting
value:
double add(double a, double b) {
return a+b;
}

The definition of a function is composed of two parts:


o The declaration consists in:
Return type: at the leftmost side lies the return type that represents the type of the

value that the function returns. In the example above, the return type is double.
The identifier of the function. In our example, the function is named add.
The parameters of the function. In our example, the parameters are a and b of type
double.

o The body of the function. It comprises a set of statements, between braces, defining the
tasks to perform.

More generally, a function is defined as follows (C standard style):
type_ret function_name(type1 arg1, type2 arg2,, typeN argN) {
statement1;

statementN;
}

A declaration of a function describes the types of its parameters and its return type. The
definition of a function consists in its declaration and its body.

If a function specifies a return type, it should return a value of that type with the return
statement. A function may have several return statements as in the following example:
int compare_string(char *s1, char *s2) {
if ( s1 == NULL || s2 == NULL )

return 0;

if ( ! strcmp(s1, s2) ) { /* s1 and s2 holds the same string */
return 1;
} else { /* s1 and s2 holds different strings */
return 0;
}
}

The function compare_string() returns 1 if the given strings are the same and 0 otherwise.

A function that has no parameter is defined as follows:
type_ret function_name(void) {
statement1;

statementN;
}

The void parameter means the function takes no parameter as in the example below.
int print_starting_header(void) {
printf(=====================================\n);
printf(========STARTING OF PROGRAM==========\n);
printf(=====================================\n);

return 1;
}

A function that returns nothing, called a procedure in other programming languages, is


defined as follows:
void function_name(type1 arg1, type2 arg2,, typeN argN) {
statement1;

statementN;
}

The keyword void in place of the return type means the function returns nothing. Here is an
example
void print_header(char *header) {
if ( ! header ) /* if pointer is NULL */
return;

printf(=====================================\n);
printf(========%s==========\n, header);
printf(=====================================\n);
}

When a function returns nothing, the return statement with no argument can be used to give
back the control to the caller (return to the point it was called).

VII.3 Function calls


Though programmers often use indifferently the words arguments and parameters as
synonyms, as we also do it sometimes, it is worth noting those words have not exactly the
same meaning according to the C standard. So far, we did not make clear distinction. Now,
we will do it. A parameter (or formal parameter) is an object declared in the declaration
of the function while an argument (or actual argument) is a value (or an expression)
passed to a function when called.

Figure VII1 Function call


Let us consider our function add():
double add(double a, double b) {
return a+b;
}

The variables a and b are parameters of the function. When we call the function, we pass
real values as below:
x = add(5, 8);

Above, the values 5 and 8 are arguments of the function. The parameter a will take the first

argument of value 5 and the parameter b will be assigned the second argument of value 8.
The parameters work as any object declared within the function. The function performs its
expected tasks and returns to the caller with a value specified by the return statement (see
Figure VII1). In summary, parameters are assigned the arguments passed to the function.

Arguments can be literals, variables and more generally expressions:
y = 9;
x = add(5*2, 8-y);

The expressions are first evaluated before being passed to the function but the order the
evaluation is implementation-defined.

Once a function has been defined, you can call it to perform the expected tasks as in the
following example:
$ cat function_call1.c
#include <stdio.h>
#include <stdlib.h>

/*
NAME: add()
DESCRIPTION: add two input numbers
PARAMETERS:
- double a
- double b
RETURN: the resulting value of the addition of the input numbers.
*/
double add(double a, double b) {
return a+b;
}

int main(void) {
float x = 10;
float y = 2.1;
double z = add( x, y );

printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}
$ gcc -o function_call1 -std=c99 -pedantic function_call1.c
$ ./function_call1

10.000000 + 2.100000 = 12.100000

In the example function_call1.c, the add() function is invoked with the arguments x and y: add(x,
y). Before executing the function, the variables x and y are first evaluated: they are replaced
by their value. Then, the function add() returns its value that is assigned to the z variable.

In the following example, we call the function compare_string() that takes two strings and
compares them. If they are identical, it returns 1. Otherwise, it returns 0.
$ cat function_call2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/*
NAME: compare_string()
DESCRIPTION: tells if two strings are identical or not
PARAMETERS:
- char *s1: input string
- char *s2: input string
RETURN: 0 if s1 and s1 are different and 1 otherwise.
*/
int compare_string(char *s1, char *s2) {
if ( s1 == NULL || s2 == NULL )
return 0;
if (! strcmp(s1, s2) ) { /* s1 and s2 holds the same string */
return 1;
} else { /* s1 and s2 holds different strings */
return 0;
}
}

int main(void) {
char *msg[] = {different, same};
char s1[] = OK;
char s2[] = OK;
int cmp1 = compare_string(s1, s2);

char s3[] = OK;
char s4[] = KO;
int cmp2 = compare_string(s3, s4);

printf(%s and %s are %s\n, s1, s2, msg[ cmp1 ] );


printf(%s and %s are %s\n, s3, s4, msg[ cmp2 ] ) ;

return EXIT_SUCCESS;
}
$ gcc -o function_call2 -std=c99 -pedantic function_call2.c
$ ./function_call2
OK and OK are same
OK and KO are different


In the following example, we call the functions print_header() and add():
$ cat function_call3.c
#include <stdio.h>
#include <stdlib.h>

/*
NAME: add()
DESCRIPTION: add two input numbers
PARAMETERS:
- double a
- double b
RETURN: the resulting value of the addition of the input numbers.
*/
double add(double const a, double const b) {
return a+b;
}

/*
NAME: printf_header()
DESCRIPTION: display a banner containing the passed string
PARAMETERS:
- char *header
RETURN: None
*/
void print_header(char *header) {
if ( ! header ) /* if pointer is NULL */
return;

printf(======================================\n);
printf(========%s==========\n, header);

printf(======================================\n);
}

int main(void) {
float x = 10;
float y = 2.1;
double z = add( x, y );

print_header(BEGINNING OF PROGRAM);
printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}
$ gcc -o function_call3 -std=c99 -pedantic function_call3.c
$ ./function_call3
======================================
========BEGINNING OF PROGRAM==========
======================================
10.000000 + 2.100000 = 12.100000

VII.4 Return statement, part1


The return statement leaves the function that contains it and returns to the caller. The return
statement takes an argument if the function returns a value. Below, the program
function_return1.c takes two strings as arguments and compares them using the function
compare_string():
$ cat function_return1.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/*
NAME: compare_string()
DESCRIPTION: tells if two strings are identical or not
PARAMETERS:
- char *s1: input string
- char *s2: input string
RETURN: 0 if s1 and s1 are different and 1 otherwise.
*/
int compare_string(char *s1, char *s2) {
if ( s1 == NULL || s2 == NULL )
return 0;


if ( ! strcmp(s1, s2) ) { /* s1 and s2 holds the same string */
return 1;
} else { /* s1 and s2 holds different strings */
return 0;
}
}

int main(int argc, char **argv) {
char *s1, *s2;

if ( argc != 3 ) {
printf(USAGE: %s string1 string2\n, argv[0]);
return EXIT_FAILURE;
}

s1 = argv[1];
s2 = argv[2];

switch ( compare_string(s1, s2) ) {
case 0:
printf(%s != %s\n, s1, s2 );
break;
case 1:
printf(%s = %s\n, s1, s2 );
}

return EXIT_SUCCESS;
}
$ gcc -o function_return1 -std=c99 -pedantic function_return1.c
$ ./function_return1 HELLO hello
HELLO != hello
$ ./function_return1 hello hello
hello = hello

Within the function compare_string(), we called three times the return statement with an
argument depending on the case.

In some cases, the return statement takes no argument. This occurs when the function
returns nothing (void) and you want control to return to the caller before reaching the end
of the function: in the example below, the function print_header() invokes return with no

value if the passed argument is a null pointer.


void print_header(char *header) {
if ( ! header ) /* if pointer is NULL */
return;

printf(=====================================\n);
printf(========%s==========\n, header);
printf(=====================================\n);
}

If a function is declared returning void, you may not invoke the return statement at all: when
the end of the function body is reached (specified by the right brace }), control
automatically returns to the caller. In the example above, if the parameter header is not a
null pointer, a banner is printed, the function terminates (with no return statement) and
control is given back to the caller as if the return statement was called.

If the argument of the return statement is an expression, it is evaluated before the resulting
value is finally returned. In the following example, the expression a % 2 is evaluated to a
value that will then be returned.
int is_even(int a) {
return a % 2;
}

A return statement can return arithmetic types, pointers, structures, union, and
enumerations but it cannot return an array. The following example duplicates a passed
string and returns a pointer to the allocated memory chunk holding the duplicated string:
$ cat function_return2.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/*
NAME: duplicate_string()
DESCRIPTION: allocate memory and copy the passed string into it
PARAMETERS:
- char *s: input string to duplicate
RETURN: the pointer to the memory block holding a copy of the passed string
*/
char *duplicate_string(char *s) {
char *duplicate_s;
int len;


if (s == NULL)
return NULL;

len = strlen ( s );
duplicate_s = malloc (len + 1);

if ( duplicate_s != NULL )
strcpy( duplicate_s, s);

return duplicate_s;
}

int main(void) {
char *s = Duplicate String;
char *dup_s = duplicate_string( s );

if ( dup_s != NULL )
printf(dup_s=%s\n, dup_s);
else
printf(dup_s=NULL\n);

free(dup_s);
return EXIT_SUCCESS;
}
$ gcc -o function_return2 -std=c99 -pedantic function_return2.c
$ ./function_return2
dup_s=Duplicate String

Of course, as malloc() has been invoked, the free() function will be called somewhere to free
the memory allocated by the function duplicate_string().

What happens if we return a value that has a type different from the return type? The
return value is just implicitly converted to the return type as it would be done in a simple
assignment operation.
$ cat function_return3.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int ret_int(double a) {

return a;
}

int main(void) {
double val = 3.14159;
printf(return value=%d\n, ret_int(val) );
}
$ gcc -o function_return3 -std=c99 -pedantic function_return3.c
$ ./function_return3
return value=3

VII.5 Function declarations


You may ask yourself what could be the use of a declaration. Before answering the
question, we first need to give some definitions: declaration, prototype, and definition.

As of C99, before calling a function, you must declare it through either a simple
declaration or a definition: a declaration must have been done before the call to the
function. A declaration is a way to specify the type bound to a given name. For example,
int x tells the compiler we will use the name x as a variable of type int. Similarly, declaring
a function means we tell the compiler we want to identify a function with a specific name:
int is_even(int a) indicates the compiler the name add is bound to a function.

In C standard, when a declaration is part of a definition, the names of the parameters and
their types must be specified:
double add(double a, double b) {
return a + b;
}

In C standard, if a function declaration is not part of a definition, declaring the types of the
parameters (the names of the parameters are optional in this case) is sufficient. The
following simple declarations are allowed and equivalent:
double add(double a, double b);
double add(double, double);

In the K&R style, the old C style, still permitted by the C standard, though obsolete, you
can declare a function without specifying the type of its parameters (i.e. type signature).
In K&R style, when a declaration is part of a definition, the names of the parameters are
specified without their type. The old C style would define a function like this:
type_ret function_name(arg1, arg2,, argN)

type1 arg1;
type2 arg2;
;
typeN argN;
{
statement1;

statementN;
}

For example:
double add(double a, double b)
double a;
double b;
{
return a + b;
}

The types appear in the code of the function not in the declaration. This kind of definition
should be avoided and we will explain why.

In K&R style (old C style, also known as pre-ANSI C), if a declaration is not part of a
definition, the parameter types are omitted as follows:
return_type function_name();

For example, the function add() is declared like this in K&R style:
double add();

There is no information about the parameters. This kind of declaration should be avoided.
You may see it in old C programs.

The prototype of a function is a declaration completed with the types of the parameters it
accepts. For example, int add(double a, double b) is a prototype: it tells the compiler the name
add identifies a function that takes two parameters of type double. In C standard style, a
declaration is a prototype. In K&R style, a declaration is not a prototype.

A definition of a function comprises a declaration and the code of the function. It provides
the statements that will be executed when the function will be called.

Before the inception of the C standard, there were no function prototypes at all. As of

ANSI C (C89/C90), functions prototypes were introduced but function prototypes and
even declarations were not required (though recommended). As of C99, functions must be
declared, preferably as prototypes but this not required, before being used. As of C99, if
you do not declare a function and try to call it, the compile will generate an error.

Here are some examples of declarations, definitions and prototypes:
double add(); /* declaration K&R style*/

double mult(double, double); /* prototype */

double mult(double a, double b); /* prototype */

void printf(); /* declaration K&R style */

int is_even(int a) { /* definition with prototype */
return a % 2;
}

int is_even()/* definition with declaration in K&R style */
int a ;
{
return a % 2;
}

Unless otherwise stated throughout the book, we will use the word function declaration as
synonym for function prototype or just prototype. We will not use the K&R function
declaration style that is obsolete.

Now, you have understood the difference between prototype, declaration and definition,
we can explain why declarations are important. One of most useful features of the C
language is its modularity. As we will find out in the next chapter, you can split you
program into several source files and create your our set of functions that will be able to
be used by other programs. You can also use functions written by other programmers. To
call them you just need their binaries containing the code of the functions and header files
holding their declarations.

Suppose you had written a set of functions, and built a library from the compiled binaries
(object files). A library is just a set of binary modules (known as object files) containing
the code of the provided functions (we will learn to do it in Chapter XIII). Since the
functions are packaged as binaries, programmers and compilers have no access to their
definitions, how could the compiler and programmers check the arguments passed to the

functions and their return value?



You have understood that declarations are used by the compiler to allow calling them
properly. For example, if the function add() was defined outside your program, you would
have had to provide in your program the declaration of the function:
double add(double a, double b);

Generally, the declarations of functions are placed in a text file called a header file such as
stdio.h

[49]

as we will explain it in the next chapter.


So far, we have considered we have a program composed of a single file (source file)
holding the complete C code, and our source files were organized like this:
#include <>
#include <>

function1() {


int main() {

Thus, our program was split into three sections:


o include section that includes header files
o function section that defines functions
o main section containing the main() function

What happens if our function section is placed after the definition of the main() function?
In other words, if we define our functions after they are actually called, does it work? We
have already answered to the questionHere is an example clarifying the answer:
$ cat function_decl1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 10;

float y = 2.1;
double z = add( x, y );

printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}

double add(double a, double b) {
return a+b;
}
$ gcc -o function_decl1 -std=c99 -pedantic function_decl1.c
function_decl1.c: In function main:
function_decl1.c:7:4: warning: implicit declaration of function add
function_decl1.c: At top level:
function_decl1.c:13:8: error: conflicting types for add
function_decl1.c:7:15: note: previous implicit declaration of add was here

The call to the function add() occurs before the declaration of the function. That is why the
compiler complained. To correct it, we can place the definition of the add() function (that is
also a declaration) before the main() function (as we did in example function1.c) or we could
also give the declaration of the function before it is called as in the following example:
$ cat function_decl2.c
#include <stdio.h>
#include <stdlib.h>

double add(double a, double b);

int main(void) {
float x = 10;
float y = 2.1;
double z = add( x, y );

printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}

double add(double a, double b) {
return a+b;
}
$ gcc -o function_decl2 -std=c99 -pedantic function_decl2.c
$ ./function_decl2
10.000000 + 2.100000 = 12.100000

When a declaration is not part of the definition of a function, you may omit the parameter
names:
$ cat function_decl3.c
#include <stdio.h>
#include <stdlib.h>

double add(double, double);

int main(void) {
float x = 10;
float y = 2.1;
double z = add( x, y );

printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}

double add(double a, double b) {
return a+b;
}
$ gcc -o function_decl3 -std=c99 -pedantic function_decl3.c
$ ./function_decl3
10.000000 + 2.100000 = 12.100000

The parameter types in the declaration are used to check the arguments and perform the
appropriate conversions (explained later in the chapter) if an argument has a type different
from the type of the corresponding parameter. If an argument cannot be converted
implicitly, an error is displayed as shown below:
$ cat function_decl4.c
#include <stdio.h>
#include <stdlib.h>

double add(double, double);

int main(void) {
float x = 10;
float y = 2.1;
double z = add( &x, y );

printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}


double add(double a, double b) {
return a+b;
}
$ gcc -o function_decl4 -std=c99 -pedantic function_decl4.c
function_decl4.c: In function main:
function_decl4.c:9:4: error: incompatible type for argument 1 of add
function_decl4.c:4:8: note: expected double but argument is of type float *

The argument &x is a pointer to float and then cannot be converted to double.

In the same way, if we move the include section after the main() function, we have the
same error:
$ cat function_decl5.c
double add(double a, double b) {
return a+b;
}

int main(void) {
float x = 10;
float y = 2.1;
double z = add( x, y );

printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}
#include <stdio.h>
#include <stdlib.h>
$ gcc -o function_decl5 -std=c99 -pedantic function_decl5.c
function_decl5.c: In function main:
function_decl5.c:10:4: warning: implicit declaration of function printf
function_decl5.c:10:4: warning: incompatible implicit declaration of built-in function printf
function_decl5.c:11:11: error: EXIT_SUCCESS undeclared (first use in this function)
function_decl5.c:11:11: note: each undeclared identifier is reported only once for each function it appears in

The compiler complained for two reasons:


o The printf() function, declared in the header file stdio.h, was not declared before being
used
o The EXIT_SUCCESS macro, declared in the header file stdlib.h, was not declared before
being used

If we move the inclusion of the header files just before the main() function, it works again:
$ cat function_decl6.c
double add(double a, double b) {
return a+b;
}

#include <stdio.h>
#include <stdlib.h>

int main(void) {
float x = 10;
float y = 2.1;
double z = add( x, y );

printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}
$ gcc -o function_decl6 -std=c99 -pedantic function_decl6.c
$ ./function_decl6
10.000000 + 2.100000 = 12.100000

Traditionally, the inclusions of header files are placed at the beginning of the source file
allowing functions within the source file to call the functions declared in header files.

Historically, before the inception of the C standard, function declarations could appear
with an empty parameter list (K&R style) or even omitted. Though the compilers still
accept this obsolescent feature, you should never use it because this prevents the compiler
to do its job correctly. In the C standard style, the declarations of functions specify the
types of the parameters or the keyword void if the function takes no parameter. In the
original C style, known as K&R style (Kernighan & Ritchie style), we could declare a
function like this:
return_type function_name();

Let us show why you should not use the old style. Let us start with K&R declarations as in
the example below:
$ cat old_style1.c
#include <stdio.h>
#include <stdlib.h>

double add(); /* K&R style declaration */

int main(void) {
double x = 10;
double y = 2;
double z = add( x, y );

printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}

double add(double a, double b) {
return a+b;
}
$ gcc -o old_style1 -std=c99 -pedantic old_style1.c
$ ./old_style1
10.000000 + 2.000000 = 12.000000

It works but now try this one:


$ cat old_style2.c
#include <stdio.h>
#include <stdlib.h>

double add(); /* K&R style declaration */

int main(void) {
int x = 10;
int y = 2;
double z = add( x, y );

printf(%d + %d = %f\n, x, y, z);
return EXIT_SUCCESS;
}

double add(double a, double b) {
return a+b;
}
$ gcc -o old_style2 -std=c99 -pedantic old_style2.c
$ ./old_style2
10 + 2 = -2124375231618922398463637855521183204518847099

No comment. It does not yield the expected result because the declaration is not a
prototype and then the compiler cannot check the arguments and convert them if required.
In our example, the arguments of type int are passed to the function without converting

them to type double. The following example shows it more explicitly:


$ cat old_style3.c
#include <stdio.h>
#include <stdlib.h>

double display_arg(); /* K&R style declaration */

int main(void) {
int x = 20;

printf(call display_arg(%d)\n, x);
display_arg( x );

return EXIT_SUCCESS;
}

double display_arg(double a) {
printf(passed argument = %f\n, a);
}
$ gcc -o old_style3 -std=c99 -pedantic old_style3.c
$ ./old_style3
call display_arg(20)
passed argument = 0.000000

Therefore, the K&R declaration does not allow the compiler to convert the arguments if
required. The following example shows you can even pass any number of arguments!
$ cat old_style4.c
#include <stdio.h>
#include <stdlib.h>

double add(); /* K&R style declaration */

int main(void) {
double x = 10;
double y = 2;
double z = add( x );

printf(%d + %d = %f\n, x, y, z);
return EXIT_SUCCESS;
}

double add(double a, double b) {


return a+b;
}
$ gcc -o old_style4 -std=c99 -pedantic old_style4.c
$ ./old_style4
0 + 1076101120 = 2.000000

Now, the turn of the K&R definition. The definition of the old style looks like the
definition of the C standard syntax but they behave differently. Try this:
$ cat old_style5.c
#include <stdio.h>
#include <stdlib.h>

/* K&R style declaration */
double add(a, b)
double a;
double b;
{
return a+b;
}
int main(void) {
double x = 10;
double y = 2;
double z = add( x, y );

printf(%f + %f = %f\n, x, y, z);
return EXIT_SUCCESS;
}
$ gcc -o old_style5 -std=c99 -pedantic old_style5.c
$ ./old_style5
10.000000 + 2.000000 = 12.000000

The arguments are of the same type as that of the parameters. So, all is fine but if you pass
other types:
$ cat old_style6.c
#include <stdio.h>
#include <stdlib.h>

/* K&R style declaration */
double add(a, b)
double a;
double b;

{
return a+b;
}
int main(void) {
int x = 10;
int y = 2;
double z = add( x, y );

printf(%d + %d = %f\n, x, y, z);
return EXIT_SUCCESS;
}
$ gcc -o old_style6 -std=c99 -pedantic old_style6.c
$ ./old_style6
10 + 2 = -21243752316189223984636378555211832045188470999510

The arguments are not converted to the corresponding types of the parameters, which
yields erroneous output.

VII.5.1 Name spaces


There are four different name spaces for identifiers:
o Identifiers for functions, macros, objects, user-defined types (typedef) and enumeration
constants
o Labels (used by the goto statement)
o Identifiers for members of structures, unions, and enumerations,
o Tags for structures, unions and enumerations

There will be no collision if two or more identical identifiers pertain to different name
spaces. In the following example, the identifier s refers to elements in different name
spaces:
$ cat name_space1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
char *s = Hello; /* identifier s for object */
struct s { /* identifier s is a tag */
int s[10]; /* identifier s for structure member */
};

return EXIT_SUCCESS;
}


In the following example, the identifier string refers to an object, a structure and a member
of a structure:
$ cat name_space2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
struct string { /* identifier s is a tag */
char string[255]; /* identifier of structure member */
} string; /* identifier of an object */

return EXIT_SUCCESS;
}

VII.6 Scope of identifiers


VII.6.1 Definition
There is an important point, that we will complete in the next chapter, we are going to talk
about here. It is the scope of identifiers.

An identifier is a symbol composed of alphanumeric characters that represent a function,
an object (variable), a typedef type, a union, a structure, an enumeration, a macro, a label
(used by the statement goto) or a member of a structure, union or enumeration type. Natural
questions that arise are:
o Is an identifier accessible everywhere in the program?
o Could we hide an identifier?
o Are identifiers within a function visible outside the function?
o What is the lifetime of an identifier?
o And so on.

An identifier is said to be visible if it is accessible. The scope of an identifier (also known
as a lexical scope) is the portion of code where it is visible. There are four kinds of scopes:
file scope, function scope, block scope, and function prototype scope.

VII.6.2 Prototype scope


Parameters declared within a prototype of a function (that is not part of a definition) are
visible only within the declaration. Within a function prototype, identifiers are unique.
Otherwise, an error is generated at compilation time as in the following example:
double f(double a, int a);

The following is valid. The parameters a and b have function prototype scope:
double add(double a, double b);

VII.6.3 Function scope


Only labels (used by the goto statement) have function scope. They can be used anywhere
within a function, and unlike other identifiers, they cannot be hidden. That is, within a
function, a label is unique and then you cannot use another label with the same name even
within another block. The following example, using two labels of the same name, is not
correct:
$ cat function_scope1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int max = 10;
int i;

for (i=0; i < 10; i++) {
if ( i == 3 ) goto MSG;
printf(%d , i);
MSG:
printf(goto label MSG. i=%d\n, i);
}

MSG:
printf(Goto label MSG. End of Program\n);

return EXIT_SUCCESS;
}
$ gcc -o function_scope1 -std=c99 -pedantic function_scope1.c
function_scope1.c: In function main:
function_scope1.c:16:4: error: duplicate label MSG
function_scope1.c:12:7: note: previous definition of MSG was here

VII.6.4 Block scope


An identifier declared within a block has block scope. It is visible within the block in
which it is declared. It is often known as a local identifier in programming languages. We
remind that a block starts with a left brace ({) and terminates with the corresponding right
brace (}). In the following example, the variable j has block scope since it is declared in
[50]
the body of the main() function
.
$ cat block_scope1.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int j = 500;

printf(j=%d\n, j);

return EXIT_SUCCESS;
}
$ gcc -o block_scope1 -std=c99 -pedantic block_scope1.c
$ ./block_scope1
j=500

In the example below, the variable j is declared in two different blocks. The variable j in
the if block hides the variable j declared in the block enclosing it (body of the main()
function):
$ cat block_scope2.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int j = 500;
int cond = 1;

if ( cond ) {
int j = 10;
printf(IF BODY: j=%d\n, j);
}

printf(main() BODY: j=%d\n, j);

return EXIT_SUCCESS;
}
$ gcc -o block_scope2 -std=c99 -pedantic block_scope2.c
$ ./block_scope2
IF BODY: j=10
main() BODY: j=500

This example shows that an identifier or a user-defined type declared within a block (block
scope) hides the other declarations in the file, or in blocks they encloses it.

Within the same block, there can be only a unique identifier. The following example is
wrong:
$ cat block_scope3.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
int j = 500;
float j = 1.9;

return EXIT_SUCCESS;
}
$ gcc -o block_scope3 -std=c99 -pedantic block_scope3.c
block_scope3.c: In function main:
block_scope3.c:6:10: error: conflicting types for j
block_scope3.c:5:8: note: previous definition of j was here

In the following example, the variable s and j are declared in the function f() and main() but
they do not reference the same object since they are declared in different blocks (body of
function f() and body of function main()):
$ cat block_scope4.c
#include <stdio.h>
#include <stdlib.h>

void f(void) {
char *s = function f();
int j = 10;

printf(s=%s, j=%d\n, s, j);
}

int main(void) {
f();
char *s = function main();
int j = 500;

printf(s=%s, j=%d\n, s, j);

return EXIT_SUCCESS;
}
$ gcc -o block_scope4 -std=c99 -pedantic block_scope4.c
$ ./block_scope4
s=function f(), j=10
s=function main(), j=500

An identifier declared within a function is visible only in the body of the function in which
it is declared (block scope).

The parameters of a function are visible in the body of the function as if they were
declared in it: they have block scope as shown below.
$ cat block_scope5.c
#include <stdio.h>
#include <stdlib.h>

void f(int j) {
int cond = 1;

if ( cond ) {
int j = 10;
printf(IF BODY: j=%d\n, j);
}

printf(f() BODY: j=%d\n, j);
}

int main(void) {
f(500);
return EXIT_SUCCESS;
}
$ gcc -o block_scope5 -std=c99 -pedantic block_scope5.c
$ ./block_scope5
IF BODY: j=10

f() BODY: j=500

In the example above, the variable j in the if body hides the parameter j. As soon as the if
statement terminates, the parameter j is no longer hidden.

The same rule applies to user-defined types. User-defined types defined within a block are
visible only within the block in which they are declared (block scope):
$ cat block_scope6.c
#include <stdio.h>
#include <stdlib.h>

void display_parity(int j) {
typedef enum { EVEN = 0, ODD = 1 } parity;
parity remainder;

int x = 10;
remainder = x % 2;

if ( remainder == EVEN )
printf(%d is even\n, x);
else if ( remainder == ODD )
printf(%d is odd\n, x);
}

int main(void) {
display_parity(10);
return EXIT_SUCCESS;
}
$ gcc -o block_scope6 -std=c99 -pedantic block_scope6.c
$ ./block_scope6
10 is even

In the example above, the enumeration type parity is visible only within the body of the
function display_parity().

VII.6.5 File scope


An identifier declared outside a function has file scope. It is visible anywhere within the
file in which it is declared except within a block in which there is another declaration of
the identifier (it is hidden). Such an identifier is also said to be external (sometimes called
global). Throughout the book, we will use the adjective global as a synonym for external
[51]
meaning having a file scope
.


A function cannot be declared within another function and then has always file scope. The
identifier of a function (its name) is accessible everywhere in the file in which it is
declared (it has file scope). Since a function identifier is always external, it cannot be
hidden. In the following example, the function f() and g() are accessible by any function in
the file file_scope1.c:
$ cat file_scope1.c
#include <stdio.h>
#include <stdlib.h>

void f(void) {
printf(function f() called\n);
}

void g(void) {
f();
}

int main(void) {
g();
f();
return EXIT_SUCCESS;
}
$ gcc -o file_scope1 -std=c99 -pedantic file_scope1.c
$ ./file_scope1
function f() called
function f() called

An object can also have file scope: it is visible within the body of any function of the file
in which it is declared. Such an object is declared outside functions. For this reason, such
an object is often qualified external. In the following example, the variable j and the array
s have file scope:
$ cat file_scope2.c
#include <stdio.h>
#include <stdlib.h>

char *s = global object;
int j = 500;
void f(void) {
printf(s=%s, j=%d\n, s, j);
}

int main(void) {
printf(s=%s, j=%d\n, s, j);

return EXIT_SUCCESS;
}
$ gcc -o file_scope2 -std=c99 -pedantic file_scope2.c
$ ./file_scope2
s=function main(), j=500
s=function main(), j=500

In the following example, the identifiers s and j have both file scope (global) and block
scope (local) since they are also declared in the f() function (block scope) and in the main()
function (block scope):
$ cat block_scope3.c
#include <stdio.h>
#include <stdlib.h>

/* variables with file scope */
char *s = global object;
int j = 500;


void f(void) {
char *s = block f();
int j = 10;

printf(s and j are local: s=%s, j=%d\n, s, j);
}


void g(void) {
printf(s and j are global: s=%s, j=%d\n, s, j);
}


int main(void) {
char *s = block main();
int j = 20;

f();
g();
printf(s and j are local: s=%s, j=%d\n, s, j);


return EXIT_SUCCESS;
}
$ gcc -o file_scope3 -std=c99 -pedantic file_scope3.c
$ ./file_scope3
s and j are local: s=block f(), j=10
s and j are global: s=global object, j=500
s and j are local: s=block main(), j=20

Local objects (block scope) hide global objects (file scope). The array s and the variable j
of the function f() hide the array s and the variable j having the file scope. In the same way,
the array s and the variable j in the main() function hide the array s and the variable j having
the file scope.

A global user-defined type (external) visible by any function within a source file (file
scope) is declared outside functions. In the following example, the structure string is visible
by all the functions of the source file file_scope4.c:
$ cat file_scope4.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* Global structure string */
struct string {
char *s;
int len;
};

typedef struct string string;

/* create a structure string from a string passed as argument */
string create_string (char *s) {
string ret_s = { NULL, 0 };
int len = 0;

if ( s == NULL )
return ret_s;

len = strlen(s);
ret_s.s = malloc( len + 1 );

if (ret_s.s == NULL ) {

printf(Cannot allocate memory\n);


return ret_s;
}

ret_s.len = len;
strcpy (ret_s.s, s);
return ret_s;
}

/* display the string stored in the structure string */
void display_string (string s) {
s.s != NULL ? printf(String=%s\n, s.s) : printf(String=NULL\n);
}

int main(void) {
string msg1 = create_string(This is a struct string);
string msg2 = create_string(NULL);
display_string(msg1);
display_string(msg2);
return EXIT_SUCCESS;
}
$ gcc -o file_scope4 -std=c99 -pedantic file_scope4.c
$ ./file_scope4
String=This is a struct string
String=NULL

VII.6.6 Same scope


Two identifiers are said to have the same scope if their scope ends at the same point within
a program. Two identifiers with file scope have the same scope. Two identifiers declared
in the same block have the same scope. Two identifiers having function prototype scope
have the same scope if they belong to the same declaration of a function.

VII.6.7 Scope and visibilty


We summarize what we said about the visibility of identifiers. Two identifiers having the
same name space may be identical if they are declared in different scopes. As scopes may
overlap (a scope s1 may be larger than a scope s2), an identifier declared in the larger scope
may be hidden by identifiers declared in embedded scopes (see Figure VII2).

Figure VII2 Scope overlaps

VII.7 Storage duration


Any object is stored the computers memory so that it could be reused for reading or
updating. An object exists as long as it has a memory location storing it. What happens if
try to use an object that no longer exists? So far, we have always worked with objects
within their scope and then their lifetime seemed to be obvious: they existed in their
scope. What do you think about the following code?

$ cat function_lifetime1.c
#include <stdio.h>
#include <stdlib.h>

int *f(void) {
int s[10] = {10, 18, 20};

return s;
}

int main(void) {
int *p = f();

return EXIT_SUCCESS;
}
$ gcc -o function_lifetime1 -std=c99 -pedantic function_lifetime1.c
function_lifetime1.c: In function f:
function_lifetime1.c:7:4: warning: function returns address of local variable
$ ./function_lifetime1

The compiler guessed our code was wrong. In our program, the f() function returned a
pointer to an array. The problem is that the array was a local variable (block scope) that
would be destroyed as soon as the function f() terminated. This means the pointer returned
by the f() function pointed to an object that no longer exists. Hence the question what is the
lifetime of objects?

The time during which an object exists, while the program is running, is the lifetime of the
object. An object exists as long as it is bound to a memory chunk in which it is stored. In
other words, the storage duration is the lifetime of an object. There are three kinds of
storage durations: automatic, static and allocated. The storage-class specifiers (auto, extern,
static, register) are the keywords determining the storage duration for an identifier. A single
storage-class specifier is allowed in a declaration. However, only the storage-class register
is allowed in the declarations of formal parameters in function prototype declarations.

Storage duration must not be confused with scope. A scope defines the portion of a
program where you can use an identifier. The storage duration defines the lifetime of an
identifier. Thus, a variable may exist as long as the program is running while it can be
used only within a specific block (local variable declared with the keyword static).

VII.7.1 Automatic duration


An object declared within a block (block scope) with the storage-class specifier auto has

automatic storage duration. The reserved word auto is generally omitted. It is used by
default when objects having block scope are declared without the storage-class specifier
static. This means that local objects have automatic storage duration.

The storage-class specifier register also declares an object with automatic storage duration.
It is used to suggest the compiler to make the access of a variable as fast as possible. This
is not a requirement. The compiler may ignore it and then considers it as if it was just
declared with the keyword auto. The C standard does not specify how to make the access
faster. Technically, it means the variable will be put in a register not in the computers
memory. The storage-class specifier register is not frequently used because of its constraints
and because the compiler is smart enough to optimize the code according to the processor
architecture. Since registers have no address, the address of an object declared with the
keyword register is not computable. This means, the operator & cannot be applied to an
object declared with the storage-class specifier register. When applied to an array, since its
address cannot be computed, you cannot use subscripts to access its elements as shown
below:
$ cat register.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
register int v =10;
register int s[10] = { 1, 2 , 3};
printf(&v=%p\n, &v);
printf(s[1]=%d\n, s[1]);

return EXIT_SUCCESS;
}
$ gcc -o register -std=c99 -pedantic register.c
register.c: In function main:
register.c:7:4: error: address of register variable v requested
register.c:8:25: warning: ISO C forbids subscripting register array


An object having automatic storage duration (local objects) is created at its declaration
within its block and is destroyed as the block is left: it is temporary. When an object is
created, storage is allocated for storing its value. It is destroyed when its storage is freed
and becomes available for another object. This implies you must not use the address of an
object with automatic storage duration outside its scope as we did in example
function_lifetime1.c.

If a block is entered several times, such as a in the case of a loop body, local objects of the

block are created and initialized each time the block is entered and destroyed each time it
is left.

VII.7.2 Static storage duration


An object has static storage duration in the following cases:
o It is declared with the storage-class specifier static. Its scope can be file or block.
o It is has file scope (global object).
o It is declared with the storage-class specifier extern.

Throughout the book, we call static identifier an identifier declared with the storage-class
specifier static. Therefore, a static identifier has static storage duration and can have file
scope (global) or block scope (local).

VII.7.2.1 Global objects (file scope)
An object declared outside functions (file scope) is said to be external or global. Not only
is it visible within the source file in which it is declared but also within all other source
files: a global object is visible throughout the whole program. It exists until the program
terminates: it is permanent. It is created once at its declaration and destroyed when the
program ends. For example, functions are global (file scope) by design. In the following
example, the variable status is visible throughout the source file function_lifetime2.c and exists
as long as the program is running:
$ cat function_lifetime2.c
#include <stdio.h>
#include <stdlib.h>

int status = 10; /* global variable */

void f(void) {
printf (function f() status=%d\n, status);
status = 20;
printf (function f() set status to %d\n\n, status);
}

void g(void) {
printf (function g() status=%d\n, status);
status = 30;
printf (function g() set status to %d\n\n, status);
}

int main(void) {
f();
g();
printf (function main() status=%d\n, status);
return EXIT_SUCCESS;
}
$ gcc -o function_lifetime2 -std=c99 -pedantic function_lifetime2.c
$ ./function_lifetime2
function f() status=10
function f() set status to 20

function g() status=20
function g() set status to 30

function main() status=30


VII.7.2.2 Extern storage-class specifier
The extern storage-class specifier will be better understood in the next chapter. So far, our
program is composed of a single source file holding all our code. As matter of fact, a
program can be composed of several source file. In each source file, you can declare
global objects and functions (that are global by design). The extern storage-class specifier
used in a declaration tells the compiler the object is actually defined in another source file
as an external object (file scope). For example, the declaration extern int status in a
translation unit indicates the variable status is declared in another file as global object (file
scope) and we wish to access it throughout this source file. Such an object holds the same
identifier throughout the whole program and exists until the program terminates. It is
created once at its declaration and destroyed when the program ends: it is permanent.

Let us suppose our program is made of two source files
function_lifetime_dummy.c:
$ cat function_lifetime_main1.c
#include <stdio.h>
#include <stdlib.h>

extern int status; /* global variable defined elsewhere */

int main(void) {
printf (status=%d\n\n, status);
return EXIT_SUCCESS;
}
$ cat function_lifetime_dummy1.c

function_lifetime_main.c

and

int status = 40; /* global variable declared and initialized here */



$ gcc -c function_lifetime_dummy1.c
$ gcc -c function_lifetime_main1.c
$ gcc -o function_lifetime_main1 function_lifetime_main1.o function_lifetime_dummy1.o
$ ./function_lifetime_main
status=40

We will talk more about modules in the next chapter. The command gcc c creates an object
file (binary code) from a source file. The command gcc o creates an executable from
object files.

By design, a function is global. In the following example the function f() is visible
throughout the whole program composed of two source files function_lifetime_main2.c and
function_lifetime_dummy2.c:
$ cat function_lifetime_main2.c
#include <stdlib.h>

extern void f(void); /* function f() is declared elsewhere */

int main(void) {
f();
return EXIT_SUCCESS;
}
$ cat function_lifetime_dummy2.c
#include <stdio.h>

void f(void) {
printf (function f()\n);
}
$ gcc -c function_lifetime_dummy2.c
$ gcc -c function_lifetime_main2.c
$ gcc -o function_lifetime_main2 function_lifetime_main2.o function_lifetime_dummy2.o
$ ./function_lifetime_main2
function f()


VII.7.2.3 Static storage-class specifier
The static storage-class specifier can be used in two ways: at file scope or block scope. An
object declared with the storage-class specifier static exists until the program terminates: a
static object is permanent.


VII.7.2.3.1 File scope

Used outside functions (file scope), the static storage-class specifier makes an object visible
only within the source file in which it is declared. Without the storage-class specifier static,
a global object can be accessed within other source files. Let us reuse our previous
example, let us place the static keyword before our variable status. What do you think it will
happen?
$ cat function_lifetime_main3.c
#include <stdio.h>
#include <stdlib.h>

extern int status; /* global variable defined elsewhere */

int main(void) {
printf (status=%d\n\n, status);
return EXIT_SUCCESS;
}
$ cat function_lifetime_dummy3.c
static int status = 40; /* global variable declared and initialized here */

$ gcc -c function_lifetime_dummy3.c
$ gcc -c function_lifetime_main3.c
$ gcc -o function_lifetime_main3 function_lifetime_main3.o function_lifetime_dummy3.o
Undefined first referenced
symbol in file
status function_lifetime_main3.o
ld: fatal: symbol referencing errors. No output written to function_lifetime_main3
collect2: ld returned 1 exit status

The compilation failed because the global variable status is no longer visible by the source
file function_lifetime_main3.c. The global variable status is visible only throughout the source
file function_lifetime_dummy3.c.

What we said about objects is holds true for functions. For example:
$ cat function_lifetime_main4.c
#include <stdlib.h>

extern void f(void); /* function f() is declared elsewhere */

int main(void) {
f();

return EXIT_SUCCESS;
}
$ cat function_lifetime_dummy4.c
#include <stdio.h>

static void f(void) {
printf (function f()\n);
}
$ gcc -c function_lifetime_dummy4.c
$ gcc -c function_lifetime_main4.c
$ gcc -o function_lifetime_main4 function_lifetime_main4.o function_lifetime_dummy4.o
Undefined first referenced
symbol in file
f function_lifetime_main4.o
ld: fatal: symbol referencing errors. No output written to function_lifetime_main4
collect2: ld returned 1 exit status

The compilation failed because the function f() in the source file function_lifetime_dummy4.c is
visible only within this file.

We will say more about static objects in the next chapter. For now, just retain the keyword
static used with identifiers having file scope make them visible only in the source file in
which they are declared.

VII.7.2.3.2 Block scope

Used with an identifier having block scope, a temporary local object (automatic), it turns it
into a permanent object. The object is created and initialized at program startup and keeps
its value until the program terminates. Let us consider the first program:
$ cat function_lifetime5.c
#include <stdlib.h>
#include <stdio.h>

void f(void) {
static int j = 10;
printf (j=%d\n, j);
j++;
}

int main(void) {
f();
f();

f();
f();
return EXIT_SUCCESS;
}
$ gcc -o function_lifetime5 -std=c99 -pedantic function_lifetime5.c
$ ./function_lifetime5
j=10
j=11
j=12
j=13

Compare with the following one:


$ cat function_lifetime6.c
#include <stdlib.h>
#include <stdio.h>

void f(void) {
int j = 10;
printf (j=%d\n, j);
j++;
}

int main(void) {
f();
f();
f();
f();
return EXIT_SUCCESS;
}
$ gcc -o function_lifetime6 -std=c99 -pedantic function_lifetime6.c
$ ./function_lifetime6
j=10
j=10
j=10
j=10

In the program function_lifetime5.c, the variable j has static storage duration. It is created
(and initialized) at program startup and exists as long as the program runs, keeping its
value until it is changed. The variable j is permanent even though it is local (block scope).

In the program function_lifetime6.c, the variable j has automatic storage duration. It is created
and initialized each time the function f() is executed. It is destroyed as the function f() is

left. The variable j is temporary.



This means that if we rewrite our program function_lifetime1.c using the static keyword, it will
work as expected:
$ cat function_lifetime7.c
#include <stdio.h>
#include <stdlib.h>

int *f(void) {
static int s[10] = {10, 18, 20};

return s;
}

int main(void) {
int *p = f();

printf (p[0]=%d\n, p[0]);
return EXIT_SUCCESS;
}
$ gcc -o function_lifetime7 -std=c99 -pedantic function_lifetime7.c
$ ./function_lifetime7
p[0]=10

Yes, it will work but it implies you will get always the same array each time you call the
function f() as shown below:
$ cat function_lifetime8.c
#include <stdio.h>
#include <stdlib.h>

int *f(void) {
static int s[10] = {10, 18, 20};

return s;
}

int main(void) {
int *p;
int *q;

p = f();

p[0] = 200;
printf (p[0]=%d\n, p[0]);

q = f();
printf (q[0]=%d\n, q[0]);

return EXIT_SUCCESS;
}
$ gcc -o function_lifetime8 -std=c99 -pedantic function_lifetime8.c
$ ./function_lifetime8
p[0]=200
q[0]=200

If this is what you want, it is fine but if you want to get a new array at each call, you have
to use memory block dynamically allocated by malloc() or calloc(). Such objects are more
interesting since they have allocated storage duration.

VII.7.3 Allocated storage duration


A valid pointer holds an address pointing to an existing memory block. As we explained it,
a valid pointer reference an object created automatically (such as a variable) or a memory
area allocated by the malloc(), calloc() or realloc() function. An automatic object is created in
the block in which it is declared and destroyed when left. A pointer referencing such an
object can be used only within the block in which the object is declared. A pointer to an
object with static storage duration can be returned by a function and used throughout a
program until it terminates. A memory area allocated by the malloc(), calloc() or realloc()
function can be exploited until the free() function is invoked: such an abject has allocated
storage duration. You decide the lifetime of such an object. As soon as, you do not need
it, you just call the free() function. You can view it as a dynamic storage duration controlled
by the user.

We can rewrite our program function_lifetime1.c using an allocated memory area:
$ cat function_lifetime9.c
#include <stdio.h>
#include <stdlib.h>

int *f(void) {
int len = 10;
int *s = malloc(len * sizeof *s);
s[0] = 10;
s[1] = 18;
s[2] = 20;


return s;
}

int main(void) {
int *p;
int *q;

p = f();
p[0] = 200;
printf (p[0]=%d\n, p[0]);

q = f();
printf (q[0]=%d\n, q[0]);

return EXIT_SUCCESS;
}
$ gcc -o function_lifetime9 -std=c99 -pedantic function_lifetime9.c
$ ./function_lifetime9
p[0]=200
q[0]=10

As soon as you no longer need the allocated me