Вы находитесь на странице: 1из 11

CSE211 Lecture Notes – 2004/2005-II

CSE 211: Data Structures

Lecture Notes I
by
Ender Ozcan, Şebnem Baydere

Data, Data Structures, C Language Features Revisited: Typing and


Pointers, Abstract Data Types, Sequences

1. DATA AND STRUCTURING DATA

Solving commercial, engineering and research problems using computers require many data
items. Each individual data item is an atomic piece of information contining a value about
some aspect of the problem.

A data item has a type, a set of values from which it can take a value from. For example, the
data item “age” has the type “non-negative integer”, and the data item “salary” has the type
“real number”. Some types are just reused types, such as, the data item “student id” is just a
“non-negative integer”, but note that we do not usually do arithmetic on students ids. There
are other types as well that are derived from the real world. Some of these types arouse due to
the related constraints and some of them are user defined types for structuring data. For
example, possible value of a month is restricted to one of twelve: January, February,...,
December. Real world problems, being complex problems, require processing and storage of
data in a structured manner for the sake of easeness.

Before starting to work at a company, you fill out an employee form. There are some data
items that you have to provide, such as, your name, last name, date of birth and so on. These
forms are received by the human resources department and kept in a file, a structure to hold a
set of records. In order to reach records in files fast, keys might be assigned to each record.
For example, 2382912 can be assigned to record of the employee Mr. Ay Sies.
There are two forms of structures in this example;
• record, representing a form, holding many individual, but linked items of data
• an array of items of the same type, that is the file.
There are two types of data:
• Empolyee information is a real world data.
• Structural information, it is about how the real data will be stored and accesed, e.g.
keys.

Language Support For Typing Data

Primitive data types allow direct typing of objects

int nextEntry;
Synonyms

typedef float salary;

1
CSE211 Lecture Notes – 2004/2005-II
salary totalSalaries, nextSalary;
totalSalaries = totalSalaries + nextSalary;

Enumerated Types

enum month_t {January, February, March, April, May, June, July, August,
September, October, November, December}
enum month_t monthOfSeminars;
monthOfSeminars=January;

Structure (Record)
struct struct-name {
Struct-items // list of declarations
};

typedef struct struct-name type-name;

A structure is a collection of one or more variables, possibly of different types grouped


together under a single name. (In Pascal they are called "records"). Structures help to organize
complicated data as they permit to treat a group of variable as a single unit.

Remember that the array has two major benefits: we can index, therefore can reach each
element of the array directly, we can pass just the name off the array to a function.

One of the difference between arrays and structures is that we cannot simply loop over
structure item as we do in arrays. Because the type of the items may not be the same.

Structures may be copied and assign to, pass to a function and returned by a function.

The variable names in the declarations of a structure are called members. Member access
operator is the dot operator (.). If the variable is a pointer, then (->) is used

struct_object.member_name
struct student_r {
int ID;
char name[50];
char lname[50];
};
typedef struct student_r Student;

Student s1;
s1.ID = 9802070203;
s1.name[0] = ‘A’;
s1.name[1] = ‘L;
s1.name[2] = ‘İ’;
s1.name[3] = ‘\0’;

2.ARRAYS

2
CSE211 Lecture Notes – 2004/2005-II
#define MaxSize 5;
typedef int myIntArray_t[MaxSize];
myIntArray_t ids;
Let's clarify the difference between arrays and structures by an abstract example given below.
We start with the following definitions:

The Array is the basic mechanism for storing a collection of identically typed objects.
The Structure stores a collection of objects that need not be of the same type.

Abstract example: Consider the layout of an apartment building. Each floor might have a one-
bedroom unit, a two-bedroom unit, a three-bedroom unit and a utility room for the floor.

Q. Which data structures can we use to define this apartment?


A. Each floor is stored as a structure and the building is an array of floors.

In this section we will see arrays and their relationship with pointers.
Declaration:

int a[10];
//defines an array of size 10, that is a block of 10 consecutive objects named a[0], a[1],….a[9].

a:

a[0] a[1] a[9]

a[i] refers to the i-th element in the array. Compiler allocates a consecutive space for 10
integers.

Common errors related to the arrays:


• No index range checking is performed in C/C++, so access out of the array index bounds
is not caught by the compiler. No explicit runtime error will be generated, but undefined
an mysterious behaviour occurs.
• If the array is passed as an actual argument to the function, the function has no idea how
large the array is.
• Arrays cannot be copied by the = operator.

Dynamic allocation of arrays

 In C, to allocate memory calloc, malloc functions are used, to deallocate memory, free
is used.
 int *Arr2 = malloc( sizeOf( int ) * Size );
 free(Arr2);
 In C++, new [ ] and delete [ ]
 int Arr1[ Size ]; // size is a compile time constant
 int *Arr2 = new int [ Size ];

3
CSE211 Lecture Notes – 2004/2005-II
 delete [ ] Arr2;

Example – structuring data


Given a problem, a population of individuals is generated, where each individual is a
candidate solution for the problem. An individual consists of a chromosome and a fitness
value. A fitness function evaluates the quality of the solution (individual), indicating how fit a
given individual is. A chromosome consists of a sequence of genes, at each locus receiving a
value from the related allele set. In a traditional GA, an allele value is either 0 or 1, but
integer encoding is also allowed. Then the initial generation of individuals is evolved towards
a final generation, utilizing genetic operators, namely crossover and mutation. At each
evolutionary step, next generation is determined based on the survival of fittest. Whenever the
termination criteria are satisfied, evolution is ended, hoping that the best individual obtained
so far is the solution searched for. It is expected that average fitness (total fitness/population
size) of the population will improve from one generation to the next.

Q: How can we structure the data?

Static versus Dynamic Implementation

#define chromosomeSLength 30; typedef struct {


#define populationSize 100; int allele;
} Gene;
typedef int Gene;
typedef struct{
typedef struct{ Gene **chromosome;
Gene double fitness;
chromosome[chromosomeLength]; int chromosomeLength;
double fitness; }Individual;
}Individual;
typedef struct{
typedef struct{ Individual **indivs;
Individual double totalFitness;
indivs[populationSize]; int populationSize;
double totalFitness; }Population;
}Population;

3. POINTERS
Sophisticated C (C++) programmers makes heavy use of pointers to access variables (objects)
in the program. Because:
• Sometimes it is the only way to express a computation
• They lead to more compact and efficient code.

Pointer: is a variable (object) that can be used to access the address of another variable
(object). It provides indirect access rather than direct access to an object.

Pointers and arrays are closely related. We will see how shortly.

If the pointers are used carelessly, it is easy to create pointers, which point to an unexpected
location. If used with a discipline, they simplify and clarify the code.

4
CSE211 Lecture Notes – 2004/2005-II
Let's see the use of pointers in real life situations:
• Somebody asks you for directions. If you do not know the answer you may give an
indirect answer. "Go to the shop X and ask for directions."
• Someone asks you for a phone number. Rather than giving an immediate answer you may
give an indirect reply "let me look it up in the phone book"
• When a Prof. says, "do problem 1.2 in the textbook", the actual homework assignment is
being stated indirectly.
• Looking up a topic in the index of a book. The index tells you where a full description
can be found.
• A street address is a pointer, (it tells you where someone resides) A forwarding address is
a pointer to pointer.

Pointers and Addresses


A typical machine has an array of consecutively numbered or addressed memory cells that can
be manipulated individually or in contiguous groups.
• Any byte can be a character
• A pair of one-byte cells can be treated as short integer (int)
• Four adjacent bytes form a long integer (long)

In C/C++, a pointer is a variable stored in a group of cells that can hold an address that is a
location in memory where other data are stored. Since an address is expected to be an integer
a pointer can usually be represented internally unsigned int or unsigned long depending on the
machine architecture.

Question: why unsigned?

What makes a pointer more than just a plain integer is that we can access the data that is
pointed at. This is called dereferencing.

We can define a pointer to the elements of an array, pointer to a structure, pointer to an


integer, pointer to a character, etc…

For pointer operations two unary operators are used:

• Address-of operator & : To have a pointer point at an object we need to know the target
objects memory address. & (unary operator) is used for this purpose.

Let’s say we define p to be a character pointer


c to be a character variable

p = &c; assigns the address of c to the pointer variable p (p is said to ‘point to c’)

The unary operator & only applies to objects in memory: variables, structures and array
elements. It cannot apply to expressions, constants or register variables.

• Indirection (dereferencing) operator *: When applied to a pointer, it accesses the object


the pointer points to.

c= *p c is assigned the value of the data p points to

5
CSE211 Lecture Notes – 2004/2005-II

Example: say x and y are two integers, ptr is an integer pointer.

Declaration: int x = 5;
int y = 7;
int *ptr; (the value represented by ptr is a memory address,
this declaration doesnot initialize ptr to any particular value)

Using pointers before assigning an address to it produces bad results and crash your program.

ptr = &x; ( pointer ptr points at x; assign memory location of x to ptr)

(&x) 1000 x=5


(&y) 1004 y=7

(&ptr) 1200 1000 ---addr of x 5 7

ptr x y

say y = *ptr; (dereferencing) 5

y the value of the data being pointed at is


obtained by dereferencing operator.

say *ptr = 10; dereferencing works for writing new values to the object as well

1000 x=10
1004 y=5
> 10
ptr x
1200 1000

We also could have initialized ptr at declaration time.

int x = 5; int *ptr = &x;


int y = 7; int x = 5; incorrect
int *ptr = &x; correct (x is not declared yet)

int *ptr = x; incorrect (x is not an address)

int *ptr; legal but uninitialized pointer (&x) 1000 x =5


(&y) 1004 y =7

5 7

6
CSE211 Lecture Notes – 2004/2005-II

1200 ptr=? ptr x y


We have already seen the correct syntax for the assignment:

ptr = &x;

suppose that we forget the address-of operator. Then the assignment

ptr = x; generates a compiler error.

*ptr = x; semantically incorrect. No compiler error. Because the statement says that the
integer to which ptr is pointing should get the value of x. For instance, if ptr is &y then y is
assigned the value of x. The assignment is legal but it does not make ptr point at x. Moreover,
since ptr is uninitialized dereferencing is likely to cause run-time error.

Pointer Arithmetic
Every pointer points to a specific data type. If ptr points to an integer x, then *ptr can occur in
any context where x could.

x = 3;
ptr = &x;
*ptr = *ptr + 10; increments *ptr by 10. (x becomes 13)

y = *ptr + 1; takes the value ptr pointing at , add 1 to it and assign the result to y.
(y becomes 14)

*ptr += 1; increments what ptr points to by 1. (same as *ptr = *ptr + 1)

++*ptr ( same as above)

*ptr++ (increments the pointer value first, and then access the value, after this
operation ptr points to the next memory location)

however (*ptr)++ increments what ptr points to.

Finally, since pointers are variables they can be used without dereferencing.

For example:
if both ptr1 and ptr2 are pointers to the same data type such as integer then

ptr1 = ptr2 sets ptr1 to point to the same location as ptr2, while
*ptr1 = *ptr2 assigns the value ptr1 points to the value ptr2 points to.
Question: give an example.

initial state: ---- 5 - 7

7
CSE211 Lecture Notes – 2004/2005-II
ptr1 x ptr2 y

after ptr1=ptr2 5
from initial state
ptr1 x

ptr2 y

after *ptr1= *ptr2


from initial state 7

ptr1 x

ptr2 y

3.2 The C/C++ implementation: An Array Name is a Pointer

In C, there is a strong relationship between pointers and arrays. Therefore they should be
discussed simultaneously. Any operation that can be achieved by array subscripting can also
be done with pointers. The pointer version is faster but harder to understand for the beginners.
In this section we will discuss the array implementation in C/C++ and see the relationship
between arrays and pointers. When a new array is allocated, the compiler multiplies the size
in bytes of the type of the array by the array size.

Ex. Lets say we define a character array

char c[4] ----------- 4 bytes allocated


short int a[4] ------------ 8 bytes allocated
long int b[4] ------------ 16 bytes allocated

usually the actual number of bytes used for int, float, double declarations is machine
dependent.

After the allocation the size is irrelevant because the name of the array represents a pointer to
the beginning of allocated memory for that array.

Ex. int a[3];


int i;

8
CSE211 Lecture Notes – 2004/2005-II
The compiler allocates for consecutive blocks for the array elemts and then allocates storage
for i.

(&a[0]) 5000 a[0]


(&a[1]) 5004 a[1]
(&a[2]) 5008 a[2]
...
(&i) 6000 I
...
(&a) 6500 a=5000

The values stored in a is the same as the value &a[0]. This equivalence is always quaranteed
and this tells that the array name (a in this example) is actually a pointer.

Lets clerify this by extending the example.

If pa is a pointer to an integer declared as

int *pa;

then the assignment

pa = &a[0]; sets pa to point to element zero of a, pa contains the address of a[0].

Now the assignment x= *pa; copy the contents of a[0] into x.

if pa points to a particular element of an array, then by definition pa+1 points to the next
element.

*(pa+1) refers to a[1].

Since the array name is a pointer to the beginning of allocated memory then

pa = a also valid. (initializes the pointer pa to point to array a --- same as pa = &a[0])

Also, a reference to a[i] can also be written as *(a+i). In evaluating a[i], C/C++ converts it to
*(a+i) immediately; the two forms are identical.

Important: In short, an array-index expression is equivalent to one written as a pointer and


offset. There is one difference between an array name and a pointer.

A pointer s a variable, so pa = a and pa++ are legal. But an array name is not a variable;
constructions like a=pa and a++ are illegal.

Parameter Passing Mechanisms


Call by value, call by reference

void Swap(int &first, int &second) void Swap(int *first, int *second)
{ int temp= first; { int temp= *first;
first = second; *first = *second;
second = temp; *second = temp;

9
CSE211 Lecture Notes – 2004/2005-II
} }

Swap(x,y) Swap(&x,&y)

4.ABSTRACT DATA TYPES

Computer programs manipulate data.

• What they do to that data is important.


• How they do it or what form the data takes is not.

So...create new data types and declare exactly what the program can do to manipulate a
variable of that type. This provides a form of data integrity. And provides flexibility for future
modification of the data's form (i.e. for efficiency, bug-fixing, or enhancement).

A data type is a set of values and a set of operations defined on those values. An abstract
data type is a data type where the specification of the values, and the operations’ effects on
the values, is seperated from the representation of the values and the implementation of the
operations.

A complex number consists of a real part and imaginary part. We should be able to add,
subtract, multiply and divide complex numbers.

Complex Number ADT struct complex_struct {


int realPart;
int imagPart;
Data Items };
Real Part
typedef struct complex_struct CN;
Imaginary Part
/*
Adds the two complex numbers a=xi+y and b=pi+q.
Operations The result (x+p)i+(y+q) is stored in sum.
Add */
Adds two complex void add(CN *sum, const CN *a, const CN *b);
numbers & returns /*
the result Substracts b=pi+q from a=xi+y and
Subtract the result (x-p)i+(y-q) is stored in diff.
*/
Substracts a void sub(CN *diff, const CN *a, const CN *b);
complex number
from another & /*
Multiplies the two complex numbers a=xi+y and
returns the result b=pi+q.
Multiply The result (x*q+y*p)i+(y*q-x*p) is stored in
Returns the product.
*/
product of two
complex numbers void mult(CN *product, const CN *a, const CN *b);
Divide /*
Divides a=xi+y by b=pi+q.
Divides a complex The result (x*q - y*p)/ (p^2+q^2) i +
number by another (y*q+x*p)/(p^2+q^2)
is stored in quotient.
& returns the result */
void div(CN *quotient, const CN *a, const CN *b);

10
CSE211 Lecture Notes – 2004/2005-II
C Implementation

4. ORDERED LISTS (SEQUENCES) as an ADT

If one asks you to define an array, most probably you will say a consecutive set of memory
locations. This clearly reveals a common point of confusion; the distinction between a data
structure and its representation. It is true that arrays are implemented by consecutive memory
but intuitively, an array is a set of pairs; index and value. For each index which is defined
there is a value associated with that index. In mathematical terms we call this a
correspondance or a mapping. An array is declared by giving it a name and by telling the
compiler what type the elements are. If we are defining an array a size must also be provided.
The size can be omitted if it is initialized. The compiler count the number of initializers and
take that as the size.

Examples for ordered lists are the days of the week, values in a card desk, floors of a building.

If we consider an ordered list (seqeunce) more abstractly, we say that it is either empty or it
can be written as <a1, a2,a3……an> where the ai are atoms from some set S. There are a
variety of operations that are performed on these lists. These operations include:

• Find the length of the list, n;


• Read the list from left-to-right(or right-to-left)
• Retrieve the i-th element
• Store a new value into the i-th position
• Insert a new element at position i, causing elements numbered from i+1 to n to move one
position right; i+2, …n+1 .

In the study of data structures we are interested in ways of representing ordered lists so that
these operations can be carried out efficiently. Perhaps the most common way to represent an
ordered list is by an array where we associate the list element a i with the array index i. This
may be called sequential mapping because using the conventional array representation we are
storing ai and a i+1 into consecutive locations i and i+1 of the array. This gives us the ability to
retrieve or modify the values of random elements in the list in a constant amount of time,
essentially because a computer memory has random access to any word. We can access the
list element values in either direction by changing the subscript values in a controlled way.

In the following lectures we will see the problems with array implementations.

11

Вам также может понравиться