Вы находитесь на странице: 1из 31

Tuesday, June 5, 2007

C++ : Multiple Access Specifiers in a Class

On May 27th, Ramshankar in one of the C++ communities at Orkut, asked what seemed
like a pretty innocuous question:

class TUid
{
public:
IMPORT_C TInt operator==(const TUid& aUid) const;
static inline TUid Null();
public:
TInt32 iUid;
};

What is the purpose of defining "public" section again? Is it for allowing:

TUid myType = { 0x01232423 };

Multiple public/protected/private sections are very much allowed in C++. In fact, they are
seen in MFC wizard generated code. But, the real problem lay in not whether it was
allowed, but why has this been allowed? As the C++ standard states that:

12. Nonstatic data members of a (non-union) class declared without an intervening


access-specifier are allocated so that later members have higher addresses within a class
object. The order of allocation of nonstatic data members separated by an access-
specifier is unspecified (_class.access.spec_).

So, to find accurate (reliable) answer I had to email Bjarne, and this was his reply:

Consider

struct S {
public:
int a;
private:
int b;
public:
int c;
private:
int d;
};
Is the compiler allowed to allocate the private members next to each other? (the answer is
yes).

The reason for the rule was early ideas of separating private data from public data for
some implementations to be able to alleviate code evolution problems when the data
layout changed.
For example if you allocated public before private, then adding a private member could
be done without affecting the public intercase (after creation). As far as I know, no
compiler has ever done that.

However, some compilers do use rearrangement to create more compact layouts. For
example:

struct SS {
char a;
public:
int b;
public:
char c;
public:
int d;
public:
char e;
};

If members are allocated in declaration order, the size will be 5 words, but you can
(legally) reorder to get 3 words (assuming a 4-byte word).

Personally, I have never found this useful.

I'll leave as an exercise how to reorder to get 3 words . <(^_^)>

But, a mystery still remains as André asked:

Can you write any piece of strictly conforming code for which 9p12 (the above stated
standard snippet) makes ANY difference for a non-POD (Plain Old Data) type?

Put another way (this is a different formulation of the same question):


Can you write any piece of code that uses 9p12 (the fact that the order is specified) for a
non-POD type, without invoking undefined behaviour?

Can you?!

13 Comments

Tuesday, March 27, 2007


C++ : 'this' pointer representation

How is the 'this' pointer represented? The C++ standard states:

9.3.2/1

In the body of a nonstatic (9.3) member function, the keyword this is a non-lvalue
expression whose value is the address of the object for which the function is called. The
type of this in a member function of a class X is X*. If the member function is declared
const, the type of this is const X*, if the member function is declared volatile, the type of
this is volatile X*, and if the member function is declared const volatile, the type of this is
const volatile X*.

But, in many texts I have read that the 'this' pointer is considered a constant. So, I did
what I usually do when faced with extreme difficulty, shoot a mail to Mr Stroustrup.
Here's a part of conversation that followed:

Dear Mr Stroustrup,

I am sorry if I am disturbing you.


In the C++ standard following is stated for the 'this' pointer.

-----------
9.3.2/1

In the body of a nonstatic (9.3) member function, the keyword this is a non-lvalue
expression whose value is the address of the object for which the function is called. The
type of this in a member function of a class X is X*. If the member function is declared
const, the type of this is const X*, if the member function is declared volatile, the type of
this is volatile X*, and if the member function is declared const volatile, the type of this is
const volatile X*.
----------

While reading 'Inside the C++ Object model', by Mr Stanley Lippman, I came through
'this' being a constant pointer: myclass *const this, and const myclass *const this. And he
writes that this technique was used in 'cfront'.

My point is, I always considered 'this' to being a constant pointer so that we can validate
it being a non-LValue as the standard requires it to be. But if we stick to the standard then
what Mr Lippman writes turns out to be non-standard. What should I assume? In the C++
community at Orkut, I replied with the following

and it must be

void zaman::shout (const zaman *const this, .....) const {}


and

void zaman::shout(zaman *const this, .....) {}

Waiting anxiously for reply

----------------------------

His reply:

The committee decided (to simplify overloading rules, I think) to express the fact that
you cannot change the value of "this" as it being a non-lvalue. I originally expressed its
immutability by saying that it was a *const. Undoubtedly, you can find some example
where that difference matter, but you'll have to look hard and the likelihood of such an
example appearing in real come must be minimal.

-------------------------------

So, it's not a constant pointer. -- Answer

0 Comments

Sunday, March 4, 2007


C++ : Are NULL and C++ standard library part of the C++ language?

NO, C++ standard library is not a part of the C++ language but provides support.

The question was raised in an Orkut forum, and who better to ask than Bjarne. This is
what his reply to my email was:

Zaman Bakshi wrote:


> Dear Mr Stroustrup,
>
> I hope this email finds you in good health. Sir, can we say that the
> standard C and C++ libraries are a part of the C++ 'language'? Or
> should they be considered as a support for the language and not a part
> of it? This point was raised in the C/C++ programmers' community (that
> I am moderating) with reference to NULL. I cited your TC++PL and wrote
> const int NULL = 0; to be the correct implementation of NULL in C++,
> if it has to be defined. We know that 0 should be used instead of
> NULL, but what if NULL has to be defined. A null pointer is defined in
> the 'C++ standard' but NULL is used with reference to the C libraries.
> My answer was that NULL is not a part of the language but part of the
> C standard libraries referred in the C++ standard.
I distinguish between the C++ language and the C++ standard library.
They are both part of the C++ standard, though, and shipped with every
implementation. Some people (slightly incorrect I think, but
understandably) refer to all that is in "The C++ language standard" as
"the C++ language".

>
> So, are NULL and the libraries part of the 'C++ language'?

I would say no. Even though you can use NULL after #including that
appropriate standard library header, you don't have to.

In C++0x, we'll get nullptr as a keyword indicating the null pointer.

0 Comments

Friday, February 23, 2007


C++ : Free-store versus Heap

What's the difference between the heap and the free-store? The C++ Programming
Language keeps on referring them interchangeably. There was as huge cry over this issue
in C/C++ programmer's community in Orkut. I had to shoot a mail to Dr. Bjarne
Stroustrup. Here's our conversation:

My Mail:

Dear Mr Stroustrup,

Sorry to disturb you again. You have mentioned several times in the TC++PL that 'new'
allocates memory from the 'free store (or heap)'. There has been a huge cry on the C++
community at Orkut (that I am moderating) as to whether free-store is the same as heap.
The argument given against is that Mr Herb Sutter has mentioned that the free-store is
different from the heap:

http://www.gotw.ca/gotw/009.htm

and that global 'new' has nothing to do with the heap.

So, if so, why has TC++PL used 'free store (or heap)' instead of mentioning the use of
'heap' separately.

Waiting anxiously for the response.

Regards,
Zaman Bakshi
His Reply:

Note that Herb says: "Note about Heap vs. Free Store: We distinguish between "heap"
and "free store" because the draft deliberately leaves unspecified the question of whether
these two areas are related. For example, when memory is deallocated via operator
delete, 18.4.1.1 states:"

In other word, the "free store" vs "heap" distinction is Herb's attempt to distinguish
malloc() allocation from new allocation.

>
> So, if so, why has TC++PL used 'free store (or heap)' instead of
> mentioning the use of 'heap' separately.

Because even though it is undefined from where new and malloc() get their memory, they
typically get them from exactly the same place. It is common for new and malloc() to
allocate and free storage from the same part of the computer's memory. In that case,
"free store" and "heap" are synonyms. I consistently use "free store" and "heap" is not a
defined term in the C++ standard (outside the heap standard library algorithms, which
are unrelated to new and malloc()). In relation to new, "heap" is simply a word someone
uses (typically as a synonym to "free store") - usually because they come from a different
language background.

My Reply:

Thank you Mr. Stroustrup, I had inferred the same thing (about using free store as
general -- or better, synonym -- term) and had explained the community. But, I had been
requested to reconfirm.

With warm regards,


Zaman Bakshi

0 Comments

Sunday, February 18, 2007


C++ : Indeterminate Value

The C language standard clearly 'defines' what an indeterminate value is in C. But the C+
+ standard is missing this definition. Naturally, we can't adopt the definition in C
standard to C++. I wanted to know, and who could have been more reliable than Mr.
Bjarne Stroustrup to clear this cloud of uncertainty. So, I dashed an email to him. Here's
the conversation that followed:
----------------------------------

Dear Mr Stroustrup,

I am reading D&E, and let me congratulate you for writing such a great book. It has
been of a lot of help. Mr. Stroustrup, I am moderating a C++ community on Orkut and
there has been a very big issue over what 'indeterminate value' means for the C++
standard. The C standard clearly states what 'indeterminate value' means, but the C++
standard though using (indeterminate value) many times doesn't specify its definition.
Should we regard 'indeterminate value' in C++ as being undefined, or should we stick to
the C standard's definition (for 'indeterminate value')?

Anxiously waiting for the response.

Regards,
Zaman Bakshi.

----------------------------------

Thanks.

I never used "indeterminate value" and hadn't noticed that it had "snug
into" the C++ standard. I have raised an issue and "indeterminate value"
will be defined in C++0x. You can't "stick to C's definition" because
that definition has never been approved for C++ (was introduced into C
relatively lately, I believe). "indeterminate" simple means that you
don't know what that value is (if could be absolutely any bit pattern
that fits in the object). I believe the C++ standard is specific about
which operations requires a properly initialized object.

Does this address the issues raised in your discussion? If not, please
ask again.

----------------------------------

Dear Mr Stroustrup,

Thank you for replying to my e-mail. Yes, it does answer my issue. This is exactly what I
had inferred, but as you know, developers like me can't argue over the standard, so had
to clear the doubt. I, like others, are anxiously waiting for C++0x to be out with the
standard. Good luck with it. And I thank you again for your response.

Regards,
Zaman Bakshi

----------------------------------
Thanks. I'm working hard for C++0x to become C++09. Doing that requires
a complete feature freeze and complete WP by the end of 2007.

0 Comments

Sunday, December 3, 2006


C/C++ : Speed Variation

People believe that C code has faster execution speed than of C++. I argue otherwise.
Many C++ gurus have spent their valuable time explaining that there's no difference.
Now, I may lack credibility but the writer of C++, Bjarne Stroustrup, certainly doesn't.
So, would you believe his words? While browsing the Internet, I found myself laying
hands on one of the emails sent to Bajrne to explain his views on this contentious topic.
Here's the email (in full):

TITLE: Speed and size of C versus C++

PROBLEM: ???

I also heard the size of C++ program is generally bigger than C. I am a C programmer
trying to learn C++. So, don't blame me if I have created any misconception about C++.

RESPONSE: ???

For one minor detail: including printf/scanf can include more code than is actually used.
An intelligent C++ linker will only get that parts of the stream library REALLY needed.

RESPONSE: cshaver@informix.com (Craig Shaver @ Informix Software, Inc.)

Who implements an intelligent C++ linker??? I was under the impression that you get all
the functionality in a class when you link, whether you use it or not.

RESPONSE: bs@alice.att.com (Bjarne Stroustrup), 12 Jan 93


AT&T Bell Laboratories, Murray Hill NJ

Let me try to clear up one or two points. Consider first a somewhat minimal C and C++
program x1.c:

main()
{
int i;
for (i = 0; i < 1000000; i++ ) printf("Hi,mom!\n") ;
}

and its more C++ looking cousin x2.c:

main()
{
int i;
for (i = 0; i<1000000; i++ ) cout << "Hi,mon!\n" ;
}

I compiled and ran x1.c and x2.c:

c: cc x1.c
c: size a.out
text | data | bss | dec | hex
12288| 6144| 7608| 26040| 65b8

c: time a.out > /dev/null


25.5u 0.5s 37r a.out

c: PTCC x1.c
c: size a.out
text |data | bss | dec | hex
12288 | 6144 | 7620 | 26052 | 65c4

c: time a.out > /dev/null


25.7u 0.4s 33r a.out

PTCC is the driver for my standard off the-shelf Cfront 3.0 (i.e. I'm not using any
technology you couldn't buy half a year ago). Note that the size of the generated code is
essentially the same. So is the speed. Running these examples a few times to eliminate
random error in the timing mechanism shows that the run-time isn't biased one way or
the other.

This is what you should expect for a program in the common sub-set of C and C++.
There is fundamental reasons for that. You should expect identical code from two C and
C++ compilers using the same technology. The only possibly SYSTEMATIC difference I
can think of is that a C++ compiler can use better function call sequences than a C
compiler that doesn't apply global optimization because in many cases a C compiler must
guard against possible calls with differing numbers of arguments where a C++ compiler
doesn't need to because of C++'s stronger type checking. In most C and C++ compilers,
this difference is theoretical, but I'm told that in Zortech C++ it is real (i.e. C++
programs are ever so slightly faster than their C equivalents). However, this is all noise,
I doubt the difference between C and C++ in this kind of comparison matters to any real
programmers. The difference is far smaller than differences between different C
compilers - but surprisingly, it is in C++'s favor.

Programs in the common subset of C and C++ results in


equal sized code that execute at equal speed.

If that conclusion doesn't appear to hold, check if your C and C++ compilers are of
similar quality. If your C++ compiler appears to loose badly you have the option of
using a Cfront variant to get the benefits of your C compiler's code generation facilities.
If your C compiler loose badly, switch to C++ even if you aren't ready to use the ``++
features.''

Now, a common argument is ``OK, so C++ can match C for a C programs but as soon as
you use the REAL C++ features your programs get bigger and slower.'' Clearly you can
write big and slow programs in any language (even C), but you don't necessarily take a
performance hit when you start using C++. Consider x2.c. It uses the C++ stream I/O
library that is certainly bigger than C's stdio and is unlikely to be tuned to the same
degree as stdio. It is also a library that
uses a very large sub-set of C++'s features in its interface and implementation (operator
overloading, multiple inheritance, virtual functions, etc.):

c: PTCC x2.c
c: size a.out
c: text | data | bss | dec | hex
17408 | 2048 | 0 |19456| 4c00
c: time a.out > /dev/null
32.8u 1.0s 43r a.out

Surprisingly enough, the code generated for x2.c is noticeably smaller than the code
generated for x1.c (75% of x1.o) though - as expected it runs a bit slower (29% user cpu
time, 16% better elapse time).

I claim, but cannot prove, that the run-time overhead is primarily a difference in tuning.
Other programs that rely heavily on C++ features show improvements over their C
counterparts - and others again show overhead. The differences does not appear
systematic to me; that is, they are differences in design and effort, rather than inherent
overhead in C or C++.

The space advantage of the C++ program is an advantage of the same kind; that is, it is
there because a little extra care and thought was spent. Other implementations of stream
I/O will show different space and time usage, as will different implementations of stdio.
To do simple things only the essential parts of the stream I/O library is brought in. You
don't actually need a very ``intelligent'' linker, the dumb old Unix ld will do: Just
manually split your implementation into several .c files. A simple example:

X.h:
class X {
// details
public:
void f(); // common function
void g(); // uncommon function
// more functions
};

X1.c:
// common functions:

void X::f() { ... }

X2.c:
// uncommon functions:

void X::g() { ... }

Now, any half-way decent archive program can bring in the object code for X1.c (only)
for programs that use the common functions (only) and leave the expense of bringing in
the object code for X2.c for the programs that actually use functions defined in X2.c.

There exist linkers that can do that without human help (mostly in the PC world), I just
happen not to have one. I think it is important to note that this technique and the tools
that supports it carried over from C to C++. We wasn't at the mercy of some ``smart''
and possibly espensive or unavailable technology. We don't have to forget or loose all of
our effective techniques in moving from C to C++. We should - as ever - use them with a
suitable amount of judgement.

C++ was designed not to leave room "below'' for a lower level language, except
assembler for machine specific operations.

I found this email on this following link:


http://nkari.uw.hu/Tutorials/CPPTips/split_impl

You can click on it to check for any inconsistencies. So, I take his words on this issue, do
you?

0 Comments

Saturday, November 18, 2006


C++ : All About Temporaries
Even the most trivial statements, like A = B, in a computer language may produce
temporaries. Moreover, the generation of these temporaries has to be standardized to
maintain a language's efficacy. The C++ language is no exception to that rule.

Following is an embellished version of 'The C++ Standard'.

Temporary Objects

1 Temporaries of class type are created in various contexts: binding an rvalue to a


reference, returning an rvalue, a conversion that creates an rvalue, throwing an exception,
entering a handler, and in some initializations. Even when the creation of the temporary
object is avoided, all the semantic restrictions must be respected as if the temporary
object was created. [Example: even if the copy constructor is not called, all the semantic
restrictions, such as accessibility, shall be satisfied. ]

2 [Example:

class X {
// ...
public:
// ...
X(int);
X(const X&);
˜X();
};

X f(X);

void g()
{
X a(1);
X b = f(X(2));
a = f(a);
}

Here, an implementation might use a temporary in which to construct X(2) before passing
it to f() using X’s copy-constructor; alternatively, X(2) might be constructed in the space
used to hold the argument. Also, a temporary might be used to hold the result of f(X(2))
before copying it to b using X’s copy-constructor; alternatively, f()’s result might be
constructed (directly) in b. On the other hand, the expression a=f(a) requires a temporary
for either the argument a or the result of f(a) to avoid undesired aliasing of a. ]

3 When an implementation introduces a temporary object of a class that has a non-trivial


constructor, it shall ensure that a constructor is called for the temporary object. Similarly,
the destructor shall be called for a temporary with a non-trivial destructor. Temporary
objects are destroyed as the last step in evaluating the full-expression (a full-expression is
an expression that is not a subexpression of another expression) that (lexically) contains
the point where they were created. This is true even if that evaluation ends in throwing an
exception.

4 There are two contexts in which temporaries are destroyed at a different point than the
end of the full-expression . The first context is when an expression appears as an
initializer for a declarator defining an object. In that context, the temporary that holds the
result of the expression shall persist until the object’s initialization is complete. The
object is initialized from a copy of the temporary; during this copying, an implementation
can call the copy constructor many times; the temporary is destroyed after it has been
copied, before or when the initialization completes. If many temporaries are created by
the evaluation of the initializer, the temporaries are destroyed in reverse order of the
completion of their construction.

5 The second context is when a reference is bound to a temporary. The temporary to


which the reference is bound or the temporary that is the complete object to a subobject
of which the temporary is bound persists for the lifetime of the reference except as
specified below. A temporary bound to a reference member in a constructor’s ctor-
initializer persists until the constructor exits. A temporary bound to a reference parameter
in a function call persists until the completion of the full expression containing the call. A
temporary bound to the returned value in a function return statement persists until the
function exits. In all these cases, the temporaries created during the evaluation of the
expression initializing the reference, except the temporary to which the reference is
bound, are destroyed at the end of the full-expression in which they are created and in the
reverse order of the completion of their construction. If the lifetime of two or more
temporaries to which references are bound ends at the same point, these temporaries are
destroyed at that point in the reverse order of the completion of their construction. In
addition, the destruction of temporaries bound to references shall take into account the
ordering of destruction of objects with static or automatic storage duration; that is, if obj1
is an object with static or automatic storage duration created before the temporary is
created, the temporary shall be destroyed before obj1 is destroyed; if obj2 is an object
with static or automatic storage duration created after the temporary is created, the
temporary shall be destroyed after obj2 is destroyed. [Example:

class C {
// ...
public:
C();
C(int);
friend C operator+( const C&, const C& );
˜C();
};

C obj1;
const C& cr = C(16)+C(23);
C obj2;
the expression C(16)+C(23) creates three temporaries. A first temporary T1 to hold the
result of the expression C(16), a second temporary T2 to hold the result of the expression
C(23), and a third temporary T3 to hold the result of the addition of these two
expressions. The temporary T3 is then bound to the reference cr. It is unspecified whether
T1 or T2 is created first. On an implementation where T1 is created before T2, it is
guaranteed that T2 is destroyed before T1. The temporaries T1 and T2 are bound to the
reference parameters of operator+; these temporaries are destroyed at the end of the full
expression containing the call to operator+. The temporary T3 bound to the reference cr
is destroyed at the end of cr’s lifetime, that is, at the end of the program. In addition, the
order in which T3 is destroyed takes into account the destruction order of other objects
with static storage duration. That is, because obj1 is constructed before T3, and T3 is
constructed before obj2, it is guaranteed that obj2 is destroyed before T3, and that T3 is
destroyed before obj1. ]

0 Comments

Friday, November 17, 2006


C++ : The Object Destruction Process
A user-defined destructor is augmented in much the same way as are the constructors, except
in reverse order:

1. If the object contains a vptr, it is reset to the virtual table associated with the class.

2. The body of the destructor is then executed; that is, the vptr is reset prior to evaluating
the user-supplied code.

3. If the class has member class objects with destructors, these are invoked in the reverse
order of their declaration.

4. If there are any immediate non-virtual base classes with destructors, these are invoked
in the reverse order of their declaration.

5. If there are any virtual base classes with destructors and this class represents the most-
derived class, these are invoked in the reverse order of their original construction.

4 Comments

Monday, November 13, 2006


C++ : The Object Construction Process
When we define an object, such as
T object;

exactly what happens? If there is a constructor associated with T (either user supplied or synthesized by the
compiler), it is invoked. That's obvious. What is sometimes less obvious is what the invocation of a
constructor actually entails.

Constructors can contain a great deal of hidden program code because the compiler augments every
constructor to a greater or lesser extent depending on the complexity of T's class hierarchy. The general
sequence of compiler augmentations is as follows:

1. The data members initialized in the member initialization list have to be entered within the body of
the constructor in the order of member declaration.

2. If a member class object is not present in the member initialization list but has an associated default
constructor, that default constructor must be invoked.

3. Prior to that, if there is a virtual table pointer (or pointers) contained within the class object, it (they)
must be initialized with the address of the appropriate virtual table(s).

4. Prior to that, all immediate base class constructors must be invoked in the order of base class
declaration (the order within the member initialization list is not relevant).

o If the base class is listed within the member initialization list, the explicit arguments, if any,
must be passed.

o If the base class is not listed within the member initialization list, the default constructor (or
default memberwise copy constructor -- bitwise copy) must be invoked, if present.

o If the base class is a second or subsequent base class, the this pointer must be adjusted.

5. Prior to that, all virtual base class constructors must be invoked in a left-to-right, depth-first search of
the inheritance hierarchy defined by the derived class.

o If the class is listed within the member initialization list, the explicit arguments, if any, must
be passed. Otherwise, if there is a default constructor associated with the class, it must be
invoked.

o In addition, the offset of each virtual base class subobject within the class must somehow be
made accessible at runtime.

o These constructors, however, may be invoked if, and only if, the class object represents the
"most-derived class." Some mechanism supporting this must be put into place.

Please click on the ads to your right :p

0 Comments
C++ : Is Member Initialization List Efficient ?

Yes it is, and should be made a habit. Consider the following example:

class Anjelina {

public:
// very naive ....
Anjelina( ) {
_name = 0; //_name is NULL now
_blemish = 0;
}

private:
String _name;
int _blemish;
};

What happens before the start of constructor call?

An object is fully constructed. What !!! Yes, it is. Before the constructor call, the 'this'
pointer is initialized and all the members are constructed (their default constructors
called). The member constructor is called if it is non-trivial. So, what's wrong in the
above constructor? Nothing, it's just naive. The reason is, that the constructor of _name is
already called and String already initialized to 0 (default behavior for String object). The
code transformation that takes place is:

// Pseudo C++ Code


Angelina::Angelina (/* 'this' pointer goes here */) {
// invoke default String constructor
_name.String::String ( ) ;
// generate temporary
String temp = String ( 0 ) ;
// memberwise copy _name
_name.String::operator= ( temp ) ;
// destroy temporary
_name.String::~String( );

_blemish = 0 ;
}

So the proper code for constructor should be:

Anjelina( ):_name(0) {
_blemish = 0;
}

or even better (non-confusing code) ...

Anjelina( ):_name(0), _blemish(0) { }

This will call the constructor with 0 as an argument. If you want String's default
constructor to be called, you can write _name( ).

Please click on the ads to your right :p

2 Comments

C++ : Is Member Initialization List a Must ?

No. But is a must in the following cases:

1. When initializing a reference member.


2. When initializing a 'const' member.
3. When invoking a base or member class constructor with a set of arguments.

Now, click on the ads to your right :p

0 Comments

C++ : What's the value of i++ + i++?

It's undefined !! Basically, in C and C++, if you read a variable twice in an expression
where you also write it, the result is undefined. Don't do that. Another example is:

v[i] = i++;
Related example:
f(v[i],i++);
Here, the result is undefined because the order of evaluation of function arguments are
undefined.

You can even check it from Bjarne's own blog on C++ FAQs.

Now click on the ads to your right :p

3 Comments

C++ : What's Undefined behavior in C++ ?


Let me quote here the 2003 standard for C++

...
1.3.12 undefined behavior

behavior, such as might arise upon use of an erroneous program construct or erroneous
data, for which this International Standard imposes no requirements. Undefined
behavior may also be expected when this International Standard omits the description of
any explicit definition of behavior. [Note: permissible undefined behavior ranges from
ignoring the situation completely with unpredictable results, to behaving during
translation or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to terminating a
translation or execution (with the issuance of a diagnostic message). Many erroneous
program constructs do not engender undefined behavior; they are required to be
diagnosed. ]

1.3.13 unspecified behavior

behavior, for a well-formed program construct and correct data, that depends on the
implementation. The implementation is not required to document which behavior occurs.
[Note: usually, the range of possible behaviors is delineated by this International
Standard. ]
...

One of the frequently asked questions to Bjarne is:

Why are some things left undefined in C++?

And here's his answer....

Because machines differ and because C left many things undefined. For details,
including definitions of the terms "undefined", "unspecified", "implementation defined",
and "well-formed"; see the ISO C++ standard. Note that the meaning of those terms
differ from their definition of the ISO C standard and from some common usage. You can
get wonderfully confused discussions when people don't realize that not everybody share
definitions.

This is a correct, if unsatisfactory, answer. Like C, C++ is meant to exploit hardware


directly and efficiently. This implies that C++ must deal with hardware entities such as
bits, bytes, words, addresses, integer computations, and floating-point computations they
way they are on a given machine, rather than how we might like them to be. Note that
many "things" that people refer to as "undefined" are in fact "implementation defined",
so that we can write perfectly specified code as long as we know which machine we are
running on. Sizes of integers and the rounding behavior of floating-point computations
fall into that category.
Consider what is probably the the best known and most infamous example of undefined
behavior:

int a[10];
a[100] = 0; // range error
int* p = a;
// ...

// range error (unless we gave p a better


// value before that assignment)
p[100] = 0;

The C++ (and C) notion of array and pointer are direct representations of a machine's
notion of memory and addresses, provided with no overhead. The primitive operations on
pointers map directly onto machine instructions. In particular, no range checking is
done. Doing range checking would impose a cost in terms of run time and code size. C
was designed to outcompete assembly code for operating systems tasks, so that was a
necessary decision. Also, C -- unlike C++ -- has no reasonable way of reporting a
violation had a compiler decided to generate code to detect it: There are no exceptions in
C. C++ followed C for reasons of compatibility and because it too compete directly with
assembler (in OS, embedded systems, and some numeric computation areas). If you want
range checking, use a suitable checked class (vector, smart pointer, string, etc.). A good
compiler could catch the range error for a[100] at compile time, catching the one for
p[100] is far more difficult, and in general it is impossible to catch every range error at
compile time.

Other examples of undefined behavior stems from the compilation model. A compiler
cannot detect an inconsistent definition of an object or a function in separately-compiled
translation units. For example:

// file1.c:
struct S { int x,y; };
int f(struct S* p) { return p->x; }

// file2.c:
struct S { int y,x; };

int main()
{
struct S s;
s.x = 1;
int x = f(&s); // x!=s.x !!
return 2;
}
Compiling file1.c and file2.c and linking the results into the same program is illegal in
both C and C++. A linker could catch the inconsistent definition of S, but is not obliged
to do so (and most don't). In many cases, it can be quite difficult to catch inconsistencies
between separately compiled translation units. Consistent use of header files helps
minimize such problems and there are some signs that linkers are improving. Note that
C++ linkers do catch almost errors related to inconsistently declared functions.
Finally, we have the apparently unnecessary and rather annoying undefined behavior of
individual expressions. For example:

void out1 { cout << 1 ; }


void out2 { cout << 2 ; }

int main()
{
int i = 10;
int j = ++i + i++; // value of j unspecified
f(out1(),out2()); // prints 12 or 21
}
The value of j is unspecified to allow compilers to produce optimal code. It is claimed
that the difference between what can be produced giving the compiler this freedom and
requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but
with innumerable compilers "out there" taking advantage of the freedom and some
people passionately defending that freedom, a change would be difficult and could take
decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed
that not all compilers warn against code such as ++i+i++. Similarly, the order of
evaluation of arguments is unspecified.

IMO far too many "things" are left undefined, unspecified, implementation-defined, etc.
However, that's easy to say and even to give examples of, but hard to fix. It should also
be noted that it is not all that difficult to avoid most of the problems and produce
portable code.

This is Bjarne at his best. Now click on the ads to your left :p

3 Comments

Wednesday, November 8, 2006


C++ / C: Should main ( ) return void?

The answer is 'NO'. 'void' has never been the return type of main() in C++ and nor C.
You should return 'int'. Example,

int main ( ) {
std::cout<< "Hello Mama";
return 0;
}

What does 'The C++ Standard' say?


Let me quote 'The C++ Standard' here

3.6.1 Main function

....

2 An implementation shall not predefine the main function. This function shall not be
overloaded. It shall have a return type of type int, but otherwise its type is
implementation-defined. All implementations shall allow both of the following definitions
of main:

int main() { /* ... */ }

and

int main(int argc, char* argv[]) { /* ... */ }

.... and it continues to add ...

5 A return statement in main has the effect of leaving the main function (destroying any
objects with automatic storage duration) and calling exit with the return value as the
argument. If control reaches the end of main without encountering a return statement,
the effect is that of executing return 0;

So, now you know what the standard states. Shun the books that don't understand the
standard properly. And now click on the ads to your right. :p

0 Comments

C++: I am sure that vptr is stored as a first member of the object !

No, you're wrong. You need to read on...

Eligibility: You should know what vptr stands for and what are its functions, and you
should have basic familiarity with the C++ language.

What does 'The C++ Standard' say?

The C++ Standard is noncommittal about it. Because the standard describes C++
Language and not it's implementation.

The first implementation of C++ (by Bjarne Stroustrup) has vptr as the last member of
the object/sub-object. But the Microsoft Compiler as some others have it as the first
member. What are the consequences of it's positioning within the object? Lets see...

The C language portability

One current topic of debate within the C++ community concerns where best to locate the
vptr within the class object. In the original cfront implementation, it was placed at the
end of the class object in order to support the following inheritance pattern, shown in the
figure below:

struct no_virts {
int d1, d2;
};

class has_virts: public no_virts {


public:
virtual void foo();
// ...
private:
int d3;
};

no_virts *p = new has_virts;

Placing the vptr at the end of the class object preserves the object layout of the base class
C struct, thus permitting it's use within C code. This inheritance idiom is believed by
many to have been more common when C++ was first introduced than currently.

Subsequent to release 2.0, with its addition of support for multiple inheritance and
abstract base classes, and the general rise in popularity of the OO paradigm, some
implementations began placing the vptr at the start of the class object.

Placing the vptr at the start of the class is more efficient in supporting some virtual
function invocations through pointer to class members under multiple inheritance.
Otherwise, not only must the offset to the start of the class be made available at runtime,
but also the offset to the location of the vptr of that class must be made available. The
trade-off, however, is a loss in C language interoperability. How significant a loss? What
percentage of programs derive polymorphic class from a C-language struct? There are
currently no empirical numbers to support either position.

Now click the ads to your right. :p

0 Comments

Tuesday, November 7, 2006


C++: I have heard that if I don't write one or more of default constructor,
copy constructor, and copy assignment operator the compiler
creates them.

Ha! They have taken you half-way through your voyage. This is incomplete information.
The answer is 'NOT ALWAYS'! Surprised? Well then read on...

Eligibility: You should be familiar with basic formulation of the C++ language. You can
check for 'The C++ Standard' here.

Triviality...

The C++ Standard states that the compiler should create the default .ctor, copy .ctor, and
copy assignment operator if they are non-trivial. Now we need to consider the conditions
in which the three are non-trivial. Let's consider them one by one.

The Default Constructor

You should remember that if you provide other argument constructors but not the default
constructor, then the default constructor is not prepared for you. The compiler assumes
that this is the exact behavior you needed. The standard states that the default constructor
is non-trivial if:

1. Any of the member class objects has a default constructor.


2. Base class has a default constructor.
3. Class has virtual functions.
4. Class has virtual base class.

If any of these conditions is not met, then the compiler doesn't create a constructor for
you. So, if we consider the following code

class MyClass{

public:
void SomeFunction( ) {//...}

private:
int* somedata;
int _x, _y;

};

the default constructor is not generated and the members are left uninitialized. This may
result in unwanted behavior. Here, somedata may point to restricted memory and it's
access may result in a program crash. So, it's best to define a default constructor, and
should be made a habit.

The Copy Constructor

The Standard states that the copy constructor if not defined should be generated only if it
will not be trivial. It goes on to add that the condition of being trivial is to exhibit bitwise
copy semantics. Bitwise copy semantics mean bit-by-bit copy of an object. The standard
states that a class does not exhibit bitwise copy semantics, if:

1. When the class contains a member object of a class for which a copy constructor
exists (either explicitly declared by the class designer, or synthesized by the
compiler).
2. When the class is derived from a base class for which a copy constructor exists
(again, either explicitly declared or synthesized).
3. When a class declares one or more virtual functions. This is essential for proper
initialization of vptr if the parent object is initialized with an object of derived
class.
4. When the class is derived from an inheritance chain in which one or more base
classes are virtual. This is essential to properly initialize the virtual class
subobject if the parent object is initialized with an object of derived class. Failing
to which would result in improper initialization of virtual base class pointer/offset
within the vtbl.
So, the following class doesn't create a copy constructor for you. That is, there is no
production of a function (and it's call). Rather, a code is inlined wherever needed.

class MyClass {
public:
MyClass( ):name(0), x(0){ }
MyClass (const char* );

~MyClass( ) {
delete[] name ;
}

void ShowMe( ) { //... }

private:

char *name;
int x;
};

Now if we write

MyClass M;

MyClass M2 =M; //invoke inlining

someFunction(M2); //invoke inlining

or

void someOtherFunction() {
MyClass M;
return M;//invoke inlining
}

There is no functional call, but, the code for bit to bit copy of members is inlined. That is,

M2.name = M.name;
M2.x=M.x;

This will lead to both the names pointing to the same location. So, if any object changes
or deletes (while exiting the scope) it's name, it will result in a dangling pointer in other
object as the memory pointer but it's name would have been invalidated.

So, it is advised to always declare a copy constructor. But if you are smart and know that
it won't lead to bugs, you can give it a miss. This will increase program speed as there
won't be any functional jumps.

Copy Assignment Operator

The condition for synthesizing copy assignment operator is the same as that of the copy
constructor -- that it is should not exhibit the bitwise copy semantics, i.e should be non-
trivial. The conditions that a copy assignment operator does not show bitwise copy
semantics are:

1. When the class contains a member object of a class for which a copy assignment
operator exists (either explicitly declared by the class designer, or synthesized by
the compiler).
2. When the class is derived from a base class for which a copy assignment operator
exists (again, either explicitly declared or synthesized).
3. When a class declares one or more virtual functions. This is essential for proper
initialization of vptr if the parent object is initialized with an object of derived
class.
4. When the class is derived from an inheritance chain in which one or more base
classes are virtual. This is essential to properly initialize the virtual class
subobject if the parent object is initialized with an object of derived class. Which
would result in improper initialization of virtual base class pointer/offset within
the vtbl.

So, if the operator is trivial, it results in inlining of code, rather than resulting in function
calls. This increases the speed. And as the assignment operator is used widely in a
program, it is best advised not to declare one unless is necessary, like when the bitwise
copy can result in dangling pointers. If you are not sure about the behavior then you
should go ahead and declare one anyhow.

Now you know the full information. So bon vo‧yage o reader. And now click the ads to
your right. :p

0 Comments

C++: What are vptr and vtbl in C++ ?

Well well, now this is one of the most asked questions of all times in C++. Let me be
clear first. These two data structures are not mentioned in 'The C++ Standard'. Rather, are
implementation details of C++. So, are not part of the standard C++.

Eligibility: you should have a clear understanding of polymorphism with C++.

Why do we need them?


The question that arises is 'Why do we need them?' The answer is simple. To support
polymorphism and RTTI. Let's look at a piece of code in C++.

class Parent {
public:
Parent():data(0) { }

void doSomeThing ( ) {
//.......
showData( ) ; //equivalent to this->showData( );
}

virtual void someFunction( ) { //...... }

private:
int data;
virtual void showData ( ) {
cout << data ;
}
};

class Child : public Parent {


public:
Child():data(10) { }

private:
int data;
void showData ( ) {
cout << data ;
}
};

int main ( ) {
Parent *p = 0;
p = new Child ( ); // p points to Child Now
p -> doSomeThing ( ); //gives 10
return 0;
}

How to produced intermediate-code capable of handling polymorphic behavior?

The crux of the problem for compiler, is that, just like main ( ) as shown here, many other
functions in a large program would call doSomeThing( ) through parent class object.
This object may actually (as in this case) point to the derived class object. Now, the
compiler has to produce an intermediate code that could decide which showData( ) to
call whenever doSomeThing( ) is called through a Parent class pointer. If the pointer
points to Parent object, cout should output 0, else if the pointer points to the Child object
it should output 10.

To make the decision, it employs the technique of vptr (virtual pointer) and vtbl (virtual
table). Every class object that exhibits polymorphic behavior (has virtual function)
embeds a pointer. This pointer is vptr.

What does vptr point to?

Now the question arises, 'what does this pointer (vptr) point to?'. To answer this question
we need to understand where the member functions are stored. Every member function of
a class is stored statically with a mangled name. Say, doSomeThing( ) would be stored as
__Parent__d001( ), as would be the two virtual functions __Parent__showData001( )
and __Child_showData001( ). So, the code like

Parent p = new Parent ( );


p.doSomeThing( );

is changed to

//...
__Parent__d001(p);

which means, call __Parent__d001( ) through p. Now, let's see what are the effects of the
keyword 'virtual' before a function declaration. A derived class declaration can:

1. Override the virtual function.


2. Not override the virtual function.

To incorporate both the situations, a compiler creates a data structure called vtbl (for
virtual table). Every class has one or more vptrs and vtbls, but are in equal numbers. In
simple scenario (as with above code) we will have one vptr and one vtbl, and would
consider it for explanation. The vtbl can be considered having first member as the
'type_info' of the class, and the rest members (variable numbers -- but at least one) are
pointers to virtual functions of the class object. The structure o

f vtbl is given below:

When a derived class decides not to override the virtual function the address of Parent's
virtual function is added to the vtbl. But if, the derived class decides to override the
function, the address of the overriding function replaces the old address. Thus, yielding
the figure shown on the left-hand side. The positions of the functions remains fixed
within the hierarchy of vtbls. Here, &__XXX__showData001( ) will always be the first
entry within the vtbls.

Where in object is vptr stored?

Now, the second question to be answered in our journey to find out what does vptr point
to, is, 'Where in the object structure is the vptr stored?'. The answer is anywhere you
would like to. Most standard compilers choose to embed vptr as the last member (to have
C language compatibility), whereas others like Microsoft compilers make it as the first
member of the object structure. So, considering the former case, we will reach to the
following object layout:
Now, we are in a position to answer the question. The vptr of
the object points to the respective vtbl.

Rewriting the intermediate code...

So the polymorphic code:

void doSomeThing ( ) {
//.......
showData( ) ;
}

can be written as

void doSomeThing ( ) {
//.......
this->vptr[1](this);
//[1] is the second entry of vtbl, which is showData( )
}

Now depending upon the type of 'this', i.e. the type of object that called doSomeThing( ),
it's respective virtual function would be called.

Hurrah! That solves the problem of embellishing polymorphic intermediate code


properly. In the same way the type_info for every object can be checked to see if it can be
properly cast into any other type. Thus enabling Runtime Type Information (RTTI).

Now click the ads to your right :)

Вам также может понравиться