Вы находитесь на странице: 1из 31

C++??

A Critique of C++

2nd Edition

Ian Joyner

c/- Unisys - ACUS


115 Wicks Rd, North Ryde
Australia 2113
Tel: +61-2-390 1328 Fax +61-2-390-1391

ian@syacus.acus.oz.au

© Ian Joyner 1992


1. Introduction......................................................................................................................1
2. The Role of a Programming Language..............................................................................1
2.1. Safety and Courtesy Concerns .............................................................................3
4. C++ Specific Criticisms ..................................................................................................4
3.1. Virtual Functions ................................................................................................4
3.2. Pure Virtual Functions ........................................................................................6
3.3. The Nature of Inheritance....................................................................................7
3.4. Function Overloading .........................................................................................7
3.5. Virtual Classes....................................................................................................8
3.6. Name overloading...............................................................................................8
3.7. Polymorphism and Inheritance ............................................................................9
3.8. ‘.’ and ‘->’ .........................................................................................................10
3.9. Anonymous parameters in Class Definitions .......................................................10
3.10. Nameless Constructors ......................................................................................11
3.11. Constructors and Temporaries ...........................................................................11
3.12. Optional Parameters ..........................................................................................11
3.13. Bad deletions.....................................................................................................11
3.14. Local entity declarations....................................................................................12
3.15. Members ...........................................................................................................12
3.16. Friends .............................................................................................................12
3.17. Static .................................................................................................................12
3.18. Union ...............................................................................................................13
3.19. Nested Classes..................................................................................................13
3.20. Global Environments........................................................................................13
3.21. Header Files .....................................................................................................14
3.22. Class Interfaces.................................................................................................14
3.23. Class header declarations ..................................................................................14
3.24. Garbage Collection ...........................................................................................15
3.25. Type-safe linkage .............................................................................................15
3.26. C++ and the software lifecycle..........................................................................16
3.27. Reusability and Communication .......................................................................17
3.28. Reusability and Trust........................................................................................17
3.29. Reusability and Compatibility ..........................................................................17
3.30. Reusability and Portability................................................................................17
3.31. Idiomatic Programming ....................................................................................17
3.32.Concurrent Programming ...................................................................................18
4. The role of Language.......................................................................................................18
5. On Writing ......................................................................................................................20
6. Generic C criticisms ........................................................................................................20
6.1. Pointers...............................................................................................................21
6.2. Arrays.................................................................................................................21
6.3. Function Parameters............................................................................................22
6.4. void * .................................................................................................................22
6.5. void fn ().............................................................................................................22
6.6. fn ().....................................................................................................................23
6.7. Metadata in Strings .............................................................................................24
6.8. ++, -- ..................................................................................................................24
6.9. Defines ...............................................................................................................25
6.10. NULL vs 0 ........................................................................................................25
6.11. Case Distinction ................................................................................................25
6.12. Assignment Operator ........................................................................................26
6.13. Type Casting ....................................................................................................26
6.14. Semicolons.......................................................................................................27
7. Conclusions......................................................................................................................27
8. Bibliography ....................................................................................................................29
most difficult to understand and technical section
1. Introduction of the paper, but it is fundamental to the
The C++ programming language is becoming understanding of the weaknesses of C++.
widely used. So it is important and timely to Having said that, I hope that you find this
question its success. Two books are already critique useful, and enjoyable. If by any chance
published on the subject [Sakkinen 92] and you do, please feel free to distribute it to your
[Yoshida 92]. This critique addresses the management, peers and friends.
following questions. How well does C++
implement object-oriented concepts? Can it easily
implement small, quick projects? Does it scale up 2. The Role of a Programming
well for large projects? Does it support or hinder
good programming practices? As a result, does it Language
ease the production of quality software? What is A programming language functions at many
the relationship between a language, compiler and different levels and has many roles. It should be
software developers; and between the language, critiqued with respect to those levels and roles.
compiler and the target system? This last question Historically, programming languages had a very
addresses issues of correctness, compatibility, limited role, that of writing executable programs.
portability, and efficiency. As programs have grown in complexity, this role
A paper on the recommended practices for use alone has proved insufficient. Many design and
in C++ [Ellemtel 92] suggests “C++ is a difficult analysis techniques have arisen to support other
language in which there may be a very fine line necessary roles. The organisation of projects also
between a feature and a bug. This places a large required tools external to the language and
responsibility upon the programmer.” Is this a compiler, like ‘make.’ Object-oriented techniques
responsibility or a costly burden? The ‘fine line’ have arisen to help in the analysis and design
is a result of poor language definition. The C++ phases, and object-oriented languages to support
standardisation committee warns “C++ is already the implementation phase of OO. Traditional,
too large and complicated for our taste” [X3J16 tried and tested but failed software practices are
92]. infiltrating the object-oriented world. Object-
While it is true that C++ is immediately orientation, however, offers a better rational
usable by many C programmers, and many see approach to software development. The
this as a strength, the C base is C++’s greatest complementary roles of analysis, design,
weakness. This is the engineering compromise implementation and project organisation should
that C++ devotees talk about. Adoption of C++ be better integrated in the object-oriented scheme.
does not suddenly transform C programmers into This results in economical software production.
object-oriented programmers. A complete change C++ is an interesting experiment in adapting
of thinking is required, and C++ actually makes the advantages of object-orientation to a
this difficult. A critique of C++ cannot be traditional programming language. Bjarne
separated from criticism of the C base language, Stroustrup is to be applauded for having the
as it is essential for the C++ programmer to be insight to put the two technologies together. C++,
fluent in C. Many of C’s problems affect the way however, retains the problems of the old order of
that object-orientation is implemented and used in software production. C++ has an advantage over
C++. This critique is not exhaustive of the C as it supports many facets of object-orientation.
weaknesses of C++, but it illustrates the practical These can be used for limited analysis and design.
consequences of these weaknesses with respect to The processes of analysis, design, and
the timely and economic production of quality organisation, however, are still largely external to
software. C++. Thus C++ has not realised the important
This critique criticises C++ in its own right, advantages of object-orientation that will indeed
without comparison to other languages. Section 2 lead to the economic production of software.
considers the role of a programming language. A language should not only be critiqued from
Section 3 examines some specific aspects of C++. a technical point of view, considering its syntactic
Section 4 examines the general role of language. and semantic features. It should also be critiqued
Section 5 is a short comment on writing. Section from the viewpoint of its contribution to the entire
6 looks specifically at C. The conclusion software development process. It should enable
examines where C++ has left us, and considers the communication between project members acting
future. The approach taken is to criticise specific at different levels, from management, who have a
aspects of C++ and C. Each section tries to be self requirement for the product, to testers, who must
contained. It is expected that not everyone will test the result. It should also enable
agree with all of the sections. It is probably best to communication between project members
approach the paper, not by reading it entirely, but separated in space and time. Often one
to read those sections that interest you. One programmer is not responsible for a task over its
section, however, is fundamental to the criticism entire lifetime.
of C++, that on virtual functions. This is also the

C++?? 2nd Edition page 1


The primary purpose of any language is techniques of schema checking are often criticised
communication. A programming language should as being restrictive and therefore unusable for real
support the exchange of ideas, intentions, and world software. This is nonsense and
decisions between project members. A misunderstands of the power of these languages. It
programming language should provide a formal, is an immature conception; the best programmers
yet readable, notation to support consistent realise that programming is difficult. As a whole,
descriptions of systems that satisfy the the computing profession is still learning to
requirements of diverse problems. A language program.
should also provide methods for automated Another example of consistency checking
project tracking. This ensures that modules comes from the user interface world. Instead of
(classes and functionality) that satisfy project correcting a user after an erroneous action, a good
requirements are completed in a timely and user interface will not offer the action as a
economic fashion. A programming language aids possibility in the first place. It is cheaper to avoid
reasoning about the design, implementation, error than to fix it. Most people drive their cars
extension, correction, and optimisation of a with this principle in mind. Smash repair is time
system. consuming and expensive.
A language definition should enable the Program development is a dynamic process. A
development of integrated automated tools to program description is constantly modified during
support software development. For example, development. Modifications often lead to
browsers, editors and debuggers. The compiler is inconsistencies and error. Languages and
another such tool. The role of a compiler is compilers that provide consistency checks help
twofold. Firstly, to generate code for the target prevent such ‘bugs’, which can creep into a
machine. The role of the machine is to execute the previously working system. These checks help
produced programs. A compiler has to check that verify that as a program is modified, previous
a program conforms to the language syntax and decisions and work are not invalidated.
grammar, so it can ‘understand’ the program in It is interesting to consider how much
order to translate it into an executable form. checking could be integrated in an editor. The
Secondly, and more importantly, the compiler focus of many current generation editors is text.
should check that the programmers expression of What happens if we change this focus from text to
the system is complete, valid and consistent. A program components? Such editors might check
compiler should perform semantics checking. This not only syntax, but semantics. Alerting
is checking that a program is internally consistent. programmers of potential errors earlier and
Generating a system that has detectable interactively will shorten development times.
inconsistencies is pointless. Future languages should be defined very cleanly
Semantics checking is done by ensuring that a in order to enable such editor technology.
specification conforms to some schema. For A programming language should provide a
example, the sentence “The boy drank the formal notation. During requirements analysis and
computer and switched on the glass of water” is design phases, formal and semi-formal notations
grammatically correct. But the sentence is are required. Notations used in analysis, design,
nonsense. It does not conform to the mental and implementation phases should be
schema we have of computers and glasses of complementary, rather than contradictory.
water. A programming language should include Currently, analysis, design and modelling
techniques for the detection of similar nonsense. notations are too far removed from programming,
The language definition provides the framework while programming languages are in general too
that makes this role of the compiler possible. low level. Both designers and programmers must
Checking is often enabled by the specification compromise to fill the gap. Current notations
of redundant information. Declarations are an provide difficult transition paths between stages.
example of redundancy that help check for This ‘semantic gap’ contributes to errors and
misspellings. Declarations define the vocabulary omissions between the requirements, design and
of a program, ie the elements in its universe. The implementation phases. Future programming
compiler uses redundant information for languages will be an implementation extension of
consistency checking, and strips it away to the high level notations used for requirements
produce efficient executable systems. Type safety analysis and design. This will lead to improved
is another technique. Declarations also associate consistency between analysis, design and
an entity with a type, to define the entities role. implementation. Object-oriented techniques
Typing ensures that you can’t drink computers or emphasise the importance of this, as abstract
switch on glasses of water. C++ is an definition and concrete implementation can be
improvement over C in type safety. separate, yet provided by the same syntax.
It is a misconception that consistency checks Programming languages also provide
are ‘training wheels’ for student programmers, notations to formally document a system.
and that ‘syntax’ errors are a hindrance to Program source is the only reliable documentation
professional programmers. Languages that exploit of a system, so a language should explicitly

C++?? 2nd Edition page 2


support documentation. As with all language, the These quotes from Reade are a good summary
effectiveness of communication is dependent upon of the principles from which I criticise C++. What
the skill of the writer. Good program writers Reade calls administrative tasks, I call
require languages that support the role of bookkeeping. C and C++ are often criticised for
documentation. They require that the syntax of a being cryptic. The reason is that C concentrates on
language is perspicuous, and easy to learn. Those points 2 and 3, while the description of what is to
not trained in the skill of ‘writing’ programs, can be computed is obscured. High level languages
read them to gain understanding of the system. describe ‘what’ is to be computed. This is the
After all, it is not necessary for newspaper readers problem domain. ‘How’ a computation is
to be journalists. achieved is in the low-level machine-oriented
Chris Reade [Reade 89] gives the following domain. The conflict between these aspects recurs
explanation of programming and languages. “One, frequently throughout this critique. Automating
rather narrow, view is that a program is a the bookkeeping tasks enhances correctness,
sequence of instructions for a machine. We hope compatibility, portability and efficiency.
to show that there is much to be gained from Bookkeeping tasks arise from having to specify
taking the much broader view that programs are ‘how’ a computation is done. Specifying ‘how’
descriptions of values, properties, methods, prob- things are done in some environments hinders
lems and solutions. The role of the machine is to portability to other platforms.
speed up the manipulation of these descriptions to The industry should be moving towards these
provide solutions to particular problems. A ideals. They will help in the economic production
programming language is a convention for of software, rather than the costly techniques of
writing descriptions which can be evaluated.” today. We should consider what we need, and
[Reade 89] also describes programming as assess the problems of what we have against that.
being a “Separation of concerns”. He says: Object-orientation provides one solution to these
“The programmer is having to do several problems. Its effectiveness, however, depends on
things at the same time, namely, the quality of its implementation.
It is relevant to ask if grafting OO concepts
(1) describe what is to be computed; onto a conventional language realises the full
(2) organise the computation sequencing into benefits of OO? Perhaps a biblical quote can be
small steps; considered: “No one sews a patch of unshrunk
(3) organise memory management during the cloth on to an old garment; if he does, the patch
computation.” tears away from it, the new from the old, and
Reade continues, “Ideally, the programmer leaves a bigger hole. No one puts new wine into
should be able to concentrate on the first of the old wineskins; if he does, the wine will burst the
three tasks (describing what is to be computed) skins, and then wine and skins are both lost. New
without being distracted by the other two, more wine goes into fresh skins.” Mark 2:22
administrative, tasks. Clearly, administration is We must abandon disorganised and error-
important but by separating it from the main task prone practices, not adapt them to new contexts.
we are likely to get more reliable results and we How well can hybrid languages support the
can ease the programming problem by automating sophisticated requirements of modern software
much of the administration. production? Surely a basic premise of object-
“The separation of concerns has other oriented programming is to enable the
advantages as well. For example, program proving development of sophisticated systems through the
becomes much more feasible when details of adoption of the simplest techniques possible?
sequencing and memory management are absent Software development technologies and
from the program. Furthermore, descriptions of methodologies should not impede the production
what is to be computed should be free of such of such sophisticated systems.
detailed step-by-step descriptions of how to do it
if they are to be evaluated with different machine 2.1. Safety and Courtesy Concerns
architectures. Sequences of small changes to a This critique makes two general types of
data object held in a store may be an inappropriate criticism, about ‘safety’ concerns and ‘courtesy’
description of how to compute something when a concerns. These themes recur throughout this
highly parallel machine is being used with critique, as C and C++ have flaws that
thousands of processors distributed throughout the compromise them frequently. Safety concerns
machine and local rather than global storage affect the external perception of the quality of the
facilities. program. Failure to meet safety concerns results in
“Automating the administrative aspects means unfulfilled requirements and program crashes.
that the language implementor has to deal with Courtesy concerns affect the internal view of
them, but he/she has far more opportunity to make the quality of a program in the development and
use of very different computation mechanisms maintenance process. Courtesy concerns are
with different machine architectures.” usually stylistic and syntactic, whereas safety
concerns are semantic. The two often go together.
C++?? 2nd Edition page 3
It is courtesy for an airline to keep its fleet well descendant classes are part of the same name
maintained. This courtesy concern is also very space as classes they inherit from. The
much a safety concern. redeclaration of a name within the same scope
Courtesy issues are even more important in should cause a name clash. Allowing two entities
the context of reusable software. Reusability to have the same name within one scope causes
depends on the clear communication of the ambiguity and other problems. (See the section on
purpose of a module. Courtesy is important to name overloading.)
establish social interactions, such as com- The following example illustrates the second
munication. Courtesy implies inconvenience to problem:
the provider, but provides convenience to others.
class A
Courtesy issues include choosing meaningful
identifiers, consistent layout and typography, {
meaningful and non-redundant commentary, etc. public:
Courtesy issues are more than just a style void nonvirt ();
consideration. A language design should directly virtual void virt ();
support courtesy issues. A language, however, }
cannot enforce courtesy issues, and it is often
class B : public A
pointed out that poor, discourteous programs can
be written in any language. But this is no reason {
for being careless about the languages that we public:
develop and choose for software development. void nonvirt ();
void virt ();
}
3. C++ Specific Criticisms
A a;
B b;
3.1. Virtual Functions A *ap = &b;
Polymorphism is a key concept of OOP. B *bp = &b;
Virtual functions are one way to implement
polymorphism. A language designer’s choice is bp->nonvirt (); // calls B::nonvirt
whether this should be specified in the parent or // as you would
the inheriting class. Is it the decision of the // expect
designer of the parent or descendant class? Cases ap->nonvirt (); // calls A::nonvirt,
can be made for both. They are not mutually // even though this
exclusive and can be catered for quite easily in an // object is of type
object-oriented language. // B.
There are three options, corresponding to ap->virt (); // calls B::virt, the
‘must not’, ‘can’, and ‘must’ be redefined: // correct version of
1) The redefinition of a routine is prohibited; // the routine for B
descendant classes must use the routine as is. // objects.
2) A routine could be redefined. Descendant
classes can use the routine as provided, or provide In this example, class B has extended or
their own implementation as long as it conforms replaced routines in class A. B::nonvirt is the
to the original interface definition and routine that should be called for objects of type B.
accomplishes at least as much. It could be pointed out that C++ gives the client
3) A routine is abstract. No implementation programmer flexibility to call either A::nonvirt or
is provided and each non-abstract descendent class B::nonvirt. But this can be provided in a simpler
must provide its own implementation. This is more direct way. A::nonvirt and B::nonvirt should
polymorphism. be given different names. That way the
The base class designer must decide options 1 programmer calls the correct routine explicitly,
and 3. Descendant class designers must decide not by an obscure and error prone trick of the
option 2. A language should provide direct syntax language, as follows:
for these options. class B : public A
{
Option 1 public:
C++ does not cater for the first option. Not void b_nonvirt ();
using a virtual function is the closest. But in that void virt ();
case the routine can be completely replaced. This
}
causes two problems. Firstly, a routine can be
unintentionally replaced in a descendent. The B b;
compiler should report a syntax error due to B *bp = &b;
‘duplicate declaration’. This is logical as

C++?? 2nd Edition page 4


bp->nonvirt (); // calls A::nonvirt whether the function f() is defined virtual or non-
bp->b_nonvirt (); // calls virtual in order to interpret exactly what a->f ()
// B::b_nonvirt means. Therefore, the statement a->f () is not
implementation independent. A change in the
Now the designer of class B has direct control declaration of f () will change the semantics of the
over B’s interface. The application requires that invocation. Implementation independence means
clients of B can call both A::nonvirt, and that a change in the implementation DOES NOT
B::b_nonvirt. B’s designer has explicitly provided change the semantics, of executable statements.
for this. This is good object-oriented design, If a change in the declaration changes the
which provides strongly defined interfaces. C++ semantics, this should generate a compiler
allows client programmers to play tricks with the detected error. The programmer should make the
class interfaces, external to the class, and B’s statement semantically consistent with the
designer cannot prevent A::nonvirt from being changed declaration. This reflects the dynamic
called. This is opposite to good modular design. nature of software development, where the
This shows the unsafeness C++’s virtual program text is subject to perpetual change.
mechanism. Objects of class B have their own For yet another case of the inconsistent
specialised ‘nonvirt’. But B’s designer does not semantics of the statement a->f () vs constructors,
have control over B’s interface to ensure that the consult section 10.9c, p 232 of the C++ ARM.
correct version of nonvirt is called. [Sakkinen 92] points out that a descendant class
C++ also does not protect class B from other can redefine a private virtual function even though
changes in the system. Suppose we need to write a it cannot access that function in other ways. When
class C that needs ‘nonvirt’ to be virtual. Then the ancestor class calls the function it instead
‘nonvirt’ in A will be changed to virtual. But this invokes the function in the descendant class.
breaks the B::nonvirt trick. The requirement of
class C to have a virtual routine forces a change in Option 2
the base class. This has an effect on all other The second option should be left open for the
descendants of the base class, instead of the programmers of descendant classes. In C++,
specific new requirement being localised to the however, the decision must be made in the base
new class. This is opposite to the reason for OOP class. In object-oriented design, the decisions you
having loosely coupled classes, so that new decide not to make are as important as the
requirements, and modifications will have decisions you make. Decisions should be made as
localised effects, and not require changes late as possible. This strategy prevents mistakes
elsewhere which can potentially break other being built into the system at early stages. By
existing parts of the system. making early decisions, you are often stuck with
Rumbaugh et al, put their criticism of C++’s assumptions that later prove to be incorrect. C++
virtual as follows: “C++ contains facilities for requires the parent class to specify potential
inheritance and run-time method resolution, but a polymorphism by virtual (although an
C++ data structure is not automatically object- intermediate class in the inheritance chain can
oriented. Method resolution and the ability to introduce virtual). This prejudges that a routine
override an operation in a subclass are only might be redefined in descendants. This can be a
available if the operation is declared virtual in the problem because routines that aren’t actually
superclass. Thus, the need to override a method polymorphic are accessed via the slightly less
must be anticipated and written into the origin efficient virtual table technique instead of a
class definition. Unfortunately, the writer of a straight procedure call. (This is never a large over-
class may not expect the need to define head but object-oriented programs tend to use
specialized subclasses or may not know what more and smaller routines making routine
operations will have to be redefined by a subclass. invocation a more significant overhead.) The
This means that the superclass often must be policy in C++ should be that routines that might
modified when a subclass is defined and places a be redefined should be declared virtual.
serious restriction on the ability to reuse library Virtual, however, is the wrong mechanism for
classes by creating subclasses, especially if the the programmer to deal with. A compilation
source code library is not available. (Of course, system can detect polymorphism, and generate the
you could declare all operations as virtual, at a underlying virtual code, where and only where
slight cost in memory and function-calling necessary. Having to specify virtual burdens the
overhead.)” [RBPEL91] programmer with another bookkeeping task. This
A further argument is that any statement is the main reason why C++ is a weak object-
should consistently have the same semantics. The oriented language as the programmer must
object-oriented interpretation of a statement like constantly be concerned with low level details.
a->f () is that the most suitable implementation of The compiler should take care of such detail and
f() is invoked for the object referred to by ‘a’, so relieve the programmer.
whether the object is of type A, or a descendent of Another problem in C++ is mistaken
A. In C++, however, the programmer must know redefinition. The base class routine can be

C++?? 2nd Edition page 5


redefined unwittingly. The compiler should report 3.2. Pure Virtual Functions
an erroneous name redefinition within the same As mentioned above, pure virtual functions
name space unless the descendant class provide a means of leaving a function undefined
programmer specifies that the routine redefinition and abstract. A class that has such an abstract
is really intended. The same name can be used, function cannot be directly instantiated. A non-
but the programmer must be conscious of this, abstract descendant class must define the function.
and state this explicitly, especially in The C++ pure virtual syntax is:
environments where systems are assembled out of
preexisting components. Unless the programmer virtual void fn () = 0;
explicitly overrides the original name a syntax This leaves the reader to guess its meaning,
error should report that the name is a duplicate even those well versed in object-oriented
declaration. C++, however, adopted the original concepts. A better choice would have been a
approach of Simula. This approach has been keyword such as ‘abstract’. Direct expression of
improved upon, and other languages have adopted concepts enhances communication, and the ease
better, more explicit approaches, that avoid the with which a language can be learnt. When
error of mistaken redefinition. learning a language it is often important to use the
Eiffel and Object Pascal cater for this situation index of a text book. A keyword like ‘abstract’
as the descendant class programmer is required to would be easily found in an index. But what do
specify that redefinition is intended. This has the you look for in the case of ‘= 0’? You might not
extra benefit that a later reader or maintainer of even realise it is significant. It should have
the class can easily identify the routines that have syntactic significance as abstract functions are a
been redefined, and that this definition is related very important concept in object-oriented design.
to a definition in an ancestor class without having The C++ decision is in keeping with the C
to refer to ancestor class definitions. Thus option philosophy of avoiding keywords. This is often at
2 is exactly where it should be, in descendant the expense of clarity. A keyword would
classes. implement this concept more clearly. For
example:
Option 3 pure virtual void fn ();
The pure virtual function caters for the third
option. The routine is undefined, the class is or
abstract and cannot be directly instantiated. A abstract void fn ();
descendant class must define the routine if it is to
be instantiated. Any descendants that do not
define the routine are also abstract classes. This The mathematical notation used in C++
concept is correct, but see the section on pure suggests that values other than zero could be used.
virtual functions for a criticism of the syntax. What if the function is equated to 13? -
virtual void fn () = 13;
Virtual is a difficult notion to grasp. The
related concepts of polymorphism and dynamic A function is either pure, or it is not. This to
binding, redefinition, and overloading are easier to any analyst suggests a boolean state, which a
grasp, being oriented towards the problem single keyword conveys. A simple suggestion to
domain. Virtual routines are an implementation fix this is to define ‘= 0’ as abstract:
mechanism for polymorphism. Polymorphism is #define abstract = 0
the ‘what’, and virtual is the ‘how’. Smalltalk and
Objective-C use a different mechanism to then
implement polymorphism. Virtual is an example virtual void fn () abstract;
of where C++ obscures the concepts of OOP. The
programmer has to come to terms with low level ‘Pure virtual’ is also an abuse of natural
concepts, rather than the higher level object- language. It is a combination of words that are
oriented concepts. Interesting as underlying somewhat opposite in meaning. Pure means
mechanisms might be for the theoretician or something that really is what it appears to be. For
compiler implementer, the practitioner should not example pure gold. Virtual means something that
be required to understand or use them to make appears to be what it actually is not. For example
sense of the higher level concepts. Having to use virtual memory. Perhaps virtual gold could be
them in practice is tedious and error-prone, and fools gold. As has been said before, virtual is a
can prevent the adaptation of software to further difficult concept to grasp. When it is combined
advances in the underlying technology and with a word such as ‘pure’, the meaning becomes
execution mechanisms (see concurrency). even more obscure. Modern language designers
should be very careful in the vocabulary they
choose.

C++?? 2nd Edition page 6


3.3. The Nature of Inheritance before. Assembling software components is
Inheritance is a close relationship. It provides building a system that has never existed before.
a fundamental way to assemble software Inheritance in C++ is like a jig-saw where the
components. Objects that are instances of a class pieces fit together, but the compiler has no way of
are also instances of all ancestors of that class. For checking that the resultant picture makes sense. In
effective object-oriented design the consistency of other words C++ has provided the syntax for
this relationship should be preserved. Each classes and inheritance but not the semantics.
redefinition in a subclass should be checked for Certainly, not very many reusable C++ libraries
consistency with the original definition in an are available, which suggests that C++ might not
ancestor class. A subclass should preserve the support reusability as well as possible. C++ fails
requirements of an ancestor class. Requirements to provide this fundamental goal of object-
that cannot be preserved indicate a design error oriented design and programming.
and perhaps inheritance is not appropriate.
Consistency due to inheritance is fundamental to 3.4. Function Overloading
object-oriented design. C++’s implementation of C++ allows functions to be overloaded if the
non-virtual overloading, and overloading by arguments in the signature are of different types.
signature (see below) means that the compiler Such overloading can be useful as these examples
cannot check for this consistency. C++ does not show:
realise this aspect of object-oriented design. This
contributes to a wide and costly gap between max (int, int);
analysis and design, and implementation. max (real, real);
Inheritance has been classified as ‘syntactic’ This will ensure that the best max routine for
inheritance and ‘semantic’ inheritance. Saake et al the types int and real will be invoked. Object-
describe these as follows : “Syntactic inheritance oriented programming, however, provides a
denotes inheritance of structure or method variant on this. Since the object is passed to the
definitions and is therefore related to the reuse of routine as a hidden parameter (‘this’ in C++), an
code (and to overriding of code for inherited equivalent but more restricted form is already
methods). Semantic inheritance denotes in- implicitly included in object-oriented concepts. A
heritance of object semantics, ie of objects simple example such as the above would be
themselves. This kind of inheritance is known expressed as:
from semantic data models, where it is used to
model one object that appears in several roles in int i, j;
an application.” [SJE91]. Saake et al concentrate real r, s;
on the semantic form of inheritance. Behavioural
i.max (j);
or semantic inheritance expresses the role of an
object within a system. r.max (s);
Wegner, however, believes code inheritance to but i.max (r) and r.max (j) result in
be of more practical value. He classifies the compilation errors because the types of the
difference between syntactic and semantic arguments do not agree. (By operator overloading
inheritance as code and behaviour hierarchies of course, these can be better expressed, i max j
[Weg90] (p43). He suggests these are rarely and r max s, but min and max are peculiar
compatible with each other and are often functions that might want to accept two or more
negatively correlated. Wegner also poses the parameters of the same type.)
question of “How should modification of The above shows that in most cases, the
inherited attributes be constrained?” Code object-oriented paradigm can consistently express
inheritance provides a basis for modularisation. function overloading, without the need for the
Behavioural inheritance provides modelling by function overloading of C++. C++, however, does
the ‘is-a’ relationship. Both are useful in their make the notion more general. The advantage is
place. Both require consistency checks that that more than one parameter can overload a
combinations due to inheritance actually make function, not just the implicit current object pa-
sense. rameter.
It seems that inheritance is most powerful in The disadvantage is that C++ introduces some
the most restrictive form of a semantics inconsistencies that the compiler cannot detect. If
preserving relationship. A subclass should not the programmer intends to redefine a virtual
break the assumptions of an ancestor class. routine, but makes a mistake in the declaration of
Software components are like jig-saw pieces. the function signature, the compiler will
When assembling a jig-saw the shape of the erroneously assume an overloaded function. Any
pieces must fit, but more importantly, the calls to the function using one or other of the
resulting picture must make sense. Assembling signatures will also fail to detect the
software components is more difficult. A jig-saw inconsistency.
is reassembling a picture that was complete When calling the routine, if the programmer
makes a mistake in supplying the actual
C++?? 2nd Edition page 7
parameters, a C++ compiler cannot be specific and attempting to do so considerably complicates
about the error. It can only report that no function design.
with a matching signature could be found.
Programmers make this sort of mistake for subtle 3.6. Name overloading
reasons, and it can be time consuming to pinpoint Naming is fundamentally important in
the parameter at fault. Secondly, the incorrect producing self-documenting software. Naming
parameter might accidentally match, one of the helps realise maintainable and reusable software
other routines. In that case this error will be components. Names are fundamental in freeing
propagated into the production code, and could programmers from low level manipulation of
remain undetected a long time. addresses. Naming is the basis for differentiating
If it is felt that C++’s scheme of having between different entities in a software module.
parameters of different types is useful, it should Name overloading allows the same name to refer
be realised that object-oriented programming to two or more different entities. The problem is
provides this in a more restricted and disciplined whether the resultant ambiguity is useful, and how
form. This is done by specifying that the to resolve it, as ambiguity weakens the power of
parameter needs to conform to a base class. Any names to distinguish entities.
parameter passed to the routine can only be a type Name overloading is useful for two purposes.
of the base class, or a subclass of the base class. Firstly it allows programmers to work on two or
For example: more modules without concern about name
A.f (B someB) {...}; clashes. The ambiguity can be tolerated as within
the context of each module, the name
class B ...; unambiguously refers to a unique entity.
class D : public B ... Secondly, name overloading provides
A a;
polymorphism, where the same name applied to
different types refers to different implementations
D d;
for those types. Polymorphism allows one word to
a.f (d); describe ‘what’ is to be computed. Different
classes might require different specifications of
The entity ‘d’ must conform to the class ‘B’, ‘how’, a computation is done. For example ‘draw’
and the compiler checks this. is an operation that is applicable to all different
The alternative to function overloading by shapes, even though circles and squares, etc are
signature, is to require functions with different ‘drawn’ differently.
signatures to have different names. Names should These two uses of name overloading provide a
be the basis of distinction of entities. This is powerful concept. But use of the same name in
known to work and solves the above problems. the same context must be resolved. Errors can
The compiler can cross check that the parameters result from ambiguity. In this case the
supplied are correct for the given routine name. programmer needs to differentiate between
This also results in better self-documented entities in ways other than name alone. A
software. It is often difficult to choose appropriate common way to do this is to introduce extra
names for entities, but it is well worth the effort. distinguishing names. For example in a group of
people, where two or more share the same first
3.5. Virtual Classes name, they can be distinguished by their surname.
If class D multiply inherits class A via classes Similarly a unique first name will distinguish the
B and C, then if D wants to inherit only a single members of a family with a common surname.
copy of A, the inheritance of A must be specified This is analogous to classes, where each class
as virtual in both B and C. This raises two in a system is given a unique name. Each member
questions. Firstly, what happens if A is declared within a class is also given a unique name. Where
virtual in only one of B or C? Secondly, what if two objects with members of the same name are
another class E wants to inherit multiple copies of used within the same context, the object name can
A via B and C? In C++, the virtual class decision qualify the members. For example a.mem and
must be made early, reducing the flexibility that b.mem.
might be required in the assembly of derived [Reade 89] points out the difference between
classes. In a shared software environment overloading and polymorphism. Overloading
different vendors might supply classes B and C. It means the use of the same name in the same
should be left to the implementer of class D or E, context for different entities with completely
exactly how to resolve this problem. And this is different definitions and types. Polymorphism
the simplest case. What if A is inherited via more though has one definition, and all types are
than two paths, with more than two levels of subtypes of a principle type. C. Strachey referred
inheritance? Such flexibility is key to reusable to polymorphism as parametric polymorphism and
software. You cannot envisage when designing a overloading as ad hoc polymorphism.
base class all the possible uses in derived classes, Block structured languages provide
overloading by scoping. Scoping allows the same
C++?? 2nd Edition page 8
name to be used in different contexts without a combination of components, which quickly
clash or confusion. Nested blocks provide a subtle leads to an exponentiation in the number of tests
problem. Names in an outer block are in scope in required.
inner blocks. Many languages, however, allow a C++ has an analogous form of hiding. A non-
name to be overloaded in an inner block. This virtual function in a derived class hides a function
does more than overload the name, it hides it. The in an ancestor class. This hiding is explained in
use of a name in the inner block does not indicate section 13.1 of the C++ ARM. This is a
any relationship with the same name in the outer discrepancy with declaring multiple functions
block. Textually nested blocks ‘inherit’ named with the same name in the same class with
entities from outer blocks. Inheritance different signatures. A function in the derived
accomplishes this in object-oriented languages. class will hide the functions of the ancestor class,
Inheritance eliminates the need to textually nest rather than add its signature to the list of possible
entities, and also accomplishes loose coupling. functions which can be called. This is confusing
Nesting makes entities tightly coupled. and error prone. Learning all these ins and outs of
Contrary to most high level languages, a name the language is extremely burdensome to the
should not be overloaded while it is in scope. This programmer. Often they will only be learnt after
inconveniently hides the outer declaration, and the falling into a trap.
programmer cannot access the outer entity. It is
also error prone. The following example 3.7. Polymorphism and Inheritance
illustrates this: Inheritance provides a form of name
{ overloading similar to overloading in subblocks.
int i; The scope of a name is the class in which it
{
occurs. If a name occurs twice in a class, it is a
syntax error. Inheritance introduces some
int i; // hide the outer i.
questions over and above this simple
i = 13; // assign to the inner i. consideration of scope. Should a name declared in
// Can’t get to the outer i here. a base class be in scope in a derived class? There
// It is in scope, but hidden. are three choices:
} 1) Names are in scope only in the immediate
} class but not in subclasses. Subclasses can freely
reuse names because there is no potential for a
Now delete the inner declaration: clash. This precludes software reusability.
Subclasses will not inherit definitions of
{ implementation. Therefore case 1 is not worth
int i; considering.
{ 2) The name is in scope in a subclass, but the
i = 13; // Syntactically valid, name can be overloaded without restriction. This
// but not the is closest to the overloading of names in nested
// intention.
blocks. This is C++’s approach. Two problems
arise. Firstly, the name can be unintentionally
}
reused. Secondly, because the new entity is not
} assumed to have any relationship to the original,
The inner overloaded declaration is removed, its signature cannot be type checked with the
and references to that name do not result in syntax original entity. Since consistency checks between
errors due to the same name being in the outer the superclass and subclass are not possible, the
environment. The inner instruction now tight relationship that inheritance implies, which
mistakenly changes the value of the outer entity. is fundamental to object-oriented design, is not
A compiler cannot detect this situation unless the guaranteed. This can lead to inconsistencies
language definition forbids nested redeclarations. between the abstract definition of a base class, and
E.W. Dijkstra uses similar reasoning in ‘An essay the implementation of a derived class. If the
on the Notion: “The Scope of Variables”’ in “A derived class does not conform to the base class in
Discipline of Programming”, [Dijkstra 76]. this way, it should be questioned why the derived
The above example demonstrates how nesting class is inheriting from the base class in the first
results in unmaintainable programs. This is place. (See the nature of inheritance.)
because the inner block is tightly coupled to the 3) The name is in scope in the subclass, but
outer block, and each is sensitive to changes in the can only be overloaded in a disciplined way to
other. The advantage of keeping components provide a specialisation of the original. Other uses
decoupled and separate is that a programmer can of the name are reported as duplicate name errors.
confidently make modifications to one component This form of overloading in a subclass ensures the
without affecting other components. Testing can entity referred to in the subclass is closely related
be limited to the changed component, rather than to the entity in the ancestor class. This helps
ensure design consistency. The relationship of

C++?? 2nd Edition page 9


name scope is not symmetric. Names in a subclass 3.9. Anonymous parameters in Class
are not in scope in a superclass (although this is Definitions
not the case in typeless languages such as C++ does not require parameters in function
Smalltalk). In order to provide the consistent templates to be named. The type alone can be
customisation of reusable software components, specified. For example a function f in a class
the same name should only be used by explicitly header can be declared as f (int, int, char). This
redefining the original entity. The programmer of gives the client no clue to the purpose of the
the descendant class should indicate that this is parameters, without referring to the
not a syntax error due to a duplicate name, but implementation of the function. Meaningful
that redefinition is intended. (This has already identifiers are essential in this situation, because
been covered in the virtual section.) This choice this is the abstract definition of a routine. A client
ensures that the resultant class is logically of the class and routine must know that the first
constructed. This might seem restrictive, but is int represents a ‘count of apples’, etc. It is true
analogous to strong typing, and makes inheritance that well known routines might not require a
a much more powerful concept. name, for example sqrt (int). But this is not
appropriate for large scale software development.
3.8. ‘.’ and ‘->’ The use of anonymous parameters handicaps the
The ‘.’ and ‘->’ member access syntax came purpose of abstract descriptions of classes and
from C structures. It illustrates where the C base members: to facilitate the reusability of software.
adversely affects flexibility. Semantically both Program text captures the meaning of the system
access a member of an object. They are, however, for some future activity, such as extension or
operationally defined in terms of how they work. maintenance. To achieve reusability,
The dot (‘.’) syntax accesses a member in an communication of intent of a software element is
object directly. For example ‘x.y’ means access essential. A compiler strips away this level of
the member y in the object x. communication, producing a machine executable
entity. Languages and compilers that perform less
obj x; // declare object x of than optimal translations should not penalise
// class obj careful production of semantic entities. But
// with a member y. neither should a language definition allow less
x.y; // access y in object x
than optimal expression to the human reader.
Languages do not have to be cryptic to achieve
// directly
efficiency. In fact cryptic languages impair
x->y; // syntax error “. expected” efficiency, as they make it harder for the pro-
The ‘->’ syntax means access a member in an grammer to develop efficient systems, and
object referenced by a pointer. For example ‘x->y’ furthermore, they make it harder for automatic
(or the equivalent *(x).y) means access the code optimisers.
member y in the object pointer x refers to . Names are not strictly necessary in
programming. Naming exists to help the human
obj *x; // declare a pointer x to an reader identify different entities within the
// object of class obj. program, and to reason about their function. For
x->y; // access y via pointer x this reason naming is essential. Without naming,
x.y; // syntax error “-> expected” development of sophisticated systems would be
nearly impossible. Some languages access
In this example, ‘what’ is to be computed is parameters by their address (position) in the
“access the element y of object x.” In C++, parameter list ($1, $2, etc). This is quite
however, the programmer must specify for every unsatisfactory, even for shell scripts. Anonymous
access the trivial detail of ‘how’ this is done. The parameters can save typing in a function template,
compiler can easily remove this burden from the but then programming is not a matter of conve-
programmer, as in fact most languages do. nience. This is inconvenient for later readers. The
Furthermore, this reduces flexibility as if the ‘obj redundancy is beneficial and saves later
x’ declaration is changed to ‘obj *x’, the effect is programmers having to look up the information in
widespread as all ‘x.y’ must be changed to ‘x->y’. another place. A real convenience in function
Since the compiler gives a syntax error if the templates would be that abstract function
wrong access is used, this shows it already knows templates be automatically generated from the
what access code is required and can generate it implementation text (see header files for more
automatically. Good programming centralises details).
decisions. The decision to access the object Anonymous parameters illustrate the link
directly or via a pointer should be centralised in between courtesy and safety issues in
the declaration. programming. Due to pressure of work, a client
programmer might wrongly guess the purpose of a
parameter from the type. Thus the failure of the
original programmer to provide a courtesy has

C++?? 2nd Edition page 10


caused a later programmer to breach safety. An 3.12. Optional Parameters
interface client must know the intention of the Optional parameters that assume a default
interface for it to be used effectively. value according to the routines declaration are
supposed to provide a shorthand notation.
3.10. Nameless Constructors Shorthand notations are intended to speed up
Multiple constructors can have different software development. Such shorthand notations
signatures, similar to overloaded functions. This can be convenient in shell scripts, and interactive
precludes two or more constructors having the systems. In large scale software production,
same signature. Constructors are also not named however, precision is mandatory, and defaults can
(apart from the same name as the class). This lead to ambiguities and mistakes. With optional
makes it difficult to discern from the class header parameters the programmer could assume the
the purpose of the different constructors. It is wrong default for a parameter. More importantly,
difficult to match an object creation with the optional parameters undermine type safety. The
called constructor. Constructors suffer from all of type of a function is defined by the composition
the problems described with regards to functions of its input types, and its output type:
with the same name but different signatures. It f: T1 x T2 x T3... -> T4
would be easy to mark routines as constructors,
for example: The entire signature determines the type of the
function, not just the return type. Optional
constructor make (...)... parameters mean that C++ is not type safe, and
constructor clone (...)... that the compiler cannot check that the parameters
constructor initialise (...)... in the call exactly match the function signature.
where each constructor leaves the object in Furthermore, they do not provide a great deal
valid, but potentially different states. Named of convenience. If a routine has five parameters,
constructors would aid comprehension as to what the last three of which are optional, and caller
the constructor is used for in the same way as wants to assume the defaults for parameters 3 and
function names document the purpose of a 4, but must specify parameter 5, then all five
function. Secondly, named constructors would parameters must be specified. A better scheme
allow multiple constructors with the same would be to have a ‘default’ keyword in function
signature. Thirdly, it is easier to match up an calls:
object creation with the constructor actually f (a, b, default, default, e);
called.
Other means, already in the language, can
3.11. Constructors and Temporaries easily provide this mechanism. For example, a
A ‘return <expression>’ can result in a call to another (possibly inline) function could
different value than the result of <expression>. In provide the defaults for the optional parameters.
section 6.6.3, the C++ ARM says “If required the This not only provides the convenience of
expression is converted, as in an initialisation, to optional parameters, but is more powerful. Any
the return type of the function in which it appears. parameter or combination can be filled in with any
This may involve the construction and copy of a combination of defaults, not just the last
temporary object (S12.2).” parameters. Multiple intermediate routines can
Section 12.2 explains “In some circumstances provide multiple sets of defaults.
it may be necessary or convenient for the compiler
to generate a temporary object. Such introduction 3.13. Bad deletions
of temporaries is implementation dependent. The following example is given on p.63 in the
When a compiler introduces a temporary object of C++ ARM as a warning about bad deletions that
a class that has a constructor it must ensure that a cannot be caught at compile-time, and probably
constructor is called for the temporary object.” not immediately at run-time:
A note says “The implementation’s use of p = new int[10];
temporaries can be observed, therefore, through p++;
the side effects produced by constructors and delete p; // error
destructors.” p = 0;
Putting this together, creation of a temporary delete p; // ok
is implementation dependent, so might or might
not be done. If a temporary is created, a One of the restrictions of the design of C++ is
constructor is called as a side effect, which can that it must remain compatible with C. This
change the state of the object. Different C++ results in examples like the above, that are ill-
implementations could therefore return different defined language constructs, that can only be
results for the same code. covered by warnings of potential disaster.
Removal of such language deficiencies would
result in loss of compatibility with C. This might

C++?? 2nd Edition page 11


be a good thing if problems such as the above data. Friend is a ‘limited export’ mechanism.
disappear. But then the resultant language might Friends have three problems:
be so far removed from C that C might be best
abandoned altogether. 1) They can change the internal state of
objects from outside the definition of the class.
2) They introduce extra coupling between
3.14. Local entity declarations components, and therefore should be used
Declaring an entity close to where it is used, sparingly.
has both advantages and disadvantages. It is 3) They have access to everything, rather
convenient, but can make a routine appear more than being restricted to the members of interest to
complex and cluttered. A problem is that an them.
identifier can be mistakenly overloaded within a
nested block in a function, with the resultant Friends are useful, and a case can be made for
problems covered in the sections on name shades of grey between public, protected and
overloading and nesting. C does not have nested private members. Multiple interfaces to a class
routines or blocks so does not have this problem. provide the functionality of friends and avoid the
ALGOL uses this simple form of name above problems. Each interface to the class can be
overloading. (A block in the ALGOL sense exported to everything, or selected classes only. A
contains both declarations and instructions.) selective export mechanism is more general than
The ARM explains problems of local public, private, protected and friend, and
declarations with branching, which shows the explicitly documents the couplings between
complications in intermingling declarations and entities in the system. Selective export specifies
instructions. Caveats cannot make up for or fix not only that a member is exported but to which
faulty language definition. classes it is exported.
The C++ FAQ [Cline] (Q83) is unclear on this One reason given for friends, is that they
point (although it is mostly excellent), claiming allow more efficient access to data members than
that an object is created and initialised at the a function call. The way C++ is often used is that
moment it is declared. This only applies to auto, data members are not put in the public section,
in stack objects. Dynamic entities are not created because this breaks the data hiding principle.
and initialised until they are the subject of a ‘new’ Data hiding is better described as
instruction. In well written object-oriented ‘implementation hiding’. Only a classes abstract
software, routines will be small, typically functional interface should be visible to the
performing one atomic action per routine. outside world. That is data members can be
Small routines that implement atomic exported, but are viewed externally as functional
operations are fundamental to loose coupling. For entities. This is because, when used in
example, a base class that provides a single expressions, functions and variables have no
routine that logically performs operations A and semantic difference. They both return values of a
B, is not useful to a subclass that needs to provide given type. (See fn () for an explanation of why
its own implementation of B, but does not want to variables and functions are best regarded as
change A. The descendant must reimplement the similar entities.) (See also Marshall Cline’s
logic of both A and B, missing an opportunity to explanation of friends in the FAQ for further
reuse the logic of A. Tight coupling reduces clarification of the friend concept.)
flexibility. Splitting A and B into different The Cambridge Encyclopedia of Language
routines accomplishes loose coupling, and has an interesting point about public and private
therefore flexibility. Efficiency is also attained names. It says “Many primitive people do not like
without the mess of local entity declarations. to hear their name used, especially in
Good design and clean modularisation achieve unfavourable circumstances, for they believe that
efficiency, as the entities which would be locals to the whole of their being resides in it, and they
a block in C++ are only created when the routine may thereby fall under the influence of others.
is entered. The danger is even greater in tribes (in Australia
and New Zealand, for example), where people are
3.15. Members given two names - a ‘public’ name, for general
Care should be taken with the C++ use of the use, and a ‘secret’ name, which is only known by
term member. In general use, an object is a God, or to the closest members of their group. To
member of a class. This corresponds to members get to know a secret name is to have total power
in set theory. But in C++, the term member means over its owner.”
a data item, or function of the class. This
ambiguity could have easily been avoided. 3.17. Static
The word ‘static’ is confusing in C++. Page
3.16. Friends 98 of the C++ Annotated Reference Manual
Friends are a mechanism to override data (ARM) mentions this confusion and gives two
hiding. Friends of a class have access to its private meanings. Firstly, a class can have static

C++?? 2nd Edition page 12


members, and a function can have static entities. variants. Inheritance and polymorphism provide
The second meaning comes from C, where a static this in OOP. A reference to a superclass can also
entity is local in scope to the current file. The be used to refer to any subclass, and thus provides
choice of different keywords would easily solve the same semantics as union, only in a type safe
this trivial problem. There is also a third more manner, as the alternatives can never be confused.
general meaning that objects are statically or An object reference is implicitly a union of all
automatically allocated and deallocated on the subclasses.
stack when a block is entered and exited, as
opposed to dynamically allocated in free space. 3.19. Nested Classes
Static class members are useful. Page 181 of Simula provided textually nested classes
the ARM states that statics reduce the need for similar to nested procedures in ALGOL. Textual
global variables. It is good to reduce global (syntactic) nesting should not be confused with
variables, but the C syntax obscures the purpose. semantic nesting, nor static modelling with
Entities declared in functions can also be dynamic run time nesting. Modelling is done in
static. These are not needed in an object-oriented the semantic domain, and should be divorced
language. The reason and history is this. ALGOL from syntax. You do not need textually nested
has the notion of ‘OWN’ locals in blocks. The classes to have nested objects. Nested classes are
semantics of an OWN entity is that when a block contrary to good object-oriented design, and the
is exited, the value of the OWN is preserved for free spirit of object-oriented decomposition,
the next entry to the block. I.e. the value is where classes should be loosely coupled, to
persistent. The implementation is that at compile support software reusability. Semantic nesting is
time, the OWN entity is limited in scope to the achieved independently of textual nesting. In
block, but at run time, it is located in the global object-oriented design all objects should interact
stack frame. The same instance of the variable is only via well defined interfaces. Objects of a class
used in all invocations of the procedure, rather that is textually nested in another class have
than each invocation using separate local storage access to the outer object without the benefit of a
on the stack. This causes complication in recur- clean interface. C avoided the complexity of
sion. nested functions, but C++ has chosen to imple-
Simula’s designers generalised the ALGOL ment this complexity for classes, which is of less
notion of block into class, and so object- use than nested functions.
orientation was born. Instead of discarding a class OOP achieves nesting in two ways: by
block on exit, it is made ‘persistent’. Declarations inheritance and object-oriented composition. Thus
within the class block are persistent, and therefore modelling nesting is achieved without tight
provide the functionality of static and OWN. textual coupling. For example, consider a car. We
Classes are more flexible than statics. Statics are know in the real world that the engine is
persistent in the same way as globals, ie for the embedded within the car. In object-oriented
duration of the program. Class member lifetime is modelling, however, this embedding is modelled
governed by the lifetime of the object. Object- without textual nesting. Both car and engine are
oriented languages do not need OWNs or statics. separate classes. The car contains a reference to an
engine object. This also allows the vehicle and
3.18. Union engine hierarchy to be independently defined.
Union is another construct that is superfluous Engine is derived independently into petrol,
in OOP. Similar constructs in other languages are diesel, and electric engines. This is simpler and
recognised as problematic. For example, more flexible than having to define a petrol engine
FORTRAN’s equivalences, COBOLs car, a diesel engine car, etc, which you have to do
REDEFINES, and Pascal’s variant records. When if you textually nest the engine class in the car.
used to overload memory space these force the Other examples can also be structured without
programmer to think about memory allocation. textual nesting, and no loss of generality.
Recursive languages use a stack mechanism that In C++, not only can classes be nested within
makes overloading memory space unnecessary, as other classes, but also within functions, thereby
it is allocated and deallocated automatically for tightly coupling a class to a function. This
locals when procedures are entered and exited. confuses class definition with object declaration.
The compiler and run time system automatically The class is the fundamental structure in object-
allocate and deallocate storage as required, oriented programming and nothing has existence
ensuring that two pieces of data never clash for separate from class (including globals). C++ is
the same memory space. This is essential so that confused as to whether it is procedure-oriented or
the programmer can concentrate on the problem object-oriented.
domain, rather than machine oriented details.
When union is used similarly to FORTRAN’s 3.20. Global Environments
equivalences it is not needed. The global environment provides a special
Union is also not needed to provide the case of nested classes. When classes are nested in
equivalent to COBOL REDEFINES or Pascal’s a global environment, dependencies can arise that
C++?? 2nd Edition page 13
make the classes difficult to decouple from that header B also includes header C. A simple but
environment, and therefore not reusable. Even if a messy fix in all headers solves this problem:
class is not intended for use in another context, it
#ifndef thismod
will benefit from the discipline of object-oriented
design. Each class is designed independently of #define thismod
the surrounding environment, and relationships ... rest of header
and dependencies between classes are explicitly #endif
stated.
In C++ functions can change the global Headers show how C++ addresses the
environment, beyond the object in which they are problem of independent modules by a non-object-
encapsulated. Such changes are side-effects that oriented approach that is sub-optimal; the
limit the opportunity to produce loosely-coupled programmer must supply this bookkeeping
objects, which is essential to enable reusable information manually. A class interface is
software. This is a drawback of both global and equivalent to a module header. A module header
nested environments. contains data and routines exported to other
A good OO language will only permit routines modules. This is exactly the purpose of the class
in an object to change its state. Removing the interface. A class definition contains all
global environment is trivial. It is simply knowledge of component classes and their
encapsulated in an object or set of objects of its dependencies (inheritance and client) in the class
own. Therefore global entities are subject to the text. Dependency analysis is derivable from the
discipline of object-oriented design. Having class text. Tools like ‘make’ can be integrated into
globals in a system circumvents OOD. Objects the compiler itself, and the errors and tedium
can also provide a clean interface to the external encountered in the use of ‘make’ are avoided.
environment, or operating system, without loss of #includes relate to the organisation and
generality, for a negligible performance penalty. administration of a project. Rational language de-
Thus classes are independent of a surrounding sign eliminates such bookkeeping mechanisms.
environment, and the project for which they were A traditional system is assembled by
first developed, and are more easily adaptable to combining modules. An object-oriented system is
new environments and projects. assembled by combining classes. Modules are a
primitive form of classes. Classes are more
3.21. Header Files sophisticated. They express more precisely
In C++ a class interface must be maintained relationships with other classes. C++ #includes
separately from its body. While an abstract and modules have problems. This primitive
interface should be distinct from a concrete method is not required in an object-oriented
implementation, the interface and implementation language.
can both be derived from one source. In C++
though, programmers must maintain the two sets 3.22. Class Interfaces
of information. Replicated information has well Section 9.1c of the C++ ARM points out that
known drawbacks. In the event of change, both C++ has no direct support for “interface
copies must be updated. This can lead to definition” and “implementation module”. In a
inconsistencies that must be detected and C++ class definition, all private and protected
corrected. Tools can automatically extract abstract members must be included in the public text of
class descriptions from class implementations, the class. The ARM points out that whenever the
and guarantee consistency. private or protected parts are changed, the whole
The programmer must also use #includes to program must be recompiled. Further to what the
manually import class headers. #include is an old ARM says, all modules that are dependent on the
and unsophisticated mechanism to provide header file must be recompiled, even though the
modularity. #include is a weak form of inheritance private and protected members do not affect other
and import. C++ still uses this 30 year old modules. Private members should not be in the
technique for modularisation, while other abstract class interface, as this exposes
languages have adopted more sophisticated implementation details to programmers of other
approaches, for example, Pascal with Units, modules.
Modula with modules, Ada with packages. In
Eiffel the unit of modularisation is the class itself, 3.23. Class header declarations
and includes are handled automatically. The OOP C’s syntax for function declarations is
class is a more sophisticated way to modularise [<type>] <identifier> (<parameters>). For (a very
programs. Inheritance implements reusability and simple) example:
modularisation, so #include is superfluous. class C
Another problem is that if header A includes {
header B, and header B includes header A a
a ();
circular dependency occurs. The same problem
occurs if header A includes headers B and C, and b ();

C++?? 2nd Edition page 14


int c (); In C++ the programmer must manually
d (); manage storage due to the lack of garbage
char e (); collection. This is a difficult bookkeeping task
virtual void f (); that leads to two opposite problems. Firstly, an
} object can be deallocated prematurely, while valid
references still exist (dangling pointers).
Secondly, dead objects might not be deallocated
To find an identifier in this layout, the eye leading to memory filling up with dead objects
must trace a course around the type specifications. (memory leaks). Attempts to correct either
This is a tiring activity. The eye has a greater problem can lead to overcompensation and the
chance of missing the sought identifier, and the other problem occurring. A correct system is a
programmer must resort to using the search fine balance. This is illustrated in the figure
function of a text editor to help out. below.
Other languages place the entity names first.
For example:
Dangling Correct Memory
class C Leaks
Pointers System
{
a (); These problems contribute to the fragility of
b (); C++ programs, and usually result in system
c () int; failure. Garbage-collection solves both problems.
d (); Garbage-collection has an undeserved bad
e () char; reputation due to some early garbage-collectors
f () virtual void; having performance problems, instead of working
} transparently in the background, as they can and
should. These problems are often over-
To those used to the ALGOL and FORTRAN emphasised as a justification for C++ ignoring
style of type first, this seems backwards. But garbage collection. A possible solution is to build
name first is logical as a real world example garbage collection into the run time architecture,
illustrates. Imagine if a dictionary is published, but allow the programmer to activate and
and the keywords are not placed first, but rather deactivate it manually. Garbage collection can be
the entry order is - disabled in systems where it is inappropriate.
noun /obvrzen/ obversion, the act or In C++ it might be argued that the lack of
result of obverting garbage-collection is not an engineering
compromise. Its inclusion is nearly an engineering
Such a dictionary would not sell many copies, impossibility, as a programmer can undermine the
unless the marketers managed to fool many structures required for implementing correctly
people that the explanation of the meaning was working garbage-collection. While garbage-
more correct because the order of layout was collection might not actually be an impossibility
mysteriously magical. This example illustrates in C++ (EC++), it is difficult, and programmers
how important subtle syntax decisions are, and would have to settle for a more restricted way of
why PASCAL style languages might have ordered programming. This could be a good thing. But
things contrary to FORTRAN, ALGOL and then the compromise to remain compatible with C
others. The language designer must consider these becomes difficult, if the compiler is to detect
trivial but important alternatives. The layout of practices inconsistent with the operation of
programming entities is essential for effective garbage-collection.
communication. The dual roles of language
syntax, and programming style affect 3.25. Type-safe linkage
comprehension. A dictionary or index style layout The C++ ARM explains that type-safe linkage
suggests placing entity names first, followed by is not 100% type safe. If it is not 100% type-safe,
their definition. then it is unsafe. It is the subtle errors that cause
the most problems, not the simple or obvious
3.24. Garbage Collection ones. Often such errors remain undetected in the
One of the hallmarks of high level languages system until critical moments. The seriousness of
is that programmers declare data without regard to this situation cannot be underestimated. Many
how the data is allocated in memory. In block forms of transport, such as planes, and space
structured languages, local variables are allocated programs depend on software to provide safety in
on the stack, and automatically deallocated when their operation. The financial survival of
the block exits. This relieves the programmer of a organisations can also depend on software. To
great burden. Garbage collection provides accept such unsafe situations is at best ir-
equivalent relief in languages with dynamic entity responsible.
allocation.

C++?? 2nd Edition page 15


The C++ ARM summarises the situation as methodology of choice of disciplined thinkers.
follows - “Handling all inconsistencies - thus Some people can hold a whole problem and
making a C++ implementation 100% type-safe - solution in their head and work in a disciplined
would require either linker support or a fashion until the solution is complete. Mozart is
mechanism (an environment) allowing the said to have composed this way, producing his
compiler access to information from separate last three symphonies in as many months in 1788.
compilations.” Beethoven toiled far more over the production of
So why does the C++ compiler (at least his works, taking years to complete one
AT&T’s) not provide for accessing information symphony. Both composers produced
from separate compilations? Why is there not a masterpieces. Mozart wrote music directly,
specialised linker for C++, that actually provides whereas Beethoven wrote themes and ideas in his
100% type safety? There is no reason why C++ famous sketchbooks. The production of
should not be implemented this way. Building masterpieces depends on skill, not on method-
systems out of preexisting elements is the ologies.
common Unix style of software production. This It is becoming accepted that the software
implements a form of reusability, but not in the lifecycle should be an integrated process.
truly flexible manner of object-oriented Analysis, design and implementation should be a
reusability. seamless continuum.The activities of the lifecycle
In the future, Unix could be replaced by should progress in parallel to expedite software
object-oriented operating systems, that are indeed development. Facts found out only as late as the
‘open’ to be tailored to best suit the purpose at implementation stage can be fed back into the
hand. By the use of pipes and flags, Unix software analysis and design stages. The object-oriented
elements can be reused to provide functionality approach supports this process. Artificial
that approximates what is desired. This approach separation of the steps leads to a large semantic
is valid and works with efficacy in some gap between the steps. The transformations
instances, like small in-house applications, or required to bridge such semantic gaps are prone to
perhaps for research prototyping, but is misinterpretation, time consuming and costly.
unacceptable for widespread and expensive The same people should be responsible for all
software, or safety critical applications. In the last stages. This way they take responsibility for the
ten years the advantages of integrated software system as a whole, rather than passing the buck
have been acknowledged. Classic Unix systems and blame which occurs when analysts, designers
don’t provide those advantages. Integrated sys- and implementers are different groups. This is not
tems are more ambitious, and place more demands a popular viewpoint in traditional hierarchical
on developers. But this is the sort of software now management structures where programmers get
being demanded by end users. Systems that are promoted to designers who get promoted to
cobbled together are unacceptable. analysts. Hierarchical management also
A further problem with linking is that discourages people from feeling responsible for a
different compilation and linking systems should product. This culture must radically change if we
use different name encoding schemes. This are to produce quality systems.
problem is related to type-safe linkage, but is We should have learnt from the extremes
covered in the section on ‘reusability and SA/SD. Some quarters believed that methodology
compatibility’. was all important, while programming and
programming languages were unimportant.
3.26. C++ and the software lifecycle Arcane and machine-oriented programming
The software lifecycle has attracted a great languages strengthened this attitude. These
deal of attention. It is at least generally accepted languages concentrate on the ‘how’ of
that the activities in the lifecycle are analysis of computation, whereas the modellers correctly
requirements, design, implementation, testing and demand notations that express the ‘what’, in order
error correction, extension. Unfortunately, the to be implementation independent. A modern
result of identifying these activities has resulted in software language supports the integration of the
a school of thought that the boundaries between activities of design and implementation by being
these activities are fixed, and that they should be readable, and problem-oriented. A language
systematically separate, each being completed should be as close to design as possible. The
before the next is commenced. It is often argued needs and requirements of an enterprise can
that if they are not cleanly separated, then you are change much more rapidly than programmers can
not practicing disciplined system development. keep up, especially in a highly competitive and
This view is incorrect. Someone who writes a commercial world.
program straight away is actually doing all the So how does C++ fit into this picture? Well it
steps in parallel. It might not be the best way do is based on C that was designed mainly as an
do things in many circumstances, might or might implementation and machine-oriented language. It
not suit the style and thinking of different people, is an old language, that did not need to consider
but this works in some scenarios, and can be the the integrated lifecycle approach. C++ might have

C++?? 2nd Edition page 16


some of the trappings of object-oriented concepts, software component, require assurance that the
but it is the marriage of a problem-oriented component is trustworthy. Trusting programmers
technique with a machine-oriented language. It is against the commercial interest of both parties.
addresses implementation, but not so well the This is not to cast dispersion on programmers, but
other aspects of the software lifecycle. Since C++ merely recognises that computers are good at
is not so well integrated with analysis and design, performing mundane tasks and checks, but people
the transformation required to go from analysis are not. If people were good at such things, we
and design to implementation is costly. The would not need computers in the first place.
semantic gap between design languages and the Building trustworthy components is a safety
implementation language is great. concern.
We should have learnt from the structured
world that this is the incorrect approach to the 3.29. Reusability and Compatibility
software lifecycle. But in the OO world we are Different compiler implementations need to be
again falling into the trap of dividing the lifecycle compatible in order to realise reusability between
into artificially distinct activities of OOA, OOD components. Different C++ compilers generate
and OOP, instead of adopting an integrated different class layouts, virtual function calling
approach to these. Modern languages provide a techniques, etc. The name encoding schemes used
much more integrated approach to the complete for type safe linkage can also be different. If two
software development process than C++. C++ different compilers generate different run-time
supports classes and inheritance and other organisations, then different name encodings are
concepts of object-orientation, but fails to address desirable as it will prevent two incompatible
the entire software lifecycle. libraries from being linked. The C++ ARM (p122)
states “If two C++ implementations for the same
3.27. Reusability and Communication system use different calling sequences or in other
Reusability is a matter of communication. ways are not link compatible it would be unwise
In order to use a software component, you to use identical encodings of type signatures.”
must be able to understand it. The writer must This can be solved in two ways. Firstly, a
communicate the purpose, intent, and correct library vendor could provide the entire source of a
usage of the component to the client. In the library so it can be compiled by the customers
object-oriented world, clear and concise definition compiler. This is not satisfactory if the sources are
of software modules is not a mere nicety, but proprietary. Then the vendor will need a separate
essential for reusability. Arising out of the issue release for every environment, and every compiler
of reusability is extendibility. In order to in that environment.
maximise the reuse of software, it often must be Because of this problem a strong case exists
tailored for new applications. The client for a universal intermediate machine readable
programmer must decide whether the software representation of programs. Interestingly, some
component is suitable for the new task. If so, what systems are already using C as a ‘universal
is the best way to extend it? Clear communication assembler’, notably AT&T C++ and Eiffel. But
to clients is a courtesy concern. this cannot solve the above problems of
compatibility between components without a
3.28. Reusability and Trust standardisation effort on run time layouts and
Reusability is a matter of trust. name encoding schemes.
Trust results from confidence that safety
concerns have been met. If you do not have 3.30. Reusability and Portability
confidence in a software component, then it is Since true OOP ensures that objects are
difficult to consider it for reuse. There could be loosely coupled to the external environment,
doubt that the software component provides portability to diverse environments is possible. C
enough functionality, or correct functionality. is tightly coupled to the Unix environment, and as
There could be doubt that the component is such is not particularly portable to diverse
efficient enough, or worse it might crash. The environments.
C/C++ philosophy of not building checks into the
language and compiler because programmers can 3.31. Idiomatic Programming
be trusted, works against trust and reusability. The ability to program in different idioms is
In the real world of reusability, the ideal of argued as a strength of C++. Idiomatic
trusting programmers is inappropriate. Trusting programming, however, is a weak form of
programmers results in less trustworthy software. paradigmatic programming. It is programming in
In reality, customers doubt the claims of a paradigm without necessarily having compiler
suppliers. It is the onus of the supplier to prove support for that paradigm. The compiler cannot
their claims, and thus trustworthiness of the check for inconsistencies with the idiom, or
software. The client is not required to trust the paradigm. Defines can often be used to invent
supplier’s programmers. Potential clients of a idioms. Anyone who has attempted to do object-

C++?? 2nd Edition page 17


oriented programming in a conventional language Object level is natural for the programmer,
using defines will realise that it is impossible to and has the advantage that a programmer can
realise all the benefits easily, if at all, without implement a system without taking into account
compiler support. parallel processing at all. The same program will
run and produce identical results irrespective of
3.32. Concurrent Programming whether the customer is running a single
In the next ten years multiple processor arrays processor, or a processor array.
that execute programs concurrently will probably Side effects must be avoided in concurrent
become common. Concurrency requires much systems. Suppose a computation depends on
cleaner languages, than the single processor combining the results of two functions f and g,
languages of today. Object-oriented concepts such as f + g. If f and g are independent, then they
support concurrent programming. Objects can can be computed concurrently. If however, f
execute state changing code independently of each produces side effects that g depends on, they must
other. Concurrent programming will be enabled be computed sequentially. F and g are parameters
by the division of the state space of a system into to the + function. Routine parameters can be
modules to achieve a high degree of independent computed concurrently, as long as the
processing. Objects provide a scheme to cleanly computation of each causes no side effects. Side
divide state spaces. The demand that everything effects are avoided by restrictive practices that C
be broken down into loosely coupled modules, devotees would object to.
that only interact through well defined interfaces C++ does not preclude the use of a global
might be perceived as inefficient. But it is environment. Access to shared global data
precisely this scheme that will mean that potentially causes a thread to lock, and if many
concurrent solutions can be developed efficiently such accesses occur, the advantage of concurrency
and transparently to the programmer. Concurrency is lost. This is because updates to a global
should be transparent to the programmer, as environment are side effects. Programming in
concurrency is a low level implementation such an environment requires complex locking
consideration. That is concurrency is how a mechanisms to ensure that things happen in the
computation is done, not what is to be computed. correct order. Locks are rather like waiting for a
The programmer should be concerned with what plane to take off when it has to wait for another
is to be computed, not how. How something is connecting flight. This cannot be entirely avoided,
computed is the concern of the target but should be reduced as much as possible.
environment, ie the compilers, operating system,
and hardware. When programmers are not
concerned with this level, efficiency and 4. The role of Language
portability follow automatically. For an intermission between sections, I’ll
The aim of concurrent processing is to keep mention some interesting points that the
all the processors in a processor array as fully Cambridge Encyclopedia of Language [Crystal
utilised as possible, so that processor resources are 87] makes. It says that language is an emotional
not wasted. This is as good as can be expected. subject. “It is not easy to be systematic and
There is nothing more mysterious to concurrent objective about language study. Popular linguistic
programming than the efficient use of resources. debate regularly deteriorates into invective and
Keeping all processors busy is an inherently polemic. Language belongs to everyone; so most
dynamic problem, which the programmer cannot people feel they have a right to hold an opinion
determine statically at compile time. All the about it. And when opinions differ, emotions can
processors can be kept busy, as long as there are run high. Arguments can flare over minor points
enough threads in the system. of usage as over major policies of linguistic
In concurrent programming, a thread is a unit planning and education.”
of sequential execution. Concurrency is achieved While natural language is difficult to be
by the splitting of threads. A thread can be split “systematic and objective about”, should this
when a state changing routine is invoked, but not apply to computer languages? The definition of
a value returning function, because it must wait natural language is generally beyond our control,
for the value. State changing routines can easily with the exception of languages such as
be invoked on another processor. Object level Esperanto. Programming language definition,
granularity seems to be a natural candidate for however, is within our control. Programming
concurrent processing. An object can have only languages must have expressiveness like natural
one update thread at a time to avoid simultaneous language, yet be precise and semantically
update problems. Other levels of concurrency are consistent. As programming languages have
instruction level, and task or process level. Task rigorous requirements, we should be even more
or process level is the level used in conventional critical and objective about them. It is a measure
multi-processing systems currently commercially of immaturity in the programming profession that
produced, and instruction level is quite difficult, emotional and irrational defensiveness often
best being left to instruction pipelines. denies valid criticism. Many dismiss the choice of

C++?? 2nd Edition page 18


programming language as a religious issue. If sent from one object to another, so that they can
language choice is merely religious, then we communicate and interact. Static binding
might as well still program in assembler, or determines this message in advance as the
maybe even binary, because the adoption of high receptor is always the same type, or descendant of
level languages would have no technical merit. that type. Static typing ensures in advance that the
Language choice, however, is a technical receptor object can process the message. Dynamic
consideration. Technical measures should judge binding, means that the exact message to be sent
the effectiveness of a language. Understanding the is determined by the dynamic type of the receptor
role of language helps quantify what must be when the message is actually sent. For example,
measured. on your telephone, you talk to your friends
The Cambridge encyclopedia lists several differently than a client, even though you are
functions of language. “To communicate our using the same piece of equipment. These are the
ideas”, it says is the most common answer, and concepts that C++’s virtual do not express well.
this must surely be the most widely recognised Designing an object-oriented system is like
function of language. It lists several other designing a language by which objects interact.
functions of language. One function is emotional Thus tools used for formal programming language
expression. For instance, when we stub our toe, design, BNF, denotational semantics, and
we often emit words, even when there is no one to axiomatic semantics can help in the design of an
hear. Another is social interaction. For example if object-oriented system.
someone sneezes, we often “bless” them. Another “Language shapes the way we think, and
is the power of sound, as in poetry and rhyming determines what we can think about.” -
jingles etc. Another is the control of reality, as in B.L.Whorf. Bjarne Stroustrup quotes this in “The
spells and incantations. Perhaps computer C++ Programming Language”. But is this correct?
programs and spells are similar in purpose. The encyclopedia says, “It seems evident that
Another is recording facts. This includes record there is the closest relationship between language
keeping, historical and geographical documents, and thought: everyday experience suggests that
etc. Another is the instrument of thought. We much of our thinking is facilitated by language.
often reason about things to ourselves in But is there identity between the two? Is it
language. Another function is the expression of possible to think without language? Or does
identity. Language can express who we are, or language dictate the ways in which we are able to
affirm our belonging to certain groups. Perhaps think? Such matters have exercised generations of
the most important role of computer languages is philosophers, psychologists, and linguists, who
to enable description, and recording the decisions have uncovered layers of complexity in these
made during the design and implementation of a straightforward questions. A simple answer is
system. certainly not possible; but at least we can be clear
Since language and communication are two about the main factors which give rise to
closely related concepts, it is important to complications.”
understand their relationship, and the nature of The above Whorf quote is a statement of the
communication. Language is the set of aural and Sapir-Whorf hypothesis on language and thought.
written symbols with which we communicate. Edward Sapir (1884-1939) formulated this with
Laurence Wylie in the foreword to “French in his pupil Benjamin Lee Whorf (1897-1941). It
Action” [Capretz 87] describes communication as reflects the view of its day when great value was
“To understand this [communication] we must placed on the diversity of the languages and
know the basic meaning of the words common, cultures of the world. The Sapir-Whorf hypothesis
communicate, and communication. They are combined two principles. The first is ‘linguistic
derived from two Indo-European stems that mean determinism’, which states that language
“to bind together.” In this ordered universe, no determines the way we think. The second,
human being can live in isolation. We must be ‘linguistic relativity’, that the distinctions found in
bound together in order to participate in an one language are not found in any other. There
organised effort to accomplish the necessary can be both verbal and non-verbal thought;
activities of existence. This relationship is so vital following a road map in a car for example. Street
to us that we must constantly be reassured of it. directions are often difficult to put into words.
We test this connection each time we have contact The Sapir-Whorf hypothesis in its strongest
with each other.” form, as in the Whorf quote, is now not generally
The concept of binding is also important in accepted. For one reason, it is known that
computing. In networks, binding establishes concepts can be translated from one language into
communication links between two or more another. This is even if in one language, the
entities. This forms a greeting, so that a concept can be expressed in one word, but takes a
relationship is established, and communication is phrase of words in another. A weaker version of
possible. In programming we have the concepts of the Sapir-Whorf hypothesis is accepted. That is
static and dynamic binding. Binding in this “language may not determine the way we think,
paradigm makes it possible for a message to be but it does influence the way we perceive and

C++?? 2nd Edition page 19


remember, and it affects the ease with which we Elements of Style,” [S & W 79]. This has been
perform mental tasks. Several experiments have around for most of this century in one form or an-
shown that people recall things more easily if the other. William Strunk, the original author had
things correspond to readily available words and some stern advice for his students:
phrases. And people certainly find it easier to “Vigorous writing is concise. A sentence
make a conceptual distinction if it neatly should contain no unnecessary words, a paragraph
corresponds to words available in their language. no unnecessary sentences, for the same reason that
Some salvation for the Sapir-Whorf hypothesis a drawing should have no unnecessary lines and a
can therefore be found in these studies, which are machine no unnecessary parts.”
being carried out within the developing field of The machines that the software professional
psycholinguistics.” develops are not built, but written. Strunk’s last
The important question to the programming sentence prompts consideration of the relationship
community is do programming languages ‘shape’ of writing to software development. A common
the way we think about and design systems? The situation is taking several thousand lines of
negative argument is that it is the concepts behind incomprehensible ‘code’, and making it execute
languages that are important, not the languages efficiently. After spending considerable time we
themselves. Languages only provide a framework often realise what the program does, and reduce it
for the expression of the concepts. A language can to several hundred lines of program ‘text’ that
only be as good as the concepts it implements. A runs ten times faster. Strunk’s quote should be
programming language influences the way we applied to programming. A routine should contain
program, and the way we use the concepts it no unnecessary declarations or instructions, a
implements. It can clarify the concepts, or obscure system no unnecessary routines. It can also be
them as in the case of C++. A language must applied to programming languages. A pro-
implement the concepts cleanly and simply. It gramming language should contain no
must express the concepts in as few words and unnecessary constructs. This is the root of my
constructs as possible. But this does not just mean dissatisfaction with C++. Much of it is
avoiding keywords as in C. Programmers who un- unnecessary, even for the most complex systems.
derstand the concepts should have no difficulty in Its syntax is ugly. C++ has become what the C
adapting to different languages, as long as the new world has constantly criticised in languages like
language implements the concepts elegantly. PL/1 and Ada. Only C++ is worse. We need to
A language can be judged like a wine regain artistic elegance and simplicity.
connoisseur judges wine, by holding it up to the
light to judge for clarity and colour. Ultimately, it
is the taste that matters, but good colour and 6. Generic C criticisms
clarity suggests that the taste is more likely to be These criticisms apply to the C base language,
good. Clear programming language definition but in general adversely affect C++. R.P.Mody
helps in the goal of the production of quality soft- [Mody 91] gives an excellent general criticism of
ware. C. He says that to properly understand C you
So where does this leave Sapir-Whorf with must understand the insides of the compiler. He
respect to programming languages? Programming gives many examples of how C obscures rather
languages do not shape the way we think. It is the than clarifies software engineering. He concludes
concepts that shape the languages, and it is the that he is “appalled at the monstrous messes that
way we think that shapes the concepts. Those who computer scientists can produce under the name of
have attempted to learn a language in order to ‘improvements’. It is to efforts such as C++ that I
learn object-oriented programming realise that it here refer. These artifacts are filled with frills and
is the concepts which must be grasped in order to features but lack coherence, simplicity,
be effective. Once the concepts have been learnt, understandability and implementability. If
object-oriented programming seems a natural way computer scientists could see that art is at the root
to program. It matches very effectively the way of the best science, such ugly creatures could
we think. If C++ has been designed according to never take birth.”
the Sapir-Whorf hypothesis, its philosophical C’s popularity is based on several myths.
basis does not serve a computer industry that Firstly, that it is a high level language. It is not. It
should shape tools best suited to its purposes, is a structured assembler oriented towards the low
processes, thinking, and concepts. level machine domain, not to the problem domain
of a high level language. Secondly, that it is small
and simple. Its semantics are not simple, but it is
5. On Writing very simple to make catastrophic errors. Thirdly,
During the development of this critique, I that it is portable. Certainly compilers are
realised it had grown larger than I had intended, available on many platforms, but this does not
and that my writing style needed some polish for make programs portable, especially to diverse,
such a large work. During my research, some and future architectures. Platform independence
colleagues recommended a small book, “The achieves portability. Fourthly, that it is efficient.

C++?? 2nd Edition page 20


What seems efficient on some platforms is the handles transparent to the programmer. This is
very antithesis of efficiency on other platforms. It similar to the Unisys A Series approach where
seems efficient on certain platforms because it object ‘descriptors’ access the target object via a
allows the lower level machine-oriented master descriptor that stores the actual address of
architecture to be visible at a higher level, instead the object. On the A Series this is transparent to
of being handled transparently by the compiler. programmers in all languages, as this transparency
This means that programs will be locked into is realised at a level lower than languages. The A
certain styles of architecture, or into current styles series descriptor mechanism also provides
of technology, instead of protecting program hardware safety checks that mean that pointers
investment against future technological change. cannot overrun, and arrays cannot be indexed out
And lastly, that the semantics are mathematically of bounds. C cannot be implemented particularly
rigorous. Anyone who reads the C++ ARM will well on such machines, as C’s mechanisms are
realise just how poorly defined the language is. lower level than the target environment.
Anyone who has practiced C will know how Other environments do not provide object
many traps there are to fall into. relocation, so double indirection is an unnecessary
overhead. In order for programs to be portable and
6.1. Pointers to be at their most efficient in different target
C pointers are a low level mechanism that environments, such system details should be the
should not be the concern of programmers. concern of the target compilation system, not of
Pointers mean the programmer must manipulate the programmer.
low level address mechanisms, and be concerned C’s pointer declaration syntax causes another
with lvalue and rvalue semantics, which are small problem:
machine oriented and not problem oriented as you int* i, j;
would expect of a high level language. A compiler
can easily handle such issues without loss of
generality or efficiency. Memory models of This does not mean, as might be easily read -
different environments often affect the definition int *i, *j;
of pointers. Memory model details such as near
and far pointers should be transparent to the but
programmer. int *i, j;
The programmer must also be concerned with
correct dereferencing of pointers to access and should be written thus to avoid confusion.
referenced entities. Use of pointers to emulate by
reference function parameters are an example. The 6.2. Arrays
programmer has to worry about the correct use of Page 137 of the C++ ARM notes that C arrays
&s and *s. (See the section on function are low level, yet not very general, and unsafe.
parameters.) Page 212 admits, “the C array concept is weak
Pointer arithmetic is error prone. Pointers can and beyond repair.” Modern software production
be incremented past the end of the entities they is far less dependent on arrays than in the past,
reference, with subsequent updates possibly especially in the object-oriented environment. The
corrupting other entities. How many lurking and trade off to be optimal, rather than general and
undetected errors are in programs because of this? safe no longer applies for most applications. C
This illustrates how C undermines OOP by arrays provide no run-time bounds checking, not
providing a mechanism where state outside an ob- even in test versions of software. This
ject’s boundaries can be changed. Since pointers compromises safety and undermines the semantics
are intrinsic to writing software in C this of an array declaration, ie an array is declared to
exacerbates the problem. Pointers as implemented be a particular size, and can only be indexed by
in C make the introduction of advanced concepts values within the given bounds. An index to an
like garbage collection and concurrency difficult. array is a parameter in the domain of the array
Another consideration is that dynamic function. An index out of bounds is not a member
memory implementations vary between platforms. of the domain, and should be treated as severely
Some environments make memory block as divide by zero. C has no notion of dynamically
relocation easier by having all pointers reference allocated arrays, whose bounds are determined at
objects via a master pointer which contains the run time, as in ALGOL 60. This limits the
actual address of the block. The location of the flexibility of arrays. The C definition of arrays
master pointer never changes, so relocation of the compromises both safety and flexibility.
block is hidden from all pointers that reference it. One view of arrays is just another object-
When the block is relocated, only the master oriented entity which should be treated in an
pointer needs to be updated. object-oriented manner as a class of data structure.
On the Macintosh, for example, the double It should have interface definitions, and
indirection mechanism of ‘handles’ facilitates consistency checks inherent in object-oriented
relocation of objects. Object Pascal makes these systems. Another view is that an array is an

C++?? 2nd Edition page 21


implementation of a function, where pairs of inputs to outputs. Abstract data types can be used
values explicitly map the domain to the range, to design such systems. Also this will help target
rather than being computed. This suggests that environments to increase parallelism and
Algol was incorrect in distinguishing arrays by concurrency in a way transparent to programmers.
using square brackets. An array just maps the In object-oriented programming, by reference
input argument (the index) to a value of the type parameters are used to pass the original object, not
of the array. An array can be viewed as a random a copy. The called routine, however, cannot
access stack. change the state of the referenced object. Only
[Ince 92] considers that arrays and pointers calling a routine in the objects interface can
need not be relied upon so heavily in modern change the state. This has the desired effect of the
software production, as higher level abstractions object being given to you, without being yours to
such as sets, sequences, etc are better suited to the change, although you can effect change in the
problem domain. Arrays, and pointers can be object.
provided in an object-oriented framework, and C shares faulty parameters with many other
used as low level implementation techniques for languages. The interaction of C’s pointer
the higher level data abstractions. As has already mechanism with a faulty parameter mechanism,
been mentioned object-oriented programming is however, makes C considerably worse than most
very useful for the encapsulation of other languages. In C, pointers are used to
implementation and environment oriented details. simulate by-reference parameters with by-value
Ince suggests that arrays and pointers should be parameters. The programmer must perform
regarded in the same way as gotos in the tedious bookkeeping by specifying *s and &s for
seventies. He suggests that languages such as referencing and dereferencing. Distinguishing
Pascal and Modula-2 should be regarded in the between by-value and by-reference parameters is
same way as assembler languages in the seventies. not just a syntactic nicety, included in most high
This applies even more to C and C++, because level languages, but a valuable compiler
pointers and arrays are far more intrinsic in the technique, as the compiler can automatically
use of C and C++. generate the referencing and dereferencing,
I agree with Ince that we have less dependence without burdening the programmer.
on arrays, and that pointers in programming
languages can be considered harmful. But I 6.4. void *
disagree in as far as the concept of array is useful “Passing paths that climb half way into the
for mapping one set of values onto another, where void” - Close to the Edge, Yes.
this mapping cannot be described Is void * the C equivalent of an oxymoron? A
computationally, but can only be expressed by pointer to void suggests some sort of semantic
pairs or tuples of coordinates. nonsense, a dangling pointer perhaps? Maybe we
should tell the astronomers we have found a black
6.3. Function Parameters hole! While we can have some fun conjecturing
Parameters are used to pass routines simple what some of the obscure syntax of C suggests, a
values (by-value parameters), or references to serious problem is that void * declarations are
entities (by-reference parameters). Parameters are used to defeat the type system, and so
inputs to routines, and should not be changed. compromise its purpose. A well thought out type
When memory was expensive, reusing parameter system does not require such a facility. In an
space could conserve space. Changing parameters, object-oriented type system, the root class of the
however, is semantic nonsense, and most inheritance hierarchy provides the equivalent of
languages get this wrong. void.
By reference parameters enable a routine to When a typed entity is assigned to a reference
change the value of an entity external to the of void *, it looses its static type information.
routine. Such updates beyond the environment of When it is assigned back to a typed reference the
a routine are side-effects. This introduces a programmer must explicitly specify to the
mechanism of updating the state space, other than compiler the type information. This is error prone
straight assignment (although the routine can use and should at least result in a run-time check, to
assignment to achieve the ‘dirty deed’.) The make sure that the correct type actually is being
danger is that the state of an object can be assigned. Without type checks, the routines of one
changed without using the well defined interface class can be mistakenly applied to objects of
of the object. By-reference parameters should not another class.
be used to change the external world. Values
should only be passed to the external world by the 6.5. void fn ()
return value of a function. Semantically, this is The default type that a function returns is int.
quite different to assignment to a reference A typeless routine returning nothing should be the
parameter; data flows through the program in one default. Instead this must be specified by another
direction, in via parameters, and out via return confusing use of void. This is an example of
values. Mathematically this maps compositions of

C++?? 2nd Edition page 22


where C’s syntax is not well matched to the assign values to variables. Functions, however,
concepts and semantics. Syntactically no <type> are the target of assignment. The return statement
should suggest nothing to return. Also a typed accomplishes this in C. Algol has no return
function can be invoked independently of an statement, but uses assignment to the name of the
expression. This is a shorthand way of discarding function. The assignment of a value to a variable
the returned value. Values should be returned sets the return value for subsequent invocations of
because they need to communicate with the that function.
outside world, and ignoring returned values is It is trivial for a compiler to realise this
often dangerous. In other words, using a typed transparency of view for variables and functions.
function as a void should result in a type error. In ALGOL style languages, the compiler
In fact there should be no such thing as a void automatically deduces invocation when it sees a
function. A void function is a procedure. name that was declared as a routine, rather than a
Procedures and functions should be distinguished. variable. The compiler knows that the identifier
This distinction belongs to the problem ‘what’ refers to a routine. This compiler technology was
domain. A procedure is a routine that changes the not realised when FORTRAN and COBOL were
state of its object, but returns no value. A function developed. This compiler technology is possible,
should, in general, not cause any change to the because the compiler stores much information
state of an object, but just return some result about an entity. A compiler can check that the
dependent upon the objects state. Mathematically, programmer uses the entity consistently with the
a function is an entity that returns a value of a declaration. A compiler can generate correct code,
given type. Procedures are untyped, and do not without burdening the programmer with having to
return a value. So it is incorrect to regard redundantly use an invocation operator. This
procedures as functions. Functions as will be enhances flexibility and implementation
explained below have more in common with independence.
variables than procedures. Procedures can cause In fact the Unisys A series architecture
side effects, functions should not cause side elegantly achieves this level of transparency at the
effects. These distinctions are useful when hardware level. The value call (VALC) operator
considering concurrency. loads a value onto the top of stack. When VALC
hits a data value, that value is retrieved from
6.6. fn () memory and loaded onto the stack. When VALC
Empty parenthesis represent the function hits a program control word (PCW), a routine that
invocation operator in C. Even though ‘()’ is computes the value to be loaded onto the stack is
mathematical looking, it is semantically invoked.
equivalent to FORTRAN’s CALL, COBOL’s Variables and functions should be
PERFORM, and JSR in assembler. The design of interchangeable for programmer optimisation. In
these operators was influenced by the underlying C, it is not possible to change a function to a
machine architectures. The invocation operator is variable without removing all the (). This might
low level, machine and execution oriented, and in be spread over many files, and the programmer
the ‘how’ domain. might not bother with optimisation to avoid the
This is opposite to most Unix shells, where tedium of the task. So the () operator reduces
invocation operators such as ‘run’ and ‘exec’ are flexibility. Thus implementation detail is visible
not needed. The ability to execute file names as for the outside world to see. The () operator is
commands extends the command repertoire. The another bookkeeping task imposed on the C
shell runs executables and interprets shell scripts. programmer. Pure functional languages such as
There is no distinction as far as the shell user is SML remove the variable/function distinction
concerned. This is a widely accepted as an elegant altogether, by not having variables at all.
and effective convenience. C’s () operator The removal of the variable/function
introduces the equivalent of a run command into distinction would remove the need for a common
the language. use of C++’s inline functions. Inlines clutter the
No invocation operator exists in the problem name space of a class and add work for the
oriented domain of high level languages. This is programmer. All that is required is to directly
because the semantics of a function is to return a export a data member as a function.
value of a given type. How this value is computed C also has pointers to functions. Function
is unimportant. The value could be computed by a pointers are analogous to the call by name facility
routine invocation, by sending a message across a in ALGOL, and this was recognised as having
network, by forking an asynchronous process, or pitfalls. Consistent application of the object-
by retrieving a precomputed result from a memory oriented paradigm avoids these pitfalls. A
location, ie a variable. common use of function pointers is to explicitly
The distinction that languages like C make set up jump tables. The mechanism behind virtual
between variables and value returning routines is functions is a jump table of function pointers. The
artificial. It could be pointed out that variables are design of a program can take advantage of this
fundamentally different to functions, as you can fact, without resorting to explicit jump tables.

C++?? 2nd Edition page 23


Another use is to jump to a function in a table that appears to be left to the implementation, which
is indexed by an input character. A switch contributes to non-portability. If this can’t be
statement can cater for this mechanism that makes defined for a sequential processor, then it is even
what is meant explicit, while keeping underlying worse for a concurrent environment.
mechanisms (and possibly optimisations) The shorthand += and -= are more powerful as
transparent. C++ allows function pointers to values other than 1 can increment the variable. It
member functions to be stored in tables (via the .* has been suggested that there should also be &&=
and ->* operators). and ||= operators.
If it is mistakenly believed that a multiplicity
6.7. Metadata in Strings of operators is required to produce more optimal
The implementation of strings in C mixes code, then it should be pointed out that code
metadata with data. Metadata is data about an generators, especially for expressions, can
object, but is not part of the data itself. Examples produce the best code for a target architecture. A
of metadata are addresses, size and type plethora of operators complicates the task of an
information. Such metadata is often referred to as optimiser. A compiler can optimise well beyond
data descriptors, and can be kept independently of what a programmer can do. An optimising
the data, with the advantage that the programmer compiler will analyse the surrounding code, and if
cannot mistakenly corrupt the metadata. an entity is used several times in a local scope, it
In C strings, metadata about where a string will keep the value of that entity handy locally at
terminates is stored in the data as a terminating the top of a stack, or in a register, rather than
byte. This means that the distinction between data retrieve it from slow main memory several times.
and metadata is lost. The value chosen as the The nature of such optimisations depends on the
terminator cannot occur in the data itself. The machines architecture, which a programmer
common alternative implementation is to store a should not have to be aware of. Open systems
length byte in a fixed location preceding the demands that programs can be ported amongst
string. This length metadata can be hidden from diverse architectures and environments, very
the programmer who does not need to know different to the original machine, and not only
where the length metadata is stored. This run, but run efficiently. Optimisers work best with
implementation also has the advantage that the simple, well defined languages.
length of a string can be easily obtained, without In fact constructs such as:
having to count the number of elements up to the while (*s1++ = *s2++);
terminating null.
might look optimal to C programmers, but are
6.8. ++, -- the antithesis of efficiency. Such constructs
The increment and decrement operators are preclude compiler optimisation for processors
often used as an example that C was designed as a with specific string handling instructions. A
high level assembler for PDP machines. These simple assignment is better for strings, as it will
operators provide a shorthand convenience, but allow the compiler to generate optimal code for
are unnecessary. There are no less than three different target platforms. String assignment will
ways to perform the same thing - also hide the implementation details of strings. If
the target processor does not have string
a = a + 1 instructions, then the compiler should be
a += 1 responsible for generating the above loop code,
a++ rather than requiring the programmer to write such
++a low level constructs. The above loop construct for
string copying is also the contrary to safety, as
For full generality, only the first form is there is no check that the destination does not
required, the others are a mere convenience. The overflow. The above code also makes explicit the
last two forms a++ and ++a are the postfix and underlying C implementation of strings, that are
prefix forms. They are often used in the context of null terminated. Such examples show why C
another expression. Thus several updates can be cannot be regarded as a high level language, but
performed in one expression. This is a very rather as a high level assembler.
powerful and convenient feature, but introduces As with name overloading, memory storage
side effects into an expression that sometimes update is a problematic, but necessary part of
have surprising effects, and can lead to program programming. A language should provide it in a
errors. The following example is given on p.46 of consistent and expected way. Many languages
the C++ ARM - recognise that memory update is problematic, and
i = v[i++]; // the value of ‘i’ is typically only provide limited but sufficient ways
// undefined of updating, by an assignment operation. (Many
languages have block memory copies as well, but
The ARM points out that compilers should assignment can also provide block copy.)
detect such cases, but the exact interpretation Furthermore, many languages avoid side-effects

C++?? 2nd Edition page 24


by limiting updates to only one per statement. C Consider the paradigm of letters and words.
provides too many ways to update memory. These Words are spelt by assembling letters in order.
add nothing to the generality of the language, There are 26 distinct letters. With the addition of
increase the opportunity for error, and complicate digits 0 to 9, and the underscore character, we
automatic optimisation. Restrictive practices are have a complete and correct definition for
justifiable in order to accomplish correctly identifiers. Letters can be written in a number of
functioning and efficient software. styles. They can be bold, italic, upper or lower
case. Such typographic representations, however,
6.9. Defines do not change the semantics of a word. Thus if we
write ALGOL, Algol or algol, we recognise the
The define declaration - word to represent a computer language. The case
#define d(<parameters>) of the letters does not change the semantics. Letter
case is only a typographic device. Typographic
has a different effect to - conventions make program text more readable,
#define d (<parameters>) but should not affect the semantics of a program.
Case distinction is based on the low level
The second form defines d as (<parameters>). paradigm of character codes such as ASCII used
Extra white space between tokens should not internally in the computer. This weakens the
affect semantics of constructs. purpose of using names to replace addresses, as
#defines are poorly integrated with the names are reduced to a string of character codes.
language. The ‘#define’ must be in column 1, and Case distinction also contributes to errors. It
knows nothing about scope rules. Errors in introduces ambiguity, and as has already been
defines can lead to obscure errors, as the mentioned, ambiguity weakens the purpose of
preprocessor does not detect them, but leaves names, as identity is lost. As every programmer
them for the compiler. Programmers must be will have experienced, one character errors are
familiar with the particular preprocessor more difficult to find than one would think. For
implementation on their system, as preprocessor example if an identifier is declared Fred, another
implementations are different, particularly one can be declared fred. Such names are easily
between Classic C and ANSI C. mistyped and confused. We are in general poor
proof-readers. The psychological reason for this
6.10. NULL vs 0 is that the the brain tends to straighten out errors
[Ellemtel 92] recommends that pointers for our perception automatically. The human brain
should not be compared to, or assigned to NULL, is an excellent instrument for working out what
but to 0. Stylistically, NULL would be preferable. was intended, even in the presence of radical
It would also allow for environments where null error. (This makes us good at difficult tasks like
pointers have a value other than 0. ANSI-C, speech recognition.) In order to overcome this
however, has subtle problems with the definition programmers must use their powers of
of NULL. concentration to override this natural tendency of
the brain. Distinguishing upper from lower case in
6.11. Case Distinction names only adds another level of difficulty. Good
It is good to adopt typographic conventions language design takes into account such
for names, but distinguishing between upper and psychological considerations in these small but
lower case in names can cause confusion. important details, being designed towards the way
Confusion leads to errors and systems that are humans work, not computers. Such considerations
difficult to maintain and modify. Case distinction of cognitive science make a big difference to the
is based on the implementation paradigm of how effectiveness of people, but do not have any
character codes work. Why do we have names? impact at all on the efficiency of code generated
To give entities identity, and aid our memory of for the computer. What is more important, people
that identity. Philosophically, case distinction is or computers?
contrary to the fundamental purpose of names. Case distinction provides another form of
Case distinction in interactive systems is a name overloading. Name overloading is a double-
poor user interface. It is clumsy having to edged sword. It leads to ambiguity, confusion and
continually use the shift key, and will slow a good error. Name overloading as has been suggested in
typist. More importantly, case distinction makes the section on name overloading should only be
names harder to remember, and so is contrary to provided in controlled and expected ways, where
the purpose of aiding memory. It is difficult overloading provides a useful function such as
enough for users to remember command module independence or polymorphism. Where a
mnemonics or file names, let alone exactly the name is overloaded in the same scope the
case. Names are used instead of difficult to compiler should report an error.
remember addresses. If we did not have names, As another example, a commonly used
we would have to retrieve files by addresses, or technique is -
call people by their social security number.

C++?? 2nd Edition page 25


class obj TYPE
{ CHARACTER
int Entry;
FUNCTIONS
ord: CHARACTER -> INTEGER
void set_entry (int entry)
// convert input character to integer
{
char: INTEGER /-> CHARACTER
entry = Entry;
// convert input integer to character
}
} PRECONDITION
// check i is in range
If you have not spotted the error in the above pre char (i: INTEGER) =
example, what was it supposed to mean? 0 <= i and
i <= ord (last character)
6.12. Assignment Operator
Using the mathematical equality symbol for The notation ‘->’ means every character will
the assignment operator is a poor choice of map to an integer. The partial function notation ‘/-
symbols. Programming assignment is not equal to >’ means that not every integer will map to a
mathematical equality (:= != =). Language character, and a precondition, given in the ‘pre
designers of ALGOL style languages realised they char’ statement, specifies the subset of integers
were semantically quite different, so took the care that maps to characters. Object-oriented syntax
to distinguish, only using ‘=’ to mean equality in provides this consistently with member functions
the sense of mathematical assertion. In C the lack on a class:
of distinction leads to error. It is easy to use = i : INTEGER
(assignment) where == (equality) is intended, ch : CHARACTER
which looks reasonable, but results in errors that
are difficult to detect. i := ch.ord
This leads to a more general criticism of C, in // i becomes the integer value of
that it has a pseudo mathematical appearance. Few // the character.
people are proficient at interpreting mathematical ch := i.char
theorems, most passing over such sections in text, // ch becomes the character
making the assumption that the mathematics // corresponding to the value i.
proves the surrounding text. The pseudo-
mathematical nature of C has this bad attribute of but a routine char would probably not be
mathematical notation. It is difficult to read, while defined on the integer type so this would more
lacking the semantic consistency and precision of likely be:
mathematical notation. One of the keys of ch.char (i);
reusability is readability.
// set ch to the character
// corresponding to the value i.
6.13. Type Casting
Type casting is just a mechanism to map The hardware of many machines cater for such
values of one type onto values of another type. basic data types as character and integer, and it is
This means type casting is no more than a specific entirely possible that a compiler will generate
form of mathematical function. Type casting has code that is optimal for any target hardware
been useful in computer systems. Often it is architecture. So many languages have character
required to map one type onto another, where the and integer as built in types. The object-oriented
bit representation of the value remains the same. paradigm, however, can treat such basic data types
Type casting is therefore a trick to optimise consistently and elegantly, by the implicit
certain operations. Type casting provides no definition of their own classes.
useful concept that general functions cannot Another example of type conversion is from
implement. Furthermore, type casting undermines real to integer. Here though, the programmer
the purpose of strongly typed systems. In many might wish to specify the use of two type
languages, the type system has not been conversion functions to truncate or round.
consistently defined, so programmers often feel
that type casting is necessary. TYPE
Mathematically, all functions perform type REAL
casting. An example often used in programming is FUNCTIONS
to cast between characters and integers. Type casts truncate: REAL -> INTEGER
between integers and characters are easily
round: REAL -> INTEGER
expressed as functions using abstract data types
(ADTs). r: REAL

C++?? 2nd Edition page 26


i: INTEGER if (condition)
statement1; /* Semicolon
i := r.truncate
required */
// i becomes the closest integer
else
// <= r
statement2;
i := r.round
// i becomes the closest integer if (condition)
// to r {
statement1;
Again many hardware platforms provide } /* Semicolon must be omitted */
specific instructions to achieve this, and an else
efficient object-oriented language compiler will statement2;
generate code best suited to the target machine.
Such inbuilt class definitions might be a part of This is an irregularity, as a parser will
the standard language definition. reduce both of the above to the grammatical form:
if (condition) statement
6.14. Semicolons else statement
I am not concerned whether the semicolon is
defined as a terminator or separator. Arguments (In fact why do conditions in C if and while
that languages that define the semicolon as statements have to have parentheses around
terminator are superior to those that define it as them?)
separator are, however, baseless. The semicolon
as separator is really quite logical. It is based on
viewing the semicolon as a statement sequencing 7. Conclusions
or concatenation operator. It is therefore a binary C++ is overly complex. C is widely
operator, requiring both a left and a right hand recognised as being a simple language. But even
side. Some people claim to find this concept this is doubtful, as it has many operators, and a
difficult to understand, but if we consider it in the difficult precedence system. Its pointer style of
context of a mathematical expression, it would be programming is difficult. Overall, C has many
silly to expect that an addition be written as: traps that lead to difficult to detect errors in
software. Object-oriented languages should
a + b + provide sophisticated concepts in the simplest
Another way to look at a separator is to possible framework. Where the framework is not
consider the structure of a program. A program is simple, the concepts are obscured. OOP addresses
a list of elements. The executable part of a many issues in order to facilitate the production of
program is a list of sequentially executed complex and sophisticated programs. Many of
instructions. Elements in a list must be separated, these issues are addressed in implicit and subtle
and the semicolon is one syntax to separate ways, but are lost in C++. Subtle errors can be
elements in a list. The semicolon is therefore part introduced into C++ software in many ways, and
of the syntax of the list, not part of the syntax of furthermore, the combination of these will cause
the individual instructions. Languages such as even further problems. C++ has devices for petty
FORTRAN separated instructions by requiring convenience, while sacrificing major
that they be placed on different lines or cards. If conveniences and long-term correctness and
an instruction overflowed a line, a continuation safety. C++ forces the programmer to perform
character was required, like the backslash in C. many administrative bookkeeping tasks that a
Well defined languages do not require compiler can easily do.
continuation characters, as line breaks are It can be considered what application domain
unimportant, and have no effect on semantics. C++ is relevant for? The answer to this is that
Languages should have very regular grammars, so C++ might be used as a better C. But for what
that the semicolon could be an entirely optional applications is C relevant? C is relevant for low
typographic separator. level Unix systems programming. It is not a
In natural language both the comma and generally applicable language in view of its low
semicolon are separators, only the full stop is a level nature, and its flaws. C is not applicable for
terminator. If the comma was a terminator, large scale production. Hence C++’s attempt to
function invocations would look like: improve it. C++, however, has not solved C’s
flaws, as I once hoped it would, but painfully
fn (a, b+c, d, e,); magnified them. Better languages exist for higher
It is often argued that the semicolon as level functions such as communications and
separator leads to irregularities. C’s handling of networks, scientific work, compilers, etc. I
the grammar of semicolons, however, leads to an envisage that C has a place as a high level
irregularity in if/else’s: assembler that can be used to implement small
pieces of code on suitable platforms, where

C++?? 2nd Edition page 27


efficiency is of prime importance. Thus the use of advanced musician ensure that the tempo of a
C would be limited and well controlled, rather piece is correct, and since playing to a metronome
like small assembler routines are currently used in is more difficult, will help sharpen the musicians
some systems for the same purpose. Indeed the performance of the piece. The musician does not
move to C++ should only be considered in the just view the metronome as an aid for beginners,
case of upgrading a body of C programs for or as something that restricts him to a set beat, but
backwards compatibility. In the case of new as a tool that helps produce a polished and
projects alternatives to C and C++ should professional performance. C should not be seen as
seriously be considered. a language to which you graduate after you have
Programming is the orchestration of change learnt to program in languages with safety checks.
within a large state space. Object-oriented In fact changing to C or C++ is a great step
techniques provide a method of simple division backwards. Languages with consistency and
and management of such state spaces. Managing semantic checks are essential aids to the
such state spaces requires the simplest techniques, production of professional software.
in order to guard against detectable This paper has shown many cases where C++
inconsistencies that lead to errors in executable uses old C mechanisms to provide things that can
systems. C and C++ do not implement the simple and should be expressed consistently within the
management of a large state space, and allow object-oriented paradigm. For example type
many potential errors to go undetected. The role casting. The move to pure object-oriented
of a language as a tool cannot seriously be languages will facilitate more consistent
regarded as some authoritarian that stops us doing programming and avoid many typical errors that
what we want or need to do, as many languages occur in software production. C++ also makes
with type safety and consistency checks are often distinctions that belong in the ‘how’
viewed. Programming languages should embody implementation domain. For example, ‘.’ vs ‘->,
the collective wisdom of common sense practices and variables and functions. These make
that have been learnt over many years, by bookkeeping work for programmers, which
common and painful experience. C++ lacks the should be handled by a compiler. But then C++
implementation of much of this wisdom. fails to make distinctions that belong in the ‘what’
[Sakkinen 92] observes that much of the C++ problem domain. For example, procedures vs
literature has few references to external work or functions. Making distinctions in the ‘how’
research. It fails to draw on the insights and domain adds inconvenience to the language.
progress made by many researchers. This leads Failing to make distinctions in the ‘what’ domain
me to believe that C++ is parochial and removed limits the power and expressiveness of the
from the many advances that will make language. The amount of change required in C++
production of systems easier and more cost to address the issues raised in this paper is seen as
effective. largely insurmountable.
It is better to detect and avoid errors than to A programming language is just a tool, in the
fix them. The fixing of errors happens many times same way that an axe is a tool. If the axe is blunt
during the development process. This slows down when chopping down a tree, then procedures,
the development process, and is therefore costly. processes and methodologies could be invented to
Good programmers in this context (often called make it as effective as possible. But that leaves
‘gurus’), are those who recognise symptoms, and the real problem unsolved; that the axe that does
recommend fixes. Good programmers in the better the real work is blunt. So it is with programming
sense (often called ‘impractical idealistic languages. To develop a system, it must be
dreamers’) adopt better practices (programming implemented, and a programming language is the
languages being a subset of these), that avoid tool to do the real work. If the language is blunt,
error in the first place. then procedures, processes and methodologies
C encourages gurus who spout false wisdom might alleviate the situation, but they do not solve
on obscure subjects. Writing programs in C is the problem. Once the axe is sharpened, then real
often called ‘coding’. Coding is writing obscure progress is made, and the procedures, processes
encryptions that will later have to be decoded, by and methodologies also become more effective. A
none else than a guru! C also encourages good axeman will have good axe wielding
programming by guesswork. C programmers often technique, but given a choice of axes will choose
solve ‘bugs’ by adding extra ()s, *s and &s, the sharpest implement. A poor axeman could be
without understanding the problem. People who ineffective with even a sharp axe, but the axe
attain proficiency at this guesswork, are known as, maker will still strive to produce the sharpest axe
well you guessed it, gurus!! for the good axeman. The argument that poor
The view that correctness checks are training programmers will produce bad programs in any
wheels for students, which gurus don’t need must language so we shouldn’t bother with better
be dispelled. Many disciplines have techniques to languages is fallacious.
ensure correctness. For example, the metronome As mentioned in the introduction, both sides
in music is not just for students, but will help an of the analysis/design vs implementation debate

C++?? 2nd Edition page 28


need to compromise in order to bridge the
semantic gap. The perpetuation of low level 8. Bibliography
languages such as C into OOP is proof that the C++ ARM Ellis and Stroustrup “The annotated
programming community is not willing to C++ Reference Manual” AT&T 1990.
compromise, or sharpen its axe enough in order to [Capretz 87] Pierre J. Capretz “French in Action,
bridge this costly gap. A Beginning Course in Language and Culture”
The critique began with certain questions, and Yale University Press.
as no work can be absolute (particularly a [Cline] Marshall Cline “C++ Frequently Asked
programming language), it will end with more Questions” comp.lang.c++ newsgroup.
questions that it is hoped will create more debate,
and more questioning into what we are really [Crystal 87] David Crystal “The Cambridge
trying to achieve with program development. Encyclopedia of Language” Cambridge
Does C++ provide effective communication University Press.
between programmers separated in both space and [DDH 72] Dahl, Dijkstra, Hoare “Structured
time? Does C++ provide communication between Programming”
the levels of analysis, design, implementation and [Dijkstra 76] E.W. Dijkstra “A Discipline of
maintenance? Programming” Prentice Hall.
Are the compromises made by C and C++ still [Ellemtel 92] “Programming in C++: Rules and
relevant to today’s environments, and the Recommendations” Ellemtel Telecommunication
environments of the not very near future? Systems Laboratories, Sweden.
Could C++ be regarded as the PL/1 of the
object-oriented world, as PL/1 was the marriage [Ince 92] D.C.Ince “Arrays and Pointers
of FORTRAN and structured ALGOL concepts, Considered Harmful”, ACM SigPlan Notices,
and C++ is the marriage of C with object-oriented January 1992.
concepts? [Mody 91] R.P.Mody “C in Education and
Are the compromises made for the restricted Software Engineering” ACM SIGCSE Bulletin
machines and environments of 20 years ago still Vol.23 No. 3 September 1991.
appropriate for today? Are languages based on 20 [Reade 89] Chris Reade “Elements of Functional
year old compromises appropriate in modern Programming” Addison-Wesley, 1989.
software development environments? [RBPEL91] Rumbaugh, Blaha, Premerlani, Eddy,
Should new software developments be forced Lorensen “Object-Oriented modelling and
to accept such compromises? Design”. Prentice-Hall, 1991.
Is C++ patching old material with new cloth, [S & W 79] William Strunk and E.B.White “The
or pouring new wine into old wineskins? Elements of Style”, MacMillan Publishing, 1979.
What are we really trying to achieve in
programming anyway? [Sakkinen 92] Markku Sakkinen “Inheritance and
Other Main Principles of C++ and Other Object-
oriented Languages”. University of Jyväskylä,
Ian Joyner 1992. (Also published as selected papers in
November 1992 ECOOP ‘88, Computing Systems Vol. 5 No. 1,
and Structured Programming Vol. 13 (1992).)
[SJE91] Saake, Jungclaus, Ehrich “Object-
Oriented Specification and Stepwise Refinement”
in IFIP Workshop on Open Distributed Processing
Berlin, 1991.
[Weg91] Peter Wegner “Concepts and Paradigms
of Object-Oriented Programming” ACM
SIGPLAN OOPS Messenger Volume 1 no. 1
August 1990.
[X3J16 92] Members of the X3J16 working group
on extensions “How to write a C++ Language
Extension Proposal for ANSI-X3j16/ISO-WG21”
ACM SIGPLAN Notices Vol. 27 No. 6 June
1992.
[Yoshida 92] Koichiro Yoshida Title and book in
Japanese.

C++?? 2nd Edition page 29

Вам также может понравиться