0 оценок0% нашли этот документ полезным (0 голосов)
51 просмотров31 страница
C++?? A Critique of C++ 2nd Edition 1. Introduction. 2. The Role of a Programming Language. 2.1. Safety and Courtesy Concerns. 4. C++ Specific Criticisms. 3.2. Virtual Functions. 3.3. The Nature of Inheritance. 3.4. Function Overloading. 3.5. Virtual Classes. 3.6. Name overloading. 3.7. '.' and '->'. 3.9. Anonymous parameters in Class Definitions. 3.8.
C++?? A Critique of C++ 2nd Edition 1. Introduction. 2. The Role of a Programming Language. 2.1. Safety and Courtesy Concerns. 4. C++ Specific Criticisms. 3.2. Virtual Functions. 3.3. The Nature of Inheritance. 3.4. Function Overloading. 3.5. Virtual Classes. 3.6. Name overloading. 3.7. '.' and '->'. 3.9. Anonymous parameters in Class Definitions. 3.8.
Авторское право:
Attribution Non-Commercial (BY-NC)
Доступные форматы
Скачайте в формате PDF, TXT или читайте онлайн в Scribd
C++?? A Critique of C++ 2nd Edition 1. Introduction. 2. The Role of a Programming Language. 2.1. Safety and Courtesy Concerns. 4. C++ Specific Criticisms. 3.2. Virtual Functions. 3.3. The Nature of Inheritance. 3.4. Function Overloading. 3.5. Virtual Classes. 3.6. Name overloading. 3.7. '.' and '->'. 3.9. Anonymous parameters in Class Definitions. 3.8.
Авторское право:
Attribution Non-Commercial (BY-NC)
Доступные форматы
Скачайте в формате PDF, TXT или читайте онлайн в Scribd
1. Introduction......................................................................................................................1 2. The Role of a Programming Language..............................................................................1 2.1. Safety and Courtesy Concerns .............................................................................3 4. C++ Specific Criticisms ..................................................................................................4 3.1. Virtual Functions ................................................................................................4 3.2. Pure Virtual Functions ........................................................................................6 3.3. The Nature of Inheritance....................................................................................7 3.4. Function Overloading .........................................................................................7 3.5. Virtual Classes....................................................................................................8 3.6. Name overloading...............................................................................................8 3.7. Polymorphism and Inheritance ............................................................................9 3.8. ‘.’ and ‘->’ .........................................................................................................10 3.9. Anonymous parameters in Class Definitions .......................................................10 3.10. Nameless Constructors ......................................................................................11 3.11. Constructors and Temporaries ...........................................................................11 3.12. Optional Parameters ..........................................................................................11 3.13. Bad deletions.....................................................................................................11 3.14. Local entity declarations....................................................................................12 3.15. Members ...........................................................................................................12 3.16. Friends .............................................................................................................12 3.17. Static .................................................................................................................12 3.18. Union ...............................................................................................................13 3.19. Nested Classes..................................................................................................13 3.20. Global Environments........................................................................................13 3.21. Header Files .....................................................................................................14 3.22. Class Interfaces.................................................................................................14 3.23. Class header declarations ..................................................................................14 3.24. Garbage Collection ...........................................................................................15 3.25. Type-safe linkage .............................................................................................15 3.26. C++ and the software lifecycle..........................................................................16 3.27. Reusability and Communication .......................................................................17 3.28. Reusability and Trust........................................................................................17 3.29. Reusability and Compatibility ..........................................................................17 3.30. Reusability and Portability................................................................................17 3.31. Idiomatic Programming ....................................................................................17 3.32.Concurrent Programming ...................................................................................18 4. The role of Language.......................................................................................................18 5. On Writing ......................................................................................................................20 6. Generic C criticisms ........................................................................................................20 6.1. Pointers...............................................................................................................21 6.2. Arrays.................................................................................................................21 6.3. Function Parameters............................................................................................22 6.4. void * .................................................................................................................22 6.5. void fn ().............................................................................................................22 6.6. fn ().....................................................................................................................23 6.7. Metadata in Strings .............................................................................................24 6.8. ++, -- ..................................................................................................................24 6.9. Defines ...............................................................................................................25 6.10. NULL vs 0 ........................................................................................................25 6.11. Case Distinction ................................................................................................25 6.12. Assignment Operator ........................................................................................26 6.13. Type Casting ....................................................................................................26 6.14. Semicolons.......................................................................................................27 7. Conclusions......................................................................................................................27 8. Bibliography ....................................................................................................................29 most difficult to understand and technical section 1. Introduction of the paper, but it is fundamental to the The C++ programming language is becoming understanding of the weaknesses of C++. widely used. So it is important and timely to Having said that, I hope that you find this question its success. Two books are already critique useful, and enjoyable. If by any chance published on the subject [Sakkinen 92] and you do, please feel free to distribute it to your [Yoshida 92]. This critique addresses the management, peers and friends. following questions. How well does C++ implement object-oriented concepts? Can it easily implement small, quick projects? Does it scale up 2. The Role of a Programming well for large projects? Does it support or hinder good programming practices? As a result, does it Language ease the production of quality software? What is A programming language functions at many the relationship between a language, compiler and different levels and has many roles. It should be software developers; and between the language, critiqued with respect to those levels and roles. compiler and the target system? This last question Historically, programming languages had a very addresses issues of correctness, compatibility, limited role, that of writing executable programs. portability, and efficiency. As programs have grown in complexity, this role A paper on the recommended practices for use alone has proved insufficient. Many design and in C++ [Ellemtel 92] suggests “C++ is a difficult analysis techniques have arisen to support other language in which there may be a very fine line necessary roles. The organisation of projects also between a feature and a bug. This places a large required tools external to the language and responsibility upon the programmer.” Is this a compiler, like ‘make.’ Object-oriented techniques responsibility or a costly burden? The ‘fine line’ have arisen to help in the analysis and design is a result of poor language definition. The C++ phases, and object-oriented languages to support standardisation committee warns “C++ is already the implementation phase of OO. Traditional, too large and complicated for our taste” [X3J16 tried and tested but failed software practices are 92]. infiltrating the object-oriented world. Object- While it is true that C++ is immediately orientation, however, offers a better rational usable by many C programmers, and many see approach to software development. The this as a strength, the C base is C++’s greatest complementary roles of analysis, design, weakness. This is the engineering compromise implementation and project organisation should that C++ devotees talk about. Adoption of C++ be better integrated in the object-oriented scheme. does not suddenly transform C programmers into This results in economical software production. object-oriented programmers. A complete change C++ is an interesting experiment in adapting of thinking is required, and C++ actually makes the advantages of object-orientation to a this difficult. A critique of C++ cannot be traditional programming language. Bjarne separated from criticism of the C base language, Stroustrup is to be applauded for having the as it is essential for the C++ programmer to be insight to put the two technologies together. C++, fluent in C. Many of C’s problems affect the way however, retains the problems of the old order of that object-orientation is implemented and used in software production. C++ has an advantage over C++. This critique is not exhaustive of the C as it supports many facets of object-orientation. weaknesses of C++, but it illustrates the practical These can be used for limited analysis and design. consequences of these weaknesses with respect to The processes of analysis, design, and the timely and economic production of quality organisation, however, are still largely external to software. C++. Thus C++ has not realised the important This critique criticises C++ in its own right, advantages of object-orientation that will indeed without comparison to other languages. Section 2 lead to the economic production of software. considers the role of a programming language. A language should not only be critiqued from Section 3 examines some specific aspects of C++. a technical point of view, considering its syntactic Section 4 examines the general role of language. and semantic features. It should also be critiqued Section 5 is a short comment on writing. Section from the viewpoint of its contribution to the entire 6 looks specifically at C. The conclusion software development process. It should enable examines where C++ has left us, and considers the communication between project members acting future. The approach taken is to criticise specific at different levels, from management, who have a aspects of C++ and C. Each section tries to be self requirement for the product, to testers, who must contained. It is expected that not everyone will test the result. It should also enable agree with all of the sections. It is probably best to communication between project members approach the paper, not by reading it entirely, but separated in space and time. Often one to read those sections that interest you. One programmer is not responsible for a task over its section, however, is fundamental to the criticism entire lifetime. of C++, that on virtual functions. This is also the
C++?? 2nd Edition page 1
The primary purpose of any language is techniques of schema checking are often criticised communication. A programming language should as being restrictive and therefore unusable for real support the exchange of ideas, intentions, and world software. This is nonsense and decisions between project members. A misunderstands of the power of these languages. It programming language should provide a formal, is an immature conception; the best programmers yet readable, notation to support consistent realise that programming is difficult. As a whole, descriptions of systems that satisfy the the computing profession is still learning to requirements of diverse problems. A language program. should also provide methods for automated Another example of consistency checking project tracking. This ensures that modules comes from the user interface world. Instead of (classes and functionality) that satisfy project correcting a user after an erroneous action, a good requirements are completed in a timely and user interface will not offer the action as a economic fashion. A programming language aids possibility in the first place. It is cheaper to avoid reasoning about the design, implementation, error than to fix it. Most people drive their cars extension, correction, and optimisation of a with this principle in mind. Smash repair is time system. consuming and expensive. A language definition should enable the Program development is a dynamic process. A development of integrated automated tools to program description is constantly modified during support software development. For example, development. Modifications often lead to browsers, editors and debuggers. The compiler is inconsistencies and error. Languages and another such tool. The role of a compiler is compilers that provide consistency checks help twofold. Firstly, to generate code for the target prevent such ‘bugs’, which can creep into a machine. The role of the machine is to execute the previously working system. These checks help produced programs. A compiler has to check that verify that as a program is modified, previous a program conforms to the language syntax and decisions and work are not invalidated. grammar, so it can ‘understand’ the program in It is interesting to consider how much order to translate it into an executable form. checking could be integrated in an editor. The Secondly, and more importantly, the compiler focus of many current generation editors is text. should check that the programmers expression of What happens if we change this focus from text to the system is complete, valid and consistent. A program components? Such editors might check compiler should perform semantics checking. This not only syntax, but semantics. Alerting is checking that a program is internally consistent. programmers of potential errors earlier and Generating a system that has detectable interactively will shorten development times. inconsistencies is pointless. Future languages should be defined very cleanly Semantics checking is done by ensuring that a in order to enable such editor technology. specification conforms to some schema. For A programming language should provide a example, the sentence “The boy drank the formal notation. During requirements analysis and computer and switched on the glass of water” is design phases, formal and semi-formal notations grammatically correct. But the sentence is are required. Notations used in analysis, design, nonsense. It does not conform to the mental and implementation phases should be schema we have of computers and glasses of complementary, rather than contradictory. water. A programming language should include Currently, analysis, design and modelling techniques for the detection of similar nonsense. notations are too far removed from programming, The language definition provides the framework while programming languages are in general too that makes this role of the compiler possible. low level. Both designers and programmers must Checking is often enabled by the specification compromise to fill the gap. Current notations of redundant information. Declarations are an provide difficult transition paths between stages. example of redundancy that help check for This ‘semantic gap’ contributes to errors and misspellings. Declarations define the vocabulary omissions between the requirements, design and of a program, ie the elements in its universe. The implementation phases. Future programming compiler uses redundant information for languages will be an implementation extension of consistency checking, and strips it away to the high level notations used for requirements produce efficient executable systems. Type safety analysis and design. This will lead to improved is another technique. Declarations also associate consistency between analysis, design and an entity with a type, to define the entities role. implementation. Object-oriented techniques Typing ensures that you can’t drink computers or emphasise the importance of this, as abstract switch on glasses of water. C++ is an definition and concrete implementation can be improvement over C in type safety. separate, yet provided by the same syntax. It is a misconception that consistency checks Programming languages also provide are ‘training wheels’ for student programmers, notations to formally document a system. and that ‘syntax’ errors are a hindrance to Program source is the only reliable documentation professional programmers. Languages that exploit of a system, so a language should explicitly
C++?? 2nd Edition page 2
support documentation. As with all language, the These quotes from Reade are a good summary effectiveness of communication is dependent upon of the principles from which I criticise C++. What the skill of the writer. Good program writers Reade calls administrative tasks, I call require languages that support the role of bookkeeping. C and C++ are often criticised for documentation. They require that the syntax of a being cryptic. The reason is that C concentrates on language is perspicuous, and easy to learn. Those points 2 and 3, while the description of what is to not trained in the skill of ‘writing’ programs, can be computed is obscured. High level languages read them to gain understanding of the system. describe ‘what’ is to be computed. This is the After all, it is not necessary for newspaper readers problem domain. ‘How’ a computation is to be journalists. achieved is in the low-level machine-oriented Chris Reade [Reade 89] gives the following domain. The conflict between these aspects recurs explanation of programming and languages. “One, frequently throughout this critique. Automating rather narrow, view is that a program is a the bookkeeping tasks enhances correctness, sequence of instructions for a machine. We hope compatibility, portability and efficiency. to show that there is much to be gained from Bookkeeping tasks arise from having to specify taking the much broader view that programs are ‘how’ a computation is done. Specifying ‘how’ descriptions of values, properties, methods, prob- things are done in some environments hinders lems and solutions. The role of the machine is to portability to other platforms. speed up the manipulation of these descriptions to The industry should be moving towards these provide solutions to particular problems. A ideals. They will help in the economic production programming language is a convention for of software, rather than the costly techniques of writing descriptions which can be evaluated.” today. We should consider what we need, and [Reade 89] also describes programming as assess the problems of what we have against that. being a “Separation of concerns”. He says: Object-orientation provides one solution to these “The programmer is having to do several problems. Its effectiveness, however, depends on things at the same time, namely, the quality of its implementation. It is relevant to ask if grafting OO concepts (1) describe what is to be computed; onto a conventional language realises the full (2) organise the computation sequencing into benefits of OO? Perhaps a biblical quote can be small steps; considered: “No one sews a patch of unshrunk (3) organise memory management during the cloth on to an old garment; if he does, the patch computation.” tears away from it, the new from the old, and Reade continues, “Ideally, the programmer leaves a bigger hole. No one puts new wine into should be able to concentrate on the first of the old wineskins; if he does, the wine will burst the three tasks (describing what is to be computed) skins, and then wine and skins are both lost. New without being distracted by the other two, more wine goes into fresh skins.” Mark 2:22 administrative, tasks. Clearly, administration is We must abandon disorganised and error- important but by separating it from the main task prone practices, not adapt them to new contexts. we are likely to get more reliable results and we How well can hybrid languages support the can ease the programming problem by automating sophisticated requirements of modern software much of the administration. production? Surely a basic premise of object- “The separation of concerns has other oriented programming is to enable the advantages as well. For example, program proving development of sophisticated systems through the becomes much more feasible when details of adoption of the simplest techniques possible? sequencing and memory management are absent Software development technologies and from the program. Furthermore, descriptions of methodologies should not impede the production what is to be computed should be free of such of such sophisticated systems. detailed step-by-step descriptions of how to do it if they are to be evaluated with different machine 2.1. Safety and Courtesy Concerns architectures. Sequences of small changes to a This critique makes two general types of data object held in a store may be an inappropriate criticism, about ‘safety’ concerns and ‘courtesy’ description of how to compute something when a concerns. These themes recur throughout this highly parallel machine is being used with critique, as C and C++ have flaws that thousands of processors distributed throughout the compromise them frequently. Safety concerns machine and local rather than global storage affect the external perception of the quality of the facilities. program. Failure to meet safety concerns results in “Automating the administrative aspects means unfulfilled requirements and program crashes. that the language implementor has to deal with Courtesy concerns affect the internal view of them, but he/she has far more opportunity to make the quality of a program in the development and use of very different computation mechanisms maintenance process. Courtesy concerns are with different machine architectures.” usually stylistic and syntactic, whereas safety concerns are semantic. The two often go together. C++?? 2nd Edition page 3 It is courtesy for an airline to keep its fleet well descendant classes are part of the same name maintained. This courtesy concern is also very space as classes they inherit from. The much a safety concern. redeclaration of a name within the same scope Courtesy issues are even more important in should cause a name clash. Allowing two entities the context of reusable software. Reusability to have the same name within one scope causes depends on the clear communication of the ambiguity and other problems. (See the section on purpose of a module. Courtesy is important to name overloading.) establish social interactions, such as com- The following example illustrates the second munication. Courtesy implies inconvenience to problem: the provider, but provides convenience to others. class A Courtesy issues include choosing meaningful identifiers, consistent layout and typography, { meaningful and non-redundant commentary, etc. public: Courtesy issues are more than just a style void nonvirt (); consideration. A language design should directly virtual void virt (); support courtesy issues. A language, however, } cannot enforce courtesy issues, and it is often class B : public A pointed out that poor, discourteous programs can be written in any language. But this is no reason { for being careless about the languages that we public: develop and choose for software development. void nonvirt (); void virt (); } 3. C++ Specific Criticisms A a; B b; 3.1. Virtual Functions A *ap = &b; Polymorphism is a key concept of OOP. B *bp = &b; Virtual functions are one way to implement polymorphism. A language designer’s choice is bp->nonvirt (); // calls B::nonvirt whether this should be specified in the parent or // as you would the inheriting class. Is it the decision of the // expect designer of the parent or descendant class? Cases ap->nonvirt (); // calls A::nonvirt, can be made for both. They are not mutually // even though this exclusive and can be catered for quite easily in an // object is of type object-oriented language. // B. There are three options, corresponding to ap->virt (); // calls B::virt, the ‘must not’, ‘can’, and ‘must’ be redefined: // correct version of 1) The redefinition of a routine is prohibited; // the routine for B descendant classes must use the routine as is. // objects. 2) A routine could be redefined. Descendant classes can use the routine as provided, or provide In this example, class B has extended or their own implementation as long as it conforms replaced routines in class A. B::nonvirt is the to the original interface definition and routine that should be called for objects of type B. accomplishes at least as much. It could be pointed out that C++ gives the client 3) A routine is abstract. No implementation programmer flexibility to call either A::nonvirt or is provided and each non-abstract descendent class B::nonvirt. But this can be provided in a simpler must provide its own implementation. This is more direct way. A::nonvirt and B::nonvirt should polymorphism. be given different names. That way the The base class designer must decide options 1 programmer calls the correct routine explicitly, and 3. Descendant class designers must decide not by an obscure and error prone trick of the option 2. A language should provide direct syntax language, as follows: for these options. class B : public A { Option 1 public: C++ does not cater for the first option. Not void b_nonvirt (); using a virtual function is the closest. But in that void virt (); case the routine can be completely replaced. This } causes two problems. Firstly, a routine can be unintentionally replaced in a descendent. The B b; compiler should report a syntax error due to B *bp = &b; ‘duplicate declaration’. This is logical as
C++?? 2nd Edition page 4
bp->nonvirt (); // calls A::nonvirt whether the function f() is defined virtual or non- bp->b_nonvirt (); // calls virtual in order to interpret exactly what a->f () // B::b_nonvirt means. Therefore, the statement a->f () is not implementation independent. A change in the Now the designer of class B has direct control declaration of f () will change the semantics of the over B’s interface. The application requires that invocation. Implementation independence means clients of B can call both A::nonvirt, and that a change in the implementation DOES NOT B::b_nonvirt. B’s designer has explicitly provided change the semantics, of executable statements. for this. This is good object-oriented design, If a change in the declaration changes the which provides strongly defined interfaces. C++ semantics, this should generate a compiler allows client programmers to play tricks with the detected error. The programmer should make the class interfaces, external to the class, and B’s statement semantically consistent with the designer cannot prevent A::nonvirt from being changed declaration. This reflects the dynamic called. This is opposite to good modular design. nature of software development, where the This shows the unsafeness C++’s virtual program text is subject to perpetual change. mechanism. Objects of class B have their own For yet another case of the inconsistent specialised ‘nonvirt’. But B’s designer does not semantics of the statement a->f () vs constructors, have control over B’s interface to ensure that the consult section 10.9c, p 232 of the C++ ARM. correct version of nonvirt is called. [Sakkinen 92] points out that a descendant class C++ also does not protect class B from other can redefine a private virtual function even though changes in the system. Suppose we need to write a it cannot access that function in other ways. When class C that needs ‘nonvirt’ to be virtual. Then the ancestor class calls the function it instead ‘nonvirt’ in A will be changed to virtual. But this invokes the function in the descendant class. breaks the B::nonvirt trick. The requirement of class C to have a virtual routine forces a change in Option 2 the base class. This has an effect on all other The second option should be left open for the descendants of the base class, instead of the programmers of descendant classes. In C++, specific new requirement being localised to the however, the decision must be made in the base new class. This is opposite to the reason for OOP class. In object-oriented design, the decisions you having loosely coupled classes, so that new decide not to make are as important as the requirements, and modifications will have decisions you make. Decisions should be made as localised effects, and not require changes late as possible. This strategy prevents mistakes elsewhere which can potentially break other being built into the system at early stages. By existing parts of the system. making early decisions, you are often stuck with Rumbaugh et al, put their criticism of C++’s assumptions that later prove to be incorrect. C++ virtual as follows: “C++ contains facilities for requires the parent class to specify potential inheritance and run-time method resolution, but a polymorphism by virtual (although an C++ data structure is not automatically object- intermediate class in the inheritance chain can oriented. Method resolution and the ability to introduce virtual). This prejudges that a routine override an operation in a subclass are only might be redefined in descendants. This can be a available if the operation is declared virtual in the problem because routines that aren’t actually superclass. Thus, the need to override a method polymorphic are accessed via the slightly less must be anticipated and written into the origin efficient virtual table technique instead of a class definition. Unfortunately, the writer of a straight procedure call. (This is never a large over- class may not expect the need to define head but object-oriented programs tend to use specialized subclasses or may not know what more and smaller routines making routine operations will have to be redefined by a subclass. invocation a more significant overhead.) The This means that the superclass often must be policy in C++ should be that routines that might modified when a subclass is defined and places a be redefined should be declared virtual. serious restriction on the ability to reuse library Virtual, however, is the wrong mechanism for classes by creating subclasses, especially if the the programmer to deal with. A compilation source code library is not available. (Of course, system can detect polymorphism, and generate the you could declare all operations as virtual, at a underlying virtual code, where and only where slight cost in memory and function-calling necessary. Having to specify virtual burdens the overhead.)” [RBPEL91] programmer with another bookkeeping task. This A further argument is that any statement is the main reason why C++ is a weak object- should consistently have the same semantics. The oriented language as the programmer must object-oriented interpretation of a statement like constantly be concerned with low level details. a->f () is that the most suitable implementation of The compiler should take care of such detail and f() is invoked for the object referred to by ‘a’, so relieve the programmer. whether the object is of type A, or a descendent of Another problem in C++ is mistaken A. In C++, however, the programmer must know redefinition. The base class routine can be
C++?? 2nd Edition page 5
redefined unwittingly. The compiler should report 3.2. Pure Virtual Functions an erroneous name redefinition within the same As mentioned above, pure virtual functions name space unless the descendant class provide a means of leaving a function undefined programmer specifies that the routine redefinition and abstract. A class that has such an abstract is really intended. The same name can be used, function cannot be directly instantiated. A non- but the programmer must be conscious of this, abstract descendant class must define the function. and state this explicitly, especially in The C++ pure virtual syntax is: environments where systems are assembled out of preexisting components. Unless the programmer virtual void fn () = 0; explicitly overrides the original name a syntax This leaves the reader to guess its meaning, error should report that the name is a duplicate even those well versed in object-oriented declaration. C++, however, adopted the original concepts. A better choice would have been a approach of Simula. This approach has been keyword such as ‘abstract’. Direct expression of improved upon, and other languages have adopted concepts enhances communication, and the ease better, more explicit approaches, that avoid the with which a language can be learnt. When error of mistaken redefinition. learning a language it is often important to use the Eiffel and Object Pascal cater for this situation index of a text book. A keyword like ‘abstract’ as the descendant class programmer is required to would be easily found in an index. But what do specify that redefinition is intended. This has the you look for in the case of ‘= 0’? You might not extra benefit that a later reader or maintainer of even realise it is significant. It should have the class can easily identify the routines that have syntactic significance as abstract functions are a been redefined, and that this definition is related very important concept in object-oriented design. to a definition in an ancestor class without having The C++ decision is in keeping with the C to refer to ancestor class definitions. Thus option philosophy of avoiding keywords. This is often at 2 is exactly where it should be, in descendant the expense of clarity. A keyword would classes. implement this concept more clearly. For example: Option 3 pure virtual void fn (); The pure virtual function caters for the third option. The routine is undefined, the class is or abstract and cannot be directly instantiated. A abstract void fn (); descendant class must define the routine if it is to be instantiated. Any descendants that do not define the routine are also abstract classes. This The mathematical notation used in C++ concept is correct, but see the section on pure suggests that values other than zero could be used. virtual functions for a criticism of the syntax. What if the function is equated to 13? - virtual void fn () = 13; Virtual is a difficult notion to grasp. The related concepts of polymorphism and dynamic A function is either pure, or it is not. This to binding, redefinition, and overloading are easier to any analyst suggests a boolean state, which a grasp, being oriented towards the problem single keyword conveys. A simple suggestion to domain. Virtual routines are an implementation fix this is to define ‘= 0’ as abstract: mechanism for polymorphism. Polymorphism is #define abstract = 0 the ‘what’, and virtual is the ‘how’. Smalltalk and Objective-C use a different mechanism to then implement polymorphism. Virtual is an example virtual void fn () abstract; of where C++ obscures the concepts of OOP. The programmer has to come to terms with low level ‘Pure virtual’ is also an abuse of natural concepts, rather than the higher level object- language. It is a combination of words that are oriented concepts. Interesting as underlying somewhat opposite in meaning. Pure means mechanisms might be for the theoretician or something that really is what it appears to be. For compiler implementer, the practitioner should not example pure gold. Virtual means something that be required to understand or use them to make appears to be what it actually is not. For example sense of the higher level concepts. Having to use virtual memory. Perhaps virtual gold could be them in practice is tedious and error-prone, and fools gold. As has been said before, virtual is a can prevent the adaptation of software to further difficult concept to grasp. When it is combined advances in the underlying technology and with a word such as ‘pure’, the meaning becomes execution mechanisms (see concurrency). even more obscure. Modern language designers should be very careful in the vocabulary they choose.
C++?? 2nd Edition page 6
3.3. The Nature of Inheritance before. Assembling software components is Inheritance is a close relationship. It provides building a system that has never existed before. a fundamental way to assemble software Inheritance in C++ is like a jig-saw where the components. Objects that are instances of a class pieces fit together, but the compiler has no way of are also instances of all ancestors of that class. For checking that the resultant picture makes sense. In effective object-oriented design the consistency of other words C++ has provided the syntax for this relationship should be preserved. Each classes and inheritance but not the semantics. redefinition in a subclass should be checked for Certainly, not very many reusable C++ libraries consistency with the original definition in an are available, which suggests that C++ might not ancestor class. A subclass should preserve the support reusability as well as possible. C++ fails requirements of an ancestor class. Requirements to provide this fundamental goal of object- that cannot be preserved indicate a design error oriented design and programming. and perhaps inheritance is not appropriate. Consistency due to inheritance is fundamental to 3.4. Function Overloading object-oriented design. C++’s implementation of C++ allows functions to be overloaded if the non-virtual overloading, and overloading by arguments in the signature are of different types. signature (see below) means that the compiler Such overloading can be useful as these examples cannot check for this consistency. C++ does not show: realise this aspect of object-oriented design. This contributes to a wide and costly gap between max (int, int); analysis and design, and implementation. max (real, real); Inheritance has been classified as ‘syntactic’ This will ensure that the best max routine for inheritance and ‘semantic’ inheritance. Saake et al the types int and real will be invoked. Object- describe these as follows : “Syntactic inheritance oriented programming, however, provides a denotes inheritance of structure or method variant on this. Since the object is passed to the definitions and is therefore related to the reuse of routine as a hidden parameter (‘this’ in C++), an code (and to overriding of code for inherited equivalent but more restricted form is already methods). Semantic inheritance denotes in- implicitly included in object-oriented concepts. A heritance of object semantics, ie of objects simple example such as the above would be themselves. This kind of inheritance is known expressed as: from semantic data models, where it is used to model one object that appears in several roles in int i, j; an application.” [SJE91]. Saake et al concentrate real r, s; on the semantic form of inheritance. Behavioural i.max (j); or semantic inheritance expresses the role of an object within a system. r.max (s); Wegner, however, believes code inheritance to but i.max (r) and r.max (j) result in be of more practical value. He classifies the compilation errors because the types of the difference between syntactic and semantic arguments do not agree. (By operator overloading inheritance as code and behaviour hierarchies of course, these can be better expressed, i max j [Weg90] (p43). He suggests these are rarely and r max s, but min and max are peculiar compatible with each other and are often functions that might want to accept two or more negatively correlated. Wegner also poses the parameters of the same type.) question of “How should modification of The above shows that in most cases, the inherited attributes be constrained?” Code object-oriented paradigm can consistently express inheritance provides a basis for modularisation. function overloading, without the need for the Behavioural inheritance provides modelling by function overloading of C++. C++, however, does the ‘is-a’ relationship. Both are useful in their make the notion more general. The advantage is place. Both require consistency checks that that more than one parameter can overload a combinations due to inheritance actually make function, not just the implicit current object pa- sense. rameter. It seems that inheritance is most powerful in The disadvantage is that C++ introduces some the most restrictive form of a semantics inconsistencies that the compiler cannot detect. If preserving relationship. A subclass should not the programmer intends to redefine a virtual break the assumptions of an ancestor class. routine, but makes a mistake in the declaration of Software components are like jig-saw pieces. the function signature, the compiler will When assembling a jig-saw the shape of the erroneously assume an overloaded function. Any pieces must fit, but more importantly, the calls to the function using one or other of the resulting picture must make sense. Assembling signatures will also fail to detect the software components is more difficult. A jig-saw inconsistency. is reassembling a picture that was complete When calling the routine, if the programmer makes a mistake in supplying the actual C++?? 2nd Edition page 7 parameters, a C++ compiler cannot be specific and attempting to do so considerably complicates about the error. It can only report that no function design. with a matching signature could be found. Programmers make this sort of mistake for subtle 3.6. Name overloading reasons, and it can be time consuming to pinpoint Naming is fundamentally important in the parameter at fault. Secondly, the incorrect producing self-documenting software. Naming parameter might accidentally match, one of the helps realise maintainable and reusable software other routines. In that case this error will be components. Names are fundamental in freeing propagated into the production code, and could programmers from low level manipulation of remain undetected a long time. addresses. Naming is the basis for differentiating If it is felt that C++’s scheme of having between different entities in a software module. parameters of different types is useful, it should Name overloading allows the same name to refer be realised that object-oriented programming to two or more different entities. The problem is provides this in a more restricted and disciplined whether the resultant ambiguity is useful, and how form. This is done by specifying that the to resolve it, as ambiguity weakens the power of parameter needs to conform to a base class. Any names to distinguish entities. parameter passed to the routine can only be a type Name overloading is useful for two purposes. of the base class, or a subclass of the base class. Firstly it allows programmers to work on two or For example: more modules without concern about name A.f (B someB) {...}; clashes. The ambiguity can be tolerated as within the context of each module, the name class B ...; unambiguously refers to a unique entity. class D : public B ... Secondly, name overloading provides A a; polymorphism, where the same name applied to different types refers to different implementations D d; for those types. Polymorphism allows one word to a.f (d); describe ‘what’ is to be computed. Different classes might require different specifications of The entity ‘d’ must conform to the class ‘B’, ‘how’, a computation is done. For example ‘draw’ and the compiler checks this. is an operation that is applicable to all different The alternative to function overloading by shapes, even though circles and squares, etc are signature, is to require functions with different ‘drawn’ differently. signatures to have different names. Names should These two uses of name overloading provide a be the basis of distinction of entities. This is powerful concept. But use of the same name in known to work and solves the above problems. the same context must be resolved. Errors can The compiler can cross check that the parameters result from ambiguity. In this case the supplied are correct for the given routine name. programmer needs to differentiate between This also results in better self-documented entities in ways other than name alone. A software. It is often difficult to choose appropriate common way to do this is to introduce extra names for entities, but it is well worth the effort. distinguishing names. For example in a group of people, where two or more share the same first 3.5. Virtual Classes name, they can be distinguished by their surname. If class D multiply inherits class A via classes Similarly a unique first name will distinguish the B and C, then if D wants to inherit only a single members of a family with a common surname. copy of A, the inheritance of A must be specified This is analogous to classes, where each class as virtual in both B and C. This raises two in a system is given a unique name. Each member questions. Firstly, what happens if A is declared within a class is also given a unique name. Where virtual in only one of B or C? Secondly, what if two objects with members of the same name are another class E wants to inherit multiple copies of used within the same context, the object name can A via B and C? In C++, the virtual class decision qualify the members. For example a.mem and must be made early, reducing the flexibility that b.mem. might be required in the assembly of derived [Reade 89] points out the difference between classes. In a shared software environment overloading and polymorphism. Overloading different vendors might supply classes B and C. It means the use of the same name in the same should be left to the implementer of class D or E, context for different entities with completely exactly how to resolve this problem. And this is different definitions and types. Polymorphism the simplest case. What if A is inherited via more though has one definition, and all types are than two paths, with more than two levels of subtypes of a principle type. C. Strachey referred inheritance? Such flexibility is key to reusable to polymorphism as parametric polymorphism and software. You cannot envisage when designing a overloading as ad hoc polymorphism. base class all the possible uses in derived classes, Block structured languages provide overloading by scoping. Scoping allows the same C++?? 2nd Edition page 8 name to be used in different contexts without a combination of components, which quickly clash or confusion. Nested blocks provide a subtle leads to an exponentiation in the number of tests problem. Names in an outer block are in scope in required. inner blocks. Many languages, however, allow a C++ has an analogous form of hiding. A non- name to be overloaded in an inner block. This virtual function in a derived class hides a function does more than overload the name, it hides it. The in an ancestor class. This hiding is explained in use of a name in the inner block does not indicate section 13.1 of the C++ ARM. This is a any relationship with the same name in the outer discrepancy with declaring multiple functions block. Textually nested blocks ‘inherit’ named with the same name in the same class with entities from outer blocks. Inheritance different signatures. A function in the derived accomplishes this in object-oriented languages. class will hide the functions of the ancestor class, Inheritance eliminates the need to textually nest rather than add its signature to the list of possible entities, and also accomplishes loose coupling. functions which can be called. This is confusing Nesting makes entities tightly coupled. and error prone. Learning all these ins and outs of Contrary to most high level languages, a name the language is extremely burdensome to the should not be overloaded while it is in scope. This programmer. Often they will only be learnt after inconveniently hides the outer declaration, and the falling into a trap. programmer cannot access the outer entity. It is also error prone. The following example 3.7. Polymorphism and Inheritance illustrates this: Inheritance provides a form of name { overloading similar to overloading in subblocks. int i; The scope of a name is the class in which it { occurs. If a name occurs twice in a class, it is a syntax error. Inheritance introduces some int i; // hide the outer i. questions over and above this simple i = 13; // assign to the inner i. consideration of scope. Should a name declared in // Can’t get to the outer i here. a base class be in scope in a derived class? There // It is in scope, but hidden. are three choices: } 1) Names are in scope only in the immediate } class but not in subclasses. Subclasses can freely reuse names because there is no potential for a Now delete the inner declaration: clash. This precludes software reusability. Subclasses will not inherit definitions of { implementation. Therefore case 1 is not worth int i; considering. { 2) The name is in scope in a subclass, but the i = 13; // Syntactically valid, name can be overloaded without restriction. This // but not the is closest to the overloading of names in nested // intention. blocks. This is C++’s approach. Two problems arise. Firstly, the name can be unintentionally } reused. Secondly, because the new entity is not } assumed to have any relationship to the original, The inner overloaded declaration is removed, its signature cannot be type checked with the and references to that name do not result in syntax original entity. Since consistency checks between errors due to the same name being in the outer the superclass and subclass are not possible, the environment. The inner instruction now tight relationship that inheritance implies, which mistakenly changes the value of the outer entity. is fundamental to object-oriented design, is not A compiler cannot detect this situation unless the guaranteed. This can lead to inconsistencies language definition forbids nested redeclarations. between the abstract definition of a base class, and E.W. Dijkstra uses similar reasoning in ‘An essay the implementation of a derived class. If the on the Notion: “The Scope of Variables”’ in “A derived class does not conform to the base class in Discipline of Programming”, [Dijkstra 76]. this way, it should be questioned why the derived The above example demonstrates how nesting class is inheriting from the base class in the first results in unmaintainable programs. This is place. (See the nature of inheritance.) because the inner block is tightly coupled to the 3) The name is in scope in the subclass, but outer block, and each is sensitive to changes in the can only be overloaded in a disciplined way to other. The advantage of keeping components provide a specialisation of the original. Other uses decoupled and separate is that a programmer can of the name are reported as duplicate name errors. confidently make modifications to one component This form of overloading in a subclass ensures the without affecting other components. Testing can entity referred to in the subclass is closely related be limited to the changed component, rather than to the entity in the ancestor class. This helps ensure design consistency. The relationship of
C++?? 2nd Edition page 9
name scope is not symmetric. Names in a subclass 3.9. Anonymous parameters in Class are not in scope in a superclass (although this is Definitions not the case in typeless languages such as C++ does not require parameters in function Smalltalk). In order to provide the consistent templates to be named. The type alone can be customisation of reusable software components, specified. For example a function f in a class the same name should only be used by explicitly header can be declared as f (int, int, char). This redefining the original entity. The programmer of gives the client no clue to the purpose of the the descendant class should indicate that this is parameters, without referring to the not a syntax error due to a duplicate name, but implementation of the function. Meaningful that redefinition is intended. (This has already identifiers are essential in this situation, because been covered in the virtual section.) This choice this is the abstract definition of a routine. A client ensures that the resultant class is logically of the class and routine must know that the first constructed. This might seem restrictive, but is int represents a ‘count of apples’, etc. It is true analogous to strong typing, and makes inheritance that well known routines might not require a a much more powerful concept. name, for example sqrt (int). But this is not appropriate for large scale software development. 3.8. ‘.’ and ‘->’ The use of anonymous parameters handicaps the The ‘.’ and ‘->’ member access syntax came purpose of abstract descriptions of classes and from C structures. It illustrates where the C base members: to facilitate the reusability of software. adversely affects flexibility. Semantically both Program text captures the meaning of the system access a member of an object. They are, however, for some future activity, such as extension or operationally defined in terms of how they work. maintenance. To achieve reusability, The dot (‘.’) syntax accesses a member in an communication of intent of a software element is object directly. For example ‘x.y’ means access essential. A compiler strips away this level of the member y in the object x. communication, producing a machine executable entity. Languages and compilers that perform less obj x; // declare object x of than optimal translations should not penalise // class obj careful production of semantic entities. But // with a member y. neither should a language definition allow less x.y; // access y in object x than optimal expression to the human reader. Languages do not have to be cryptic to achieve // directly efficiency. In fact cryptic languages impair x->y; // syntax error “. expected” efficiency, as they make it harder for the pro- The ‘->’ syntax means access a member in an grammer to develop efficient systems, and object referenced by a pointer. For example ‘x->y’ furthermore, they make it harder for automatic (or the equivalent *(x).y) means access the code optimisers. member y in the object pointer x refers to . Names are not strictly necessary in programming. Naming exists to help the human obj *x; // declare a pointer x to an reader identify different entities within the // object of class obj. program, and to reason about their function. For x->y; // access y via pointer x this reason naming is essential. Without naming, x.y; // syntax error “-> expected” development of sophisticated systems would be nearly impossible. Some languages access In this example, ‘what’ is to be computed is parameters by their address (position) in the “access the element y of object x.” In C++, parameter list ($1, $2, etc). This is quite however, the programmer must specify for every unsatisfactory, even for shell scripts. Anonymous access the trivial detail of ‘how’ this is done. The parameters can save typing in a function template, compiler can easily remove this burden from the but then programming is not a matter of conve- programmer, as in fact most languages do. nience. This is inconvenient for later readers. The Furthermore, this reduces flexibility as if the ‘obj redundancy is beneficial and saves later x’ declaration is changed to ‘obj *x’, the effect is programmers having to look up the information in widespread as all ‘x.y’ must be changed to ‘x->y’. another place. A real convenience in function Since the compiler gives a syntax error if the templates would be that abstract function wrong access is used, this shows it already knows templates be automatically generated from the what access code is required and can generate it implementation text (see header files for more automatically. Good programming centralises details). decisions. The decision to access the object Anonymous parameters illustrate the link directly or via a pointer should be centralised in between courtesy and safety issues in the declaration. programming. Due to pressure of work, a client programmer might wrongly guess the purpose of a parameter from the type. Thus the failure of the original programmer to provide a courtesy has
C++?? 2nd Edition page 10
caused a later programmer to breach safety. An 3.12. Optional Parameters interface client must know the intention of the Optional parameters that assume a default interface for it to be used effectively. value according to the routines declaration are supposed to provide a shorthand notation. 3.10. Nameless Constructors Shorthand notations are intended to speed up Multiple constructors can have different software development. Such shorthand notations signatures, similar to overloaded functions. This can be convenient in shell scripts, and interactive precludes two or more constructors having the systems. In large scale software production, same signature. Constructors are also not named however, precision is mandatory, and defaults can (apart from the same name as the class). This lead to ambiguities and mistakes. With optional makes it difficult to discern from the class header parameters the programmer could assume the the purpose of the different constructors. It is wrong default for a parameter. More importantly, difficult to match an object creation with the optional parameters undermine type safety. The called constructor. Constructors suffer from all of type of a function is defined by the composition the problems described with regards to functions of its input types, and its output type: with the same name but different signatures. It f: T1 x T2 x T3... -> T4 would be easy to mark routines as constructors, for example: The entire signature determines the type of the function, not just the return type. Optional constructor make (...)... parameters mean that C++ is not type safe, and constructor clone (...)... that the compiler cannot check that the parameters constructor initialise (...)... in the call exactly match the function signature. where each constructor leaves the object in Furthermore, they do not provide a great deal valid, but potentially different states. Named of convenience. If a routine has five parameters, constructors would aid comprehension as to what the last three of which are optional, and caller the constructor is used for in the same way as wants to assume the defaults for parameters 3 and function names document the purpose of a 4, but must specify parameter 5, then all five function. Secondly, named constructors would parameters must be specified. A better scheme allow multiple constructors with the same would be to have a ‘default’ keyword in function signature. Thirdly, it is easier to match up an calls: object creation with the constructor actually f (a, b, default, default, e); called. Other means, already in the language, can 3.11. Constructors and Temporaries easily provide this mechanism. For example, a A ‘return <expression>’ can result in a call to another (possibly inline) function could different value than the result of <expression>. In provide the defaults for the optional parameters. section 6.6.3, the C++ ARM says “If required the This not only provides the convenience of expression is converted, as in an initialisation, to optional parameters, but is more powerful. Any the return type of the function in which it appears. parameter or combination can be filled in with any This may involve the construction and copy of a combination of defaults, not just the last temporary object (S12.2).” parameters. Multiple intermediate routines can Section 12.2 explains “In some circumstances provide multiple sets of defaults. it may be necessary or convenient for the compiler to generate a temporary object. Such introduction 3.13. Bad deletions of temporaries is implementation dependent. The following example is given on p.63 in the When a compiler introduces a temporary object of C++ ARM as a warning about bad deletions that a class that has a constructor it must ensure that a cannot be caught at compile-time, and probably constructor is called for the temporary object.” not immediately at run-time: A note says “The implementation’s use of p = new int[10]; temporaries can be observed, therefore, through p++; the side effects produced by constructors and delete p; // error destructors.” p = 0; Putting this together, creation of a temporary delete p; // ok is implementation dependent, so might or might not be done. If a temporary is created, a One of the restrictions of the design of C++ is constructor is called as a side effect, which can that it must remain compatible with C. This change the state of the object. Different C++ results in examples like the above, that are ill- implementations could therefore return different defined language constructs, that can only be results for the same code. covered by warnings of potential disaster. Removal of such language deficiencies would result in loss of compatibility with C. This might
C++?? 2nd Edition page 11
be a good thing if problems such as the above data. Friend is a ‘limited export’ mechanism. disappear. But then the resultant language might Friends have three problems: be so far removed from C that C might be best abandoned altogether. 1) They can change the internal state of objects from outside the definition of the class. 2) They introduce extra coupling between 3.14. Local entity declarations components, and therefore should be used Declaring an entity close to where it is used, sparingly. has both advantages and disadvantages. It is 3) They have access to everything, rather convenient, but can make a routine appear more than being restricted to the members of interest to complex and cluttered. A problem is that an them. identifier can be mistakenly overloaded within a nested block in a function, with the resultant Friends are useful, and a case can be made for problems covered in the sections on name shades of grey between public, protected and overloading and nesting. C does not have nested private members. Multiple interfaces to a class routines or blocks so does not have this problem. provide the functionality of friends and avoid the ALGOL uses this simple form of name above problems. Each interface to the class can be overloading. (A block in the ALGOL sense exported to everything, or selected classes only. A contains both declarations and instructions.) selective export mechanism is more general than The ARM explains problems of local public, private, protected and friend, and declarations with branching, which shows the explicitly documents the couplings between complications in intermingling declarations and entities in the system. Selective export specifies instructions. Caveats cannot make up for or fix not only that a member is exported but to which faulty language definition. classes it is exported. The C++ FAQ [Cline] (Q83) is unclear on this One reason given for friends, is that they point (although it is mostly excellent), claiming allow more efficient access to data members than that an object is created and initialised at the a function call. The way C++ is often used is that moment it is declared. This only applies to auto, data members are not put in the public section, in stack objects. Dynamic entities are not created because this breaks the data hiding principle. and initialised until they are the subject of a ‘new’ Data hiding is better described as instruction. In well written object-oriented ‘implementation hiding’. Only a classes abstract software, routines will be small, typically functional interface should be visible to the performing one atomic action per routine. outside world. That is data members can be Small routines that implement atomic exported, but are viewed externally as functional operations are fundamental to loose coupling. For entities. This is because, when used in example, a base class that provides a single expressions, functions and variables have no routine that logically performs operations A and semantic difference. They both return values of a B, is not useful to a subclass that needs to provide given type. (See fn () for an explanation of why its own implementation of B, but does not want to variables and functions are best regarded as change A. The descendant must reimplement the similar entities.) (See also Marshall Cline’s logic of both A and B, missing an opportunity to explanation of friends in the FAQ for further reuse the logic of A. Tight coupling reduces clarification of the friend concept.) flexibility. Splitting A and B into different The Cambridge Encyclopedia of Language routines accomplishes loose coupling, and has an interesting point about public and private therefore flexibility. Efficiency is also attained names. It says “Many primitive people do not like without the mess of local entity declarations. to hear their name used, especially in Good design and clean modularisation achieve unfavourable circumstances, for they believe that efficiency, as the entities which would be locals to the whole of their being resides in it, and they a block in C++ are only created when the routine may thereby fall under the influence of others. is entered. The danger is even greater in tribes (in Australia and New Zealand, for example), where people are 3.15. Members given two names - a ‘public’ name, for general Care should be taken with the C++ use of the use, and a ‘secret’ name, which is only known by term member. In general use, an object is a God, or to the closest members of their group. To member of a class. This corresponds to members get to know a secret name is to have total power in set theory. But in C++, the term member means over its owner.” a data item, or function of the class. This ambiguity could have easily been avoided. 3.17. Static The word ‘static’ is confusing in C++. Page 3.16. Friends 98 of the C++ Annotated Reference Manual Friends are a mechanism to override data (ARM) mentions this confusion and gives two hiding. Friends of a class have access to its private meanings. Firstly, a class can have static
C++?? 2nd Edition page 12
members, and a function can have static entities. variants. Inheritance and polymorphism provide The second meaning comes from C, where a static this in OOP. A reference to a superclass can also entity is local in scope to the current file. The be used to refer to any subclass, and thus provides choice of different keywords would easily solve the same semantics as union, only in a type safe this trivial problem. There is also a third more manner, as the alternatives can never be confused. general meaning that objects are statically or An object reference is implicitly a union of all automatically allocated and deallocated on the subclasses. stack when a block is entered and exited, as opposed to dynamically allocated in free space. 3.19. Nested Classes Static class members are useful. Page 181 of Simula provided textually nested classes the ARM states that statics reduce the need for similar to nested procedures in ALGOL. Textual global variables. It is good to reduce global (syntactic) nesting should not be confused with variables, but the C syntax obscures the purpose. semantic nesting, nor static modelling with Entities declared in functions can also be dynamic run time nesting. Modelling is done in static. These are not needed in an object-oriented the semantic domain, and should be divorced language. The reason and history is this. ALGOL from syntax. You do not need textually nested has the notion of ‘OWN’ locals in blocks. The classes to have nested objects. Nested classes are semantics of an OWN entity is that when a block contrary to good object-oriented design, and the is exited, the value of the OWN is preserved for free spirit of object-oriented decomposition, the next entry to the block. I.e. the value is where classes should be loosely coupled, to persistent. The implementation is that at compile support software reusability. Semantic nesting is time, the OWN entity is limited in scope to the achieved independently of textual nesting. In block, but at run time, it is located in the global object-oriented design all objects should interact stack frame. The same instance of the variable is only via well defined interfaces. Objects of a class used in all invocations of the procedure, rather that is textually nested in another class have than each invocation using separate local storage access to the outer object without the benefit of a on the stack. This causes complication in recur- clean interface. C avoided the complexity of sion. nested functions, but C++ has chosen to imple- Simula’s designers generalised the ALGOL ment this complexity for classes, which is of less notion of block into class, and so object- use than nested functions. orientation was born. Instead of discarding a class OOP achieves nesting in two ways: by block on exit, it is made ‘persistent’. Declarations inheritance and object-oriented composition. Thus within the class block are persistent, and therefore modelling nesting is achieved without tight provide the functionality of static and OWN. textual coupling. For example, consider a car. We Classes are more flexible than statics. Statics are know in the real world that the engine is persistent in the same way as globals, ie for the embedded within the car. In object-oriented duration of the program. Class member lifetime is modelling, however, this embedding is modelled governed by the lifetime of the object. Object- without textual nesting. Both car and engine are oriented languages do not need OWNs or statics. separate classes. The car contains a reference to an engine object. This also allows the vehicle and 3.18. Union engine hierarchy to be independently defined. Union is another construct that is superfluous Engine is derived independently into petrol, in OOP. Similar constructs in other languages are diesel, and electric engines. This is simpler and recognised as problematic. For example, more flexible than having to define a petrol engine FORTRAN’s equivalences, COBOLs car, a diesel engine car, etc, which you have to do REDEFINES, and Pascal’s variant records. When if you textually nest the engine class in the car. used to overload memory space these force the Other examples can also be structured without programmer to think about memory allocation. textual nesting, and no loss of generality. Recursive languages use a stack mechanism that In C++, not only can classes be nested within makes overloading memory space unnecessary, as other classes, but also within functions, thereby it is allocated and deallocated automatically for tightly coupling a class to a function. This locals when procedures are entered and exited. confuses class definition with object declaration. The compiler and run time system automatically The class is the fundamental structure in object- allocate and deallocate storage as required, oriented programming and nothing has existence ensuring that two pieces of data never clash for separate from class (including globals). C++ is the same memory space. This is essential so that confused as to whether it is procedure-oriented or the programmer can concentrate on the problem object-oriented. domain, rather than machine oriented details. When union is used similarly to FORTRAN’s 3.20. Global Environments equivalences it is not needed. The global environment provides a special Union is also not needed to provide the case of nested classes. When classes are nested in equivalent to COBOL REDEFINES or Pascal’s a global environment, dependencies can arise that C++?? 2nd Edition page 13 make the classes difficult to decouple from that header B also includes header C. A simple but environment, and therefore not reusable. Even if a messy fix in all headers solves this problem: class is not intended for use in another context, it #ifndef thismod will benefit from the discipline of object-oriented design. Each class is designed independently of #define thismod the surrounding environment, and relationships ... rest of header and dependencies between classes are explicitly #endif stated. In C++ functions can change the global Headers show how C++ addresses the environment, beyond the object in which they are problem of independent modules by a non-object- encapsulated. Such changes are side-effects that oriented approach that is sub-optimal; the limit the opportunity to produce loosely-coupled programmer must supply this bookkeeping objects, which is essential to enable reusable information manually. A class interface is software. This is a drawback of both global and equivalent to a module header. A module header nested environments. contains data and routines exported to other A good OO language will only permit routines modules. This is exactly the purpose of the class in an object to change its state. Removing the interface. A class definition contains all global environment is trivial. It is simply knowledge of component classes and their encapsulated in an object or set of objects of its dependencies (inheritance and client) in the class own. Therefore global entities are subject to the text. Dependency analysis is derivable from the discipline of object-oriented design. Having class text. Tools like ‘make’ can be integrated into globals in a system circumvents OOD. Objects the compiler itself, and the errors and tedium can also provide a clean interface to the external encountered in the use of ‘make’ are avoided. environment, or operating system, without loss of #includes relate to the organisation and generality, for a negligible performance penalty. administration of a project. Rational language de- Thus classes are independent of a surrounding sign eliminates such bookkeeping mechanisms. environment, and the project for which they were A traditional system is assembled by first developed, and are more easily adaptable to combining modules. An object-oriented system is new environments and projects. assembled by combining classes. Modules are a primitive form of classes. Classes are more 3.21. Header Files sophisticated. They express more precisely In C++ a class interface must be maintained relationships with other classes. C++ #includes separately from its body. While an abstract and modules have problems. This primitive interface should be distinct from a concrete method is not required in an object-oriented implementation, the interface and implementation language. can both be derived from one source. In C++ though, programmers must maintain the two sets 3.22. Class Interfaces of information. Replicated information has well Section 9.1c of the C++ ARM points out that known drawbacks. In the event of change, both C++ has no direct support for “interface copies must be updated. This can lead to definition” and “implementation module”. In a inconsistencies that must be detected and C++ class definition, all private and protected corrected. Tools can automatically extract abstract members must be included in the public text of class descriptions from class implementations, the class. The ARM points out that whenever the and guarantee consistency. private or protected parts are changed, the whole The programmer must also use #includes to program must be recompiled. Further to what the manually import class headers. #include is an old ARM says, all modules that are dependent on the and unsophisticated mechanism to provide header file must be recompiled, even though the modularity. #include is a weak form of inheritance private and protected members do not affect other and import. C++ still uses this 30 year old modules. Private members should not be in the technique for modularisation, while other abstract class interface, as this exposes languages have adopted more sophisticated implementation details to programmers of other approaches, for example, Pascal with Units, modules. Modula with modules, Ada with packages. In Eiffel the unit of modularisation is the class itself, 3.23. Class header declarations and includes are handled automatically. The OOP C’s syntax for function declarations is class is a more sophisticated way to modularise [<type>] <identifier> (<parameters>). For (a very programs. Inheritance implements reusability and simple) example: modularisation, so #include is superfluous. class C Another problem is that if header A includes { header B, and header B includes header A a a (); circular dependency occurs. The same problem occurs if header A includes headers B and C, and b ();
C++?? 2nd Edition page 14
int c (); In C++ the programmer must manually d (); manage storage due to the lack of garbage char e (); collection. This is a difficult bookkeeping task virtual void f (); that leads to two opposite problems. Firstly, an } object can be deallocated prematurely, while valid references still exist (dangling pointers). Secondly, dead objects might not be deallocated To find an identifier in this layout, the eye leading to memory filling up with dead objects must trace a course around the type specifications. (memory leaks). Attempts to correct either This is a tiring activity. The eye has a greater problem can lead to overcompensation and the chance of missing the sought identifier, and the other problem occurring. A correct system is a programmer must resort to using the search fine balance. This is illustrated in the figure function of a text editor to help out. below. Other languages place the entity names first. For example: Dangling Correct Memory class C Leaks Pointers System { a (); These problems contribute to the fragility of b (); C++ programs, and usually result in system c () int; failure. Garbage-collection solves both problems. d (); Garbage-collection has an undeserved bad e () char; reputation due to some early garbage-collectors f () virtual void; having performance problems, instead of working } transparently in the background, as they can and should. These problems are often over- To those used to the ALGOL and FORTRAN emphasised as a justification for C++ ignoring style of type first, this seems backwards. But garbage collection. A possible solution is to build name first is logical as a real world example garbage collection into the run time architecture, illustrates. Imagine if a dictionary is published, but allow the programmer to activate and and the keywords are not placed first, but rather deactivate it manually. Garbage collection can be the entry order is - disabled in systems where it is inappropriate. noun /obvrzen/ obversion, the act or In C++ it might be argued that the lack of result of obverting garbage-collection is not an engineering compromise. Its inclusion is nearly an engineering Such a dictionary would not sell many copies, impossibility, as a programmer can undermine the unless the marketers managed to fool many structures required for implementing correctly people that the explanation of the meaning was working garbage-collection. While garbage- more correct because the order of layout was collection might not actually be an impossibility mysteriously magical. This example illustrates in C++ (EC++), it is difficult, and programmers how important subtle syntax decisions are, and would have to settle for a more restricted way of why PASCAL style languages might have ordered programming. This could be a good thing. But things contrary to FORTRAN, ALGOL and then the compromise to remain compatible with C others. The language designer must consider these becomes difficult, if the compiler is to detect trivial but important alternatives. The layout of practices inconsistent with the operation of programming entities is essential for effective garbage-collection. communication. The dual roles of language syntax, and programming style affect 3.25. Type-safe linkage comprehension. A dictionary or index style layout The C++ ARM explains that type-safe linkage suggests placing entity names first, followed by is not 100% type safe. If it is not 100% type-safe, their definition. then it is unsafe. It is the subtle errors that cause the most problems, not the simple or obvious 3.24. Garbage Collection ones. Often such errors remain undetected in the One of the hallmarks of high level languages system until critical moments. The seriousness of is that programmers declare data without regard to this situation cannot be underestimated. Many how the data is allocated in memory. In block forms of transport, such as planes, and space structured languages, local variables are allocated programs depend on software to provide safety in on the stack, and automatically deallocated when their operation. The financial survival of the block exits. This relieves the programmer of a organisations can also depend on software. To great burden. Garbage collection provides accept such unsafe situations is at best ir- equivalent relief in languages with dynamic entity responsible. allocation.
C++?? 2nd Edition page 15
The C++ ARM summarises the situation as methodology of choice of disciplined thinkers. follows - “Handling all inconsistencies - thus Some people can hold a whole problem and making a C++ implementation 100% type-safe - solution in their head and work in a disciplined would require either linker support or a fashion until the solution is complete. Mozart is mechanism (an environment) allowing the said to have composed this way, producing his compiler access to information from separate last three symphonies in as many months in 1788. compilations.” Beethoven toiled far more over the production of So why does the C++ compiler (at least his works, taking years to complete one AT&T’s) not provide for accessing information symphony. Both composers produced from separate compilations? Why is there not a masterpieces. Mozart wrote music directly, specialised linker for C++, that actually provides whereas Beethoven wrote themes and ideas in his 100% type safety? There is no reason why C++ famous sketchbooks. The production of should not be implemented this way. Building masterpieces depends on skill, not on method- systems out of preexisting elements is the ologies. common Unix style of software production. This It is becoming accepted that the software implements a form of reusability, but not in the lifecycle should be an integrated process. truly flexible manner of object-oriented Analysis, design and implementation should be a reusability. seamless continuum.The activities of the lifecycle In the future, Unix could be replaced by should progress in parallel to expedite software object-oriented operating systems, that are indeed development. Facts found out only as late as the ‘open’ to be tailored to best suit the purpose at implementation stage can be fed back into the hand. By the use of pipes and flags, Unix software analysis and design stages. The object-oriented elements can be reused to provide functionality approach supports this process. Artificial that approximates what is desired. This approach separation of the steps leads to a large semantic is valid and works with efficacy in some gap between the steps. The transformations instances, like small in-house applications, or required to bridge such semantic gaps are prone to perhaps for research prototyping, but is misinterpretation, time consuming and costly. unacceptable for widespread and expensive The same people should be responsible for all software, or safety critical applications. In the last stages. This way they take responsibility for the ten years the advantages of integrated software system as a whole, rather than passing the buck have been acknowledged. Classic Unix systems and blame which occurs when analysts, designers don’t provide those advantages. Integrated sys- and implementers are different groups. This is not tems are more ambitious, and place more demands a popular viewpoint in traditional hierarchical on developers. But this is the sort of software now management structures where programmers get being demanded by end users. Systems that are promoted to designers who get promoted to cobbled together are unacceptable. analysts. Hierarchical management also A further problem with linking is that discourages people from feeling responsible for a different compilation and linking systems should product. This culture must radically change if we use different name encoding schemes. This are to produce quality systems. problem is related to type-safe linkage, but is We should have learnt from the extremes covered in the section on ‘reusability and SA/SD. Some quarters believed that methodology compatibility’. was all important, while programming and programming languages were unimportant. 3.26. C++ and the software lifecycle Arcane and machine-oriented programming The software lifecycle has attracted a great languages strengthened this attitude. These deal of attention. It is at least generally accepted languages concentrate on the ‘how’ of that the activities in the lifecycle are analysis of computation, whereas the modellers correctly requirements, design, implementation, testing and demand notations that express the ‘what’, in order error correction, extension. Unfortunately, the to be implementation independent. A modern result of identifying these activities has resulted in software language supports the integration of the a school of thought that the boundaries between activities of design and implementation by being these activities are fixed, and that they should be readable, and problem-oriented. A language systematically separate, each being completed should be as close to design as possible. The before the next is commenced. It is often argued needs and requirements of an enterprise can that if they are not cleanly separated, then you are change much more rapidly than programmers can not practicing disciplined system development. keep up, especially in a highly competitive and This view is incorrect. Someone who writes a commercial world. program straight away is actually doing all the So how does C++ fit into this picture? Well it steps in parallel. It might not be the best way do is based on C that was designed mainly as an do things in many circumstances, might or might implementation and machine-oriented language. It not suit the style and thinking of different people, is an old language, that did not need to consider but this works in some scenarios, and can be the the integrated lifecycle approach. C++ might have
C++?? 2nd Edition page 16
some of the trappings of object-oriented concepts, software component, require assurance that the but it is the marriage of a problem-oriented component is trustworthy. Trusting programmers technique with a machine-oriented language. It is against the commercial interest of both parties. addresses implementation, but not so well the This is not to cast dispersion on programmers, but other aspects of the software lifecycle. Since C++ merely recognises that computers are good at is not so well integrated with analysis and design, performing mundane tasks and checks, but people the transformation required to go from analysis are not. If people were good at such things, we and design to implementation is costly. The would not need computers in the first place. semantic gap between design languages and the Building trustworthy components is a safety implementation language is great. concern. We should have learnt from the structured world that this is the incorrect approach to the 3.29. Reusability and Compatibility software lifecycle. But in the OO world we are Different compiler implementations need to be again falling into the trap of dividing the lifecycle compatible in order to realise reusability between into artificially distinct activities of OOA, OOD components. Different C++ compilers generate and OOP, instead of adopting an integrated different class layouts, virtual function calling approach to these. Modern languages provide a techniques, etc. The name encoding schemes used much more integrated approach to the complete for type safe linkage can also be different. If two software development process than C++. C++ different compilers generate different run-time supports classes and inheritance and other organisations, then different name encodings are concepts of object-orientation, but fails to address desirable as it will prevent two incompatible the entire software lifecycle. libraries from being linked. The C++ ARM (p122) states “If two C++ implementations for the same 3.27. Reusability and Communication system use different calling sequences or in other Reusability is a matter of communication. ways are not link compatible it would be unwise In order to use a software component, you to use identical encodings of type signatures.” must be able to understand it. The writer must This can be solved in two ways. Firstly, a communicate the purpose, intent, and correct library vendor could provide the entire source of a usage of the component to the client. In the library so it can be compiled by the customers object-oriented world, clear and concise definition compiler. This is not satisfactory if the sources are of software modules is not a mere nicety, but proprietary. Then the vendor will need a separate essential for reusability. Arising out of the issue release for every environment, and every compiler of reusability is extendibility. In order to in that environment. maximise the reuse of software, it often must be Because of this problem a strong case exists tailored for new applications. The client for a universal intermediate machine readable programmer must decide whether the software representation of programs. Interestingly, some component is suitable for the new task. If so, what systems are already using C as a ‘universal is the best way to extend it? Clear communication assembler’, notably AT&T C++ and Eiffel. But to clients is a courtesy concern. this cannot solve the above problems of compatibility between components without a 3.28. Reusability and Trust standardisation effort on run time layouts and Reusability is a matter of trust. name encoding schemes. Trust results from confidence that safety concerns have been met. If you do not have 3.30. Reusability and Portability confidence in a software component, then it is Since true OOP ensures that objects are difficult to consider it for reuse. There could be loosely coupled to the external environment, doubt that the software component provides portability to diverse environments is possible. C enough functionality, or correct functionality. is tightly coupled to the Unix environment, and as There could be doubt that the component is such is not particularly portable to diverse efficient enough, or worse it might crash. The environments. C/C++ philosophy of not building checks into the language and compiler because programmers can 3.31. Idiomatic Programming be trusted, works against trust and reusability. The ability to program in different idioms is In the real world of reusability, the ideal of argued as a strength of C++. Idiomatic trusting programmers is inappropriate. Trusting programming, however, is a weak form of programmers results in less trustworthy software. paradigmatic programming. It is programming in In reality, customers doubt the claims of a paradigm without necessarily having compiler suppliers. It is the onus of the supplier to prove support for that paradigm. The compiler cannot their claims, and thus trustworthiness of the check for inconsistencies with the idiom, or software. The client is not required to trust the paradigm. Defines can often be used to invent supplier’s programmers. Potential clients of a idioms. Anyone who has attempted to do object-
C++?? 2nd Edition page 17
oriented programming in a conventional language Object level is natural for the programmer, using defines will realise that it is impossible to and has the advantage that a programmer can realise all the benefits easily, if at all, without implement a system without taking into account compiler support. parallel processing at all. The same program will run and produce identical results irrespective of 3.32. Concurrent Programming whether the customer is running a single In the next ten years multiple processor arrays processor, or a processor array. that execute programs concurrently will probably Side effects must be avoided in concurrent become common. Concurrency requires much systems. Suppose a computation depends on cleaner languages, than the single processor combining the results of two functions f and g, languages of today. Object-oriented concepts such as f + g. If f and g are independent, then they support concurrent programming. Objects can can be computed concurrently. If however, f execute state changing code independently of each produces side effects that g depends on, they must other. Concurrent programming will be enabled be computed sequentially. F and g are parameters by the division of the state space of a system into to the + function. Routine parameters can be modules to achieve a high degree of independent computed concurrently, as long as the processing. Objects provide a scheme to cleanly computation of each causes no side effects. Side divide state spaces. The demand that everything effects are avoided by restrictive practices that C be broken down into loosely coupled modules, devotees would object to. that only interact through well defined interfaces C++ does not preclude the use of a global might be perceived as inefficient. But it is environment. Access to shared global data precisely this scheme that will mean that potentially causes a thread to lock, and if many concurrent solutions can be developed efficiently such accesses occur, the advantage of concurrency and transparently to the programmer. Concurrency is lost. This is because updates to a global should be transparent to the programmer, as environment are side effects. Programming in concurrency is a low level implementation such an environment requires complex locking consideration. That is concurrency is how a mechanisms to ensure that things happen in the computation is done, not what is to be computed. correct order. Locks are rather like waiting for a The programmer should be concerned with what plane to take off when it has to wait for another is to be computed, not how. How something is connecting flight. This cannot be entirely avoided, computed is the concern of the target but should be reduced as much as possible. environment, ie the compilers, operating system, and hardware. When programmers are not concerned with this level, efficiency and 4. The role of Language portability follow automatically. For an intermission between sections, I’ll The aim of concurrent processing is to keep mention some interesting points that the all the processors in a processor array as fully Cambridge Encyclopedia of Language [Crystal utilised as possible, so that processor resources are 87] makes. It says that language is an emotional not wasted. This is as good as can be expected. subject. “It is not easy to be systematic and There is nothing more mysterious to concurrent objective about language study. Popular linguistic programming than the efficient use of resources. debate regularly deteriorates into invective and Keeping all processors busy is an inherently polemic. Language belongs to everyone; so most dynamic problem, which the programmer cannot people feel they have a right to hold an opinion determine statically at compile time. All the about it. And when opinions differ, emotions can processors can be kept busy, as long as there are run high. Arguments can flare over minor points enough threads in the system. of usage as over major policies of linguistic In concurrent programming, a thread is a unit planning and education.” of sequential execution. Concurrency is achieved While natural language is difficult to be by the splitting of threads. A thread can be split “systematic and objective about”, should this when a state changing routine is invoked, but not apply to computer languages? The definition of a value returning function, because it must wait natural language is generally beyond our control, for the value. State changing routines can easily with the exception of languages such as be invoked on another processor. Object level Esperanto. Programming language definition, granularity seems to be a natural candidate for however, is within our control. Programming concurrent processing. An object can have only languages must have expressiveness like natural one update thread at a time to avoid simultaneous language, yet be precise and semantically update problems. Other levels of concurrency are consistent. As programming languages have instruction level, and task or process level. Task rigorous requirements, we should be even more or process level is the level used in conventional critical and objective about them. It is a measure multi-processing systems currently commercially of immaturity in the programming profession that produced, and instruction level is quite difficult, emotional and irrational defensiveness often best being left to instruction pipelines. denies valid criticism. Many dismiss the choice of
C++?? 2nd Edition page 18
programming language as a religious issue. If sent from one object to another, so that they can language choice is merely religious, then we communicate and interact. Static binding might as well still program in assembler, or determines this message in advance as the maybe even binary, because the adoption of high receptor is always the same type, or descendant of level languages would have no technical merit. that type. Static typing ensures in advance that the Language choice, however, is a technical receptor object can process the message. Dynamic consideration. Technical measures should judge binding, means that the exact message to be sent the effectiveness of a language. Understanding the is determined by the dynamic type of the receptor role of language helps quantify what must be when the message is actually sent. For example, measured. on your telephone, you talk to your friends The Cambridge encyclopedia lists several differently than a client, even though you are functions of language. “To communicate our using the same piece of equipment. These are the ideas”, it says is the most common answer, and concepts that C++’s virtual do not express well. this must surely be the most widely recognised Designing an object-oriented system is like function of language. It lists several other designing a language by which objects interact. functions of language. One function is emotional Thus tools used for formal programming language expression. For instance, when we stub our toe, design, BNF, denotational semantics, and we often emit words, even when there is no one to axiomatic semantics can help in the design of an hear. Another is social interaction. For example if object-oriented system. someone sneezes, we often “bless” them. Another “Language shapes the way we think, and is the power of sound, as in poetry and rhyming determines what we can think about.” - jingles etc. Another is the control of reality, as in B.L.Whorf. Bjarne Stroustrup quotes this in “The spells and incantations. Perhaps computer C++ Programming Language”. But is this correct? programs and spells are similar in purpose. The encyclopedia says, “It seems evident that Another is recording facts. This includes record there is the closest relationship between language keeping, historical and geographical documents, and thought: everyday experience suggests that etc. Another is the instrument of thought. We much of our thinking is facilitated by language. often reason about things to ourselves in But is there identity between the two? Is it language. Another function is the expression of possible to think without language? Or does identity. Language can express who we are, or language dictate the ways in which we are able to affirm our belonging to certain groups. Perhaps think? Such matters have exercised generations of the most important role of computer languages is philosophers, psychologists, and linguists, who to enable description, and recording the decisions have uncovered layers of complexity in these made during the design and implementation of a straightforward questions. A simple answer is system. certainly not possible; but at least we can be clear Since language and communication are two about the main factors which give rise to closely related concepts, it is important to complications.” understand their relationship, and the nature of The above Whorf quote is a statement of the communication. Language is the set of aural and Sapir-Whorf hypothesis on language and thought. written symbols with which we communicate. Edward Sapir (1884-1939) formulated this with Laurence Wylie in the foreword to “French in his pupil Benjamin Lee Whorf (1897-1941). It Action” [Capretz 87] describes communication as reflects the view of its day when great value was “To understand this [communication] we must placed on the diversity of the languages and know the basic meaning of the words common, cultures of the world. The Sapir-Whorf hypothesis communicate, and communication. They are combined two principles. The first is ‘linguistic derived from two Indo-European stems that mean determinism’, which states that language “to bind together.” In this ordered universe, no determines the way we think. The second, human being can live in isolation. We must be ‘linguistic relativity’, that the distinctions found in bound together in order to participate in an one language are not found in any other. There organised effort to accomplish the necessary can be both verbal and non-verbal thought; activities of existence. This relationship is so vital following a road map in a car for example. Street to us that we must constantly be reassured of it. directions are often difficult to put into words. We test this connection each time we have contact The Sapir-Whorf hypothesis in its strongest with each other.” form, as in the Whorf quote, is now not generally The concept of binding is also important in accepted. For one reason, it is known that computing. In networks, binding establishes concepts can be translated from one language into communication links between two or more another. This is even if in one language, the entities. This forms a greeting, so that a concept can be expressed in one word, but takes a relationship is established, and communication is phrase of words in another. A weaker version of possible. In programming we have the concepts of the Sapir-Whorf hypothesis is accepted. That is static and dynamic binding. Binding in this “language may not determine the way we think, paradigm makes it possible for a message to be but it does influence the way we perceive and
C++?? 2nd Edition page 19
remember, and it affects the ease with which we Elements of Style,” [S & W 79]. This has been perform mental tasks. Several experiments have around for most of this century in one form or an- shown that people recall things more easily if the other. William Strunk, the original author had things correspond to readily available words and some stern advice for his students: phrases. And people certainly find it easier to “Vigorous writing is concise. A sentence make a conceptual distinction if it neatly should contain no unnecessary words, a paragraph corresponds to words available in their language. no unnecessary sentences, for the same reason that Some salvation for the Sapir-Whorf hypothesis a drawing should have no unnecessary lines and a can therefore be found in these studies, which are machine no unnecessary parts.” being carried out within the developing field of The machines that the software professional psycholinguistics.” develops are not built, but written. Strunk’s last The important question to the programming sentence prompts consideration of the relationship community is do programming languages ‘shape’ of writing to software development. A common the way we think about and design systems? The situation is taking several thousand lines of negative argument is that it is the concepts behind incomprehensible ‘code’, and making it execute languages that are important, not the languages efficiently. After spending considerable time we themselves. Languages only provide a framework often realise what the program does, and reduce it for the expression of the concepts. A language can to several hundred lines of program ‘text’ that only be as good as the concepts it implements. A runs ten times faster. Strunk’s quote should be programming language influences the way we applied to programming. A routine should contain program, and the way we use the concepts it no unnecessary declarations or instructions, a implements. It can clarify the concepts, or obscure system no unnecessary routines. It can also be them as in the case of C++. A language must applied to programming languages. A pro- implement the concepts cleanly and simply. It gramming language should contain no must express the concepts in as few words and unnecessary constructs. This is the root of my constructs as possible. But this does not just mean dissatisfaction with C++. Much of it is avoiding keywords as in C. Programmers who un- unnecessary, even for the most complex systems. derstand the concepts should have no difficulty in Its syntax is ugly. C++ has become what the C adapting to different languages, as long as the new world has constantly criticised in languages like language implements the concepts elegantly. PL/1 and Ada. Only C++ is worse. We need to A language can be judged like a wine regain artistic elegance and simplicity. connoisseur judges wine, by holding it up to the light to judge for clarity and colour. Ultimately, it is the taste that matters, but good colour and 6. Generic C criticisms clarity suggests that the taste is more likely to be These criticisms apply to the C base language, good. Clear programming language definition but in general adversely affect C++. R.P.Mody helps in the goal of the production of quality soft- [Mody 91] gives an excellent general criticism of ware. C. He says that to properly understand C you So where does this leave Sapir-Whorf with must understand the insides of the compiler. He respect to programming languages? Programming gives many examples of how C obscures rather languages do not shape the way we think. It is the than clarifies software engineering. He concludes concepts that shape the languages, and it is the that he is “appalled at the monstrous messes that way we think that shapes the concepts. Those who computer scientists can produce under the name of have attempted to learn a language in order to ‘improvements’. It is to efforts such as C++ that I learn object-oriented programming realise that it here refer. These artifacts are filled with frills and is the concepts which must be grasped in order to features but lack coherence, simplicity, be effective. Once the concepts have been learnt, understandability and implementability. If object-oriented programming seems a natural way computer scientists could see that art is at the root to program. It matches very effectively the way of the best science, such ugly creatures could we think. If C++ has been designed according to never take birth.” the Sapir-Whorf hypothesis, its philosophical C’s popularity is based on several myths. basis does not serve a computer industry that Firstly, that it is a high level language. It is not. It should shape tools best suited to its purposes, is a structured assembler oriented towards the low processes, thinking, and concepts. level machine domain, not to the problem domain of a high level language. Secondly, that it is small and simple. Its semantics are not simple, but it is 5. On Writing very simple to make catastrophic errors. Thirdly, During the development of this critique, I that it is portable. Certainly compilers are realised it had grown larger than I had intended, available on many platforms, but this does not and that my writing style needed some polish for make programs portable, especially to diverse, such a large work. During my research, some and future architectures. Platform independence colleagues recommended a small book, “The achieves portability. Fourthly, that it is efficient.
C++?? 2nd Edition page 20
What seems efficient on some platforms is the handles transparent to the programmer. This is very antithesis of efficiency on other platforms. It similar to the Unisys A Series approach where seems efficient on certain platforms because it object ‘descriptors’ access the target object via a allows the lower level machine-oriented master descriptor that stores the actual address of architecture to be visible at a higher level, instead the object. On the A Series this is transparent to of being handled transparently by the compiler. programmers in all languages, as this transparency This means that programs will be locked into is realised at a level lower than languages. The A certain styles of architecture, or into current styles series descriptor mechanism also provides of technology, instead of protecting program hardware safety checks that mean that pointers investment against future technological change. cannot overrun, and arrays cannot be indexed out And lastly, that the semantics are mathematically of bounds. C cannot be implemented particularly rigorous. Anyone who reads the C++ ARM will well on such machines, as C’s mechanisms are realise just how poorly defined the language is. lower level than the target environment. Anyone who has practiced C will know how Other environments do not provide object many traps there are to fall into. relocation, so double indirection is an unnecessary overhead. In order for programs to be portable and 6.1. Pointers to be at their most efficient in different target C pointers are a low level mechanism that environments, such system details should be the should not be the concern of programmers. concern of the target compilation system, not of Pointers mean the programmer must manipulate the programmer. low level address mechanisms, and be concerned C’s pointer declaration syntax causes another with lvalue and rvalue semantics, which are small problem: machine oriented and not problem oriented as you int* i, j; would expect of a high level language. A compiler can easily handle such issues without loss of generality or efficiency. Memory models of This does not mean, as might be easily read - different environments often affect the definition int *i, *j; of pointers. Memory model details such as near and far pointers should be transparent to the but programmer. int *i, j; The programmer must also be concerned with correct dereferencing of pointers to access and should be written thus to avoid confusion. referenced entities. Use of pointers to emulate by reference function parameters are an example. The 6.2. Arrays programmer has to worry about the correct use of Page 137 of the C++ ARM notes that C arrays &s and *s. (See the section on function are low level, yet not very general, and unsafe. parameters.) Page 212 admits, “the C array concept is weak Pointer arithmetic is error prone. Pointers can and beyond repair.” Modern software production be incremented past the end of the entities they is far less dependent on arrays than in the past, reference, with subsequent updates possibly especially in the object-oriented environment. The corrupting other entities. How many lurking and trade off to be optimal, rather than general and undetected errors are in programs because of this? safe no longer applies for most applications. C This illustrates how C undermines OOP by arrays provide no run-time bounds checking, not providing a mechanism where state outside an ob- even in test versions of software. This ject’s boundaries can be changed. Since pointers compromises safety and undermines the semantics are intrinsic to writing software in C this of an array declaration, ie an array is declared to exacerbates the problem. Pointers as implemented be a particular size, and can only be indexed by in C make the introduction of advanced concepts values within the given bounds. An index to an like garbage collection and concurrency difficult. array is a parameter in the domain of the array Another consideration is that dynamic function. An index out of bounds is not a member memory implementations vary between platforms. of the domain, and should be treated as severely Some environments make memory block as divide by zero. C has no notion of dynamically relocation easier by having all pointers reference allocated arrays, whose bounds are determined at objects via a master pointer which contains the run time, as in ALGOL 60. This limits the actual address of the block. The location of the flexibility of arrays. The C definition of arrays master pointer never changes, so relocation of the compromises both safety and flexibility. block is hidden from all pointers that reference it. One view of arrays is just another object- When the block is relocated, only the master oriented entity which should be treated in an pointer needs to be updated. object-oriented manner as a class of data structure. On the Macintosh, for example, the double It should have interface definitions, and indirection mechanism of ‘handles’ facilitates consistency checks inherent in object-oriented relocation of objects. Object Pascal makes these systems. Another view is that an array is an
C++?? 2nd Edition page 21
implementation of a function, where pairs of inputs to outputs. Abstract data types can be used values explicitly map the domain to the range, to design such systems. Also this will help target rather than being computed. This suggests that environments to increase parallelism and Algol was incorrect in distinguishing arrays by concurrency in a way transparent to programmers. using square brackets. An array just maps the In object-oriented programming, by reference input argument (the index) to a value of the type parameters are used to pass the original object, not of the array. An array can be viewed as a random a copy. The called routine, however, cannot access stack. change the state of the referenced object. Only [Ince 92] considers that arrays and pointers calling a routine in the objects interface can need not be relied upon so heavily in modern change the state. This has the desired effect of the software production, as higher level abstractions object being given to you, without being yours to such as sets, sequences, etc are better suited to the change, although you can effect change in the problem domain. Arrays, and pointers can be object. provided in an object-oriented framework, and C shares faulty parameters with many other used as low level implementation techniques for languages. The interaction of C’s pointer the higher level data abstractions. As has already mechanism with a faulty parameter mechanism, been mentioned object-oriented programming is however, makes C considerably worse than most very useful for the encapsulation of other languages. In C, pointers are used to implementation and environment oriented details. simulate by-reference parameters with by-value Ince suggests that arrays and pointers should be parameters. The programmer must perform regarded in the same way as gotos in the tedious bookkeeping by specifying *s and &s for seventies. He suggests that languages such as referencing and dereferencing. Distinguishing Pascal and Modula-2 should be regarded in the between by-value and by-reference parameters is same way as assembler languages in the seventies. not just a syntactic nicety, included in most high This applies even more to C and C++, because level languages, but a valuable compiler pointers and arrays are far more intrinsic in the technique, as the compiler can automatically use of C and C++. generate the referencing and dereferencing, I agree with Ince that we have less dependence without burdening the programmer. on arrays, and that pointers in programming languages can be considered harmful. But I 6.4. void * disagree in as far as the concept of array is useful “Passing paths that climb half way into the for mapping one set of values onto another, where void” - Close to the Edge, Yes. this mapping cannot be described Is void * the C equivalent of an oxymoron? A computationally, but can only be expressed by pointer to void suggests some sort of semantic pairs or tuples of coordinates. nonsense, a dangling pointer perhaps? Maybe we should tell the astronomers we have found a black 6.3. Function Parameters hole! While we can have some fun conjecturing Parameters are used to pass routines simple what some of the obscure syntax of C suggests, a values (by-value parameters), or references to serious problem is that void * declarations are entities (by-reference parameters). Parameters are used to defeat the type system, and so inputs to routines, and should not be changed. compromise its purpose. A well thought out type When memory was expensive, reusing parameter system does not require such a facility. In an space could conserve space. Changing parameters, object-oriented type system, the root class of the however, is semantic nonsense, and most inheritance hierarchy provides the equivalent of languages get this wrong. void. By reference parameters enable a routine to When a typed entity is assigned to a reference change the value of an entity external to the of void *, it looses its static type information. routine. Such updates beyond the environment of When it is assigned back to a typed reference the a routine are side-effects. This introduces a programmer must explicitly specify to the mechanism of updating the state space, other than compiler the type information. This is error prone straight assignment (although the routine can use and should at least result in a run-time check, to assignment to achieve the ‘dirty deed’.) The make sure that the correct type actually is being danger is that the state of an object can be assigned. Without type checks, the routines of one changed without using the well defined interface class can be mistakenly applied to objects of of the object. By-reference parameters should not another class. be used to change the external world. Values should only be passed to the external world by the 6.5. void fn () return value of a function. Semantically, this is The default type that a function returns is int. quite different to assignment to a reference A typeless routine returning nothing should be the parameter; data flows through the program in one default. Instead this must be specified by another direction, in via parameters, and out via return confusing use of void. This is an example of values. Mathematically this maps compositions of
C++?? 2nd Edition page 22
where C’s syntax is not well matched to the assign values to variables. Functions, however, concepts and semantics. Syntactically no <type> are the target of assignment. The return statement should suggest nothing to return. Also a typed accomplishes this in C. Algol has no return function can be invoked independently of an statement, but uses assignment to the name of the expression. This is a shorthand way of discarding function. The assignment of a value to a variable the returned value. Values should be returned sets the return value for subsequent invocations of because they need to communicate with the that function. outside world, and ignoring returned values is It is trivial for a compiler to realise this often dangerous. In other words, using a typed transparency of view for variables and functions. function as a void should result in a type error. In ALGOL style languages, the compiler In fact there should be no such thing as a void automatically deduces invocation when it sees a function. A void function is a procedure. name that was declared as a routine, rather than a Procedures and functions should be distinguished. variable. The compiler knows that the identifier This distinction belongs to the problem ‘what’ refers to a routine. This compiler technology was domain. A procedure is a routine that changes the not realised when FORTRAN and COBOL were state of its object, but returns no value. A function developed. This compiler technology is possible, should, in general, not cause any change to the because the compiler stores much information state of an object, but just return some result about an entity. A compiler can check that the dependent upon the objects state. Mathematically, programmer uses the entity consistently with the a function is an entity that returns a value of a declaration. A compiler can generate correct code, given type. Procedures are untyped, and do not without burdening the programmer with having to return a value. So it is incorrect to regard redundantly use an invocation operator. This procedures as functions. Functions as will be enhances flexibility and implementation explained below have more in common with independence. variables than procedures. Procedures can cause In fact the Unisys A series architecture side effects, functions should not cause side elegantly achieves this level of transparency at the effects. These distinctions are useful when hardware level. The value call (VALC) operator considering concurrency. loads a value onto the top of stack. When VALC hits a data value, that value is retrieved from 6.6. fn () memory and loaded onto the stack. When VALC Empty parenthesis represent the function hits a program control word (PCW), a routine that invocation operator in C. Even though ‘()’ is computes the value to be loaded onto the stack is mathematical looking, it is semantically invoked. equivalent to FORTRAN’s CALL, COBOL’s Variables and functions should be PERFORM, and JSR in assembler. The design of interchangeable for programmer optimisation. In these operators was influenced by the underlying C, it is not possible to change a function to a machine architectures. The invocation operator is variable without removing all the (). This might low level, machine and execution oriented, and in be spread over many files, and the programmer the ‘how’ domain. might not bother with optimisation to avoid the This is opposite to most Unix shells, where tedium of the task. So the () operator reduces invocation operators such as ‘run’ and ‘exec’ are flexibility. Thus implementation detail is visible not needed. The ability to execute file names as for the outside world to see. The () operator is commands extends the command repertoire. The another bookkeeping task imposed on the C shell runs executables and interprets shell scripts. programmer. Pure functional languages such as There is no distinction as far as the shell user is SML remove the variable/function distinction concerned. This is a widely accepted as an elegant altogether, by not having variables at all. and effective convenience. C’s () operator The removal of the variable/function introduces the equivalent of a run command into distinction would remove the need for a common the language. use of C++’s inline functions. Inlines clutter the No invocation operator exists in the problem name space of a class and add work for the oriented domain of high level languages. This is programmer. All that is required is to directly because the semantics of a function is to return a export a data member as a function. value of a given type. How this value is computed C also has pointers to functions. Function is unimportant. The value could be computed by a pointers are analogous to the call by name facility routine invocation, by sending a message across a in ALGOL, and this was recognised as having network, by forking an asynchronous process, or pitfalls. Consistent application of the object- by retrieving a precomputed result from a memory oriented paradigm avoids these pitfalls. A location, ie a variable. common use of function pointers is to explicitly The distinction that languages like C make set up jump tables. The mechanism behind virtual between variables and value returning routines is functions is a jump table of function pointers. The artificial. It could be pointed out that variables are design of a program can take advantage of this fundamentally different to functions, as you can fact, without resorting to explicit jump tables.
C++?? 2nd Edition page 23
Another use is to jump to a function in a table that appears to be left to the implementation, which is indexed by an input character. A switch contributes to non-portability. If this can’t be statement can cater for this mechanism that makes defined for a sequential processor, then it is even what is meant explicit, while keeping underlying worse for a concurrent environment. mechanisms (and possibly optimisations) The shorthand += and -= are more powerful as transparent. C++ allows function pointers to values other than 1 can increment the variable. It member functions to be stored in tables (via the .* has been suggested that there should also be &&= and ->* operators). and ||= operators. If it is mistakenly believed that a multiplicity 6.7. Metadata in Strings of operators is required to produce more optimal The implementation of strings in C mixes code, then it should be pointed out that code metadata with data. Metadata is data about an generators, especially for expressions, can object, but is not part of the data itself. Examples produce the best code for a target architecture. A of metadata are addresses, size and type plethora of operators complicates the task of an information. Such metadata is often referred to as optimiser. A compiler can optimise well beyond data descriptors, and can be kept independently of what a programmer can do. An optimising the data, with the advantage that the programmer compiler will analyse the surrounding code, and if cannot mistakenly corrupt the metadata. an entity is used several times in a local scope, it In C strings, metadata about where a string will keep the value of that entity handy locally at terminates is stored in the data as a terminating the top of a stack, or in a register, rather than byte. This means that the distinction between data retrieve it from slow main memory several times. and metadata is lost. The value chosen as the The nature of such optimisations depends on the terminator cannot occur in the data itself. The machines architecture, which a programmer common alternative implementation is to store a should not have to be aware of. Open systems length byte in a fixed location preceding the demands that programs can be ported amongst string. This length metadata can be hidden from diverse architectures and environments, very the programmer who does not need to know different to the original machine, and not only where the length metadata is stored. This run, but run efficiently. Optimisers work best with implementation also has the advantage that the simple, well defined languages. length of a string can be easily obtained, without In fact constructs such as: having to count the number of elements up to the while (*s1++ = *s2++); terminating null. might look optimal to C programmers, but are 6.8. ++, -- the antithesis of efficiency. Such constructs The increment and decrement operators are preclude compiler optimisation for processors often used as an example that C was designed as a with specific string handling instructions. A high level assembler for PDP machines. These simple assignment is better for strings, as it will operators provide a shorthand convenience, but allow the compiler to generate optimal code for are unnecessary. There are no less than three different target platforms. String assignment will ways to perform the same thing - also hide the implementation details of strings. If the target processor does not have string a = a + 1 instructions, then the compiler should be a += 1 responsible for generating the above loop code, a++ rather than requiring the programmer to write such ++a low level constructs. The above loop construct for string copying is also the contrary to safety, as For full generality, only the first form is there is no check that the destination does not required, the others are a mere convenience. The overflow. The above code also makes explicit the last two forms a++ and ++a are the postfix and underlying C implementation of strings, that are prefix forms. They are often used in the context of null terminated. Such examples show why C another expression. Thus several updates can be cannot be regarded as a high level language, but performed in one expression. This is a very rather as a high level assembler. powerful and convenient feature, but introduces As with name overloading, memory storage side effects into an expression that sometimes update is a problematic, but necessary part of have surprising effects, and can lead to program programming. A language should provide it in a errors. The following example is given on p.46 of consistent and expected way. Many languages the C++ ARM - recognise that memory update is problematic, and i = v[i++]; // the value of ‘i’ is typically only provide limited but sufficient ways // undefined of updating, by an assignment operation. (Many languages have block memory copies as well, but The ARM points out that compilers should assignment can also provide block copy.) detect such cases, but the exact interpretation Furthermore, many languages avoid side-effects
C++?? 2nd Edition page 24
by limiting updates to only one per statement. C Consider the paradigm of letters and words. provides too many ways to update memory. These Words are spelt by assembling letters in order. add nothing to the generality of the language, There are 26 distinct letters. With the addition of increase the opportunity for error, and complicate digits 0 to 9, and the underscore character, we automatic optimisation. Restrictive practices are have a complete and correct definition for justifiable in order to accomplish correctly identifiers. Letters can be written in a number of functioning and efficient software. styles. They can be bold, italic, upper or lower case. Such typographic representations, however, 6.9. Defines do not change the semantics of a word. Thus if we write ALGOL, Algol or algol, we recognise the The define declaration - word to represent a computer language. The case #define d(<parameters>) of the letters does not change the semantics. Letter case is only a typographic device. Typographic has a different effect to - conventions make program text more readable, #define d (<parameters>) but should not affect the semantics of a program. Case distinction is based on the low level The second form defines d as (<parameters>). paradigm of character codes such as ASCII used Extra white space between tokens should not internally in the computer. This weakens the affect semantics of constructs. purpose of using names to replace addresses, as #defines are poorly integrated with the names are reduced to a string of character codes. language. The ‘#define’ must be in column 1, and Case distinction also contributes to errors. It knows nothing about scope rules. Errors in introduces ambiguity, and as has already been defines can lead to obscure errors, as the mentioned, ambiguity weakens the purpose of preprocessor does not detect them, but leaves names, as identity is lost. As every programmer them for the compiler. Programmers must be will have experienced, one character errors are familiar with the particular preprocessor more difficult to find than one would think. For implementation on their system, as preprocessor example if an identifier is declared Fred, another implementations are different, particularly one can be declared fred. Such names are easily between Classic C and ANSI C. mistyped and confused. We are in general poor proof-readers. The psychological reason for this 6.10. NULL vs 0 is that the the brain tends to straighten out errors [Ellemtel 92] recommends that pointers for our perception automatically. The human brain should not be compared to, or assigned to NULL, is an excellent instrument for working out what but to 0. Stylistically, NULL would be preferable. was intended, even in the presence of radical It would also allow for environments where null error. (This makes us good at difficult tasks like pointers have a value other than 0. ANSI-C, speech recognition.) In order to overcome this however, has subtle problems with the definition programmers must use their powers of of NULL. concentration to override this natural tendency of the brain. Distinguishing upper from lower case in 6.11. Case Distinction names only adds another level of difficulty. Good It is good to adopt typographic conventions language design takes into account such for names, but distinguishing between upper and psychological considerations in these small but lower case in names can cause confusion. important details, being designed towards the way Confusion leads to errors and systems that are humans work, not computers. Such considerations difficult to maintain and modify. Case distinction of cognitive science make a big difference to the is based on the implementation paradigm of how effectiveness of people, but do not have any character codes work. Why do we have names? impact at all on the efficiency of code generated To give entities identity, and aid our memory of for the computer. What is more important, people that identity. Philosophically, case distinction is or computers? contrary to the fundamental purpose of names. Case distinction provides another form of Case distinction in interactive systems is a name overloading. Name overloading is a double- poor user interface. It is clumsy having to edged sword. It leads to ambiguity, confusion and continually use the shift key, and will slow a good error. Name overloading as has been suggested in typist. More importantly, case distinction makes the section on name overloading should only be names harder to remember, and so is contrary to provided in controlled and expected ways, where the purpose of aiding memory. It is difficult overloading provides a useful function such as enough for users to remember command module independence or polymorphism. Where a mnemonics or file names, let alone exactly the name is overloaded in the same scope the case. Names are used instead of difficult to compiler should report an error. remember addresses. If we did not have names, As another example, a commonly used we would have to retrieve files by addresses, or technique is - call people by their social security number.
C++?? 2nd Edition page 25
class obj TYPE { CHARACTER int Entry; FUNCTIONS ord: CHARACTER -> INTEGER void set_entry (int entry) // convert input character to integer { char: INTEGER /-> CHARACTER entry = Entry; // convert input integer to character } } PRECONDITION // check i is in range If you have not spotted the error in the above pre char (i: INTEGER) = example, what was it supposed to mean? 0 <= i and i <= ord (last character) 6.12. Assignment Operator Using the mathematical equality symbol for The notation ‘->’ means every character will the assignment operator is a poor choice of map to an integer. The partial function notation ‘/- symbols. Programming assignment is not equal to >’ means that not every integer will map to a mathematical equality (:= != =). Language character, and a precondition, given in the ‘pre designers of ALGOL style languages realised they char’ statement, specifies the subset of integers were semantically quite different, so took the care that maps to characters. Object-oriented syntax to distinguish, only using ‘=’ to mean equality in provides this consistently with member functions the sense of mathematical assertion. In C the lack on a class: of distinction leads to error. It is easy to use = i : INTEGER (assignment) where == (equality) is intended, ch : CHARACTER which looks reasonable, but results in errors that are difficult to detect. i := ch.ord This leads to a more general criticism of C, in // i becomes the integer value of that it has a pseudo mathematical appearance. Few // the character. people are proficient at interpreting mathematical ch := i.char theorems, most passing over such sections in text, // ch becomes the character making the assumption that the mathematics // corresponding to the value i. proves the surrounding text. The pseudo- mathematical nature of C has this bad attribute of but a routine char would probably not be mathematical notation. It is difficult to read, while defined on the integer type so this would more lacking the semantic consistency and precision of likely be: mathematical notation. One of the keys of ch.char (i); reusability is readability. // set ch to the character // corresponding to the value i. 6.13. Type Casting Type casting is just a mechanism to map The hardware of many machines cater for such values of one type onto values of another type. basic data types as character and integer, and it is This means type casting is no more than a specific entirely possible that a compiler will generate form of mathematical function. Type casting has code that is optimal for any target hardware been useful in computer systems. Often it is architecture. So many languages have character required to map one type onto another, where the and integer as built in types. The object-oriented bit representation of the value remains the same. paradigm, however, can treat such basic data types Type casting is therefore a trick to optimise consistently and elegantly, by the implicit certain operations. Type casting provides no definition of their own classes. useful concept that general functions cannot Another example of type conversion is from implement. Furthermore, type casting undermines real to integer. Here though, the programmer the purpose of strongly typed systems. In many might wish to specify the use of two type languages, the type system has not been conversion functions to truncate or round. consistently defined, so programmers often feel that type casting is necessary. TYPE Mathematically, all functions perform type REAL casting. An example often used in programming is FUNCTIONS to cast between characters and integers. Type casts truncate: REAL -> INTEGER between integers and characters are easily round: REAL -> INTEGER expressed as functions using abstract data types (ADTs). r: REAL
C++?? 2nd Edition page 26
i: INTEGER if (condition) statement1; /* Semicolon i := r.truncate required */ // i becomes the closest integer else // <= r statement2; i := r.round // i becomes the closest integer if (condition) // to r { statement1; Again many hardware platforms provide } /* Semicolon must be omitted */ specific instructions to achieve this, and an else efficient object-oriented language compiler will statement2; generate code best suited to the target machine. Such inbuilt class definitions might be a part of This is an irregularity, as a parser will the standard language definition. reduce both of the above to the grammatical form: if (condition) statement 6.14. Semicolons else statement I am not concerned whether the semicolon is defined as a terminator or separator. Arguments (In fact why do conditions in C if and while that languages that define the semicolon as statements have to have parentheses around terminator are superior to those that define it as them?) separator are, however, baseless. The semicolon as separator is really quite logical. It is based on viewing the semicolon as a statement sequencing 7. Conclusions or concatenation operator. It is therefore a binary C++ is overly complex. C is widely operator, requiring both a left and a right hand recognised as being a simple language. But even side. Some people claim to find this concept this is doubtful, as it has many operators, and a difficult to understand, but if we consider it in the difficult precedence system. Its pointer style of context of a mathematical expression, it would be programming is difficult. Overall, C has many silly to expect that an addition be written as: traps that lead to difficult to detect errors in software. Object-oriented languages should a + b + provide sophisticated concepts in the simplest Another way to look at a separator is to possible framework. Where the framework is not consider the structure of a program. A program is simple, the concepts are obscured. OOP addresses a list of elements. The executable part of a many issues in order to facilitate the production of program is a list of sequentially executed complex and sophisticated programs. Many of instructions. Elements in a list must be separated, these issues are addressed in implicit and subtle and the semicolon is one syntax to separate ways, but are lost in C++. Subtle errors can be elements in a list. The semicolon is therefore part introduced into C++ software in many ways, and of the syntax of the list, not part of the syntax of furthermore, the combination of these will cause the individual instructions. Languages such as even further problems. C++ has devices for petty FORTRAN separated instructions by requiring convenience, while sacrificing major that they be placed on different lines or cards. If conveniences and long-term correctness and an instruction overflowed a line, a continuation safety. C++ forces the programmer to perform character was required, like the backslash in C. many administrative bookkeeping tasks that a Well defined languages do not require compiler can easily do. continuation characters, as line breaks are It can be considered what application domain unimportant, and have no effect on semantics. C++ is relevant for? The answer to this is that Languages should have very regular grammars, so C++ might be used as a better C. But for what that the semicolon could be an entirely optional applications is C relevant? C is relevant for low typographic separator. level Unix systems programming. It is not a In natural language both the comma and generally applicable language in view of its low semicolon are separators, only the full stop is a level nature, and its flaws. C is not applicable for terminator. If the comma was a terminator, large scale production. Hence C++’s attempt to function invocations would look like: improve it. C++, however, has not solved C’s flaws, as I once hoped it would, but painfully fn (a, b+c, d, e,); magnified them. Better languages exist for higher It is often argued that the semicolon as level functions such as communications and separator leads to irregularities. C’s handling of networks, scientific work, compilers, etc. I the grammar of semicolons, however, leads to an envisage that C has a place as a high level irregularity in if/else’s: assembler that can be used to implement small pieces of code on suitable platforms, where
C++?? 2nd Edition page 27
efficiency is of prime importance. Thus the use of advanced musician ensure that the tempo of a C would be limited and well controlled, rather piece is correct, and since playing to a metronome like small assembler routines are currently used in is more difficult, will help sharpen the musicians some systems for the same purpose. Indeed the performance of the piece. The musician does not move to C++ should only be considered in the just view the metronome as an aid for beginners, case of upgrading a body of C programs for or as something that restricts him to a set beat, but backwards compatibility. In the case of new as a tool that helps produce a polished and projects alternatives to C and C++ should professional performance. C should not be seen as seriously be considered. a language to which you graduate after you have Programming is the orchestration of change learnt to program in languages with safety checks. within a large state space. Object-oriented In fact changing to C or C++ is a great step techniques provide a method of simple division backwards. Languages with consistency and and management of such state spaces. Managing semantic checks are essential aids to the such state spaces requires the simplest techniques, production of professional software. in order to guard against detectable This paper has shown many cases where C++ inconsistencies that lead to errors in executable uses old C mechanisms to provide things that can systems. C and C++ do not implement the simple and should be expressed consistently within the management of a large state space, and allow object-oriented paradigm. For example type many potential errors to go undetected. The role casting. The move to pure object-oriented of a language as a tool cannot seriously be languages will facilitate more consistent regarded as some authoritarian that stops us doing programming and avoid many typical errors that what we want or need to do, as many languages occur in software production. C++ also makes with type safety and consistency checks are often distinctions that belong in the ‘how’ viewed. Programming languages should embody implementation domain. For example, ‘.’ vs ‘->, the collective wisdom of common sense practices and variables and functions. These make that have been learnt over many years, by bookkeeping work for programmers, which common and painful experience. C++ lacks the should be handled by a compiler. But then C++ implementation of much of this wisdom. fails to make distinctions that belong in the ‘what’ [Sakkinen 92] observes that much of the C++ problem domain. For example, procedures vs literature has few references to external work or functions. Making distinctions in the ‘how’ research. It fails to draw on the insights and domain adds inconvenience to the language. progress made by many researchers. This leads Failing to make distinctions in the ‘what’ domain me to believe that C++ is parochial and removed limits the power and expressiveness of the from the many advances that will make language. The amount of change required in C++ production of systems easier and more cost to address the issues raised in this paper is seen as effective. largely insurmountable. It is better to detect and avoid errors than to A programming language is just a tool, in the fix them. The fixing of errors happens many times same way that an axe is a tool. If the axe is blunt during the development process. This slows down when chopping down a tree, then procedures, the development process, and is therefore costly. processes and methodologies could be invented to Good programmers in this context (often called make it as effective as possible. But that leaves ‘gurus’), are those who recognise symptoms, and the real problem unsolved; that the axe that does recommend fixes. Good programmers in the better the real work is blunt. So it is with programming sense (often called ‘impractical idealistic languages. To develop a system, it must be dreamers’) adopt better practices (programming implemented, and a programming language is the languages being a subset of these), that avoid tool to do the real work. If the language is blunt, error in the first place. then procedures, processes and methodologies C encourages gurus who spout false wisdom might alleviate the situation, but they do not solve on obscure subjects. Writing programs in C is the problem. Once the axe is sharpened, then real often called ‘coding’. Coding is writing obscure progress is made, and the procedures, processes encryptions that will later have to be decoded, by and methodologies also become more effective. A none else than a guru! C also encourages good axeman will have good axe wielding programming by guesswork. C programmers often technique, but given a choice of axes will choose solve ‘bugs’ by adding extra ()s, *s and &s, the sharpest implement. A poor axeman could be without understanding the problem. People who ineffective with even a sharp axe, but the axe attain proficiency at this guesswork, are known as, maker will still strive to produce the sharpest axe well you guessed it, gurus!! for the good axeman. The argument that poor The view that correctness checks are training programmers will produce bad programs in any wheels for students, which gurus don’t need must language so we shouldn’t bother with better be dispelled. Many disciplines have techniques to languages is fallacious. ensure correctness. For example, the metronome As mentioned in the introduction, both sides in music is not just for students, but will help an of the analysis/design vs implementation debate
C++?? 2nd Edition page 28
need to compromise in order to bridge the semantic gap. The perpetuation of low level 8. Bibliography languages such as C into OOP is proof that the C++ ARM Ellis and Stroustrup “The annotated programming community is not willing to C++ Reference Manual” AT&T 1990. compromise, or sharpen its axe enough in order to [Capretz 87] Pierre J. Capretz “French in Action, bridge this costly gap. A Beginning Course in Language and Culture” The critique began with certain questions, and Yale University Press. as no work can be absolute (particularly a [Cline] Marshall Cline “C++ Frequently Asked programming language), it will end with more Questions” comp.lang.c++ newsgroup. questions that it is hoped will create more debate, and more questioning into what we are really [Crystal 87] David Crystal “The Cambridge trying to achieve with program development. Encyclopedia of Language” Cambridge Does C++ provide effective communication University Press. between programmers separated in both space and [DDH 72] Dahl, Dijkstra, Hoare “Structured time? Does C++ provide communication between Programming” the levels of analysis, design, implementation and [Dijkstra 76] E.W. Dijkstra “A Discipline of maintenance? Programming” Prentice Hall. Are the compromises made by C and C++ still [Ellemtel 92] “Programming in C++: Rules and relevant to today’s environments, and the Recommendations” Ellemtel Telecommunication environments of the not very near future? Systems Laboratories, Sweden. Could C++ be regarded as the PL/1 of the object-oriented world, as PL/1 was the marriage [Ince 92] D.C.Ince “Arrays and Pointers of FORTRAN and structured ALGOL concepts, Considered Harmful”, ACM SigPlan Notices, and C++ is the marriage of C with object-oriented January 1992. concepts? [Mody 91] R.P.Mody “C in Education and Are the compromises made for the restricted Software Engineering” ACM SIGCSE Bulletin machines and environments of 20 years ago still Vol.23 No. 3 September 1991. appropriate for today? Are languages based on 20 [Reade 89] Chris Reade “Elements of Functional year old compromises appropriate in modern Programming” Addison-Wesley, 1989. software development environments? [RBPEL91] Rumbaugh, Blaha, Premerlani, Eddy, Should new software developments be forced Lorensen “Object-Oriented modelling and to accept such compromises? Design”. Prentice-Hall, 1991. Is C++ patching old material with new cloth, [S & W 79] William Strunk and E.B.White “The or pouring new wine into old wineskins? Elements of Style”, MacMillan Publishing, 1979. What are we really trying to achieve in programming anyway? [Sakkinen 92] Markku Sakkinen “Inheritance and Other Main Principles of C++ and Other Object- oriented Languages”. University of Jyväskylä, Ian Joyner 1992. (Also published as selected papers in November 1992 ECOOP ‘88, Computing Systems Vol. 5 No. 1, and Structured Programming Vol. 13 (1992).) [SJE91] Saake, Jungclaus, Ehrich “Object- Oriented Specification and Stepwise Refinement” in IFIP Workshop on Open Distributed Processing Berlin, 1991. [Weg91] Peter Wegner “Concepts and Paradigms of Object-Oriented Programming” ACM SIGPLAN OOPS Messenger Volume 1 no. 1 August 1990. [X3J16 92] Members of the X3J16 working group on extensions “How to write a C++ Language Extension Proposal for ANSI-X3j16/ISO-WG21” ACM SIGPLAN Notices Vol. 27 No. 6 June 1992. [Yoshida 92] Koichiro Yoshida Title and book in Japanese.
A fast backtracking algorithm to test directed graphs for isomorphism using distance matrices
de Douglas C. Schmidt, Larry E. Druffel
Acceseaza
Top carti si articole in IT Software/Hardware:
1. Curs depanare PC 5164 5164 accesari
2. Retele de calculatoare (editia a IV-a) 4551 4551 accesari
3. Cracking & Hacking 3088 3088 accesari
4. Calcul tabelar - Microsoft Excel 3028 3028 accesari
5. Autocad (curs) 2964 2964 accesari
6. Arhitectura calculatoarelor 2558 2558 accesari
7. Administrarea sistemelor Linux 2524 2524 accesari
8. Configurarea optima a BIOS-ului 2472 2472 accesari
9. 3D Modelling in AutoCAD - tutorial exercise 2390 2390 accesari
10. Excel 2353 2353 accesari
11. Curs practic de Java 2352 2352 accesari
12. Curs de utilizarea calculatoarelor: Microsoft Excel 2351 2351 accesari
13. Curs C,C++ 2133 2133 accesari
14. Informatica 2121 2121 accesari
15. Curs de Linux pe intelesul tuturor 2064 2064 accesari
16.