· 7 years ago · Feb 19, 2019, 12:14 PM
1art II
2Creating High-Quality Code
3In this part:
4Chapter 5: Design in Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .73
5Chapter 6: Working Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125
6Chapter 7: High-Quality Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161
7Chapter 8: Defensive Programming. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .187
8Chapter 9: The Pseudocode Programming Process. . . . . . . . . . . . . . . . . .215
9
1073
11Chapter 5
12Design in Construction
13cc2e.com/0578 Contents
14â– 5.1 Design Challenges: page 74
15â– 5.2 Key Design Concepts: page 77
16â– 5.3 Design Building Blocks: Heuristics: page 87
17â– 5.4 Design Practices: page 110
18â– 5.5 Comments on Popular Methodologies: page 118
19Related Topics
20â– Software architecture: Section 3.5
21â– Working classes: Chapter 6
22â– Characteristics of high-quality routines: Chapter 7
23â– Defensive programming: Chapter 8
24â– Refactoring: Chapter 24
25â– How program size affects construction: Chapter 27
26Some people might argue that design isn’t really a construction activity, but on small
27projects, many activities are thought of as construction, often including design. On
28some larger projects, a formal architecture might address only the system-level issues
29and much design work might intentionally be left for construction. On other large
30projects, the design might be intended to be detailed enough for coding to be fairly
31mechanical, but design is rarely that complete—the programmer usually designs part
32of the program, officially or otherwise.
33Cross-Reference For details
34on the different levels of formality
35required on large and
36small projects, see Chapter
3727, “How Program Size
38Affects Construction.â€
39On small, informal projects, a lot of design is done while the programmer sits at the
40keyboard. “Design†might be just writing a class interface in pseudocode before writing
41the details. It might be drawing diagrams of a few class relationships before coding
42them. It might be asking another programmer which design pattern seems like a better
43choice. Regardless of how it’s done, small projects benefit from careful design just
44as larger projects do, and recognizing design as an explicit activity maximizes the benefit
45you will receive from it.
46Design is a huge topic, so only a few aspects of it are considered in this chapter. A large
47part of good class or routine design is determined by the system architecture, so be
4874 Chapter 5: Design in Construction
49sure that the architecture prerequisite discussed in Section 3.5 has been satisfied.
50Even more design work is done at the level of individual classes and routines,
51described in Chapter 6, “Working Classes,†and Chapter 7, “High-Quality Routines.â€
52If you’re already familiar with software design topics, you might want to just hit the
53highlights in the sections about design challenges in Section 5.1 and key heuristics in
54Section 5.3.
555.1 Design Challenges
56Cross-Reference The difference
57between heuristic and
58deterministic processes is
59described in Chapter 2,
60“Metaphors for a Richer
61Understanding of Software
62Development.â€
63The phrase “software design†means the conception, invention, or contrivance of a
64scheme for turning a specification for computer software into operational software.
65Design is the activity that links requirements to coding and debugging. A good toplevel
66design provides a structure that can safely contain multiple lower-level designs.
67Good design is useful on small projects and indispensable on large projects.
68Design is also marked by numerous challenges, which are outlined in this section.
69Design Is a Wicked Problem
70The picture of the software
71designer deriving his design
72in a rational, error-free way
73from a statement of requirements
74is quite unrealistic. No
75system has ever been developed
76in that way, and probably
77none ever will. Even the
78small program developments
79shown in textbooks
80and papers are unreal. They
81have been revised and polished
82until the author has
83shown us what he wishes he
84had done, not what actually
85did happen.
86—David Parnas and
87Paul Clements
88Horst Rittel and Melvin Webber defined a “wicked†problem as one that could be
89clearly defined only by solving it, or by solving part of it (1973). This paradox implies,
90essentially, that you have to “solve†the problem once in order to clearly define it and
91then solve it again to create a solution that works. This process has been motherhood
92and apple pie in software development for decades (Peters and Tripp 1976).
93In my part of the world, a dramatic example of such a wicked problem was the design
94of the original Tacoma Narrows bridge. At the time the bridge was built, the main consideration
95in designing a bridge was that it be strong enough to support its planned
96load. In the case of the Tacoma Narrows bridge, wind created an unexpected, side-toside
97harmonic ripple. One blustery day in 1940, the ripple grew uncontrollably until
98the bridge collapsed, as shown in Figure 5-1.
99This is a good example of a wicked problem because, until the bridge collapsed, its
100engineers didn’t know that aerodynamics needed to be considered to such an extent.
101Only by building the bridge (solving the problem) could they learn about the additional
102consideration in the problem that allowed them to build another bridge that
103still stands.
1045.1 Design Challenges 75
105Figure 5-1 The Tacoma Narrows bridge—an example of a wicked problem.
106One of the main differences between programs you develop in school and those you
107develop as a professional is that the design problems solved by school programs are
108rarely, if ever, wicked. Programming assignments in school are devised to move you in a
109beeline from beginning to end. You’d probably want to tar and feather a teacher who gave
110you a programming assignment, then changed the assignment as soon as you finished
111the design, and then changed it again just as you were about to turn in the completed program.
112But that very process is an everyday reality in professional programming.
113Design Is a Sloppy Process (Even If it Produces a Tidy Result)
114The finished software design should look well organized and clean, but the process
115used to develop the design isn’t nearly as tidy as the end result.
116Further Reading For a fuller
117exploration of this viewpoint,
118see “A Rational Design Process:
119How and Why to Fake
120It†(Parnas and Clements
1211986).
122Design is sloppy because you take many false steps and go down many blind alleys—
123you make a lot of mistakes. Indeed, making mistakes is the point of design—it’s
124cheaper to make mistakes and correct designs than it would be to make the same mistakes,
125recognize them after coding, and have to correct full-blown code. Design is
126sloppy because a good solution is often only subtly different from a poor one.
127Morning News Tribune
12876 Chapter 5: Design in Construction
129Cross-Reference For a better
130answer to this question, see
131“How Much Design is
132Enough?†in Section 5.4 later
133in this chapter.
134Design is also sloppy because it’s hard to know when your design is “good enough.â€
135How much detail is enough? How much design should be done with a formal design
136notation, and how much should be left to be done at the keyboard? When are you
137done? Since design is open-ended, the most common answer to that question is
138“When you’re out of time.â€
139Design Is About Tradeoffs and Priorities
140In an ideal world, every system could run instantly, consume zero storage space, use
141zero network bandwidth, never contain any errors, and cost nothing to build. In the real
142world, a key part of the designer’s job is to weigh competing design characteristics and
143strike a balance among those characteristics. If a fast response rate is more important
144than minimizing development time, a designer will choose one design. If minimizing
145development time is more important, a good designer will craft a different design.
146Design Involves Restrictions
147The point of design is partly to create possibilities and partly to restrict possibilities. If
148people had infinite time, resources, and space to build physical structures, you would
149see incredible sprawling buildings with one room for each shoe and hundreds of rooms.
150This is how software can turn out without deliberately imposed restrictions. The constraints
151of limited resources for constructing buildings force simplifications of the solution
152that ultimately improve the solution. The goal in software design is the same.
153Design Is Nondeterministic
154If you send three people away to design the same program, they can easily return with
155three vastly different designs, each of which could be perfectly acceptable. There
156might be more than one way to skin a cat, but there are usually dozens of ways to
157design a computer program.
158Design Is a Heuristic Process
159Because design is nondeterministic, design techniques tend to be heuristics—“rules of
160thumb†or “things to try that sometimes workâ€â€”rather than repeatable processes that
161are guaranteed to produce predictable results. Design involves trial and error. A
162design tool or technique that worked well on one job or on one aspect of a job might
163not work as well on the next project. No tool is right for everything.
164Design Is Emergent
165cc2e.com/0539 A tidy way of summarizing these attributes of design is to say that design is
166“emergent.†Designs don’t spring fully formed directly from someone’s brain. They
167evolve and improve through design reviews, informal discussions, experience writing
168the code itself, and experience revising the code.
169KEY POINT
1705.2 Key Design Concepts 77
171Further Reading Software
172isn’t the only kind of structure
173that changes over time.
174Physical structures evolve,
175too—see How Buildings
176Learn (Brand 1995).
177Virtually all systems undergo some degree of design changes during their initial development,
178and then they typically change to a greater extent as they’re extended into
179later versions. The degree to which change is beneficial or acceptable depends on the
180nature of the software being built.
1815.2 Key Design Concepts
182Good design depends on understanding a handful of key concepts. This section discusses
183the role of complexity, desirable characteristics of designs, and levels of design.
184Software’s Primary Technical Imperative: Managing Complexity
185Cross-Reference For discussion
186of the way complexity
187affects programming issues
188other than design, see
189Section 34.1, “Conquer
190Complexity.â€
191To understand the importance of managing complexity, it’s useful to refer to Fred
192Brooks’s landmark paper, “No Silver Bullets: Essence and Accidents of Software Engineeringâ€
193(1987).
194Accidental and Essential Difficulties
195Brooks argues that software development is made difficult because of two different
196classes of problems—the essential and the accidental. In referring to these two terms,
197Brooks draws on a philosophical tradition going back to Aristotle. In philosophy, the
198essential properties are the properties that a thing must have in order to be that thing.
199A car must have an engine, wheels, and doors to be a car. If it doesn’t have any of those
200essential properties, it isn’t really a car.
201Accidental properties are the properties a thing just happens to have, properties that
202don’t really bear on whether the thing is what it is. A car could have a V8, a turbocharged
2034-cylinder, or some other kind of engine and be a car regardless of that detail.
204A car could have two doors or four; it could have skinny wheels or mag wheels. All
205those details are accidental properties. You could also think of accidental properties
206as incidental, discretionary, optional, and happenstance.
207Cross-Reference Accidental
208difficulties are more prominent
209in early-wave development
210than in late-wave
211development. For details,
212see Section 4.3, “Your Location
213on the Technology
214Wave.â€
215Brooks observes that the major accidental difficulties in software were addressed long
216ago. For example, accidental difficulties related to clumsy language syntaxes were
217largely eliminated in the evolution from assembly language to third-generation languages
218and have declined in significance incrementally since then. Accidental difficulties
219related to noninteractive computers were resolved when time-share operating
220systems replaced batch-mode systems. Integrated programming environments further
221eliminated inefficiencies in programming work arising from tools that worked
222poorly together.
22378 Chapter 5: Design in Construction
224Brooks argues that progress on software’s remaining essential difficulties is bound to
225be slower. The reason is that, at its essence, software development consists of working
226out all the details of a highly intricate, interlocking set of concepts. The essential
227difficulties arise from the necessity of interfacing with the complex, disorderly real
228world; accurately and completely identifying the dependencies and exception cases;
229designing solutions that can’t be just approximately correct but that must be exactly
230correct; and so on. Even if we could invent a programming language that used the
231same terminology as the real-world problem we’re trying to solve, programming
232would still be difficult because of the challenge in determining precisely how the real
233world works. As software addresses ever-larger real-world problems, the interactions
234among the real-world entities become increasingly intricate, and that in turn increases
235the essential difficulty of the software solutions.
236The root of all these essential difficulties is complexity—both accidental and essential.
237Importance of Managing Complexity
238There are two ways of constructing
239a software design:
240one way is to make it so simple
241that there are obviously
242no deficiencies, and the
243other is to make it so complicated
244that there are no obvious
245deficiencies.
246—C. A. R. Hoare
247When software-project surveys report causes of project failure, they rarely identify
248technical reasons as the primary causes of project failure. Projects fail most often
249because of poor requirements, poor planning, or poor management. But when
250projects do fail for reasons that are primarily technical, the reason is often uncontrolled
251complexity. The software is allowed to grow so complex that no one really
252knows what it does. When a project reaches the point at which no one completely
253understands the impact that code changes in one area will have on other areas,
254progress grinds to a halt.
255Managing complexity is the most important technical topic in software development.
256In my view, it’s so important that Software’s Primary Technical Imperative has to be
257managing complexity.
258Complexity is not a new feature of software development. Computing pioneer Edsger
259Dijkstra pointed out that computing is the only profession in which a single mind is
260obliged to span the distance from a bit to a few hundred megabytes, a ratio of 1 to 109,
261or nine orders of magnitude (Dijkstra 1989). This gigantic ratio is staggering. Dijkstra
262put it this way: “Compared to that number of semantic levels, the average mathematical
263theory is almost flat. By evoking the need for deep conceptual hierarchies, the
264automatic computer confronts us with a radically new intellectual challenge that has
265no precedent in our history.†Of course software has become even more complex
266since 1989, and Dijkstra’s ratio of 1 to 109 could easily be more like 1 to 1015 today.
267KEY POINT
2685.2 Key Design Concepts 79
269One symptom that you have
270bogged down in complexity
271overload is when you find
272yourself doggedly applying a
273method that is clearly irrelevant,
274at least to any outside
275observer. It is like the
276mechanically inept person
277whose car breaks down—so
278he puts water in the battery
279and empties the ashtrays.
280—P. J. Plauger
281Dijkstra pointed out that no one’s skull is really big enough to contain a modern computer
282program (Dijkstra 1972), which means that we as software developers
283shouldn’t try to cram whole programs into our skulls at once; we should try to organize
284our programs in such a way that we can safely focus on one part of it at a time.
285The goal is to minimize the amount of a program you have to think about at any one
286time. You might think of this as mental juggling—the more mental balls the program
287requires you to keep in the air at once, the more likely you’ll drop one of the balls,
288leading to a design or coding error.
289At the software-architecture level, the complexity of a problem is reduced by dividing
290the system into subsystems. Humans have an easier time comprehending several simple
291pieces of information than one complicated piece. The goal of all software-design
292techniques is to break a complicated problem into simple pieces. The more independent
293the subsystems are, the more you make it safe to focus on one bit of complexity
294at a time. Carefully defined objects separate concerns so that you can focus on one
295thing at a time. Packages provide the same benefit at a higher level of aggregation.
296Keeping routines short helps reduce your mental workload. Writing programs in
297terms of the problem domain, rather than in terms of low-level implementation
298details, and working at the highest level of abstraction reduce the load on your brain.
299The bottom line is that programmers who compensate for inherent human limitations
300write code that’s easier for themselves and others to understand and that has
301fewer errors.
302How to Attack Complexity
303Overly costly, ineffective designs arise from three sources:
304â– A complex solution to a simple problem
305â– A simple, incorrect solution to a complex problem
306â– An inappropriate, complex solution to a complex problem
307As Dijkstra pointed out, modern software is inherently complex, and no matter how
308hard you try, you’ll eventually bump into some level of complexity that’s inherent in the
309real-world problem itself. This suggests a two-prong approach to managing complexity:
310■Minimize the amount of essential complexity that anyone’s brain has to deal
311with at any one time.
312â– Keep accidental complexity from needlessly proliferating.
313Once you understand that all other technical goals in software are secondary to managing
314complexity, many design considerations become straightforward.
315KEY POINT
31680 Chapter 5: Design in Construction
317Desirable Characteristics of a Design
318When I am working on a
319problem I never think about
320beauty. I think only how to
321solve the problem. But when
322I have finished, if the solution
323is not beautiful, I know it
324is wrong.
325—R. Buckminster Fuller
326A high-quality design has several general characteristics. If you could achieve all these
327goals, your design would be very good indeed. Some goals contradict other goals, but
328that’s the challenge of design—creating a good set of tradeoffs from competing
329objectives. Some characteristics of design quality are also characteristics of a good
330program: reliability, performance, and so on. Others are internal characteristics of
331the design.
332Cross-Reference These
333characteristics are related to
334general software-quality
335attributes. For details on
336general attributes, see Section
33720.1, “Characteristics of
338Software Quality.â€
339Here’s a list of internal design characteristics:
340Minimal complexity The primary goal of design should be to minimize complexity
341for all the reasons just described. Avoid making “clever†designs. Clever designs are
342usually hard to understand. Instead make “simple†and “easy-to-understand†designs.
343If your design doesn’t let you safely ignore most other parts of the program when
344you’re immersed in one specific part, the design isn’t doing its job.
345Ease of maintenance Ease of maintenance means designing for the maintenance
346programmer. Continually imagine the questions a maintenance programmer would
347ask about the code you’re writing. Think of the maintenance programmer as your
348audience, and then design the system to be self-explanatory.
349Loose coupling Loose coupling means designing so that you hold connections
350among different parts of a program to a minimum. Use the principles of good abstractions
351in class interfaces, encapsulation, and information hiding to design classes with
352as few interconnections as possible. Minimal connectedness minimizes work during
353integration, testing, and maintenance.
354Extensibility Extensibility means that you can enhance a system without causing
355violence to the underlying structure. You can change a piece of a system without
356affecting other pieces. The most likely changes cause the system the least trauma.
357Reusability Reusability means designing the system so that you can reuse pieces of
358it in other systems.
359High fan-in High fan-in refers to having a high number of classes that use a given
360class. High fan-in implies that a system has been designed to make good use of utility
361classes at the lower levels in the system.
3625.2 Key Design Concepts 81
363Low-to-medium fan-out Low-to-medium fan-out means having a given class use a
364low-to-medium number of other classes. High fan-out (more than about seven) indicates
365that a class uses a large number of other classes and may therefore be overly
366complex. Researchers have found that the principle of low fan-out is beneficial
367whether you’re considering the number of routines called from within a routine or the
368number of classes used within a class (Card and Glass 1990; Basili, Briand, and Melo
3691996).
370Portability Portability means designing the system so that you can easily move it to
371another environment.
372Leanness Leanness means designing the system so that it has no extra parts (Wirth
3731995, McConnell 1997). Voltaire said that a book is finished not when nothing more
374can be added but when nothing more can be taken away. In software, this is especially
375true because extra code has to be developed, reviewed, tested, and considered when
376the other code is modified. Future versions of the software must remain backwardcompatible
377with the extra code. The fatal question is “It’s easy, so what will we hurt by
378putting it in?â€
379Stratification Stratification means trying to keep the levels of decomposition stratified
380so that you can view the system at any single level and get a consistent view.
381Design the system so that you can view it at one level without dipping into other levels.
382Cross-Reference For more
383on working with old systems,
384see Section 24.5, “Refactoring
385Strategies.â€
386For example, if you’re writing a modern system that has to use a lot of older, poorly
387designed code, write a layer of the new system that’s responsible for interfacing with
388the old code. Design the layer so that it hides the poor quality of the old code, presenting
389a consistent set of services to the newer layers. Then have the rest of the system
390use those classes rather than the old code. The beneficial effects of stratified design in
391such a case are (1) it compartmentalizes the messiness of the bad code and (2) if
392you’re ever allowed to jettison the old code or refactor it, you won’t need to modify any
393new code except the interface layer.
394Cross-Reference An especially
395valuable kind of standardization
396is the use of
397design patterns, which are
398discussed in “Look for Common
399Design Patterns†in
400Section 5.3.
401Standard techniques The more a system relies on exotic pieces, the more intimidating
402it will be for someone trying to understand it the first time. Try to give the whole
403system a familiar feeling by using standardized, common approaches.
40482 Chapter 5: Design in Construction
405Levels of Design
406Design is needed at several different levels of detail in a software system. Some design techniques
407apply at all levels, and some apply at only one or two. Figure 5-2 illustrates the levels.
408Figure 5-2 The levels of design in a program. The system (1) is first organized into subsystems
409(2). The subsystems are further divided into classes (3), and the classes are divided
410into routines and data (4). The inside of each routine is also designed (5).
411Level 1: Software System
412In other words—and this is
413the rock-solid principle on
414which the whole of the Corporation’s
415Galaxywide success
416is founded—their
417fundamental design flaws
418are completely hidden by
419their superficial design flaws.
420—Douglas Adams
421The first level is the entire system. Some programmers jump right from the system
422level into designing classes, but it’s usually beneficial to think through higher level
423combinations of classes, such as subsystems or packages.
424Level 2: Division into Subsystems or Packages
425The main product of design at this level is the identification of all major subsystems. The
426subsystems can be big: database, user interface, business rules, command interpreter,
4272 Division into subsystems/packages
4283 Division into classes within packages
4291 Software system
4304 Division into data and routines within classes
4315 Internal routine design
4325.2 Key Design Concepts 83
433report engine, and so on. The major design activity at this level is deciding how to partition
434the program into major subsystems and defining how each subsystem is allowed to
435use each other subsystem. Division at this level is typically needed on any project that
436takes longer than a few weeks. Within each subsystem, different methods of design
437might be used—choosing the approach that best fits each part of the system. In Figure 5-
4382, design at this level is marked with a 2.
439Of particular importance at this level are the rules about how the various subsystems
440can communicate. If all subsystems can communicate with all other subsystems, you
441lose the benefit of separating them at all. Make each subsystem meaningful by restricting
442communications.
443Suppose for example that you define a system with six subsystems, as shown in Figure
4445-3. When there are no rules, the second law of thermodynamics will come into
445play and the entropy of the system will increase. One way in which entropy increases
446is that, without any restrictions on communications among subsystems, communication
447will occur in an unrestricted way, as in Figure 5-4.
448Figure 5-3 An example of a system with six subsystems.
449Figure 5-4 An example of what happens with no restrictions on intersubsystem
450communications.
451User Interface
452Data Storage
453Application
454Level Classes
455Enterprise-Level
456Tools
457Business
458Rules
459Graphics
460User Interface
461Data Storage
462Application
463Level Classes
464Enterprise-Level
465Tools
466Business
467Rules
468Graphics
46984 Chapter 5: Design in Construction
470As you can see, every subsystem ends up communicating directly with every other
471subsystem, which raises some important questions:
472â– How many different parts of the system does a developer need to understand at
473least a little bit to change something in the graphics subsystem?
474â– What happens when you try to use the business rules in another system?
475â– What happens when you want to put a new user interface on the system, perhaps
476a command-line UI for test purposes?
477â– What happens when you want to put data storage on a remote machine?
478You might think of the lines between subsystems as being hoses with water running
479through them. If you want to reach in and pull out a subsystem, that subsystem is
480going to have some hoses attached to it. The more hoses you have to disconnect and
481reconnect, the more wet you’re going to get. You want to architect your system so that
482if you pull out a subsystem to use elsewhere, you won’t have many hoses to reconnect
483and those hoses will reconnect easily.
484With forethought, all of these issues can be addressed with little extra work. Allow
485communication between subsystems only on a “need to know†basis—and it had better
486be a good reason. If in doubt, it’s easier to restrict communication early and relax it
487later than it is to relax it early and then try to tighten it up after you’ve coded several
488hundred intersubsystem calls. Figure 5-5 shows how a few communication guidelines
489could change the system depicted in Figure 5-4.
490Figure 5-5 With a few communication rules, you can simplify subsystem interactions significantly.
491To keep the connections easy to understand and maintain, err on the side of simple
492intersubsystem relations. The simplest relationship is to have one subsystem call routines
493in another. A more involved relationship is to have one subsystem contain
494classes from another. The most involved relationship is to have classes in one subsystem
495inherit from classes in another.
496User Interface
497Data Storage
498Application
499Level Classes
500Enterprise-Level
501Tools
502Business
503Rules
504Graphics
5055.2 Key Design Concepts 85
506A good general rule is that a system-level diagram like Figure 5-5 should be an acyclic
507graph. In other words, a program shouldn’t contain any circular relationships in
508which Class A uses Class B, Class B uses Class C, and Class C uses Class A.
509On large programs and families of programs, design at the subsystem level makes a
510difference. If you believe that your program is small enough to skip subsystem-level
511design, at least make the decision to skip that level of design a conscious one.
512Common Subsystems Some kinds of subsystems appear again and again in different
513systems. Here are some of the usual suspects.
514Cross-Reference For more
515on simplifying business logic
516by expressing it in tables, see
517Chapter 18, "Table-Driven
518Methods."
519Business rules Business rules are the laws, regulations, policies, and procedures
520that you encode into a computer system. If you’re writing a payroll system, you
521might encode rules from the IRS about the number of allowable withholdings and
522the estimated tax rate. Additional rules for a payroll system might come from a
523union contract specifying overtime rates, vacation and holiday pay, and so on. If
524you’re writing a program to quote automobile insurance rates, rules might come
525from government regulations on required liability coverages, actuarial rate tables, or
526underwriting restrictions
527User interface Create a subsystem to isolate user-interface components so that the
528user interface can evolve without damaging the rest of the program. In most cases, a
529user-interface subsystem uses several subordinate subsystems or classes for the GUI
530interface, command line interface, menu operations, window management, help system,
531and so forth.
532Database access You can hide the implementation details of accessing a database so
533that most of the program doesn’t need to worry about the messy details of manipulating
534low-level structures and can deal with the data in terms of how it’s used at the
535business-problem level. Subsystems that hide implementation details provide a valuable
536level of abstraction that reduces a program’s complexity. They centralize database
537operations in one place and reduce the chance of errors in working with the data.
538They make it easy to change the database design structure without changing most of
539the program.
540System dependencies Package operating-system dependencies into a subsystem for
541the same reason you package hardware dependencies. If you’re developing a program
542for Microsoft Windows, for example, why limit yourself to the Windows environment?
543Isolate the Windows calls in a Windows-interface subsystem. If you later
544want to move your program to Mac OS or Linux, all you’ll have to change is the
545interface subsystem. An interface subsystem can be too extensive for you to implement
546on your own, but such subsystems are readily available in any of several commercial
547code libraries.
54886 Chapter 5: Design in Construction
549Level 3: Division into Classes
550Further Reading For a good
551discussion of database
552design, see Agile Database
553Techniques (Ambler 2003).
554Design at this level includes identifying all classes in the system. For example, a database-
555interface subsystem might be further partitioned into data access classes and
556persistence framework classes as well as database metadata. Figure 5-2, Level 3,
557shows how one of Level 2’s subsystems might be divided into classes, and it implies
558that the other three subsystems shown at Level 2 are also decomposed into classes.
559Details of the ways in which each class interacts with the rest of the system are also
560specified as the classes are specified. In particular, the class’s interface is defined.
561Overall, the major design activity at this level is making sure that all the subsystems
562have been decomposed to a level of detail fine enough that you can implement their
563parts as individual classes.
564Cross-Reference For details
565on characteristics of highquality
566classes, see Chapter
5676, “Working Classes.â€
568The division of subsystems into classes is typically needed on any project that takes
569longer than a few days. If the project is large, the division is clearly distinct from the
570program partitioning of Level 2. If the project is very small, you might move directly
571from the whole-system view of Level 1 to the classes view of Level 3.
572Classes vs. Objects A key concept in object-oriented design is the differentiation
573between objects and classes. An object is any specific entity that exists in your program
574at run time. A class is the static thing you look at in the program listing. An
575object is the dynamic thing with specific values and attributes you see when you run
576the program. For example, you could declare a class Person that had attributes of
577name, age, gender, and so on. At run time you would have the objects nancy, hank,
578diane, tony, and so on—that is, specific instances of the class. If you’re familiar with
579database terms, it’s the same as the distinction between “schema†and “instance.†You
580could think of the class as the cookie cutter and the object as the cookie. This book
581uses the terms informally and generally refers to classes and objects more or less interchangeably.
582Level 4: Division into Routines
583Design at this level includes dividing each class into routines. The class interface
584defined at Level 3 will define some of the routines. Design at Level 4 will detail the
585class’s private routines. When you examine the details of the routines inside a class,
586you can see that many routines are simple boxes but a few are composed of hierarchically
587organized routines, which require still more design.
588The act of fully defining the class’s routines often results in a better understanding of
589the class’s interface, and that causes corresponding changes to the interface—that is,
590changes back at Level 3.
591This level of decomposition and design is often left up to the individual programmer,
592and it’s needed on any project that takes more than a few hours. It doesn’t need to be
593done formally, but it at least needs to be done mentally.
5945.3 Design Building Blocks: Heuristics 87
595Level 5: Internal Routine Design
596Cross-Reference For details
597on creating high-quality routines,
598see Chapter 7, “High-
599Quality Routines,†and Chapter
6008, “Defensive Programming.â€
601Design at the routine level consists of laying out the detailed functionality of the individual
602routines. Internal routine design is typically left to the individual programmer
603working on an individual routine. The design consists of activities such as writing
604pseudocode, looking up algorithms in reference books, deciding how to organize the
605paragraphs of code in a routine, and writing programming-language code. This level
606of design is always done, though sometimes it’s done unconsciously and poorly
607rather than consciously and well. In Figure 5-2, design at this level is marked with a 5.
6085.3 Design Building Blocks: Heuristics
609Software developers tend to like our answers cut and dried: “Do A, B, and C, and X, Y,
610Z will follow every time.†We take pride in learning arcane sets of steps that produce
611desired effects, and we become annoyed when instructions don’t work as advertised.
612This desire for deterministic behavior is highly appropriate to detailed computer programming,
613where that kind of strict attention to detail makes or breaks a program. But
614software design is a much different story.
615Because design is nondeterministic, skillful application of an effective set of heuristics
616is the core activity in good software design. The following subsections describe a number
617of heuristics—ways to think about a design that sometime produce good design
618insights. You might think of heuristics as the guides for the trials in “trial and error.â€
619You undoubtedly have run across some of these before. Consequently, the following
620subsections describe each of the heuristics in terms of Software’s Primary Technical
621Imperative: managing complexity.
622Find Real-World Objects
623Ask not first what the system
624does; ask WHAT it does it to!
625—Bertrand Meyer
626The first and most popular approach to identifying design alternatives is the “by the
627book†object-oriented approach, which focuses on identifying real-world and synthetic
628objects.
629The steps in designing with objects are
630Cross-Reference For more
631details on designing using
632classes, see Chapter 6,
633“Working Classes.â€
634â– Identify the objects and their attributes (methods and data).
635â– Determine what can be done to each object.
636â– Determine what each object is allowed to do to other objects.
637■Determine the parts of each object that will be visible to other objects—which
638parts will be public and which will be private.
639■Define each object’s public interface.
64088 Chapter 5: Design in Construction
641These steps aren’t necessarily performed in order, and they’re often repeated. Iteration
642is important. Each of these steps is summarized below.
643Identify the objects and their attributes Computer programs are usually based on
644real-world entities. For example, you could base a time-billing system on real-world
645employees, clients, timecards, and bills. Figure 5-6 shows an object-oriented view of
646such a billing system.
647Figure 5-6 This billing system is composed of four major objects. The objects have been
648simplified for this example.
649Identifying the objects’ attributes is no more complicated than identifying the objects
650themselves. Each object has characteristics that are relevant to the computer program.
651For example, in the time-billing system, an employee object has a name, a title, and a
652billing rate. A client object has a name, a billing address, and an account balance. A bill
653object has a billing amount, a client name, a billing date, and so on.
654Objects in a graphical user interface system would include windows, dialog boxes,
655buttons, fonts, and drawing tools. Further examination of the problem domain might
656produce better choices for software objects than a one-to-one mapping to real-world
657objects, but the real-world objects are a good place to start.
658Determine what can be done to each object A variety of operations can be performed
659on each object. In the billing system shown in Figure 5-6, an employee object
660could have a change in title or billing rate, a client object could have its name or billing
661address changed, and so on.
662Determine what each object is allowed to do to other objects This step is just what it
663sounds like. The two generic things objects can do to each other are containment and
664inheritance. Which objects can contain which other objects? Which objects can inherit
665Employee
666name
667title
668billingRate
669billingEmployee
670billingRecords
671clientToBill
672clientToBill
673bills
674GetHoursForMonth()
675...
676Client
677name
678billingAddress
679accountBalance
680currentBillingAmount
681EnterPayment()
682...
683Timecard
684hours
685date
686projectCode
6871 11
688* *
689* 0..1
690*
691...
692Bill
693billDate
694BillForClient()
695...
6965.3 Design Building Blocks: Heuristics 89
697from which other objects? In Figure 5-6, a timecard object can contain an employee
698object and a client object, and a bill can contain one or more timecards. In addition, a
699bill can indicate that a client has been billed, and a client can enter payments against
700a bill. A more complicated system would include additional interactions.
701Cross-Reference For details
702on classes and information
703hiding, see “Hide Secrets
704(Information Hiding)†in
705Section 5.3.
706Determine the parts of each object that will be visible to other objects One of the key
707design decisions is identifying the parts of an object that should be made public and those
708that should be kept private. This decision has to be made for both data and methods.
709Define each object’s interfaces Define the formal, syntactic, programming-languagelevel
710interfaces to each object. The data and methods the object exposes to every other
711object is called the object’s “public interface.†The parts of the object that it exposes to
712derived objects via inheritance is called the object’s “protected interface.†Think about
713both kinds of interfaces.
714When you finish going through the steps to achieve a top-level object-oriented system
715organization, you’ll iterate in two ways. You’ll iterate on the top-level system organization
716to get a better organization of classes. You’ll also iterate on each of the classes
717you’ve defined, driving the design of each class to a more detailed level.
718Form Consistent Abstractions
719Abstraction is the ability to engage with a concept while safely ignoring some of its
720details—handling different details at different levels. Any time you work with an aggregate,
721you’re working with an abstraction. If you refer to an object as a “house†rather
722than a combination of glass, wood, and nails, you’re making an abstraction. If you
723refer to a collection of houses as a “town,†you’re making another abstraction.
724Base classes are abstractions that allow you to focus on common attributes of a set of
725derived classes and ignore the details of the specific classes while you’re working on
726the base class. A good class interface is an abstraction that allows you to focus on the
727interface without needing to worry about the internal workings of the class. The interface
728to a well-designed routine provides the same benefit at a lower level of detail, and
729the interface to a well-designed package or subsystem provides that benefit at a higher
730level of detail.
731From a complexity point of view, the principal benefit of abstraction is that it allows
732you to ignore irrelevant details. Most real-world objects are already abstractions of
733some kind. As just mentioned, a house is an abstraction of windows, doors, siding,
734wiring, plumbing, insulation, and a particular way of organizing them. A door is in
735turn an abstraction of a particular arrangement of a rectangular piece of material with
736hinges and a doorknob. And the doorknob is an abstraction of a particular formation
737of brass, nickel, iron, or steel.
73890 Chapter 5: Design in Construction
739People use abstraction continuously. If you had to deal with individual wood fibers,
740varnish molecules, and steel molecules every time you used your front door, you’d
741hardly make it in or out of your house each day. As Figure 5-7 suggests, abstraction is
742a big part of how we deal with complexity in the real world.
743Figure 5-7 Abstraction allows you to take a simpler view of a complex concept.
744Cross-Reference For more
745details on abstraction in
746class design, see “Good
747Abstraction†in Section 6.2.
748Software developers sometimes build systems at the wood-fiber, varnish-molecule,
749and steel-molecule level. This makes the systems overly complex and intellectually
750hard to manage. When programmers fail to provide larger programming abstractions,
751the system itself sometimes fails to make it through the front door.
752Good programmers create abstractions at the routine-interface level, class-interface
753level, and package-interface level—in other words, the doorknob level, door level, and
754house level—and that supports faster and safer programming.
755Encapsulate Implementation Details
756Encapsulation picks up where abstraction leaves off. Abstraction says, “You’re allowed
757to look at an object at a high level of detail.†Encapsulation says, “Furthermore, you
758aren’t allowed to look at an object at any other level of detail.â€
759Continuing with the housing-materials analogy: encapsulation is a way of saying that
760you can look at the outside of the house but you can’t get close enough to make out
761the door’s details. You are allowed to know that there’s a door, and you’re allowed to
762know whether the door is open or closed, but you’re not allowed to know whether the
763door is made of wood, fiberglass, steel, or some other material, and you’re certainly
764not allowed to look at each individual wood fiber.
765As Figure 5-8 suggests, encapsulation helps to manage complexity by forbidding you
766to look at the complexity. The section titled “Good Encapsulation†in Section 6.2 provides
767more background on encapsulation as it applies to class design.
7685.3 Design Building Blocks: Heuristics 91
769Figure 5-8 Encapsulation says that, not only are you allowed to take a simpler view of a
770complex concept, you are not allowed to look at any of the details of the complex concept.
771What you see is what you get—it’s all you get!
772Inherit—When Inheritance Simplifies the Design
773In designing a software system, you’ll often find objects that are much like other
774objects, except for a few differences. In an accounting system, for instance, you might
775have both full-time and part-time employees. Most of the data associated with both
776kinds of employees is the same, but some is different. In object-oriented programming,
777you can define a general type of employee and then define full-time employees
778as general employees, except for a few differences, and part-time employees also as
779general employees, except for a few differences. When an operation on an employee
780doesn’t depend on the type of employee, the operation is handled as if the employee
781were just a general employee. When the operation depends on whether the employee
782is full-time or part-time, the operation is handled differently.
783Defining similarities and differences among such objects is called “inheritanceâ€
784because the specific part-time and full-time employees inherit characteristics from the
785general-employee type.
786The benefit of inheritance is that it works synergistically with the notion of abstraction.
787Abstraction deals with objects at different levels of detail. Recall the door that
788was a collection of certain kinds of molecules at one level, a collection of wood fibers
789at the next, and something that keeps burglars out of your house at the next level.
790Wood has certain properties—for example, you can cut it with a saw or glue it with
791wood glue—and two-by-fours or cedar shingles have the general properties of wood as
792well as some specific properties of their own.
793Inheritance simplifies programming because you write a general routine to handle
794anything that depends on a door’s general properties and then write specific routines
795to handle specific operations on specific kinds of doors. Some operations, such as
79692 Chapter 5: Design in Construction
797Open() or Close(), might apply regardless of whether the door is a solid door, interior
798door, exterior door, screen door, French door, or sliding glass door. The ability of a
799language to support operations like Open() or Close() without knowing until run time
800what kind of door you’re dealing with is called “polymorphism.†Object-oriented languages
801such as C++, Java, and later versions of Microsoft Visual Basic support inheritance
802and polymorphism.
803Inheritance is one of object-oriented programming’s most powerful tools. It can provide
804great benefits when used well, and it can do great damage when used naively. For
805details, see “Inheritance (“is a†Relationships)†in Section 6.3.
806Hide Secrets (Information Hiding)
807Information hiding is part of the foundation of both structured design and object-oriented
808design. In structured design, the notion of “black boxes†comes from information
809hiding. In object-oriented design, it gives rise to the concepts of encapsulation
810and modularity and it is associated with the concept of abstraction. Information hiding
811is one of the seminal ideas in software development, and so this subsection
812explores it in depth.
813Information hiding first came to public attention in a paper published by David Parnas
814in 1972 called “On the Criteria to Be Used in Decomposing Systems Into Modules.â€
815Information hiding is characterized by the idea of “secrets,†design and
816implementation decisions that a software developer hides in one place from the rest of
817a program.
818In the 20th Anniversary edition of The Mythical Man Month, Fred Brooks concluded
819that his criticism of information hiding was one of the few ways in which the first edition
820of his book was wrong. “Parnas was right, and I was wrong about information
821hiding,†he proclaimed (Brooks 1995). Barry Boehm reported that information hiding
822was a powerful technique for eliminating rework, and he pointed out that it was particularly
823effective in incremental, high-change environments (Boehm 1987).
824Information hiding is a particularly powerful heuristic for Software’s Primary Technical
825Imperative because, beginning with its name and throughout its details, it emphasizes
826hiding complexity.
827Secrets and the Right to Privacy
828In information hiding, each class (or package or routine) is characterized by the
829design or construction decisions that it hides from all other classes. The secret might
830be an area that’s likely to change, the format of a file, the way a data type is implemented,
831or an area that needs to be walled off from the rest of the program so that
832errors in that area cause as little damage as possible. The class’s job is to keep this
833information hidden and to protect its own right to privacy. Minor changes to a system
8345.3 Design Building Blocks: Heuristics 93
835might affect several routines within a class, but they should not ripple beyond the
836class interface.
837Strive for class interfaces
838that are complete and minimal.
839—Scott Meyers
840One key task in designing a class is deciding which features should be known outside
841the class and which should remain secret. A class might use 25 routines and expose
842only 5 of them, using the other 20 internally. A class might use several data types and
843expose no information about them. This aspect of class design is also known as “visibilityâ€
844since it has to do with which features of the class are “visible†or “exposed†outside
845the class.
846The interface to a class should reveal as little as possible about its inner workings. As
847shown in Figure 5-9, a class is a lot like an iceberg: seven-eighths is under water, and
848you can see only the one-eighth that’s above the surface.
849Figure 5-9 A good class interface is like the tip of an iceberg, leaving most of the class
850unexposed.
851Designing the class interface is an iterative process just like any other aspect of design.
852If you don’t get the interface right the first time, try a few more times until it stabilizes.
853If it doesn’t stabilize, you need to try a different approach.
854An Example of Information Hiding
855Suppose you have a program in which each object is supposed to have a unique ID
856stored in a member variable called id. One design approach would be to use integers
857for the IDs and to store the highest ID assigned so far in a global variable called
858g_maxId. As each new object is allocated, perhaps in each object’s constructor, you
859could simply use the id = ++g_maxId statement, which would guarantee a unique id,
860and it would add the absolute minimum of code in each place an object is created.
861What could go wrong with that?
86294 Chapter 5: Design in Construction
863A lot of things could go wrong. What if you want to reserve ranges of IDs for special
864purposes? What if you want to use nonsequential IDs to improve security? What if you
865want to be able to reuse the IDs of objects that have been destroyed? What if you want
866to add an assertion that fires when you allocate more IDs than the maximum number
867you’ve anticipated? If you allocated IDs by spreading id = ++g_maxId statements
868throughout your program, you would have to change code associated with every one
869of those statements. And, if your program is multithreaded, this approach won’t be
870thread-safe.
871The way that new IDs are created is a design decision that you should hide. If you use
872the phrase ++g_maxId throughout your program, you expose the way a new ID is created,
873which is simply by incrementing g_maxId. If instead you put the id = NewId()
874statement throughout your program, you hide the information about how new IDs are
875created. Inside the NewId() routine you might still have just one line of code, return
876( ++g_maxId ) or its equivalent, but if you later decide to reserve certain ranges of IDs
877for special purposes or to reuse old IDs, you could make those changes within the
878NewId() routine itself—without touching dozens or hundreds of id = NewId() statements.
879No matter how complicated the revisions inside NewId() might become, they
880wouldn’t affect any other part of the program.
881Now suppose you discover you need to change the type of the ID from an integer to a
882string. If you’ve spread variable declarations like int id throughout your program, your
883use of the NewId() routine won’t help. You’ll still have to go through your program
884and make dozens or hundreds of changes.
885An additional secret to hide is the ID’s type. By exposing the fact that IDs are integers,
886you encourage programmers to perform integer operations like >, <, = on them.
887In C++, you could use a simple typedef to declare your IDs to be of IdType—a userdefined
888type that resolves to int—rather than directly declaring them to be of type
889int. Alternatively, in C++ and other languages you could create a simple IdType class.
890Once again, hiding a design decision makes a huge difference in the amount of code
891affected by a change.
892Information hiding is useful at all levels of design, from the use of named constants
893instead of literals, to creation of data types, to class design, routine design, and subsystem
894design.
895Two Categories of Secrets
896Secrets in information hiding fall into two general camps:
897■Hiding complexity so that your brain doesn’t have to deal with it unless you’re
898specifically concerned with it
899â– Hiding sources of change so that when change occurs, the effects are localized
900KEY POINT
9015.3 Design Building Blocks: Heuristics 95
902Sources of complexity include complicated data types, file structures, boolean tests,
903involved algorithms, and so on. A comprehensive list of sources of change is described
904later in this chapter.
905Barriers to Information Hiding
906Further Reading Parts of
907this section are adapted
908from “Designing Software
909for Ease of Extension and
910Contraction†(Parnas 1979).
911In a few instances, information hiding is truly impossible, but most of the barriers to
912information hiding are mental blocks built up from the habitual use of other techniques.
913Excessive distribution of information One common barrier to information hiding is
914an excessive distribution of information throughout a system. You might have hardcoded
915the literal 100 throughout a system. Using 100 as a literal decentralizes references
916to it. It’s better to hide the information in one place, in a constant
917MAX_EMPLOYEES perhaps, whose value is changed in only one place.
918Another example of excessive information distribution is interleaving interaction with
919human users throughout a system. If the mode of interaction changes—say, from a
920GUI interface to a command line interface—virtually all the code will have to be modified.
921It’s better to concentrate user interaction in a single class, package, or subsystem
922you can change without affecting the whole system.
923Cross-Reference For more
924on accessing global data
925through class interfaces, see
926“Using Access Routines
927Instead of Global Data†in
928Section 13.3.
929Yet another example would be a global data element—perhaps an array of employee
930data with 1000 elements maximum that’s accessed throughout a program. If the program
931uses the global data directly, information about the data item’s implementation—
932such as the fact that it’s an array and has a maximum of 1000 elements—will be
933spread throughout the program. If the program uses the data only through access routines,
934only the access routines will know the implementation details.
935Circular dependencies A more subtle barrier to information hiding is circular dependencies,
936as when a routine in class A calls a routine in class B, and a routine in class B
937calls a routine in class A.
938Avoid such dependency loops. They make it hard to test a system because you can’t
939test either class A or class B until at least part of the other is ready.
940Class data mistaken for global data If you’re a conscientious programmer, one of
941the barriers to effective information hiding might be thinking of class data as global
942data and avoiding it because you want to avoid the problems associated with global
943data. While the road to programming hell is paved with global variables, class data
944presents far fewer risks.
945Global data is generally subject to two problems: routines operate on global data without
946knowing that other routines are operating on it, and routines are aware that other routines
947are operating on the global data but they don’t know exactly what they’re doing to
948it. Class data isn’t subject to either of these problems. Direct access to the data is
949restricted to a few routines organized into a single class. The routines are aware that other
950routines operate on the data, and they know exactly which other routines they are.
95196 Chapter 5: Design in Construction
952Of course, this whole discussion assumes that your system makes use of welldesigned,
953small classes. If your program is designed to use huge classes that contain
954dozens of routines each, the distinction between class data and global data will begin
955to blur and class data will be subject to many of the same problems as global data.
956Cross-Reference Code-level
957performance optimizations
958are discussed in Chapter 25,
959“Code-Tuning Strategiesâ€
960and Chapter 26, “Code-Tuning
961Techniques.â€
962Perceived performance penalties A final barrier to information hiding can be an
963attempt to avoid performance penalties at both the architectural and the coding levels.
964You don’t need to worry at either level. At the architectural level, the worry is unnecessary
965because architecting a system for information hiding doesn’t conflict with
966architecting it for performance. If you keep both information hiding and performance
967in mind, you can achieve both objectives.
968The more common worry is at the coding level. The concern is that accessing data
969items indirectly incurs run-time performance penalties for additional levels of object
970instantiations, routine calls, and so on. This concern is premature. Until you can measure
971the system’s performance and pinpoint the bottlenecks, the best way to prepare
972for code-level performance work is to create a highly modular design. When you
973detect hot spots later, you can optimize individual classes and routines without affecting
974the rest of the system.
975Value of Information Hiding
976Information hiding is one of the few theoretical techniques that has indisputably proven
977its value in practice, which has been true for a long time (Boehm 1987a). Large programs
978that use information hiding were found years ago to be easier to modify—by a factor
979of 4—than programs that don’t (Korson and Vaishnavi 1986). Moreover, information
980hiding is part of the foundation of both structured design and object-oriented design.
981Information hiding has unique heuristic power, a unique ability to inspire effective
982design solutions. Traditional object-oriented design provides the heuristic power of
983modeling the world in objects, but object thinking wouldn’t help you avoid declaring
984the ID as an int instead of an IdType. The object-oriented designer would ask, “Should
985an ID be treated as an object?†Depending on the project’s coding standards, a “Yesâ€
986answer might mean that the programmer has to write a constructor, destructor, copy
987operator, and assignment operator; comment it all; and place it under configuration
988control. Most programmers would decide, “No, it isn’t worth creating a whole class
989just for an ID. I’ll just use ints.â€
990Note what just happened. A useful design alternative, that of simply hiding the ID’s
991data type, was not even considered. If, instead, the designer had asked, “What about
992the ID should be hidden?†he might well have decided to hide its type behind a simple
993type declaration that substitutes IdType for int. The difference between object-oriented
994design and information hiding in this example is more subtle than a clash of explicit
995rules and regulations. Object-oriented design would approve of this design decision
996as much as information hiding would. Rather, the difference is one of heuristics—
9971
9982
9993
1000HARD DATA
10015.3 Design Building Blocks: Heuristics 97
1002thinking about information hiding inspires and promotes design decisions that thinking
1003about objects does not.
1004Information hiding can also be useful in designing a class’s public interface. The gap
1005between theory and practice in class design is wide, and among many class designers
1006the decision about what to put into a class’s public interface amounts to deciding
1007what interface would be the most convenient to use, which usually results in exposing
1008as much of the class as possible. From what I’ve seen, some programmers would
1009rather expose all of a class’s private data than write 10 extra lines of code to keep the
1010class’s secrets intact.
1011Asking “What does this class need to hide?†cuts to the heart of the interface-design
1012issue. If you can put a function or data into the class’s public interface without compromising
1013its secrets, do. Otherwise, don’t.
1014Asking about what needs to be hidden supports good design decisions at all levels. It
1015promotes the use of named constants instead of literals at the construction level. It
1016helps in creating good routine and parameter names inside classes. It guides decisions
1017about class and subsystem decompositions and interconnections at the system level.
1018Get into the habit of asking “What should I hide?†You’ll be surprised at how many difficult
1019design issues dissolve before your eyes.
1020Identify Areas Likely to Change
1021Further Reading The
1022approach described in this
1023section is adapted from
1024“Designing Software for Ease
1025of Extension and Contractionâ€
1026(Parnas 1979).
1027A study of great designers found that one attribute they had in common was their ability
1028to anticipate change (Glass 1995). Accommodating changes is one of the most
1029challenging aspects of good program design. The goal is to isolate unstable areas so
1030that the effect of a change will be limited to one routine, class, or package. Here are the
1031steps you should follow in preparing for such perturbations.
10321. Identify items that seem likely to change. If the requirements have been done
1033well, they include a list of potential changes and the likelihood of each change.
1034In such a case, identifying the likely changes is easy. If the requirements don’t
1035cover potential changes, see the discussion that follows of areas that are likely to
1036change on any project.
10372. Separate items that are likely to change. Compartmentalize each volatile component
1038identified in step 1 into its own class or into a class with other volatile
1039components that are likely to change at the same time.
10403. Isolate items that seem likely to change. Design the interclass interfaces to be
1041insensitive to the potential changes. Design the interfaces so that changes are
1042limited to the inside of the class and the outside remains unaffected. Any other
1043class using the changed class should be unaware that the change has occurred.
1044The class’s interface should protect its secrets.
1045KEY POINT
104698 Chapter 5: Design in Construction
1047Here are a few areas that are likely to change:
1048Cross-Reference One of the
1049most powerful techniques
1050for anticipating change is to
1051use table-driven methods.
1052For details, see Chapter 18,
1053“Table-Driven Methods.â€
1054Business rules Business rules tend to be the source of frequent software changes.
1055Congress changes the tax structure, a union renegotiates its contract, or an insurance
1056company changes its rate tables. If you follow the principle of information hiding,
1057logic based on these rules won’t be strewn throughout your program. The logic will
1058stay hidden in a single dark corner of the system until it needs to be changed.
1059Hardware dependencies Examples of hardware dependencies include interfaces to
1060screens, printers, keyboards, mice, disk drives, sound facilities, and communications
1061devices. Isolate hardware dependencies in their own subsystem or class. Isolating
1062such dependencies helps when you move the program to a new hardware environment.
1063It also helps initially when you’re developing a program for volatile hardware.
1064You can write software that simulates interaction with specific hardware, have the
1065hardware-interface subsystem use the simulator as long as the hardware is unstable or
1066unavailable, and then unplug the hardware-interface subsystem from the simulator
1067and plug the subsystem into the hardware when it’s ready to use.
1068Input and output At a slightly higher level of design than raw hardware interfaces,
1069input/output is a volatile area. If your application creates its own data files, the file format
1070will probably change as your application becomes more sophisticated. User-level
1071input and output formats will also change—the positioning of fields on the page, the
1072number of fields on each page, the sequence of fields, and so on. In general, it’s a good
1073idea to examine all external interfaces for possible changes.
1074Nonstandard language features Most language implementations contain handy,
1075nonstandard extensions. Using the extensions is a double-edged sword because they
1076might not be available in a different environment, whether the different environment
1077is different hardware, a different vendor’s implementation of the language, or a new
1078version of the language from the same vendor.
1079If you use nonstandard extensions to your programming language, hide those extensions
1080in a class of their own so that you can replace them with your own code when
1081you move to a different environment. Likewise, if you use library routines that aren’t
1082available in all environments, hide the actual library routines behind an interface that
1083works just as well in another environment.
1084Difficult design and construction areas It’s a good idea to hide difficult design and
1085construction areas because they might be done poorly and you might need to do them
1086again. Compartmentalize them and minimize the impact their bad design or construction
1087might have on the rest of the system.
1088Status variables Status variables indicate the state of a program and tend to be
1089changed more frequently than most other data. In a typical scenario, you might originally
1090define an error-status variable as a boolean variable and decide later that it
10915.3 Design Building Blocks: Heuristics 99
1092would be better implemented as an enumerated type with the values ErrorType_None,
1093ErrorType_Warning, and ErrorType_Fatal.
1094You can add at least two levels of flexibility and readability to your use of status variables:
1095■Don’t use a boolean variable as a status variable. Use an enumerated type
1096instead. It’s common to add a new state to a status variable, and adding a new
1097type to an enumerated type requires a mere recompilation rather than a major
1098revision of every line of code that checks the variable.
1099â– Use access routines rather than checking the variable directly. By checking the
1100access routine rather than the variable, you allow for the possibility of more
1101sophisticated state detection. For example, if you wanted to check combinations
1102of an error-state variable and a current-function-state variable, it would be easy
1103to do if the test were hidden in a routine and hard to do if it were a complicated
1104test hard-coded throughout the program.
1105Data-size constraints When you declare an array of size 100, you’re exposing information
1106to the world that the world doesn’t need to see. Defend your right to privacy!
1107Information hiding isn’t always as complicated as a whole class. Sometimes it’s as simple
1108as using a named constant such as MAX_EMPLOYEES to hide a 100.
1109Anticipating Different Degrees of Change
1110Cross-Reference This section’s
1111approach to anticipating
1112change does not involve
1113designing ahead or coding
1114ahead. For a discussion of
1115those practices, see “A program
1116contains code that
1117seems like it might be needed
1118someday†in Section 24.2.
1119When thinking about potential changes to a system, design the system so that the
1120effect or scope of the change is proportional to the chance that the change will occur.
1121If a change is likely, make sure that the system can accommodate it easily. Only
1122extremely unlikely changes should be allowed to have drastic consequences for more
1123than one class in a system. Good designers also factor in the cost of anticipating
1124change. If a change is not terribly likely but easy to plan for, you should think harder
1125about anticipating it than if it isn’t very likely and is difficult to plan for.
1126Further Reading This discussion
1127draws on the
1128approach described in “On
1129the design and development
1130of program families†(Parnas
11311976).
1132A good technique for identifying areas likely to change is first to identify the minimal
1133subset of the program that might be of use to the user. The subset makes up the core
1134of the system and is unlikely to change. Next, define minimal increments to the system.
1135They can be so small that they seem trivial. As you consider functional changes,
1136be sure also to consider qualitative changes: making the program thread-safe, making
1137it localizable, and so on. These areas of potential improvement constitute potential
1138changes to the system; design these areas using the principles of information hiding.
1139By identifying the core first, you can see which components are really add-ons and
1140then extrapolate and hide improvements from there.
1141100 Chapter 5: Design in Construction
1142Keep Coupling Loose
1143Coupling describes how tightly a class or routine is related to other classes or routines.
1144The goal is to create classes and routines with small, direct, visible, and flexible
1145relations to other classes and routines, which is known as “loose coupling.†The concept
1146of coupling applies equally to classes and routines, so for the rest of this discussion
1147I’ll use the word “module†to refer to both classes and routines.
1148Good coupling between modules is loose enough that one module can easily be used
1149by other modules. Model railroad cars are coupled by opposing hooks that latch
1150when pushed together. Connecting two cars is easy—you just push the cars together.
1151Imagine how much more difficult it would be if you had to screw things together, or
1152connect a set of wires, or if you could connect only certain kinds of cars to certain
1153other kinds of cars. The coupling of model railroad cars works because it’s as simple
1154as possible. In software, make the connections among modules as simple as possible.
1155Try to create modules that depend little on other modules. Make them detached, as
1156business associates are, rather than attached, as Siamese twins are. A routine like sin()
1157is loosely coupled because everything it needs to know is passed in to it with one
1158value representing an angle in degrees. A routine such as InitVars( var 1, var2, var3, ...,
1159varN ) is more tightly coupled because, with all the variables it must pass, the calling
1160module practically knows what is happening inside InitVars(). Two classes that
1161depend on each other’s use of the same global data are even more tightly coupled.
1162Coupling Criteria
1163Here are several criteria to use in evaluating coupling between modules:
1164Size Size refers to the number of connections between modules. With coupling,
1165small is beautiful because it’s less work to connect other modules to a module that has
1166a smaller interface. A routine that takes one parameter is more loosely coupled to
1167modules that call it than a routine that takes six parameters. A class with four welldefined
1168public methods is more loosely coupled to modules that use it than a class
1169that exposes 37 public methods.
1170Visibility Visibility refers to the prominence of the connection between two modules.
1171Programming is not like being in the CIA; you don’t get credit for being sneaky.
1172It’s more like advertising; you get lots of credit for making your connections as blatant
1173as possible. Passing data in a parameter list is making an obvious connection and is
1174therefore good. Modifying global data so that another module can use that data is a
1175sneaky connection and is therefore bad. Documenting the global-data connection
1176makes it more obvious and is slightly better.
1177Flexibility Flexibility refers to how easily you can change the connections between
1178modules. Ideally, you want something more like the USB connector on your computer
1179than like bare wire and a soldering gun. Flexibility is partly a product of the other
11805.3 Design Building Blocks: Heuristics 101
1181coupling characteristics, but it’s a little different too. Suppose you have a routine that
1182looks up the amount of vacation an employee receives each year, given a hiring date and
1183a job classification. Name the routine LookupVacationBenefit(). Suppose in another
1184module you have an employee object that contains the hiring date and the job classification,
1185among other things, and that module passes the object to LookupVacationBenefit().
1186From the point of view of the other criteria, the two modules would look loosely coupled.
1187The employee connection between the two modules is visible, and there’s only
1188one connection. Now suppose that you need to use the LookupVacationBenefit() module
1189from a third module that doesn’t have an employee object but that does have a hiring
1190date and a job classification. Suddenly LookupVacationBenefit() looks less friendly,
1191unwilling to associate with the new module.
1192For the third module to use LookupVacationBenefit(), it has to know about the
1193Employee class. It could dummy up an employee object with only two fields, but that
1194would require internal knowledge of LookupVacationBenefit(), namely that those are
1195the only fields it uses. Such a solution would be a kludge, and an ugly one. The second
1196option would be to modify LookupVacationBenefit() so that it would take hiring date
1197and job classification instead of employee. In either case, the original module turns out
1198to be a lot less flexible than it seemed to be at first.
1199The happy ending to the story is that an unfriendly module can make friends if it’s
1200willing to be flexible—in this case, by changing to take hiring date and job classification
1201specifically instead of employee.
1202In short, the more easily other modules can call a module, the more loosely coupled
1203it is, and that’s good because it’s more flexible and maintainable. In creating a system
1204structure, break up the program along the lines of minimal interconnectedness. If a
1205program were a piece of wood, you would try to split it with the grain.
1206Kinds of Coupling
1207Here are the most common kinds of coupling you’ll encounter.
1208Simple-data-parameter coupling Two modules are simple-data-parameter coupled if
1209all the data passed between them are of primitive data types and all the data is passed
1210through parameter lists. This kind of coupling is normal and acceptable.
1211Simple-object coupling A module is simple-object coupled to an object if it instantiates
1212that object. This kind of coupling is fine.
1213Object-parameter coupling Two modules are object-parameter coupled to each
1214other if Object1 requires Object2 to pass it an Object3. This kind of coupling is tighter
1215than Object1 requiring Object2 to pass it only primitive data types because it requires
1216Object2 to know about Object3.
1217102 Chapter 5: Design in Construction
1218Semantic coupling The most insidious kind of coupling occurs when one module
1219makes use not of some syntactic element of another module but of some semantic
1220knowledge of another module’s inner workings. Here are some examples:
1221â– Module1 passes a control flag to Module2 that tells Module2 what to do. This
1222approach requires Module1 to make assumptions about the internal workings of
1223Module2, namely what Module2 is going to do with the control flag. If Module2
1224defines a specific data type for the control flag (enumerated type or object), this
1225usage is probably OK.
1226â– Module2 uses global data after the global data has been modified by Module1.
1227This approach requires Module2 to assume that Module1 has modified the data
1228in the ways Module2 needs it to be modified, and that Module1 has been called at
1229the right time.
1230■Module1’s interface states that its Module1.Initialize() routine should be called
1231before its Module1.Routine() is called. Module2 knows that Module1.Routine()
1232calls Module1.Initialize() anyway, so it just instantiates Module1 and calls
1233Module1.Routine() without calling Module1.Initialize() first.
1234â– Module1 passes Object to Module2. Because Module1 knows that Module2 uses
1235only three of Object’s seven methods, it initializes Object only partially—with the
1236specific data those three methods need.
1237â– Module1 passes BaseObject to Module2. Because Module2 knows that Module1 is
1238really passing it DerivedObject, it casts BaseObject to DerivedObject and calls
1239methods that are specific to DerivedObject.
1240Semantic coupling is dangerous because changing code in the used module can break
1241code in the using module in ways that are completely undetectable by the compiler.
1242When code like this breaks, it breaks in subtle ways that seem unrelated to the change
1243made in the used module, which turns debugging into a Sisyphean task.
1244The point of loose coupling is that an effective module provides an additional level of
1245abstraction—once you write it, you can take it for granted. It reduces overall program
1246complexity and allows you to focus on one thing at a time. If using a module requires
1247you to focus on more than one thing at once—knowledge of its internal workings,
1248modification to global data, uncertain functionality—the abstractive power is lost and
1249the module’s ability to help manage complexity is reduced or eliminated.
1250Classes and routines are first and foremost intellectual tools for reducing complexity.
1251If they’re not making your job simpler, they’re not doing their jobs.
1252KEY POINT
12535.3 Design Building Blocks: Heuristics 103
1254Look for Common Design Patterns
1255cc2e.com/0585 Design patterns provide the cores of ready-made solutions that can be used to solve
1256many of software’s most common problems. Some software problems require solutions
1257that are derived from first principles. But most problems are similar to past problems,
1258and those can be solved using similar solutions, or patterns. Common patterns include
1259Adapter, Bridge, Decorator, Facade, Factory Method, Observor, Singleton, Strategy, and
1260Template Method. The book Design Patterns by Erich Gamma, Richard Helm, Ralph
1261Johnson, and John Vlissides (1995) is the definitive description of design patterns.
1262Patterns provide several benefits that fully custom design doesn’t:
1263Patterns reduce complexity by providing ready-made abstractions If you say, “This
1264code uses a Factory Method to create instances of derived classes,†other programmers
1265on your project will understand that your code involves a fairly rich set of interrelationships
1266and programming protocols, all of which are invoked when you refer to
1267the design pattern of Factory Method.
1268The Factory Method is a pattern that allows you to instantiate any class derived from
1269a specific base class without needing to keep track of the individual derived classes
1270anywhere but the Factory Method. For a good discussion of the Factory Method pattern,
1271see “Replace Constructor with Factory Method†in Refactoring (Fowler 1999).
1272You don’t have to spell out every line of code for other programmers to understand
1273the design approach found in your code.
1274Patterns reduce errors by institutionalizing details of common solutions Software
1275design problems contain nuances that emerge fully only after the problem has been
1276solved once or twice (or three times, or four times, or...). Because patterns represent
1277standardized ways of solving common problems, they embody the wisdom accumulated
1278from years of attempting to solve those problems, and they also embody the corrections
1279to the false attempts that people have made in solving those problems.
1280Using a design pattern is thus conceptually similar to using library code instead of
1281writing your own. Sure, everybody has written a custom Quicksort a few times, but
1282what are the odds that your custom version will be fully correct on the first try? Similarly,
1283numerous design problems are similar enough to past problems that you’re better
1284off using a prebuilt design solution than creating a novel solution.
1285Patterns provide heuristic value by suggesting design alternatives A designer who’s
1286familiar with common patterns can easily run through a list of patterns and ask
1287“Which of these patterns fits my design problem?†Cycling through a set of familiar
1288alternatives is immeasurably easier than creating a custom design solution out of
1289whole cloth. And the code arising from a familiar pattern will also be easier for readers
1290of the code to understand than fully custom code would be.
1291104 Chapter 5: Design in Construction
1292Patterns streamline communication by moving the design dialog to a higher level In
1293addition to their complexity-management benefit, design patterns can accelerate
1294design discussions by allowing designers to think and discuss at a larger level of granularity.
1295If you say “I can’t decide whether I should use a Creator or a Factory Method
1296in this situation,†you’ve communicated a great deal with just a few words—as long as
1297you and your listener are both familiar with those patterns. Imagine how much longer
1298it would take you to dive into the details of the code for a Creator pattern and the code
1299for a Factory Method pattern and then compare and contrast the two approaches.
1300If you’re not already familiar with design patterns, Table 5-1 summarizes some of the
1301most common patterns to stimulate your interest.
1302If you haven’t seen design patterns before, your reaction to the descriptions in Table 5-
13031 might be “Sure, I already know most of these ideas.†That reaction is a big part of
1304why design patterns are valuable. Patterns are familiar to most experienced programmers,
1305and assigning recognizable names to them supports efficient and effective communication
1306about them.
1307Table 5-1 Popular Design Patterns
1308Pattern Description
1309Abstract Factory Supports creation of sets of related objects by specifying the kind
1310of set but not the kinds of each specific object.
1311Adapter Converts the interface of a class to a different interface.
1312Bridge Builds an interface and an implementation in such a way that
1313either can vary without the other varying.
1314Composite Consists of an object that contains additional objects of its own
1315type so that client code can interact with the top-level object and
1316not concern itself with all the detailed objects.
1317Decorator Attaches responsibilities to an object dynamically, without creating
1318specific subclasses for each possible configuration of responsibilities.
1319Facade Provides a consistent interface to code that wouldn’t otherwise
1320offer a consistent interface.
1321Factory Method Instantiates classes derived from a specific base class without
1322needing to keep track of the individual derived classes anywhere
1323but the Factory Method.
1324Iterator A server object that provides access to each element in a set
1325sequentially.
1326Observer Keeps multiple objects in synch with one another by making an
1327object responsible for notifying the set of related objects about
1328changes to any member of the set.
1329Singleton Provides global access to a class that has one and only one instance.
1330Strategy Defines a set of algorithms or behaviors that are dynamically
1331interchangeable with each other.
1332Template Method Defines the structure of an algorithm but leaves some of the
1333detailed implementation to subclasses.
13345.3 Design Building Blocks: Heuristics 105
1335One potential trap with patterns is force-fitting code to use a pattern. In some cases, shifting
1336code slightly to conform to a well-recognized pattern will improve understandability
1337of the code. But if the code has to be shifted too far, forcing it to look like a standard pattern
1338can sometimes increase complexity.
1339Another potential trap with patterns is feature-itis: using a pattern because of a desire
1340to try out a pattern rather than because the pattern is an appropriate design solution.
1341Overall, design patterns are a powerful tool for managing complexity. You can read more
1342detailed descriptions in any of the good books that are listed at the end of this chapter.
1343Other Heuristics
1344The preceding sections describe the major software design heuristics. Following are a few
1345other heuristics that might not be useful quite as often but are still worth mentioning.
1346Aim for Strong Cohesion
1347Cohesion arose from structured design and is usually discussed in the same context
1348as coupling. Cohesion refers to how closely all the routines in a class or all the code in
1349a routine support a central purpose—how focused the class is. Classes that contain
1350strongly related functionality are described as having strong cohesion, and the heuristic
1351goal is to make cohesion as strong as possible. Cohesion is a useful tool for managing
1352complexity because the more that code in a class supports a central purpose, the
1353more easily your brain can remember everything the code does.
1354Thinking about cohesion at the routine level has been a useful heuristic for decades
1355and is still useful today. At the class level, the heuristic of cohesion has largely been
1356subsumed by the broader heuristic of well-defined abstractions, which was discussed
1357earlier in this chapter and in Chapter 6. Abstractions are useful at the routine level,
1358too, but on a more even footing with cohesion at that level of detail.
1359Build Hierarchies
1360A hierarchy is a tiered information structure in which the most general or abstract representation
1361of concepts is contained at the top of the hierarchy, with increasingly
1362detailed, specialized representations at the hierarchy’s lower levels. In software,
1363hierarchies are found in class hierarchies, and, as Level 4 in Figure 5-2 illustrated, in
1364routine-calling hierarchies as well.
1365Hierarchies have been an important tool for managing complex sets of information for
1366at least 2000 years. Aristotle used a hierarchy to organize the animal kingdom.
1367Humans frequently use outlines to organize complex information (like this book).
1368Researchers have found that people generally find hierarchies to be a natural way to
1369organize complex information. When they draw a complex object such as a house,
1370they draw it hierarchically. First they draw the outline of the house, then the windows
1371106 Chapter 5: Design in Construction
1372and doors, and then more details. They don’t draw the house brick by brick, shingle
1373by shingle, or nail by nail (Simon 1996).
1374Hierarchies are a useful tool for achieving Software’s Primary Technical Imperative
1375because they allow you to focus on only the level of detail you’re currently concerned
1376with. The details don’t go away completely; they’re simply pushed to another level so
1377that you can think about them when you want to rather than thinking about all the
1378details all of the time.
1379Formalize Class Contracts
1380Cross-Reference For more
1381on contracts, see “Use assertions
1382to document and verify
1383preconditions and postconditionsâ€
1384in Section 8.2.
1385At a more detailed level, thinking of each class’s interface as a contract with the rest of
1386the program can yield good insights. Typically, the contract is something like “If you
1387promise to provide data x, y, and z and you promise they’ll have characteristics a, b,
1388and c, I promise to perform operations 1, 2, and 3 within constraints 8, 9, and 10.†The
1389promises the clients of the class make to the class are typically called “preconditions,â€
1390and the promises the object makes to its clients are called the “postconditions.â€
1391Contracts are useful for managing complexity because, at least in theory, the object can
1392safely ignore any noncontractual behavior. In practice, this issue is much more difficult.
1393Assign Responsibilities
1394Another heuristic is to think through how responsibilities should be assigned to
1395objects. Asking what each object should be responsible for is similar to asking what
1396information it should hide, but I think it can produce broader answers, which gives
1397the heuristic unique value.
1398Design for Test
1399A thought process that can yield interesting design insights is to ask what the system will
1400look like if you design it to facilitate testing. Do you need to separate the user interface
1401from the rest of the code so that you can exercise it independently? Do you need to organize
1402each subsystem so that it minimizes dependencies on other subsystems? Designing
1403for test tends to result in more formalized class interfaces, which is generally beneficial.
1404Avoid Failure
1405Civil engineering professor Henry Petroski wrote an interesting book, Design Paradigms:
1406Case Histories of Error and Judgment in Engineering (Petroski 1994), that chronicles the
1407history of failures in bridge design. Petroski argues that many spectacular bridge failures
1408have occurred because of focusing on previous successes and not adequately considering
1409possible failure modes. He concludes that failures like the Tacoma Narrows bridge
1410could have been avoided if the designers had carefully considered the ways the bridge
1411might fail and not just copied the attributes of other successful designs.
14125.3 Design Building Blocks: Heuristics 107
1413The high-profile security lapses of various well-known systems the past few years
1414make it hard to disagree that we should find ways to apply Petroski’s design-failure
1415insights to software.
1416Choose Binding Time Consciously
1417Cross-Reference For more
1418on binding time, see Section
141910.6, “Binding Time.â€
1420Binding time refers to the time a specific value is bound to a variable. Code that binds
1421early tends to be simpler, but it also tends to be less flexible. Sometimes you can get a
1422good design insight from asking questions like these: What if I bound these values
1423earlier? What if I bound these values later? What if I initialized this table right here in
1424the code? What if I read the value of this variable from the user at run time?
1425Make Central Points of Control
1426P.J. Plauger says his major concern is “The Principle of One Right Place—there should
1427be One Right Place to look for any nontrivial piece of code, and One Right Place to
1428make a likely maintenance change†(Plauger 1993). Control can be centralized in
1429classes, routines, preprocessor macros, #include files—even a named constant is an
1430example of a central point of control.
1431The reduced-complexity benefit is that the fewer places you have to look for something,
1432the easier and safer it will be to change.
1433Consider Using Brute Force
1434When in doubt, use brute
1435force.
1436—Butler Lampson
1437One powerful heuristic tool is brute force. Don’t underestimate it. A brute-force solution
1438that works is better than an elegant solution that doesn’t work. It can take a long
1439time to get an elegant solution to work. In describing the history of searching algorithms,
1440for example, Donald Knuth pointed out that even though the first description
1441of a binary search algorithm was published in 1946, it took another 16 years for someone
1442to publish an algorithm that correctly searched lists of all sizes (Knuth 1998). A
1443binary search is more elegant, but a brute-force, sequential search is often sufficient.
1444Draw a Diagram
1445Diagrams are another powerful heuristic tool. A picture is worth 1000 words—kind of.
1446You actually want to leave out most of the 1000 words because one point of using a
1447picture is that a picture can represent the problem at a higher level of abstraction.
1448Sometimes you want to deal with the problem in detail, but other times you want to be
1449able to work with more generality.
1450Keep Your Design Modular
1451Modularity’s goal is to make each routine or class like a “black boxâ€: You know what
1452goes in, and you know what comes out, but you don’t know what happens inside. A
1453108 Chapter 5: Design in Construction
1454black box has such a simple interface and such well-defined functionality that for any
1455specific input you can accurately predict the corresponding output.
1456The concept of modularity is related to information hiding, encapsulation, and other
1457design heuristics. But sometimes thinking about how to assemble a system from a set
1458of black boxes provides insights that information hiding and encapsulation don’t, so
1459the concept is worth having in your back pocket.
1460Summary of Design Heuristics
1461More alarming, the same
1462programmer is quite capable
1463of doing the same task
1464himself in two or three
1465ways, sometimes unconsciously,
1466but quite often
1467simply for a change, or to
1468provide elegant variation.
1469—A. R. Brown and W. A.
1470Sampson
1471Here’s a summary of major design heuristics:
1472â– Find Real-World Objects
1473â– Form Consistent Abstractions
1474â– Encapsulate Implementation Details
1475â– Inherit When Possible
1476â– Hide Secrets (Information Hiding)
1477â– Identify Areas Likely to Change
1478â– Keep Coupling Loose
1479â– Look for Common Design Patterns
1480The following heuristics are sometimes useful too:
1481â– Aim for Strong Cohesion
1482â– Build Hierarchies
1483â– Formalize Class Contracts
1484â– Assign Responsibilities
1485â– Design for Test
1486â– Avoid Failure
1487â– Choose Binding Time Consciously
1488â– Make Central Points of Control
1489â– Consider Using Brute Force
1490â– Draw a Diagram
1491â– Keep Your Design Modular
14925.3 Design Building Blocks: Heuristics 109
1493Guidelines for Using Heuristics
1494Approaches to design in software can learn from approaches to design in other fields.
1495One of the original books on heuristics in problem solving was G. Polya’s How to Solve
1496It (1957). Polya’s generalized problem-solving approach focuses on problem solving
1497in mathematics. Figure 5-10 is a summary of his approach, adapted from a similar
1498summary in his book (emphases his).
1499cc2e.com/0592
1500Figure 5-10 G. Polya developed an approach to problem solving in mathematics that’s also
1501useful in solving problems in software design (Polya 1957).
15021. Understanding the Problem. You have to understand the problem.
1503What is the unknown? What are the data? What is the condition? Is it possible to satisfy
1504the condition? Is the condition sufficient to determine the unknown? Or is it
1505insufficient? Or redundant? Or contradictory?
1506Draw a figure. Introduce suitable notation. Separate the various parts of the
1507condition. Can you write them down?
15082. Devising a Plan. Find the connection between the data and the unknown. You
1509might be obliged to consider auxiliary problems if you can't find an intermediate
1510connection. You should eventually come up with a plan of the solution.
1511Have you seen the problem before? Or have you seen the same problem in a
1512slightly different form? Do you know a related problem? Do you know a theorem that
1513could be useful?
1514Look at the unknown! And try to think of a familiar problem having the same or a
1515similar unknown. Here is a problem related to yours and solved before. Can you use it?
1516Can you use its result? Can you use its method? Should you introduce some auxiliary
1517element in order to make its use possible?
1518Can you restate the problem? Can you restate it still differently? Go back to
1519definitions.
1520If you cannot solve the proposed problem, try to solve some related problem first.
1521Can you imagine a more accessible related problem? A more general problem? A
1522more special problem? An analogous problem? Can you solve a part of the problem?
1523Keep only a part of the condition, drop the other part; how far is the unknown then
1524determined, how can it vary? Can you derive something useful from the data? Can
1525you think of other data appropriate for determining the unknown? Can you change
1526the unknown or the data, or both if necessary, so that the new unknown and the new
1527data are nearer to each other?
1528Did you use all the data? Did you use the whole condition? Have you taken into
1529account all essential notions involved in the problem?
15303. Carrying out the Plan. Carry out your plan.
1531Carrying out your plan of the solution, check each step. Can you see clearly that the
1532step is correct? Can you prove that it's correct?
15334. Looking Back. Examine the solution.
1534Can you check the result? Can you check the argument? Can you derive the result
1535differently? Can you see it at a glance?
1536Can you use the result, or the method, for some other problem?
1537110 Chapter 5: Design in Construction
1538One of the most effective guidelines is not to get stuck on a single approach. If diagramming
1539the design in UML isn’t working, write it in English. Write a short test program.
1540Try a completely different approach. Think of a brute-force solution. Keep
1541outlining and sketching with your pencil, and your brain will follow. If all else fails,
1542walk away from the problem. Literally go for a walk, or think about something else
1543before returning to the problem. If you’ve given it your best and are getting nowhere,
1544putting it out of your mind for a time often produces results more quickly than sheer
1545persistence can.
1546You don’t have to solve the whole design problem at once. If you get stuck, remember
1547that a point needs to be decided but recognize that you don’t yet have enough information
1548to resolve that specific issue. Why fight your way through the last 20 percent
1549of the design when it will drop into place easily the next time through? Why make bad
1550decisions based on limited experience with the design when you can make good decisions
1551based on more experience with it later? Some people are uncomfortable if they
1552don’t come to closure after a design cycle, but after you have created a few designs
1553without resolving issues prematurely, it will seem natural to leave issues unresolved
1554until you have more information (Zahniser 1992, Beck 2000).
15555.4 Design Practices
1556The preceding section focused on heuristics related to design attributes—what you
1557want the completed design to look like. This section describes design practice heuristics,
1558steps you can take that often produce good results.
1559Iterate
1560You might have had an experience in which you learned so much from writing a program
1561that you wished you could write it again, armed with the insights you gained
1562from writing it the first time. The same phenomenon applies to design, but the design
1563cycles are shorter and the effects downstream are bigger, so you can afford to whirl
1564through the design loop a few times.
1565Design is an iterative process. You don’t usually go from point A only to point B; you
1566go from point A to point B and back to point A.
1567As you cycle through candidate designs and try different approaches, you’ll look at
1568both high-level and low-level views. The big picture you get from working with highlevel
1569issues will help you to put the low-level details in perspective. The details you
1570get from working with low-level issues will provide a foundation in solid reality for
1571the high-level decisions. The tug and pull between top-level and bottom-level
1572KEY POINT
15735.4 Design Practices 111
1574considerations is a healthy dynamic; it creates a stressed structure that’s more stable
1575than one built wholly from the top down or the bottom up.
1576Many programmers—many people, for that matter—have trouble ranging between highlevel
1577and low-level considerations. Switching from one view of a system to another is
1578mentally strenuous, but it’s essential to creating effective designs. For entertaining exercises
1579to enhance your mental flexibility, read Conceptual Blockbusting (Adams 2001),
1580described in the “Additional Resources†section at the end of the chapter.
1581Cross-Reference Refactoring
1582is a safe way to try different
1583alternatives in code. For
1584more on this, see Chapter
158524, "Refactoring."
1586When you come up with a first design attempt that seems good enough, don’t stop!
1587The second attempt is nearly always better than the first, and you learn things on each
1588attempt that can improve your overall design. After trying a thousand different materials
1589for a light bulb filament with no success, Thomas Edison was reportedly asked if
1590he felt his time had been wasted since he had discovered nothing. “Nonsense,†Edison
1591is supposed to have replied. “I have discovered a thousand things that don’t work.†In
1592many cases, solving the problem with one approach will produce insights that will
1593enable you to solve the problem using another approach that’s even better.
1594Divide and Conquer
1595As Edsger Dijkstra pointed out, no one’s skull is big enough to contain all the details
1596of a complex program, and that applies just as well to design. Divide the program into
1597different areas of concern, and then tackle each of those areas individually. If you run
1598into a dead end in one of the areas, iterate!
1599Incremental refinement is a powerful tool for managing complexity. As Polya recommended
1600in mathematical problem solving, understand the problem, devise a plan,
1601carry out the plan, and then look back to see how you did (Polya 1957).
1602Top-Down and Bottom-Up Design Approaches
1603“Top down†and “bottom up†might have an old-fashioned sound, but they provide
1604valuable insight into the creation of object-oriented designs. Top-down design begins
1605at a high level of abstraction. You define base classes or other nonspecific design elements.
1606As you develop the design, you increase the level of detail, identifying derived
1607classes, collaborating classes, and other detailed design elements.
1608Bottom-up design starts with specifics and works toward generalities. It typically
1609begins by identifying concrete objects and then generalizes aggregations of objects
1610and base classes from those specifics.
1611Some people argue vehemently that starting with generalities and working toward
1612specifics is best, and some argue that you can’t really identify general design principles
1613until you’ve worked out the significant details. Here are the arguments on both sides.
1614112 Chapter 5: Design in Construction
1615Argument for Top Down
1616The guiding principle behind the top-down approach is the idea that the human brain
1617can concentrate on only a certain amount of detail at a time. If you start with general
1618classes and decompose them into more specialized classes step by step, your brain
1619isn’t forced to deal with too many details at once.
1620The divide-and-conquer process is iterative in a couple of senses. First, it’s iterative
1621because you usually don’t stop after one level of decomposition. You keep going for
1622several levels. Second, it’s iterative because you don’t usually settle for your first
1623attempt. You decompose a program one way. At various points in the decomposition,
1624you’ll have choices about which way to partition the subsystems, lay out the inheritance
1625tree, and form compositions of objects. You make a choice and see what happens.
1626Then you start over and decompose it another way and see whether that works
1627better. After several attempts, you’ll have a good idea of what will work and why.
1628How far do you decompose a program? Continue decomposing until it seems as if it
1629would be easier to code the next level than to decompose it. Work until you become
1630somewhat impatient at how obvious and easy the design seems. At that point, you’re
1631done. If it’s not clear, work some more. If the solution is even slightly tricky for you
1632now, it’ll be a bear for anyone who works on it later.
1633Argument for Bottom Up
1634Sometimes the top-down approach is so abstract that it’s hard to get started. If you
1635need to work with something more tangible, try the bottom-up design approach. Ask
1636yourself, “What do I know this system needs to do?†Undoubtedly, you can answer
1637that question. You might identify a few low-level responsibilities that you can assign to
1638concrete classes. For example, you might know that a system needs to format a particular
1639report, compute data for that report, center its headings, display the report on the
1640screen, print the report on a printer, and so on. After you identify several low-level
1641responsibilities, you’ll usually start to feel comfortable enough to look at the top again.
1642In some other cases, major attributes of the design problem are dictated from the bottom.
1643You might have to interface with hardware devices whose interface requirements
1644dictate large chunks of your design.
1645Here are some things to keep in mind as you do bottom-up composition:
1646â– Ask yourself what you know the system needs to do.
1647â– Identify concrete objects and responsibilities from that question.
1648â– Identify common objects, and group them using subsystem organization, packages,
1649composition within objects, or inheritance, whichever is appropriate.
1650â– Continue with the next level up, or go back to the top and try again to work down.
16515.4 Design Practices 113
1652No Argument, Really
1653The key difference between top-down and bottom-up strategies is that one is a decomposition
1654strategy and the other is a composition strategy. One starts from the general
1655problem and breaks it into manageable pieces; the other starts with manageable
1656pieces and builds up a general solution. Both approaches have strengths and weaknesses
1657that you’ll want to consider as you apply them to your design problems.
1658The strength of top-down design is that it’s easy. People are good at breaking something
1659big into smaller components, and programmers are especially good at it.
1660Another strength of top-down design is that you can defer construction details. Since
1661systems are often perturbed by changes in construction details (for example, changes
1662in a file structure or a report format), it’s useful to know early on that those details
1663should be hidden in classes at the bottom of the hierarchy.
1664One strength of the bottom-up approach is that it typically results in early identification
1665of needed utility functionality, which results in a compact, well-factored design. If
1666similar systems have already been built, the bottom-up approach allows you to start
1667the design of the new system by looking at pieces of the old system and asking “What
1668can I reuse?â€
1669A weakness of the bottom-up composition approach is that it’s hard to use exclusively.
1670Most people are better at taking one big concept and breaking it into smaller concepts
1671than they are at taking small concepts and making one big one. It’s like the old assemble-
1672it-yourself problem: I thought I was done, so why does the box still have parts in it?
1673Fortunately, you don’t have to use the bottom-up composition approach exclusively.
1674Another weakness of the bottom-up design strategy is that sometimes you find that
1675you can’t build a program from the pieces you’ve started with. You can’t build an airplane
1676from bricks, and you might have to work at the top before you know what kinds
1677of pieces you need at the bottom.
1678To summarize, top down tends to start simple, but sometimes low-level complexity
1679ripples back to the top, and those ripples can make things more complex than they
1680really needed to be. Bottom up tends to start complex, but identifying that complexity
1681early on leads to better design of the higher-level classes—if the complexity doesn’t torpedo
1682the whole system first!
1683In the final analysis, top-down and bottom-up design aren’t competing strategies—
1684they’re mutually beneficial. Design is a heuristic process, which means that no solution
1685is guaranteed to work every time. Design contains elements of trial and error. Try
1686a variety of approaches until you find one that works well.
1687114 Chapter 5: Design in Construction
1688Experimental Prototyping
1689cc2e.com/0599 Sometimes you can’t really know whether a design will work until you better understand
1690some implementation detail. You might not know if a particular database organization
1691will work until you know whether it will meet your performance goals. You
1692might not know whether a particular subsystem design will work until you select the
1693specific GUI libraries you’ll be working with. These are examples of the essential
1694“wickedness†of software design—you can’t fully define the design problem until
1695you’ve at least partially solved it.
1696A general technique for addressing these questions at low cost is experimental prototyping.
1697The word “prototyping†means lots of different things to different people
1698(McConnell 1996). In this context, prototyping means writing the absolute minimum
1699amount of throwaway code that’s needed to answer a specific design question.
1700Prototyping works poorly when developers aren’t disciplined about writing the absolute
1701minimum of code needed to answer a question. Suppose the design question is,
1702“Can the database framework we’ve selected support the transaction volume we
1703need?†You don’t need to write any production code to answer that question. You
1704don’t even need to know the database specifics. You just need to know enough to
1705approximate the problem space—number of tables, number of entries in the tables,
1706and so on. You can then write very simple prototyping code that uses tables with
1707names like Table1, Table2, and Column1, and Column2, populate the tables with junk
1708data, and do your performance testing.
1709Prototyping also works poorly when the design question is not specific enough. A
1710design question like “Will this database framework work?†does not provide enough
1711direction for prototyping. A design question like “Will this database framework support
17121,000 transactions per second under assumptions X, Y, and Z?†provides a more
1713solid basis for prototyping.
1714A final risk of prototyping arises when developers do not treat the code as throwaway
1715code. I have found that it is not possible for people to write the absolute minimum
1716amount of code to answer a question if they believe that the code will eventually end
1717up in the production system. They end up implementing the system instead of prototyping.
1718By adopting the attitude that once the question is answered the code will be
1719thrown away, you can minimize this risk. One way to avoid this problem is to create
1720prototypes in a different technology than the production code. You could prototype a
1721Java design in Python or mock up a user interface in Microsoft PowerPoint. If you do
1722create prototypes using the production technology, a practical standard that can help
1723is requiring that class names or package names for prototype code be prefixed with
1724prototype. That at least makes a programmer think twice before trying to extend prototype
1725code (Stephens 2003).
17265.4 Design Practices 115
1727Used with discipline, prototyping is the workhorse tool a designer has to combat design
1728wickedness. Used without discipline, prototyping adds some wickedness of its own.
1729Collaborative Design
1730Cross-Reference For more
1731details on collaborative development,
1732see Chapter 21,
1733“Collaborative Construction.â€
1734In design, two heads are often better than one, whether those two heads are organized
1735formally or informally. Collaboration can take any of several forms:
1736■You informally walk over to a co-worker’s desk and ask to bounce some ideas
1737around.
1738â– You and your co-worker sit together in a conference room and draw design alternatives
1739on a whiteboard.
1740â– You and your co-worker sit together at the keyboard and do detailed design in
1741the programming language you’re using—that is, you can use pair programming,
1742described in Chapter 21, “Collaborative Construction.â€
1743â– You schedule a meeting to walk through your design ideas with one or more coworkers.
1744â– You schedule a formal inspection with all the structure described in Chapter 21.
1745■You don’t work with anyone who can review your work, so you do some initial
1746work, put it into a drawer, and come back to it a week later. You will have forgotten
1747enough that you should be able to give yourself a fairly good review.
1748â– You ask someone outside your company for help: send questions to a specialized
1749forum or newsgroup.
1750If the goal is quality assurance, I tend to recommend the most structured review practice,
1751formal inspections, for the reasons described in Chapter 21. But if the goal is to
1752foster creativity and to increase the number of design alternatives generated, not just
1753to find errors, less structured approaches work better. After you’ve settled on a specific
1754design, switching to a more formal inspection might be appropriate, depending on
1755the nature of your project.
1756How Much Design Is Enough?
1757We try to solve the problem
1758by rushing through the
1759design process so that
1760enough time is left at the
1761end of the project to uncover
1762the errors that were made
1763because we rushed through
1764the design process.
1765—Glenford Myers
1766Sometimes only the barest sketch of an architecture is mapped out before coding
1767begins. Other times, teams create designs at such a level of detail that coding
1768becomes a mostly mechanical exercise. How much design should you do before you
1769begin coding?
1770A related question is how formal to make the design. Do you need formal, polished
1771design diagrams, or would digital snapshots of a few drawings on a whiteboard be
1772enough?
1773116 Chapter 5: Design in Construction
1774Deciding how much design to do before beginning full-scale coding and how much
1775formality to use in documenting that design is hardly an exact science. The experience
1776of the team, expected lifetime of the system, desired level of reliability, and size of
1777project and team should all be considered. Table 5-2 summarizes how each of these
1778factors influence the design approach.
1779Two or more of these factors might come into play on any specific project, and in
1780some cases the factors might provide contradictory advice. For example, you might
1781have a highly experienced team working on safety critical software. In that case, you’d
1782probably want to err on the side of the higher level of design detail and formality. In
1783such cases, you’ll need to weigh the significance of each factor and make a judgment
1784about what matters most.
1785If the level of design is left to each individual, then, when the design descends to the
1786level of a task that you’ve done before or to a simple modification or extension of such
1787a task, you’re probably ready to stop designing and begin coding.
1788Table 5-2 Design Formality and Level of Detail Needed
1789Factor
1790Level of Detail Needed
1791in Design Before
1792Construction
1793Documentation
1794Formality
1795Design/construction team
1796has deep experience in
1797applications area.
1798Low Detail Low Formality
1799Design/construction team
1800has deep experience but
1801is inexperienced in the
1802applications area.
1803Medium Detail Medium Formality
1804Design/construction team
1805is inexperienced.
1806Medium to High Detail Low-Medium Formality
1807Design/construction team
1808has moderate-to-high
1809turnover.
1810Medium Detail —
1811Application is
1812safety-critical.
1813High Detail High Formality
1814Application is
1815mission-critical.
1816Medium Detail Medium-High Formality
1817Project is small. Low Detail Low Formality
1818Project is large. Medium Detail Medium Formality
1819Software is expected to
1820have a short lifetime
1821(weeks or months).
1822Low Detail Low Formality
1823Software is expected to
1824have a long lifetime
1825(months or years).
1826Medium Detail Medium Formality
18275.4 Design Practices 117
1828If I can’t decide how deeply to investigate a design before I begin coding, I tend to err
1829on the side of going into more detail. The biggest design errors arise from cases in
1830which I thought I went far enough, but it later turns out that I didn’t go far enough to
1831realize there were additional design challenges. In other words, the biggest design
1832problems tend to arise not from areas I knew were difficult and created bad designs
1833for, but from areas I thought were easy and didn’t create any designs for at all. I rarely
1834encounter projects that are suffering from having done too much design work.
1835I've never met a human
1836being who would want to
1837read 17,000 pages of documentation,
1838and if there was,
1839I'd kill him to get him out of
1840the gene pool.
1841—Joseph Costello
1842On the other hand, occasionally I have seen projects that are suffering from too much
1843design documentation. Gresham’s Law states that “programmed activity tends to drive
1844out nonprogrammed activity†(Simon 1965). A premature rush to polish a design
1845description is a good example of that law. I would rather see 80 percent of the design
1846effort go into creating and exploring numerous design alternatives and 20 percent go
1847into creating less polished documentation than to have 20 percent go into creating
1848mediocre design alternatives and 80 percent go into polishing documentation of
1849designs that are not very good.
1850Capturing Your Design Work
1851cc2e.com/0506 The traditional approach to capturing design work is to write up the designs in a formal
1852design document. However, you can capture designs in numerous alternative
1853ways that work well on small projects, informal projects, or projects that need a lightweight
1854way to record a design:
1855The bad news is that, in our
1856opinion, we will never find the
1857philosopher’s stone. We will
1858never find a process that allows
1859us to design software in a perfectly
1860rational way. The good
1861news is that we can fake it.
1862—David Parnas and Paul
1863Clements
1864Insert design documentation into the code itself Document key design decisions in
1865code comments, typically in the file or class header. When you couple this approach
1866with a documentation extractor like JavaDoc, this assures that design documentation
1867will be readily available to a programmer working on a section of code, and it
1868improves the chance that programmers will keep the design documentation reasonably
1869up to date.
1870Capture design discussions and decisions on a Wiki Have your design discussions
1871in writing, on a project Wiki (that is, a collection of Web pages that can be edited easily
1872by anyone on your project using a Web browser). This will capture your design discussions
1873and decision automatically, albeit with the extra overhead of typing rather
1874than talking. You can also use the Wiki to capture digital pictures to supplement the
1875text discussion, links to websites that support the design decision, white papers, and
1876other materials. This technique is especially useful if your development team is geographically
1877distributed.
1878Write e-mail summaries After a design discussion, adopt the practice of designating
1879someone to write a summary of the discussion—especially what was decided—and send
1880it to the project team. Archive a copy of the e-mail in the project’s public e-mail folder.
1881118 Chapter 5: Design in Construction
1882Use a digital camera One common barrier to documenting designs is the tedium of
1883creating design drawings in some popular drawing tools. But the documentation
1884choices are not limited to the two options of “capturing the design in a nicely formatted,
1885formal notation†vs. “no design documentation at all.â€
1886Taking pictures of whiteboard drawings with a digital camera and then embedding
1887those pictures into traditional documents can be a low-effort way to get 80 percent of
1888the benefit of saving design drawings by doing about 1 percent of the work required
1889if you use a drawing tool.
1890Save design flip charts There’s no law that says your design documentation has to
1891fit on standard letter-size paper. If you make your design drawings on large flip chart
1892paper, you can simply archive the flip charts in a convenient location—or, better yet,
1893post them on the walls around the project area so that people can easily refer to them
1894and update them when needed.
1895cc2e.com/0513 Use CRC (Class, Responsibility, Collaborator) cards Another low-tech alternative
1896for documenting designs is to use index cards. On each card, designers write a class
1897name, responsibilities of the class, and collaborators (other classes that cooperate
1898with the class). A design group then works with the cards until they’re satisfied that
1899they’ve created a good design. At that point, you can simply save the cards for future
1900reference. Index cards are cheap, unintimidating, and portable, and they encourage
1901group interaction (Beck 1991).
1902Create UML diagrams at appropriate levels of detail One popular technique for
1903diagramming designs is called Unified Modeling Language (UML), which is defined
1904by the Object Management Group (Fowler 2004). Figure 5-6 earlier in this chapter
1905was one example of a UML class diagram. UML provides a rich set of formalized representations
1906for design entities and relationships. You can use informal versions of
1907UML to explore and discuss design approaches. Start with minimal sketches and add
1908detail only after you’ve zeroed in on a final design solution. Because UML is standardized,
1909it supports common understanding in communicating design ideas and it can
1910accelerate the process of considering design alternatives when working in a group.
1911These techniques can work in various combinations, so feel free to mix and match these
1912approaches on a project-by-project basis or even within different areas of a single project.
19135.5 Comments on Popular Methodologies
1914The history of design in software has been marked by fanatic advocates of wildly conflicting
1915design approaches. When I published the first edition of Code Complete in the
1916early 1990s, design zealots were advocating dotting every design i and crossing every
1917design t before beginning coding. That recommendation didn’t make any sense.
1918Additional Resources 119
1919People who preach software
1920design as a disciplined activity
1921spend considerable
1922energy making us all feel
1923guilty. We can never be
1924structured enough or objectoriented
1925enough to achieve
1926nirvana in this lifetime. We
1927all truck around a kind of
1928original sin from having
1929learned Basic at an impressionable
1930age. But my bet is
1931that most of us are better
1932designers than the purists
1933will ever acknowledge.
1934—P. J. Plauger
1935As I write this edition in the mid-2000s, some software swamis are arguing for not
1936doing any design at all. “Big Design Up Front is BDUF,†they say. “BDUF is bad. You’re
1937better off not doing any design before you begin coding!â€
1938In ten years the pendulum has swung from “design everything†to “design nothing.â€
1939But the alternative to BDUF isn’t no design up front, it’s a Little Design Up Front
1940(LDUF) or Enough Design Up Front—ENUF.
1941How do you tell how much is enough? That’s a judgment call, and no one can make
1942that call perfectly. But while you can’t know the exact right amount of design with any
1943confidence, two amounts of design are guaranteed to be wrong every time: designing
1944every last detail and not designing anything at all. The two positions advocated by
1945extremists on both ends of the scale turn out to be the only two positions that are
1946always wrong!
1947As P.J. Plauger says, “The more dogmatic you are about applying a design method, the
1948fewer real-life problems you are going to solve†(Plauger 1993). Treat design as a
1949wicked, sloppy, heuristic process. Don’t settle for the first design that occurs to you.
1950Collaborate. Strive for simplicity. Prototype when you need to. Iterate, iterate, and iterate
1951again. You’ll be happy with your designs.
1952Additional Resources
1953cc2e.com/0520 Software design is a rich field with abundant resources. The challenge is identifying
1954which resources will be most useful. Here are some suggestions.
1955Software Design, General
1956Weisfeld, Matt. The Object-Oriented Thought Process, 2d ed. SAMS, 2004. This is an
1957accessible book that introduces object-oriented programming. If you’re already familiar
1958with object-oriented programming, you’ll probably want a more advanced book,
1959but if you’re just getting your feet wet in object orientation, this book introduces fundamental
1960object-oriented concepts, including objects, classes, interfaces, inheritance,
1961polymorphism, overloading, abstract classes, aggregation and association, constructors/
1962destructors, exceptions, and others.
1963Riel, Arthur J. Object-Oriented Design Heuristics. Reading, MA: Addison-Wesley, 1996.
1964This book is easy to read and focuses on design at the class level.
1965Plauger, P. J. Programming on Purpose: Essays on Software Design. Englewood Cliffs, NJ:
1966PTR Prentice Hall, 1993. I picked up as many tips about good software design from
1967reading this book as from any other book I’ve read. Plauger is well-versed in a widevariety
1968of design approaches, he’s pragmatic, and he’s a great writer.
1969120 Chapter 5: Design in Construction
1970Meyer, Bertrand. Object-Oriented Software Construction, 2d ed. New York, NY: Prentice
1971Hall PTR, 1997. Meyer presents a forceful advocacy of hard-core object-oriented
1972programming.
1973Raymond, Eric S. The Art of UNIX Programming. Boston, MA: Addison-Wesley, 2004.
1974This is a well-researched look at software design through UNIX-colored glasses. Section
19751.6 is an especially concise 12-page explanation of 17 key UNIX design principles.
1976Larman, Craig. Applying UML and Patterns: An Introduction to Object-Oriented Analysis
1977and Design and the Unified Process, 2d ed. Englewood Cliffs, NJ: Prentice Hall, 2001.
1978This book is a popular introduction to object-oriented design in the context of the
1979Unified Process. It also discusses object-oriented analysis.
1980Software Design Theory
1981Parnas, David L., and Paul C. Clements. “A Rational Design Process: How and Why to
1982Fake It.†IEEE Transactions on Software Engineering SE-12, no. 2 (February 1986): 251–57.
1983This classic article describes the gap between how programs are really designed and
1984how you sometimes wish they were designed. The main point is that no one ever
1985really goes through a rational, orderly design process but that aiming for it makes for
1986better designs in the end.
1987I’m not aware of any comprehensive treatment of information hiding. Most softwareengineering
1988textbooks discuss it briefly, frequently in the context of object-oriented
1989techniques. The three Parnas papers listed below are the seminal presentations of the
1990idea and are probably still the best resources on information hiding.
1991Parnas, David L. “On the Criteria to Be Used in Decomposing Systems into Modules.â€
1992Communications of the ACM 5, no. 12 (December 1972): 1053-58.
1993Parnas, David L. “Designing Software for Ease of Extension and Contraction.†IEEE
1994Transactions on Software Engineering SE-5, no. 2 (March 1979): 128-38.
1995Parnas, David L., Paul C. Clements, and D. M. Weiss. “The Modular Structure of Complex
1996Systems.†IEEE Transactions on Software Engineering SE-11, no. 3 (March 1985):
1997259-66.
1998Design Patterns
1999Gamma, Erich, et al. Design Patterns. Reading, MA: Addison-Wesley, 1995. This book
2000by the “Gang of Four†is the seminal book on design patterns.
2001Shalloway, Alan, and James R. Trott. Design Patterns Explained. Boston, MA: Addison-
2002Wesley, 2002. This book contains an easy-to-read introduction to design patterns.
2003Additional Resources 121
2004Design in General
2005Adams, James L. Conceptual Blockbusting: A Guide to Better Ideas, 4th ed. Cambridge,
2006MA: Perseus Publishing, 2001. Although not specifically about software design, this
2007book was written to teach design to engineering students at Stanford. Even if you
2008never design anything, the book is a fascinating discussion of creative thought processes.
2009It includes many exercises in the kinds of thinking required for effective
2010design. It also contains a well-annotated bibliography on design and creative thinking.
2011If you like problem solving, you’ll like this book.
2012Polya, G. How to Solve It: A New Aspect of Mathematical Method, 2d ed. Princeton, NJ:
2013Princeton University Press, 1957. This discussion of heuristics and problem solving
2014focuses on mathematics but is applicable to software development. Polya’s book was
2015the first written about the use of heuristics in mathematical problem solving. It draws
2016a clear distinction between the messy heuristics used to discover solutions and the
2017tidier techniques used to present them once they’ve been discovered. It’s not easy
2018reading, but if you’re interested in heuristics, you’ll eventually read it whether you
2019want to or not. Polya’s book makes it clear that problem solving isn’t a deterministic
2020activity and that adherence to any single methodology is like walking with your feet in
2021chains. At one time, Microsoft gave this book to all its new programmers.
2022Michalewicz, Zbigniew, and David B. Fogel. How to Solve It: Modern Heuristics. Berlin:
2023Springer-Verlag, 2000. This is an updated treatment of Polya’s book that’s quite a bit
2024easier to read and that also contains some nonmathematical examples.
2025Simon, Herbert. The Sciences of the Artificial, 3d ed. Cambridge, MA: MIT Press, 1996.
2026This fascinating book draws a distinction between sciences that deal with the natural
2027world (biology, geology, and so on) and sciences that deal with the artificial world created
2028by humans (business, architecture, and computer science). It then discusses the
2029characteristics of the sciences of the artificial, emphasizing the science of design. It has
2030an academic tone and is well worth reading for anyone intent on a career in software
2031development or any other “artificial†field.
2032Glass, Robert L. Software Creativity. Englewood Cliffs, NJ: Prentice Hall PTR, 1995. Is
2033software development controlled more by theory or by practice? Is it primarily creative
2034or is it primarily deterministic? What intellectual qualities does a software developer
2035need? This book contains an interesting discussion of the nature of software
2036development with a special emphasis on design.
2037Petroski, Henry. Design Paradigms: Case Histories of Error and Judgment in Engineering.
2038Cambridge: Cambridge University Press, 1994. This book draws heavily from the field of
2039civil engineering (especially bridge design) to explain its main argument that successful
2040design depends at least as much upon learning from past failures as from past successes.
2041122 Chapter 5: Design in Construction
2042Standards
2043IEEE Std 1016-1998, Recommended Practice for Software Design Descriptions. This document
2044contains the IEEE-ANSI standard for software-design descriptions. It describes
2045what should be included in a software-design document.
2046IEEE Std 1471-2000. Recommended Practice for Architectural Description of Software Intensive
2047Systems. Los Alamitos, CA: IEEE Computer Society Press. This document is the
2048IEEE-ANSI guide for creating software architecture specifications.
2049cc2e.com/0527 CHECKLIST: Design in Construction
2050Design Practices
2051â‘ Have you iterated, selecting the best of several attempts rather than the
2052first attempt?
2053â‘ Have you tried decomposing the system in several different ways to see
2054which way will work best?
2055â‘ Have you approached the design problem both from the top down and
2056from the bottom up?
2057â‘ Have you prototyped risky or unfamiliar parts of the system, creating the
2058absolute minimum amount of throwaway code needed to answer specific
2059questions?
2060â‘ Has your design been reviewed, formally or informally, by others?
2061â‘ Have you driven the design to the point that its implementation seems
2062obvious?
2063â‘ Have you captured your design work using an appropriate technique such
2064as a Wiki, e-mail, flip charts, digital photography, UML, CRC cards, or
2065comments in the code itself?
2066Design Goals
2067â‘ Does the design adequately address issues that were identified and
2068deferred at the architectural level?
2069â‘ Is the design stratified into layers?
2070â‘ Are you satisfied with the way the program has been decomposed into
2071subsystems, packages, and classes?
2072â‘ Are you satisfied with the way the classes have been decomposed into
2073routines?
2074â‘ Are classes designed for minimal interaction with each other?
2075Key Points 123
2076â‘ Are classes and subsystems designed so that you can use them in other
2077systems?
2078â‘ Will the program be easy to maintain?
2079â‘ Is the design lean? Are all of its parts strictly necessary?
2080â‘ Does the design use standard techniques and avoid exotic, hard-to-understand
2081elements?
2082â‘ Overall, does the design help minimize both accidental and essential
2083complexity?
2084Key Points
2085■Software’s Primary Technical Imperative is managing complexity. This is greatly
2086aided by a design focus on simplicity.
2087â– Simplicity is achieved in two general ways: minimizing the amount of essential
2088complexity that anyone’s brain has to deal with at any one time, and keeping
2089accidental complexity from proliferating needlessly.
2090â– Design is heuristic. Dogmatic adherence to any single methodology hurts creativity
2091and hurts your programs.
2092â– Good design is iterative; the more design possibilities you try, the better your
2093final design will be.
2094■Information hiding is a particularly valuable concept. Asking “What should I
2095hide?†settles many difficult design issues.
2096â– Lots of useful, interesting information on design is available outside this book.
2097The perspectives presented here are just the tip of the iceberg.
2098
2099125
2100Chapter 6
2101Working Classes
2102cc2e.com/0665 Contents
2103â– 6.1 Class Foundations: Abstract Data Types (ADTs): page 126
2104â– 6.2 Good Class Interfaces: page 133
2105â– 6.3 Design and Implementation Issues: page 143
2106â– 6.4 Reasons to Create a Class: page 152
2107â– 6.5 Language-Specific Issues: page 156
2108â– 6.6 Beyond Classes: Packages: page 156
2109Related Topics
2110â– Design in construction: Chapter 5
2111â– Software architecture: Section 3.5
2112â– High-quality routines: Chapter 7
2113â– The Pseudocode Programming Process: Chapter 9
2114â– Refactoring: Chapter 24
2115In the dawn of computing, programmers thought about programming in terms of
2116statements. Throughout the 1970s and 1980s, programmers began thinking about
2117programs in terms of routines. In the twenty-first century, programmers think about
2118programming in terms of classes.
2119A class is a collection of data and routines that share a cohesive, well-defined responsibility.
2120A class might also be a collection of routines that provides a cohesive set of services
2121even if no common data is involved. A key to being an effective programmer is
2122maximizing the portion of a program that you can safely ignore while working on any
2123one section of code. Classes are the primary tool for accomplishing that objective.
2124This chapter contains a distillation of advice in creating high-quality classes. If you’re
2125still warming up to object-oriented concepts, this chapter might be too advanced.
2126Make sure you’ve read Chapter 5, “Design in Construction.†Then start with Section
21276.1, “Class Foundations: Abstract Data Types (ADTs),†and ease your way into the
2128remaining sections. If you’re already familiar with class basics, you might skim Section
21296.1 and then dive into the discussion of class interfaces in Section 6.2. The “Additional
2130Resources†section at the end of this chapter contains pointers to introductory reading,
2131advanced reading, and programming-language-specific resources.
2132KEY POINT
2133126 Chapter 6: Working Classes
21346.1 Class Foundations: Abstract Data Types (ADTs)
2135An abstract data type is a collection of data and operations that work on that data. The
2136operations both describe the data to the rest of the program and allow the rest of the
2137program to change the data. The word “data†in “abstract data type†is used loosely.
2138An ADT might be a graphics window with all the operations that affect it, a file and file
2139operations, an insurance-rates table and the operations on it, or something else.
2140Cross-Reference Thinking
2141about ADTs first and classes
2142second is an example of programming
2143into a language
2144vs. programming in one. See
2145Section 4.3, “Your Location
2146on the Technology Wave,â€
2147and Section 34.4, “Program
2148into Your Language, Not in It.â€
2149Understanding ADTs is essential to understanding object-oriented programming.
2150Without understanding ADTs, programmers create classes that are “classes†in name
2151only—in reality, they are little more than convenient carrying cases for loosely related
2152collections of data and routines. With an understanding of ADTs, programmers can
2153create classes that are easier to implement initially and easier to modify over time.
2154Traditionally, programming books wax mathematical when they arrive at the topic of
2155abstract data types. They tend to make statements like “One can think of an abstract
2156data type as a mathematical model with a collection of operations defined on it.†Such
2157books make it seem as if you’d never actually use an abstract data type except as a
2158sleep aid.
2159Such dry explanations of abstract data types completely miss the point. Abstract data
2160types are exciting because you can use them to manipulate real-world entities rather
2161than low-level, implementation entities. Instead of inserting a node into a linked list,
2162you can add a cell to a spreadsheet, a new type of window to a list of window types, or
2163another passenger car to a train simulation. Tap into the power of being able to work
2164in the problem domain rather than at the low-level implementation domain!
2165Example of the Need for an ADT
2166To get things started, here’s an example of a case in which an ADT would be useful.
2167We’ll get to the details after we have an example to talk about.
2168Suppose you’re writing a program to control text output to the screen using a variety
2169of typefaces, point sizes, and font attributes (such as bold and italic). Part of the program
2170manipulates the text’s fonts. If you use an ADT, you’ll have a group of font routines
2171bundled with the data—the typeface names, point sizes, and font attributes—they
2172operate on. The collection of font routines and data is an ADT.
2173If you’re not using ADTs, you’ll take an ad hoc approach to manipulating fonts. For
2174example, if you need to change to a 12-point font size, which happens to be 16 pixels
2175high, you’ll have code like this:
2176currentFont.size = 16
21776.1 Class Foundations: Abstract Data Types (ADTs) 127
2178If you’ve built up a collection of library routines, the code might be slightly more
2179readable:
2180currentFont.size = PointsToPixels( 12 )
2181Or you could provide a more specific name for the attribute, something like
2182currentFont.sizeInPixels = PointsToPixels( 12 )
2183But what you can’t do is have both currentFont.sizeInPixels and currentFont.sizeInPoints,
2184because, if both the data members are in play, currentFont won’t have any way to know
2185which of the two it should use. And if you change sizes in several places in the program,
2186you’ll have similar lines spread throughout your program.
2187If you need to set a font to bold, you might have code like this that uses a logical or and
2188a hexidecimal constant 0x02:
2189currentFont.attribute = currentFont.attribute or 0x02
2190If you’re lucky, you’ll have something cleaner than that, but the best you’ll get with an
2191ad hoc approach is something like this:
2192currentFont.attribute = currentFont.attribute or BOLD
2193Or maybe something like this:
2194currentFont.bold = True
2195As with the font size, the limitation is that the client code is required to control the
2196data members directly, which limits how currentFont can be used.
2197If you program this way, you’re likely to have similar lines in many places in your
2198program.
2199Benefits of Using ADTs
2200The problem isn’t that the ad hoc approach is bad programming practice. It’s that you
2201can replace the approach with a better programming practice that produces these
2202benefits:
2203You can hide implementation details Hiding information about the font data type
2204means that if the data type changes, you can change it in one place without affecting
2205the whole program. For example, unless you hid the implementation details in an
2206ADT, changing the data type from the first representation of bold to the second would
2207entail changing your program in every place in which bold was set rather than in just
2208one place. Hiding the information also protects the rest of the program if you decide
2209to store data in external storage rather than in memory or to rewrite all the fontmanipulation
2210routines in another language.
2211128 Chapter 6: Working Classes
2212Changes don’t affect the whole program If fonts need to become richer and support
2213more operations (such as switching to small caps, superscripts, strikethrough, and so
2214on), you can change the program in one place. The change won’t affect the rest of the
2215program.
2216You can make the interface more informative Code like currentFont.size = 16 is
2217ambiguous because 16 could be a size in either pixels or points. The context doesn’t
2218tell you which is which. Collecting all similar operations into an ADT allows you to
2219define the entire interface in terms of points, or in terms of pixels, or to clearly differentiate
2220between the two, which helps avoid confusing them.
2221It’s easier to improve performance If you need to improve font performance, you can
2222recode a few well-defined routines rather than wading through an entire program.
2223The program is more obviously correct You can replace the more tedious task of verifying
2224that statements like currentFont.attribute = currentFont.attribute or 0x02 are correct
2225with the easier task of verifying that calls to currentFont.SetBoldOn() are correct.
2226With the first statement, you can have the wrong structure name, the wrong field
2227name, the wrong operation (and instead of or), or the wrong value for the attribute
2228(0x20 instead of 0x02). In the second case, the only thing that could possibly be
2229wrong with the call to currentFont.SetBoldOn() is that it’s a call to the wrong routine
2230name, so it’s easier to see whether it’s correct.
2231The program becomes more self-documenting You can improve statements like currentFont.
2232attribute or 0x02 by replacing 0x02 with BOLD or whatever 0x02 represents, but
2233that doesn’t compare to the readability of a routine call such as currentFont.SetBoldOn().
2234Woodfield, Dunsmore, and Shen conducted a study in which graduate and senior
2235undergraduate computer-science students answered questions about two programs:
2236one that was divided into eight routines along functional lines, and one that was
2237divided into eight abstract-data-type routines (1981). Students using the abstract-datatype
2238program scored over 30 percent higher than students using the functional version.
2239You don’t have to pass data all over your program In the examples just presented,
2240you have to change currentFont directly or pass it to every routine that works with fonts.
2241If. you use an abstract data type, you don’t have to pass currentFont all over the program
2242and you don’t have to turn it into global data either. The ADT has a structure that contains
2243currentFont’s data. The data is directly accessed only by routines that are part of the
2244ADT. Routines that aren’t part of the ADT don’t have to worry about the data.
2245You’re able to work with real-world entities rather than with low-level implementation
2246structures You can define operations dealing with fonts so that most of the program
2247operates solely in terms of fonts rather than in terms of array accesses, structure definitions,
2248and True and False.
22491
22502
22513
2252HARD DATA
22536.1 Class Foundations: Abstract Data Types (ADTs) 129
2254In this case, to define an abstract data type, you’d define a few routines to control
2255fonts—perhaps like this:
2256currentFont.SetSizeInPoints( sizeInPoints )
2257currentFont.SetSizeInPixels( sizeInPixels )
2258currentFont.SetBoldOn()
2259currentFont.SetBoldOff()
2260currentFont.SetItalicOn()
2261currentFont.SetItalicOff()
2262currentFont.SetTypeFace( faceName )
2263The code inside these routines would probably be short—it would probably be similar
2264to the code you saw in the ad hoc approach to the font problem earlier. The difference
2265is that you’ve isolated font operations in a set of routines. That provides a better level
2266of abstraction for the rest of your program to work with fonts, and it gives you a layer
2267of protection against changes in font operations.
2268More Examples of ADTs
2269Suppose you’re writing software that controls the cooling system for a nuclear reactor.
2270You can treat the cooling system as an abstract data type by defining the following
2271operations for it:
2272coolingSystem.GetTemperature()
2273coolingSystem.SetCirculationRate( rate )
2274coolingSystem.OpenValve( valveNumber )
2275coolingSystem.CloseValve( valveNumber )
2276The specific environment would determine the code written to implement each of
2277these operations. The rest of the program could deal with the cooling system through
2278these functions and wouldn’t have to worry about internal details of data-structure
2279implementations, data-structure limitations, changes, and so on.
2280Here are more examples of abstract data types and likely operations on them:
2281KEY POINT
2282Cruise Control Blender Fuel Tank
2283Set speed Turn on Fill tank
2284Get current settings Turn off Drain tank
2285Resume former speed Set speed Get tank capacity
2286Deactivate Start “Insta-Pulverize†Get tank status
2287Stop “Insta-Pulverizeâ€
2288List Stack
2289Initialize list Light Initialize stack
2290Insert item in list Turn on Push item onto stack
2291Remove item from list Turn off Pop item from stack
2292Read next item from list Read top of stack
2293130 Chapter 6: Working Classes
2294Yon can derive several guidelines from a study of these examples; those guidelines are
2295described in the following subsections:
2296Build or use typical low-level data types as ADTs, not as low-level data types Most
2297discussions of ADTs focus on representing typical low-level data types as ADTs. As you
2298can see from the examples, you can represent a stack, a list, and a queue, as well as virtually
2299any other typical data type, as an ADT.
2300The question you need to ask is, “What does this stack, list, or queue represent?†If a
2301stack represents a set of employees, treat the ADT as employees rather than as a stack.
2302If a list represents a set of billing records, treat it as billing records rather than a list. If
2303a queue represents cells in a spreadsheet, treat it as a collection of cells rather than a
2304generic item in a queue. Treat yourself to the highest possible level of abstraction.
2305Treat common objects such as files as ADTs Most languages include a few abstract
2306data types that you’re probably familiar with but might not think of as ADTs. File operations
2307are a good example. While writing to disk, the operating system spares you the
2308grief of positioning the read/write head at a specific physical address, allocating a new
2309disk sector when you exhaust an old one, and interpreting cryptic error codes. The operating
2310system provides a first level of abstraction and the ADTs for that level. High-level
2311languages provide a second level of abstraction and ADTs for that higher level. A highlevel
2312language protects you from the messy details of generating operating-system calls
2313and manipulating data buffers. It allows you to treat a chunk of disk space as a “file.â€
2314You can layer ADTs similarly. If you want to use an ADT at one level that offers datastructure
2315level operations (like pushing and popping a stack), that’s fine. You can create
2316another level on top of that one that works at the level of the real-world problem.
2317Set of Help Screens Menu File
2318Add help topic Start new menu Open file
2319Remove help topic Delete menu Read file
2320Set current help topic Add menu item Write file
2321Display help screen Remove menu item Set current file location
2322Remove help display Activate menu item Close file
2323Display help index Deactivate menu item
2324Back up to previous screen Display menu Elevator
2325Hide menu Move up one floor
2326Pointer Get menu choice Move down one floor
2327Get pointer to new memory Move to specific floor
2328Dispose of memory from
2329existing pointer
2330Report current floor
2331Return to home floor
2332Change amount of memory
2333allocated
23346.1 Class Foundations: Abstract Data Types (ADTs) 131
2335Treat even simple items as ADTs You don’t have to have a formidable data type to
2336justify using an abstract data type. One of the ADTs in the example list is a light that
2337supports only two operations—turning it on and turning it off. You might think that it
2338would be a waste to isolate simple “on†and “off†operations in routines of their own,
2339but even simple operations can benefit from the use of ADTs. Putting the light and its
2340operations into an ADT makes the code more self-documenting and easier to change,
2341confines the potential consequences of changes to the TurnLightOn() and TurnLight-
2342Off() routines, and reduces the number of data items you have to pass around.
2343Refer to an ADT independently of the medium it’s stored on Suppose you have an
2344insurance-rates table that’s so big that it’s always stored on disk. You might be
2345tempted to refer to it as a “rate file†and create access routines such as RateFile.Read().
2346When you refer to it as a file, however, you’re exposing more information about the
2347data than you need to. If you ever change the program so that the table is in memory
2348instead of on disk, the code that refers to it as a file will be incorrect, misleading, and
2349confusing. Try to make the names of classes and access routines independent of how
2350the data is stored, and refer to the abstract data type, like the insurance-rates table,
2351instead. That would give your class and access routine names like rateTable.Read() or
2352simply rates.Read().
2353Handling Multiple Instances of Data with ADTs in Non-Object-
2354Oriented Environments
2355Object-oriented languages provide automatic support for handling multiple instances
2356of an ADT. If you’ve worked exclusively in object-oriented environments and you’ve
2357never had to handle the implementation details of multiple instances yourself, count
2358your blessings! (You can also move on to the next section, “ADTs and Classes.â€)
2359If you’re working in a non-object-oriented environment such as C, you will have to
2360build support for multiple instances manually. In general, that means including services
2361for the ADT to create and delete instances and designing the ADT’s other services
2362so that they can work with multiple instances.
2363The font ADT originally offered these services:
2364currentFont.SetSize( sizeInPoints )
2365currentFont.SetBoldOn()
2366currentFont.SetBoldOff()
2367currentFont.SetItalicOn()
2368currentFont.SetItalicOff()
2369currentFont.SetTypeFace( faceName )
2370132 Chapter 6: Working Classes
2371In a non-object-oriented environment, these functions would not be attached to a
2372class and would look more like this:
2373SetCurrentFontSize( sizeInPoints )
2374SetCurrentFontBoldOn()
2375SetCurrentFontBoldOff()
2376SetCurrentFontItalicOn()
2377SetCurrentFontItalicOff()
2378SetCurrentFontTypeFace( faceName )
2379If you want to work with more than one font at a time, you’ll need to add services to
2380create and delete font instances—maybe these:
2381CreateFont( fontId )
2382DeleteFont( fontId )
2383SetCurrentFont( fontId )
2384The notion of a fontId has been added as a way to keep track of multiple fonts as
2385they’re created and used. For other operations, you can choose from among three
2386ways to handle the ADT interface:
2387â– Option 1: Explicitly identify instances each time you use ADT services. In this
2388case, you don’t have the notion of a “current font.†You pass fontId to each routine
2389that manipulates fonts. The Font functions keep track of any underlying
2390data, and the client code needs to keep track only of the fontId. This requires
2391adding fontId as a parameter to each font routine.
2392â– Option 2: Explicitly provide the data used by the ADT services. In this approach,
2393you declare the data that the ADT uses within each routine that uses an ADT service.
2394In other words, you create a Font data type that you pass to each of the ADT
2395service routines. You must design the ADT service routines so that they use the
2396Font data that’s passed to them each time they’re called. The client code doesn’t
2397need a font ID if you use this approach because it keeps track of the font data
2398itself. (Even though the data is available directly from the Font data type, you
2399should access it only with the ADT service routines. This is called keeping the
2400structure “closed.â€)
2401The advantage of this approach is that the ADT service routines don’t have to
2402look up font information based on a font ID. The disadvantage is that it exposes
2403font data to the rest of the program, which increases the likelihood that client
2404code will make use of the ADT’s implementation details that should have
2405remained hidden within the ADT.
2406â– Option 3: Use implicit instances (with great care). Design a new service to call to
2407make a specific font instance the current one—something like SetCurrentFont
2408( fontId ). Setting the current font makes all other services use the current font
2409when they’re called. If you use this approach, you don’t need fontId as a parameter
2410to the other services. For simple applications, this can streamline use of
24116.2 Good Class Interfaces 133
2412multiple instances. For complex applications, this systemwide dependence on
2413state means that you must keep track of the current font instance throughout
2414code that uses the Font functions. Complexity tends to proliferate, and for applications
2415of any size, better alternatives exist.
2416Inside the abstract data type, you’ll have a wealth of options for handling multiple
2417instances, but outside, this sums up the choices if you’re working in a non-object-oriented
2418language.
2419ADTs and Classes
2420Abstract data types form the foundation for the concept of classes. In languages that
2421support classes, you can implement each abstract data type as its own class. Classes
2422usually involve the additional concepts of inheritance and polymorphism. One way of
2423thinking of a class is as an abstract data type plus inheritance and polymorphism.
24246.2 Good Class Interfaces
2425The first and probably most important step in creating a high-quality class is creating
2426a good interface. This consists of creating a good abstraction for the interface to represent
2427and ensuring that the details remain hidden behind the abstraction.
2428Good Abstraction
2429As “Form Consistent Abstractions†in Section 5.3 described, abstraction is the ability
2430to view a complex operation in a simplified form. A class interface provides an abstraction
2431of the implementation that’s hidden behind the interface. The class’s interface
2432should offer a group of routines that clearly belong together.
2433You might have a class that implements an employee. It would contain data describing
2434the employee’s name, address, phone number, and so on. It would offer services to initialize
2435and use an employee. Here’s how that might look.
2436C++ Example of a Class Interface That Presents a Good Abstraction
2437Cross-Reference Code samples
2438in this book are formatted
2439using a coding
2440convention that emphasizes
2441similarity of styles across
2442multiple languages. For
2443details on the convention
2444(and discussions about multiple
2445coding styles), see
2446“Mixed-Language Programming
2447Considerations†in
2448Section 11.4.
2449class Employee {
2450public:
2451// public constructors and destructors
2452Employee();
2453Employee(
2454FullName name,
2455String address,
2456String workPhone,
2457String homePhone,
2458TaxId taxIdNumber,
2459JobClassification jobClass
2460);
2461virtual ~Employee();
2462134 Chapter 6: Working Classes
2463// public routines
2464FullName GetName() const;
2465String GetAddress() const;
2466String GetWorkPhone() const;
2467String GetHomePhone() const;
2468TaxId GetTaxIdNumber() const;
2469JobClassification GetJobClassification() const;
2470...
2471private:
2472...
2473};
2474Internally, this class might have additional routines and data to support these services,
2475but users of the class don’t need to know anything about them. The class interface
2476abstraction is great because every routine in the interface is working toward a
2477consistent end.
2478A class that presents a poor abstraction would be one that contained a collection of
2479miscellaneous functions. Here’s an example:
2480C++ Example of a Class Interface That Presents a Poor Abstraction
2481class Program {
2482public:
2483...
2484// public routines
2485void InitializeCommandStack();
2486void PushCommand( Command command );
2487Command PopCommand();
2488void ShutdownCommandStack();
2489void InitializeReportFormatting();
2490void FormatReport( Report report );
2491void PrintReport( Report report );
2492void InitializeGlobalData();
2493void ShutdownGlobalData();
2494...
2495private:
2496...
2497};
2498Suppose that a class contains routines to work with a command stack, to format
2499reports, to print reports, and to initialize global data. It’s hard to see any connection
2500among the command stack and report routines or the global data. The class interface
2501doesn’t present a consistent abstraction, so the class has poor cohesion. The routines
2502should be reorganized into more-focused classes, each of which provides a better
2503abstraction in its interface.
2504If these routines were part of a Program class, they could be revised to present a consistent
2505abstraction, like so:
2506CODING
2507HORROR
25086.2 Good Class Interfaces 135
2509C++ Example of a Class Interface That Presents a Better Abstraction
2510class Program {
2511public:
2512...
2513// public routines
2514void InitializeUserInterface();
2515void ShutDownUserInterface();
2516void InitializeReports();
2517void ShutDownReports();
2518...
2519private:
2520...
2521};
2522The cleanup of this interface assumes that some of the original routines were moved
2523to other, more appropriate classes and some were converted to private routines used
2524by InitializeUserInterface() and the other routines.
2525This evaluation of class abstraction is based on the class’s collection of public routines—
2526that is, on the class’s interface. The routines inside the class don’t necessarily
2527present good individual abstractions just because the overall class does, but they need
2528to be designed to present good abstractions too. For guidelines on that, see Section
25297.2, “Design at the Routine Level.â€
2530The pursuit of good, abstract interfaces gives rise to several guidelines for creating
2531class interfaces.
2532Present a consistent level of abstraction in the class interface A good way to think
2533about a class is as the mechanism for implementing the abstract data types described
2534in Section 6.1. Each class should implement one and only one ADT. If you find a class
2535implementing more than one ADT, or if you can’t determine what ADT the class
2536implements, it’s time to reorganize the class into one or more well-defined ADTs.
2537Here’s an example of a class that presents an interface that’s inconsistent because its
2538level of abstraction is not uniform:
2539C++ Example of a Class Interface with Mixed Levels of Abstraction
2540class EmployeeCensus: public ListContainer {
2541public:
2542...
2543// public routines
2544The abstraction of these
2545routines is at the “employeeâ€
2546level.
2547void AddEmployee( Employee employee );
2548void RemoveEmployee( Employee employee );
2549The abstraction of these
2550routines is at the “list†level.
2551Employee NextItemInList();
2552Employee FirstItem();
2553Employee LastItem();
2554...
2555private:
2556...
2557};
2558CODING
2559HORROR
2560136 Chapter 6: Working Classes
2561This class is presenting two ADTs: an Employee and a ListContainer. This sort of mixed
2562abstraction commonly arises when a programmer uses a container class or other
2563library classes for implementation and doesn’t hide the fact that a library class is used.
2564Ask yourself whether the fact that a container class is used should be part of the
2565abstraction. Usually that’s an implementation detail that should be hidden from the
2566rest of the program, like this:
2567C++ Example of a Class Interface with Consistent Levels of Abstraction
2568class EmployeeCensus {
2569public:
2570...
2571// public routines
2572The abstraction of all these
2573routines is now at the
2574“employee†level.
2575void AddEmployee( Employee employee );
2576void RemoveEmployee( Employee employee );
2577Employee NextEmployee();
2578Employee FirstEmployee();
2579Employee LastEmployee();
2580...
2581private:
2582That the class uses the
2583ListContainer library is now
2584hidden.
2585ListContainer m_EmployeeList;
2586...
2587};
2588Programmers might argue that inheriting from ListContainer is convenient because it
2589supports polymorphism, allowing an external search or sort function that takes a List-
2590Container object. That argument fails the main test for inheritance, which is, “Is inheritance
2591used only for “is a†relationships?†To inherit from ListContainer would mean
2592that EmployeeCensus “is a†ListContainer, which obviously isn’t true. If the abstraction
2593of the EmployeeCensus object is that it can be searched or sorted, that should be incorporated
2594as an explicit, consistent part of the class interface.
2595If you think of the class’s public routines as an air lock that keeps water from getting
2596into a submarine, inconsistent public routines are leaky panels in the class. The leaky
2597panels might not let water in as quickly as an open air lock, but if you give them
2598enough time, they’ll still sink the boat. In practice, this is what happens when you mix
2599levels of abstraction. As the program is modified, the mixed levels of abstraction make
2600the program harder and harder to understand, and it gradually degrades until it
2601becomes unmaintainable.
2602Be sure you understand what abstraction the class is implementing Some classes are
2603similar enough that you must be careful to understand which abstraction the class
2604interface should capture. I once worked on a program that needed to allow information
2605to be edited in a table format. We wanted to use a simple grid control, but the grid
2606controls that were available didn’t allow us to color the data-entry cells, so we decided
2607to use a spreadsheet control that did provide that capability.
2608KEY POINT
26096.2 Good Class Interfaces 137
2610The spreadsheet control was far more complicated than the grid control, providing
2611about 150 routines to the grid control’s 15. Since our goal was to use a grid control,
2612not a spreadsheet control, we assigned a programmer to write a wrapper class to hide
2613the fact that we were using a spreadsheet control as a grid control. The programmer
2614grumbled quite a bit about unnecessary overhead and bureaucracy, went away, and
2615came back a couple days later with a wrapper class that faithfully exposed all 150 routines
2616of the spreadsheet control.
2617This was not what was needed. We wanted a grid-control interface that encapsulated
2618the fact that, behind the scenes, we were using a much more complicated spreadsheet
2619control. The programmer should have exposed just the 15 grid-control routines plus
2620a 16th routine that supported cell coloring. By exposing all 150 routines, the programmer
2621created the possibility that, if we ever wanted to change the underlying implementation,
2622we could find ourselves supporting 150 public routines. The programmer
2623failed to achieve the encapsulation we were looking for, as well as creating a lot more
2624work for himself than necessary.
2625Depending on specific circumstances, the right abstraction might be either a spreadsheet
2626control or a grid control. When you have to choose between two similar abstractions,
2627make sure you choose the right one.
2628Provide services in pairs with their opposites Most operations have corresponding,
2629equal, and opposite operations. If you have an operation that turns a light on, you’ll
2630probably need one to turn it off. If you have an operation to add an item to a list, you’ll
2631probably need one to delete an item from the list. If you have an operation to activate
2632a menu item, you’ll probably need one to deactivate an item. When you design a class,
2633check each public routine to determine whether you need its complement. Don’t create
2634an opposite gratuitously, but do check to see whether you need one.
2635Move unrelated information to another class In some cases, you’ll find that half a
2636class’s routines work with half the class’s data and half the routines work with the
2637other half of the data. In such a case, you really have two classes masquerading as one.
2638Break them up!
2639Make interfaces programmatic rather than semantic when possible Each interface
2640consists of a programmatic part and a semantic part. The programmatic part consists of
2641the data types and other attributes of the interface that can be enforced by the compiler.
2642The semantic part of the interface consists of the assumptions about how the interface
2643will be used, which cannot be enforced by the compiler. The semantic interface includes
2644considerations such as “RoutineA must be called before RoutineB†or “RoutineA will crash
2645if dataMember1 isn’t initialized before it’s passed to RoutineA.†The semantic interface
2646should be documented in comments, but try to keep interfaces minimally dependent
2647on documentation. Any aspect of an interface that can’t be enforced by the compiler is
2648an aspect that’s likely to be misused. Look for ways to convert semantic interface elements
2649to programmatic interface elements by using Asserts or other techniques.
2650138 Chapter 6: Working Classes
2651Cross-Reference For more
2652suggestions about how to
2653preserve code quality as
2654code is modified, see Chapter
265524, “Refactoring.â€
2656Beware of erosion of the interface’s abstraction under modification As a class is
2657modified and extended, you often discover additional functionality that’s needed, that
2658doesn’t quite fit with the original class interface, but that seems too hard to implement
2659any other way. For example, in the Employee class, you might find that the class
2660evolves to look like this:
2661C++ Example of a Class Interface That’s Eroding Under Maintenance
2662class Employee {
2663public:
2664...
2665// public routines
2666FullName GetName() const;
2667Address GetAddress() const;
2668PhoneNumber GetWorkPhone() const;
2669...
2670bool IsJobClassificationValid( JobClassification jobClass );
2671bool IsZipCodeValid( Address address );
2672bool IsPhoneNumberValid( PhoneNumber phoneNumber );
2673SqlQuery GetQueryToCreateNewEmployee() const;
2674SqlQuery GetQueryToModifyEmployee() const;
2675SqlQuery GetQueryToRetrieveEmployee() const;
2676...
2677private:
2678...
2679};
2680What started out as a clean abstraction in an earlier code sample has evolved into a
2681hodgepodge of functions that are only loosely related. There’s no logical connection
2682between employees and routines that check ZIP Codes, phone numbers, or job classifications.
2683The routines that expose SQL query details are at a much lower level of
2684abstraction than the Employee class, and they break the Employee abstraction.
2685Don’t add public members that are inconsistent with the interface abstraction Each
2686time you add a routine to a class interface, ask “Is this routine consistent with the
2687abstraction provided by the existing interface?†If not, find a different way to make the
2688modification and preserve the integrity of the abstraction.
2689Consider abstraction and cohesion together The ideas of abstraction and cohesion
2690are closely related—a class interface that presents a good abstraction usually has
2691strong cohesion. Classes with strong cohesion tend to present good abstractions,
2692although that relationship is not as strong.
2693I have found that focusing on the abstraction presented by the class interface tends to
2694provide more insight into class design than focusing on class cohesion. If you see that
2695a class has weak cohesion and aren’t sure how to correct it, ask yourself whether the
2696class presents a consistent abstraction instead.
2697CODING
2698HORROR
26996.2 Good Class Interfaces 139
2700Good Encapsulation
2701Cross-Reference For more
2702on encapsulation, see
2703“Encapsulate Implementation
2704Details†in Section 5.3.
2705As Section 5.3 discussed, encapsulation is a stronger concept than abstraction.
2706Abstraction helps to manage complexity by providing models that allow you to ignore
2707implementation details. Encapsulation is the enforcer that prevents you from looking
2708at the details even if you want to.
2709The two concepts are related because, without encapsulation, abstraction tends to
2710break down. In my experience, either you have both abstraction and encapsulation or
2711you have neither. There is no middle ground.
2712The single most important
2713factor that distinguishes a
2714well-designed module from
2715a poorly designed one is the
2716degree to which the module
2717hides its internal data and
2718other implementation details
2719from other modules.
2720—Joshua Bloch
2721Minimize accessibility of classes and members Minimizing accessibility is one of
2722several rules that are designed to encourage encapsulation. If you’re wondering
2723whether a specific routine should be public, private, or protected, one school of
2724thought is that you should favor the strictest level of privacy that’s workable (Meyers
27251998, Bloch 2001). I think that’s a fine guideline, but I think the more important
2726guideline is, “What best preserves the integrity of the interface abstraction?†If exposing
2727the routine is consistent with the abstraction, it’s probably fine to expose it. If
2728you’re not sure, hiding more is generally better than hiding less.
2729Don’t expose member data in public Exposing member data is a violation of encapsulation
2730and limits your control over the abstraction. As Arthur Riel points out, a Point
2731class that exposes
2732float x;
2733float y;
2734float z;
2735is violating encapsulation because client code is free to monkey around with Point’s
2736data and Point won’t necessarily even know when its values have been changed (Riel
27371996). However, a Point class that exposes
2738float GetX();
2739float GetY();
2740float GetZ();
2741void SetX( float x );
2742void SetY( float y );
2743void SetZ( float z );
2744is maintaining perfect encapsulation. You have no idea whether the underlying implementation
2745is in terms of floats x, y, and z, whether Point is storing those items as doubles
2746and converting them to floats, or whether Point is storing them on the moon and
2747retrieving them from a satellite in outer space.
2748Avoid putting private implementation details into a class’s interface With true
2749encapsulation, programmers would not be able to see implementation details at all.
2750They would be hidden both figuratively and literally. In popular languages, including
2751140 Chapter 6: Working Classes
2752C++, however, the structure of the language requires programmers to disclose implementation
2753details in the class interface. Here’s an example:
2754C++ Example of Exposing a Class’s Implementation Details
2755class Employee {
2756public:
2757...
2758Employee(
2759FullName name,
2760String address,
2761String workPhone,
2762String homePhone,
2763TaxId taxIdNumber,
2764JobClassification jobClass
2765);
2766...
2767FullName GetName() const;
2768String GetAddress() const;
2769...
2770private:
2771Here are the exposed
2772implementation details.
2773String m_Name;
2774String m_Address;
2775int m_jobClass;
2776...
2777};
2778Including private declarations in the class header file might seem like a small transgression,
2779but it encourages other programmers to examine the implementation
2780details. In this case, the client code is intended to use the Address type for addresses
2781but the header file exposes the implementation detail that addresses are stored as
2782Strings.
2783Scott Meyers describes a common way to address this issue in Item 34 of Effective C++,
27842d ed. (Meyers 1998). You separate the class interface from the class implementation.
2785Within the class declaration, include a pointer to the class’s implementation but don’t
2786include any other implementation details.
2787C++ Example of Hiding a Class’s Implementation Details
2788class Employee {
2789public:
2790...
2791Employee( ... );
2792...
2793FullName GetName() const;
2794String GetAddress() const;
2795...
2796private:
2797Here the implementation
2798details are hidden behind
2799the pointer.
2800EmployeeImplementation *m_implementation;
2801};
28026.2 Good Class Interfaces 141
2803Now you can put implementation details inside the EmployeeImplementation class,
2804which should be visible only to the Employee class and not to the code that uses the
2805Employee class.
2806If you’ve already written lots of code that doesn’t use this approach for your project,
2807you might decide it isn’t worth the effort to convert a mountain of existing code to use
2808this approach. But when you read code that exposes its implementation details, you
2809can resist the urge to comb through the private section of the class interface looking
2810for implementation clues.
2811Don’t make assumptions about the class’s users A class should be designed and
2812implemented to adhere to the contract implied by the class interface. It shouldn’t
2813make any assumptions about how that interface will or won’t be used, other than
2814what’s documented in the interface. Comments like the following one are an indication
2815that a class is more aware of its users than it should be:
2816-- initialize x, y, and z to 1.0 because DerivedClass blows
2817-- up if they're initialized to 0.0
2818Avoid friend classes In a few circumstances such as the State pattern, friend classes
2819can be used in a disciplined way that contributes to managing complexity (Gamma et al.
28201995). But, in general, friend classes violate encapsulation. They expand the amount of
2821code you have to think about at any one time, thereby increasing complexity.
2822Don’t put a routine into the public interface just because it uses only public routines
2823The fact that a routine uses only public routines is not a significant consideration.
2824Instead, ask whether exposing the routine would be consistent with the abstraction
2825presented by the interface.
2826Favor read-time convenience to write-time convenience Code is read far more times
2827than it’s written, even during initial development. Favoring a technique that speeds
2828write-time convenience at the expense of read-time convenience is a false economy.
2829This is especially applicable to creation of class interfaces. Even if a routine doesn’t
2830quite fit the interface’s abstraction, sometimes it’s tempting to add a routine to an
2831interface that would be convenient for the particular client of a class that you’re working
2832on at the time. But adding that routine is the first step down a slippery slope, and
2833it’s better not to take even the first step.
2834It ain’t abstract if you have to
2835look at the underlying implementation
2836to understand
2837what’s going on.
2838—P. J. Plauger
2839Be very, very wary of semantic violations of encapsulation At one time I thought
2840that when I learned how to avoid syntax errors I would be home free. I soon discovered
2841that learning how to avoid syntax errors had merely bought me a ticket to a
2842whole new theater of coding errors, most of which were more difficult to diagnose and
2843correct than the syntax errors.
2844The difficulty of semantic encapsulation compared to syntactic encapsulation is similar.
2845Syntactically, it’s relatively easy to avoid poking your nose into the internal workings of
2846another class just by declaring the class’s internal routines and data private. Achieving
2847142 Chapter 6: Working Classes
2848semantic encapsulation is another matter entirely. Here are some examples of the ways
2849that a user of a class can break encapsulation semantically:
2850■Not calling Class A’s InitializeOperations() routine because you know that Class
2851A’s PerformFirstOperation() routine calls it automatically.
2852â– Not calling the database.Connect() routine before you call employee.Retrieve(
2853database ) because you know that the employee.Retrieve() function will connect
2854to the database if there isn’t already a connection.
2855■Not calling Class A’s Terminate() routine because you know that Class A’s PerformFinalOperation()
2856routine has already called it.
2857â– Using a pointer or reference to ObjectB created by ObjectA even after ObjectA has
2858gone out of scope, because you know that ObjectA keeps ObjectB in static storage
2859and ObjectB will still be valid.
2860■Using Class B’s MAXIMUM_ELEMENTS constant instead of using
2861ClassA.MAXIMUM_ELEMENTS, because you know that they’re both equal to
2862the same value.
2863The problem with each of these examples is that they make the client code dependent
2864not on the class’s public interface, but on its private implementation. Anytime you
2865find yourself looking at a class’s implementation to figure out how to use the class,
2866you’re not programming to the interface; you’re programming through the interface to
2867the implementation. If you’re programming through the interface, encapsulation is
2868broken, and once encapsulation starts to break down, abstraction won’t be far behind.
2869If you can’t figure out how to use a class based solely on its interface documentation,
2870the right response is not to pull up the source code and look at the implementation.
2871That’s good initiative but bad judgment. The right response is to contact the author of
2872the class and say “I can’t figure out how to use this class.†The right response on the
2873class-author’s part is not to answer your question face to face. The right response for
2874the class author is to check out the class-interface file, modify the class-interface documentation,
2875check the file back in, and then say “See if you can understand how it
2876works now.†You want this dialog to occur in the interface code itself so that it will be
2877preserved for future programmers. You don’t want the dialog to occur solely in your
2878own mind, which will bake subtle semantic dependencies into the client code that
2879uses the class. And you don’t want the dialog to occur interpersonally so that it benefits
2880only your code but no one else’s.
2881Watch for coupling that’s too tight “Coupling†refers to how tight the connection is
2882between two classes. In general, the looser the connection, the better. Several general
2883guidelines flow from this concept:
2884â– Minimize accessibility of classes and members.
2885■Avoid friend classes, because they’re tightly coupled.
2886KEY POINT
28876.3 Design and Implementation Issues 143
2888â– Make data private rather than protected in a base class to make derived classes
2889less tightly coupled to the base class.
2890■Avoid exposing member data in a class’s public interface.
2891â– Be wary of semantic violations of encapsulation.
2892■Observe the “Law of Demeter†(discussed in Section 6.3 of this chapter).
2893Coupling goes hand in glove with abstraction and encapsulation. Tight coupling
2894occurs when an abstraction is leaky, or when encapsulation is broken. If a class offers
2895an incomplete set of services, other routines might find they need to read or write its
2896internal data directly. That opens up the class, making it a glass box instead of a black
2897box, and it virtually eliminates the class’s encapsulation.
28986.3 Design and Implementation Issues
2899Defining good class interfaces goes a long way toward creating a high-quality program.
2900The internal class design and implementation are also important. This section
2901discusses issues related to containment, inheritance, member functions and data,
2902class coupling, constructors, and value-vs.-reference objects.
2903Containment (“has a†Relationships)
2904Containment is the simple idea that a class contains a primitive data element or
2905object. A lot more is written about inheritance than about containment, but that’s
2906because inheritance is more tricky and error-prone, not because it’s better. Containment
2907is the work-horse technique in object-oriented programming.
2908Implement “has a†through containment One way of thinking of containment is as a
2909“has a†relationship. For example, an employee “has a†name, “has a†phone number,
2910“has a†tax ID, and so on. You can usually accomplish this by making the name, phone
2911number, and tax ID member data of the Employee class.
2912Implement “has a†through private inheritance as a last resort In some instances
2913you might find that you can’t achieve containment through making one object a member
2914of another. In that case, some experts suggest privately inheriting from the contained
2915object (Meyers 1998, Sutter 2000). The main reason you would do that is to set
2916up the containing class to access protected member functions or protected member
2917data of the class that’s contained. In practice, this approach creates an overly cozy relationship
2918with the ancestor class and violates encapsulation. It tends to point to design
2919errors that should be resolved some way other than through private inheritance.
2920Be critical of classes that contain more than about seven data members The number
2921“7±2†has been found to be a number of discrete items a person can remember while
2922performing other tasks (Miller 1956). If a class contains more than about seven data
2923KEY POINT
2924144 Chapter 6: Working Classes
2925members, consider whether the class should be decomposed into multiple smaller
2926classes (Riel 1996). You might err more toward the high end of 7±2 if the data members
2927are primitive data types like integers and strings, more toward the lower end of
29287±2 if the data members are complex objects.
2929Inheritance (“is a†Relationships)
2930Inheritance is the idea that one class is a specialization of another class. The purpose of
2931inheritance is to create simpler code by defining a base class that specifies common elements
2932of two or more derived classes. The common elements can be routine interfaces,
2933implementations, data members, or data types. Inheritance helps avoid the need to
2934repeat code and data in multiple locations by centralizing it within a base class.
2935When you decide to use inheritance, you have to make several decisions:
2936â– For each member routine, will the routine be visible to derived classes? Will it
2937have a default implementation? Will the default implementation be overridable?
2938â– For each data member (including variables, named constants, enumerations,
2939and so on), will the data member be visible to derived classes?
2940The following subsections explain the ins and outs of making these decisions:
2941The single most important
2942rule in object-oriented programming
2943with C++ is this:
2944public inheritance means
2945“is a.†Commit this rule to
2946memory.
2947—Scott Meyers
2948Implement “is a†through public inheritance When a programmer decides to create
2949a new class by inheriting from an existing class, that programmer is saying that the
2950new class “is a†more specialized version of the older class. The base class sets expectations
2951about how the derived class will operate and imposes constraints on how the
2952derived class can operate (Meyers 1998).
2953If the derived class isn’t going to adhere completely to the same interface contract
2954defined by the base class, inheritance is not the right implementation technique. Consider
2955containment or making a change further up the inheritance hierarchy.
2956Design and document for inheritance or prohibit it Inheritance adds complexity to a
2957program, and, as such, it’s a dangerous technique. As Java guru Joshua Bloch says,
2958“Design and document for inheritance, or prohibit it.†If a class isn’t designed to be
2959inherited from, make its members non-virtual in C++, final in Java, or non-overridable
2960in Microsoft Visual Basic so that you can’t inherit from it.
2961Adhere to the Liskov Substitution Principle (LSP) In one of object-oriented programming’s
2962seminal papers, Barbara Liskov argued that you shouldn’t inherit from a
2963base class unless the derived class truly “is a†more specific version of the base class
2964(Liskov 1988). Andy Hunt and Dave Thomas summarize LSP like this: “Subclasses
2965must be usable through the base class interface without the need for the user to know
2966the difference†(Hunt and Thomas 2000).
29676.3 Design and Implementation Issues 145
2968In other words, all the routines defined in the base class should mean the same thing
2969when they’re used in each of the derived classes.
2970If you have a base class of Account and derived classes of CheckingAccount, SavingsAccount,
2971and AutoLoanAccount, a programmer should be able to invoke any of the routines
2972derived from Account on any of Account’s subtypes without caring about which
2973subtype a specific account object is.
2974If a program has been written so that the Liskov Substitution Principle is true, inheritance
2975is a powerful tool for reducing complexity because a programmer can focus on
2976the generic attributes of an object without worrying about the details. If a programmer
2977must be constantly thinking about semantic differences in subclass implementations,
2978then inheritance is increasing complexity rather than reducing it. Suppose a programmer
2979has to think this: “If I call the InterestRate() routine on CheckingAccount or SavingsAccount,
2980it returns the interest the bank pays, but if I call InterestRate() on
2981AutoLoanAccount I have to change the sign because it returns the interest the consumer
2982pays to the bank.†According to LSP, AutoLoanAccount should not inherit from
2983the Account base class in this example because the semantics of the InterestRate() routine
2984are not the same as the semantics of the base class’s InterestRate() routine.
2985Be sure to inherit only what you want to inherit A derived class can inherit member
2986routine interfaces, implementations, or both. Table 6-1 shows the variations of how
2987routines can be implemented and overridden.
2988As the table suggests, inherited routines come in three basic flavors:
2989■An abstract overridable routine means that the derived class inherits the routine’s
2990interface but not its implementation.
2991■An overridable routine means that the derived class inherits the routine’s interface
2992and a default implementation and it is allowed to override the default
2993implementation.
2994■A non-overridable routine means that the derived class inherits the routine’s interface
2995and its default implementation and it is not allowed to override the routine’s
2996implementation.
2997Table 6-1 Variations on Inherited Routines
2998Overridable Not Overridable
2999Implementation: Default
3000Provided
3001Overridable Routine Non-Overridable Routine
3002Implementation: No Default
3003Provided
3004Abstract Overridable
3005Routine
3006Not used (doesn’t make sense to
3007leave a routine undefined and
3008not allow it to be overridden)
3009146 Chapter 6: Working Classes
3010When you choose to implement a new class through inheritance, think through the
3011kind of inheritance you want for each member routine. Beware of inheriting implementation
3012just because you’re inheriting an interface, and beware of inheriting an
3013interface just because you want to inherit an implementation. If you want to use a
3014class’s implementation but not its interface, use containment rather than inheritance.
3015Don’t “override†a non-overridable member function Both C++ and Java allow a programmer
3016to override a non-overridable member routine—kind of. If a function is private
3017in the base class, a derived class can create a function with the same name. To the
3018programmer reading the code in the derived class, such a function can create confusion
3019because it looks like it should be polymorphic, but it isn’t; it just has the same
3020name. Another way to state this guideline is, “Don’t reuse names of non-overridable
3021base-class routines in derived classes.â€
3022Move common interfaces, data, and behavior as high as possible in the inheritance
3023tree The higher you move interfaces, data, and behavior, the more easily derived
3024classes can use them. How high is too high? Let abstraction be your guide. If you find
3025that moving a routine higher would break the higher object’s abstraction, don’t do it.
3026Be suspicious of classes of which there is only one instance A single instance might
3027indicate that the design confuses objects with classes. Consider whether you could
3028just create an object instead of a new class. Can the variation of the derived class be
3029represented in data rather than as a distinct class? The Singleton pattern is one notable
3030exception to this guideline.
3031Be suspicious of base classes of which there is only one derived class When I see a
3032base class that has only one derived class, I suspect that some programmer has been
3033“designing aheadâ€â€”trying to anticipate future needs, usually without fully understanding
3034what those future needs are. The best way to prepare for future work is not to
3035design extra layers of base classes that “might be needed somedayâ€; it’s to make current
3036work as clear, straightforward, and simple as possible. That means not creating
3037any more inheritance structure than is absolutely necessary.
3038Be suspicious of classes that override a routine and do nothing inside the derived
3039routine This typically indicates an error in the design of the base class. For instance,
3040suppose you have a class Cat and a routine Scratch() and suppose that you eventually
3041find out that some cats are declawed and can’t scratch. You might be tempted to create
3042a class derived from Cat named ScratchlessCat and override the Scratch() routine to do
3043nothing. This approach presents several problems:
3044â– It violates the abstraction (interface contract) presented in the Cat class by
3045changing the semantics of its interface.
3046â– This approach quickly gets out of control when you extend it to other derived
3047classes. What happens when you find a cat without a tail? Or a cat that doesn’t
3048catch mice? Or a cat that doesn’t drink milk? Eventually you’ll end up with
3049derived classes like ScratchlessTaillessMicelessMilklessCat.
30506.3 Design and Implementation Issues 147
3051■Over time, this approach gives rise to code that’s confusing to maintain because
3052the interfaces and behavior of the ancestor classes imply little or nothing about
3053the behavior of their descendants.
3054The place to fix this problem is not in the base class, but in the original Cat class. Create
3055a Claws class and contain that within the Cats class. The root problem was the
3056assumption that all cats scratch, so fix that problem at the source, rather than just
3057bandaging it at the destination.
3058Avoid deep inheritance trees Object-oriented programming provides a large number
3059of techniques for managing complexity. But every powerful tool has its hazards, and
3060some object-oriented techniques have a tendency to increase complexity rather than
3061reduce it.
3062In his excellent book Object-Oriented Design Heuristics (1996), Arthur Riel suggests
3063limiting inheritance hierarchies to a maximum of six levels. Riel bases his recommendation
3064on the “magic number 7±2,†but I think that’s grossly optimistic. In my experience
3065most people have trouble juggling more than two or three levels of inheritance in
3066their brains at once. The “magic number 7±2†is probably better applied as a limit to
3067the total number of subclasses of a base class rather than the number of levels in an
3068inheritance tree.
3069Deep inheritance trees have been found to be significantly associated with increased
3070fault rates (Basili, Briand, and Melo 1996). Anyone who has ever tried to debug a complex
3071inheritance hierarchy knows why. Deep inheritance trees increase complexity,
3072which is exactly the opposite of what inheritance should be used to accomplish. Keep
3073the primary technical mission in mind. Make sure you’re using inheritance to avoid
3074duplicating code and to minimize complexity.
3075Prefer polymorphism to extensive type checking Frequently repeated case statements
3076sometimes suggest that inheritance might be a better design choice, although this is
3077not always true. Here is a classic example of code that cries out for a more object-oriented
3078approach:
3079C++ Example of a Case Statement That Probably Should Be Replaced
3080by Polymorphism
3081switch ( shape.type ) {
3082case Shape_Circle:
3083shape.DrawCircle();
3084break;
3085case Shape_Square:
3086shape.DrawSquare();
3087break;
3088...
3089}
3090148 Chapter 6: Working Classes
3091In this example, the calls to shape.DrawCircle() and shape.DrawSquare() should be
3092replaced by a single routine named shape.Draw(), which can be called regardless of
3093whether the shape is a circle or a square.
3094On the other hand, sometimes case statements are used to separate truly different
3095kinds of objects or behavior. Here is an example of a case statement that is appropriate
3096in an object-oriented program:
3097C++ Example of a Case Statement That Probably Should Not Be Replaced
3098by Polymorphism
3099switch ( ui.Command() ) {
3100case Command_OpenFile:
3101OpenFile();
3102break;
3103case Command_Print:
3104Print();
3105break;
3106case Command_Save:
3107Save();
3108break;
3109case Command_Exit:
3110ShutDown();
3111break;
3112...
3113}
3114In this case, it would be possible to create a base class with derived classes and a polymorphic
3115DoCommand() routine for each command (as in the Command pattern). But
3116in a simple case like this one, the meaning of DoCommand() would be so diluted as to
3117be meaningless, and the case statement is the more understandable solution.
3118Make all data private, not protected As Joshua Bloch says, “Inheritance breaks
3119encapsulation†(2001). When you inherit from an object, you obtain privileged access
3120to that object’s protected routines and data. If the derived class really needs access to
3121the base class’s attributes, provide protected accessor functions instead.
3122Multiple Inheritance
3123The one indisputable fact
3124about multiple inheritance in
3125C++ is that it opens up a
3126Pandora’s box of complexities
3127that simply do not exist
3128under single inheritance.
3129—Scott Meyers
3130Inheritance is a power tool. It’s like using a chain saw to cut down a tree instead of a
3131manual crosscut saw. It can be incredibly useful when used with care, but it’s dangerous
3132in the hands of someone who doesn’t observe proper precautions.
31336.3 Design and Implementation Issues 149
3134If inheritance is a chain saw, multiple inheritance is a 1950s-era chain saw with no
3135blade guard, no automatic shutoff, and a finicky engine. There are times when such a
3136tool is valuable; mostly, however, you’re better off leaving the tool in the garage where
3137it can’t do any damage.
3138Although some experts recommend broad use of multiple inheritance (Meyer 1997),
3139in my experience multiple inheritance is useful primarily for defining “mixins,†simple
3140classes that are used to add a set of properties to an object. Mixins are called mixins
3141because they allow properties to be “mixed in†to derived classes. Mixins might be
3142classes like Displayable, Persistant, Serializable, or Sortable. Mixins are nearly always
3143abstract and aren’t meant to be instantiated independently of other objects.
3144Mixins require the use of multiple inheritance, but they aren’t subject to the classic
3145diamond-inheritance problem associated with multiple inheritance as long as all mixins
3146are truly independent of each other. They also make the design more comprehensible
3147by “chunking†attributes together. A programmer will have an easier time
3148understanding that an object uses the mixins Displayable and Persistent than understanding
3149that an object uses the 11 more-specific routines that would otherwise be
3150needed to implement those two properties.
3151Java and Visual Basic recognize the value of mixins by allowing multiple inheritance
3152of interfaces but only single-class inheritance. C++ supports multiple inheritance of
3153both interface and implementation. Programmers should use multiple inheritance
3154only after carefully considering the alternatives and weighing the impact on system
3155complexity and comprehensibility.
3156Why Are There So Many Rules for Inheritance?
3157This section has presented numerous rules for staying out of trouble with inheritance.
3158The underlying message of all these rules is that inheritance tends to work against the primary
3159technical imperative you have as a programmer, which is to manage complexity. For the
3160sake of controlling complexity, you should maintain a heavy bias against inheritance.
3161Here’s a summary of when to use inheritance and when to use containment:
3162Cross-Reference For more
3163on complexity, see “Software’s
3164Primary Technical
3165Imperative: Managing Complexityâ€
3166in Section 5.2.
3167â– If multiple classes share common data but not behavior, create a common object
3168that those classes can contain.
3169â– If multiple classes share common behavior but not data, derive them from a
3170common base class that defines the common routines.
3171â– If multiple classes share common data and behavior, inherit from a common
3172base class that defines the common data and routines.
3173â– Inherit when you want the base class to control your interface; contain when
3174you want to control your interface.
3175KEY POINT
3176150 Chapter 6: Working Classes
3177Member Functions and Data
3178Cross-Reference For more
3179discussion of routines in
3180general, see Chapter 7,
3181“High-Quality Routines.â€
3182Here are a few guidelines for implementing member functions and member data
3183effectively.
3184Keep the number of routines in a class as small as possible A study of C++ programs
3185found that higher numbers of routines per class were associated with higher fault
3186rates (Basili, Briand, and Melo 1996). However, other competing factors were found to
3187be more significant, including deep inheritance trees, large number of routines called
3188within a class, and strong coupling between classes. Evaluate the tradeoff between
3189minimizing the number of routines and these other factors.
3190Disallow implicitly generated member functions and operators you don’t want
3191Sometimes you’ll find that you want to disallow certain functions—perhaps you want
3192to disallow assignment, or you don’t want to allow an object to be constructed. You
3193might think that, since the compiler generates operators automatically, you’re stuck
3194allowing access. But in such cases you can disallow those uses by declaring the constructor,
3195assignment operator, or other function or operator private, which will prevent
3196clients from accessing it. (Making the constructor private is a standard technique
3197for defining a singleton class, which is discussed later in this chapter.)
3198Minimize the number of different routines called by a class One study found that
3199the number of faults in a class was statistically correlated with the total number of routines
3200that were called from within a class (Basili, Briand, and Melo 1996). The same
3201study found that the more classes a class used, the higher its fault rate tended to be.
3202These concepts are sometimes called “fan out.â€
3203Further Reading Good
3204accounts of the Law of
3205Demeter can be found in
3206Pragmatic Programmer
3207(Hunt and Thomas 2000),
3208Applying UML and Patterns
3209(Larman 2001), and Fundamentals
3210of Object-Oriented
3211Design in UML (Page-Jones
32122000).
3213Minimize indirect routine calls to other classes Direct connections are hazardous
3214enough. Indirect connections—such as account.ContactPerson().DaytimeContact-
3215Info().PhoneNumber()—tend to be even more hazardous. Researchers have formulated
3216a rule called the “Law of Demeter†(Lieberherr and Holland 1989), which essentially
3217states that Object A can call any of its own routines. If Object A instantiates an Object
3218B, it can call any of Object B’s routines. But it should avoid calling routines on objects
3219provided by Object B. In the account example above, that means account.ContactPerson()
3220is OK but account.ContactPerson().DaytimeContactInfo() is not.
3221This is a simplified explanation. See the additional resources at the end of this chapter
3222for more details.
3223In general, minimize the extent to which a class collaborates with other classes Try
3224to minimize all of the following:
3225â– Number of kinds of objects instantiated
3226â– Number of different direct routine calls on instantiated objects
3227â– Number of routine calls on objects returned by other instantiated objects
32286.3 Design and Implementation Issues 151
3229Constructors
3230Following are some guidelines that apply specifically to constructors. Guidelines for
3231constructors are pretty similar across languages (C++, Java, and Visual Basic, anyway).
3232Destructors vary more, so you should check out the materials listed in this chapter’s
3233“Additional Resources†section for information on destructors.
3234Initialize all member data in all constructors, if possible Initializing all data members
3235in all constructors is an inexpensive defensive programming practice.
3236Further Reading The code
3237to do this in C++ would be
3238similar. For details, see More
3239Effective C++, Item 26 (Meyers
32401998).
3241Enforce the singleton property by using a private constructor If you want to define a
3242class that allows only one object to be instantiated, you can enforce this by hiding all
3243the constructors of the class and then providing a static GetInstance() routine to access
3244the class’s single instance. Here’s an example of how that would work:
3245Java Example of Enforcing a Singleton with a Private Constructor
3246public class MaxId {
3247// constructors and destructors
3248Here is the private
3249constructor.
3250private MaxId() {
3251...
3252}
3253...
3254// public routines
3255Here is the public routine
3256that provides access to the
3257single instance.
3258public static MaxId GetInstance() {
3259return m_instance;
3260}
3261...
3262// private members
3263Here is the single instance. private static final MaxId m_instance = new MaxId();
3264...
3265}
3266The private constructor is called only when the static object m_instance is initialized.
3267In this approach, if you want to reference the MaxId singleton, you would simply refer
3268to MaxId.GetInstance().
3269Prefer deep copies to shallow copies until proven otherwise One of the major decisions
3270you’ll make about complex objects is whether to implement deep copies or shallow
3271copies of the object. A deep copy of an object is a member-wise copy of the
3272object’s member data; a shallow copy typically just points to or refers to a single reference
3273copy, although the specific meanings of “deep†and “shallow†vary.
3274The motivation for creating shallow copies is typically to improve performance.
3275Although creating multiple copies of large objects might be aesthetically offensive, it
3276rarely causes any measurable performance impact. A small number of objects might
3277cause performance issues, but programmers are notoriously poor at guessing which
3278code really causes problems. (For details, see Chapter 25, “Code-Tuning Strategies.â€)
3279152 Chapter 6: Working Classes
3280Because it’s a poor tradeoff to add complexity for dubious performance gains, a good
3281approach to deep vs. shallow copies is to prefer deep copies until proven otherwise.
3282Deep copies are simpler to code and maintain than shallow copies. In addition to the
3283code either kind of object would contain, shallow copies add code to count references,
3284ensure safe object copies, safe comparisons, safe deletes, and so on. This code can be
3285error-prone, and you should avoid it unless there’s a compelling reason to create it.
3286If you find that you do need to use a shallow-copy approach, Scott Meyers’s More
3287Effective C++, Item 29 (1996) contains an excellent discussion of the issues in C++.
3288Martin Fowler’s Refactoring (1999) describes the specific steps needed to convert
3289from shallow copies to deep copies and from deep copies to shallow copies. (Fowler
3290calls them reference objects and value objects.)
32916.4 Reasons to Create a Class
3292Cross-Reference Reasons
3293for creating classes and
3294routines overlap. See
3295Section 7.1.
3296If you believe everything you read, you might get the idea that the only reason to create
3297a class is to model real-world objects. In practice, classes get created for many more
3298reasons than that. Here’s a list of good reasons to create a class.
3299Cross-Reference For more
3300on identifying real-world
3301objects, see “Find Real-
3302World Objects†in Section
33035.3.
3304Model real-world objects Modeling real-world objects might not be the only reason
3305to create a class, but it’s still a good reason! Create a class for each real-world object
3306type that your program models. Put the data needed for the object into the class, and
3307then build service routines that model the behavior of the object. See the discussion of
3308ADTs in Section 6.1 for examples.
3309Model abstract objects Another good reason to create a class is to model an abstract
3310object—an object that isn’t a concrete, real-world object but that provides an abstraction
3311of other concrete objects. A good example is the classic Shape object. Circle and
3312Square really exist, but Shape is an abstraction of other specific shapes.
3313On programming projects, the abstractions are not ready-made the way Shape is, so
3314we have to work harder to come up with clean abstractions. The process of distilling
3315abstract concepts from real-world entities is non-deterministic, and different designers
3316will abstract out different generalities. If we didn’t know about geometric shapes like
3317circles, squares and triangles, for example, we might come up with more unusual
3318shapes like squash shape, rutabaga shape, and Pontiac Aztek shape. Coming up with
3319appropriate abstract objects is one of the major challenges in object-oriented design.
3320Reduce complexity The single most important reason to create a class is to reduce a
3321program’s complexity. Create a class to hide information so that you won’t need to
3322think about it. Sure, you’ll need to think about it when you write the class. But after it’s
3323written, you should be able to forget the details and use the class without any knowledge
3324of its internal workings. Other reasons to create classes—minimizing code size,
3325KEY POINT
33266.4 Reasons to Create a Class 153
3327improving maintainability, and improving correctness—are also good reasons, but
3328without the abstractive power of classes, complex programs would be impossible to
3329manage intellectually.
3330Isolate complexity Complexity in all forms—complicated algorithms, large data sets,
3331intricate communications protocols, and so on—is prone to errors. If an error does
3332occur, it will be easier to find if it isn’t spread through the code but is localized within
3333a class. Changes arising from fixing the error won’t affect other code because only one
3334class will have to be fixed—other code won’t be touched. If you find a better, simpler,
3335or more reliable algorithm, it will be easier to replace the old algorithm if it has been
3336isolated into a class. During development, it will be easier to try several designs and
3337keep the one that works best.
3338Hide implementation details The desire to hide implementation details is a wonderful
3339reason to create a class whether the details are as complicated as a convoluted database
3340access or as mundane as whether a specific data member is stored as a number
3341or a string.
3342Limit effects of changes Isolate areas that are likely to change so that the effects of
3343changes are limited to the scope of a single class or a few classes. Design so that areas
3344that are most likely to change are the easiest to change. Areas likely to change include
3345hardware dependencies, input/output, complex data types, and business rules. The
3346subsection titled “Hide Secrets (Information Hiding)†in Section 5.3 described several
3347common sources of change.
3348Cross-Reference For a discussion
3349of problems associated
3350with using global data,
3351see Section 13.3, “Global
3352Data.â€
3353Hide global data If you need to use global data, you can hide its implementation
3354details behind a class interface. Working with global data through access routines provides
3355several benefits compared to working with global data directly. You can change
3356the structure of the data without changing your program. You can monitor accesses to
3357the data. The discipline of using access routines also encourages you to think about
3358whether the data is really global; it often becomes apparent that the “global data†is
3359really just object data.
3360Streamline parameter passing If you’re passing a parameter among several routines,
3361that might indicate a need to factor those routines into a class that share the parameter
3362as object data. Streamlining parameter passing isn’t a goal, per se, but passing lots of
3363data around suggests that a different class organization might work better.
3364Cross-Reference For details
3365on information hiding, see
3366“Hide Secrets (Information
3367Hiding)†in Section 5.3.
3368Make central points of control It’s a good idea to control each task in one place.
3369Control assumes many forms. Knowledge of the number of entries in a table is one
3370form. Control of devices—files, database connections, printers, and so on—is another.
3371Using one class to read from and write to a database is a form of centralized control.
3372If the database needs to be converted to a flat file or to in-memory data, the changes
3373will affect only one class.
3374154 Chapter 6: Working Classes
3375The idea of centralized control is similar to information hiding, but it has unique heuristic
3376power that makes it worth adding to your programming toolbox.
3377Facilitate reusable code Code put into well-factored classes can be reused in other
3378programs more easily than the same code embedded in one larger class. Even if a section
3379of code is called from only one place in the program and is understandable as
3380part of a larger class, it makes sense to put it into its own class if that piece of code
3381might be used in another program.
3382NASA’s Software Engineering Laboratory studied ten projects that pursued reuse
3383aggressively (McGarry, Waligora, and McDermott 1989). In both the object-oriented
3384and the functionally oriented approaches, the initial projects weren’t able to take
3385much of their code from previous projects because previous projects hadn’t established
3386a sufficient code base. Subsequently, the projects that used functional design
3387were able to take about 35 percent of their code from previous projects. Projects that
3388used an object-oriented approach were able to take more than 70 percent of their code
3389from previous projects. If you can avoid writing 70 percent of your code by planning
3390ahead, do it!
3391Cross-Reference For more
3392on implementing the minimum
3393amount of functionality
3394required, see “A program
3395contains code that seems
3396like it might be needed
3397someday†in Section 24.2.
3398Notably, the core of NASA’s approach to creating reusable classes does not involve
3399“designing for reuse.†NASA identifies reuse candidates at the ends of their projects.
3400They then perform the work needed to make the classes reusable as a special project
3401at the end of the main project or as the first step in a new project. This approach helps
3402prevent “gold-platingâ€â€”creation of functionality that isn’t required and that unnecessarily
3403adds complexity.
3404Plan for a family of programs If you expect a program to be modified, it’s a good
3405idea to isolate the parts that you expect to change by putting them into their own
3406classes. You can then modify the classes without affecting the rest of the program, or
3407you can put in completely new classes instead. Thinking through not just what one
3408program will look like but what the whole family of programs might look like is a
3409powerful heuristic for anticipating entire categories of changes (Parnas 1976).
3410Several years ago I managed a team that wrote a series of programs used by our clients
3411to sell insurance. We had to tailor each program to the specific client’s insurance rates,
3412quote-report format, and so on. But many parts of the programs were similar: the
3413classes that input information about potential customers, that stored information in a
3414customer database, that looked up rates, that computed total rates for a group, and so
3415on. The team factored the program so that each part that varied from client to client
3416was in its own class. The initial programming might have taken three months or so,
3417but when we got a new client, we merely wrote a handful of new classes for the new
3418client and dropped them into the rest of the code. A few days’ work and—voila!—custom
3419software!
34201
34212
34223
3423HARD DATA
34246.4 Reasons to Create a Class 155
3425Package related operations In cases in which you can’t hide information, share data,
3426or plan for flexibility, you can still package sets of operations into sensible groups,
3427such as trig functions, statistical functions, string-manipulation routines, bit-manipulation
3428routines, graphics routines, and so on. Classes are one means of combining
3429related operations. You could also use packages, namespaces, or header files, depending
3430on the language you’re working in.
3431Accomplish a specific refactoring Many of the specific refactorings described in
3432Chapter 24, “Refactoring,†result in new classes—including converting one class to
3433two, hiding a delegate, removing a middle man, and introducing an extension class.
3434These new classes could be motivated by a desire to better accomplish any of the
3435objectives described throughout this section.
3436Classes to Avoid
3437While classes in general are good, you can run into a few gotchas. Here are some
3438classes to avoid.
3439Avoid creating god classes Avoid creating omniscient classes that are all-knowing
3440and all-powerful. If a class spends its time retrieving data from other classes using
3441Get() and Set() routines (that is, digging into their business and telling them what to
3442do), ask whether that functionality might better be organized into those other classes
3443rather than into the god class (Riel 1996).
3444Cross-Reference This kind of
3445class is usually called a structure.
3446For more on structures,
3447see Section 13.1, “Structures.â€
3448Eliminate irrelevant classes If a class consists only of data but no behavior, ask yourself
3449whether it’s really a class and consider demoting it so that its member data just
3450becomes attributes of one or more other classes.
3451Avoid classes named after verbs A class that has only behavior but no data is generally
3452not really a class. Consider turning a class like DatabaseInitialization() or String-
3453Builder() into a routine on some other class.
3454Summary of Reasons to Create a Class
3455Here’s a summary list of the valid reasons to create a class:
3456â– Model real-world objects
3457â– Model abstract objects
3458â– Reduce complexity
3459â– Isolate complexity
3460â– Hide implementation details
3461â– Limit effects of changes
3462â– Hide global data
3463156 Chapter 6: Working Classes
3464â– Streamline parameter passing
3465â– Make central points of control
3466â– Facilitate reusable code
3467â– Plan for a family of programs
3468â– Package related operations
3469â– Accomplish a specific refactoring
34706.5 Language-Specific Issues
3471Approaches to classes in different programming languages vary in interesting ways.
3472Consider how you override a member routine to achieve polymorphism in a derived
3473class. In Java, all routines are overridable by default and a routine must be declared
3474final to prevent a derived class from overriding it. In C++, routines are not overridable
3475by default. A routine must be declared virtual in the base class to be overridable. In
3476Visual Basic, a routine must be declared overridable in the base class and the derived
3477class should use the overrides keyword.
3478Here are some of the class-related areas that vary significantly depending on the
3479language:
3480â– Behavior of overridden constructors and destructors in an inheritance tree
3481â– Behavior of constructors and destructors under exception-handling conditions
3482â– Importance of default constructors (constructors with no arguments)
3483â– Time at which a destructor or finalizer is called
3484■Wisdom of overriding the language’s built-in operators, including assignment
3485and equality
3486â– How memory is handled as objects are created and destroyed or as they are
3487declared and go out of scope
3488Detailed discussions of these issues are beyond the scope of this book, but the “Additional
3489Resources†section points to good language-specific resources.
34906.6 Beyond Classes: Packages
3491Cross-Reference For more
3492on the distinction between
3493classes and packages, see
3494“Levels of Design†in
3495Section 5.2.
3496Classes are currently the best way for programmers to achieve modularity. But modularity
3497is a big topic, and it extends beyond classes. Over the past several decades, software
3498development has advanced in large part by increasing the granularity of the aggregations
3499that we have to work with. The first aggregation we had was the statement, which
35006.6 Beyond Classes: Packages 157
3501at the time seemed like a big step up from machine instructions. Then came subroutines,
3502and later came classes.
3503It’s evident that we could better support the goals of abstraction and encapsulation if
3504we had good tools for aggregating groups of objects. Ada supported the notion of
3505packages more than a decade ago, and Java supports packages today. If you’re programming
3506in a language that doesn’t support packages directly, you can create your
3507own poor-programmer’s version of a package and enforce it through programming
3508standards that include the following:
3509â– Naming conventions that differentiate which classes are public and which are
3510for the package’s private use
3511â– Naming conventions, code-organization conventions (project structure), or
3512both that identify which package each class belongs to
3513â– Rules that define which packages are allowed to use which other packages,
3514including whether the usage can be inheritance, containment, or both
3515These workarounds are good examples of the distinction between programming in a
3516language vs. programming into a language. For more on this distinction, see Section
351734.4, “Program into Your Language, Not in It.â€
3518cc2e.com/0672
3519Cross-Reference This is a
3520checklist of considerations
3521about the quality of the
3522class. For a list of the steps
3523used to build a class, see the
3524checklist “The Pseudocode
3525Programming Process†in
3526Chapter 9, page 233.
3527CHECKLIST: Class Quality
3528Abstract Data Types
3529â‘ Have you thought of the classes in your program as abstract data types
3530and evaluated their interfaces from that point of view?
3531Abstraction
3532â‘ Does the class have a central purpose?
3533â‘ Is the class well named, and does its name describe its central purpose?
3534①Does the class’s interface present a consistent abstraction?
3535①Does the class’s interface make obvious how you should use the class?
3536①Is the class’s interface abstract enough that you don’t have to think about
3537how its services are implemented? Can you treat the class as a black box?
3538①Are the class’s services complete enough that other classes don’t have to
3539meddle with its internal data?
3540â‘ Has unrelated information been moved out of the class?
3541â‘ Have you thought about subdividing the class into component classes,
3542and have you subdivided it as much as you can?
3543①Are you preserving the integrity of the class’s interface as you modify the
3544class?
3545158 Chapter 6: Working Classes
3546Encapsulation
3547â‘ Does the class minimize accessibility to its members?
3548â‘ Does the class avoid exposing member data?
3549â‘ Does the class hide its implementation details from other classes as much
3550as the programming language permits?
3551â‘ Does the class avoid making assumptions about its users, including its
3552derived classes?
3553â‘ Is the class independent of other classes? Is it loosely coupled?
3554Inheritance
3555①Is inheritance used only to model “is a†relationships—that is, do derived
3556classes adhere to the Liskov Substitution Principle?
3557â‘ Does the class documentation describe the inheritance strategy?
3558①Do derived classes avoid “overriding†non-overridable routines?
3559â‘ Are common interfaces, data, and behavior as high as possible in the
3560inheritance tree?
3561â‘ Are inheritance trees fairly shallow?
3562â‘ Are all data members in the base class private rather than protected?
3563Other Implementation Issues
3564â‘ Does the class contain about seven data members or fewer?
3565â‘ Does the class minimize direct and indirect routine calls to other classes?
3566â‘ Does the class collaborate with other classes only to the extent absolutely
3567necessary?
3568â‘ Is all member data initialized in the constructor?
3569â‘ Is the class designed to be used as deep copies rather than shallow copies
3570unless there’s a measured reason to create shallow copies?
3571Language-Specific Issues
3572â‘ Have you investigated the language-specific issues for classes in your specific
3573programming language?
3574Additional Resources 159
3575Additional Resources
3576Classes in General
3577cc2e.com/0679 Meyer, Bertrand. Object-Oriented Software Construction, 2d ed. New York, NY: Prentice
3578Hall PTR, 1997. This book contains an in-depth discussion of abstract data types and
3579explains how they form the basis for classes. Chapters 14–16 discuss inheritance in
3580depth. Meyer provides an argument in favor of multiple inheritance in Chapter 15.
3581Riel, Arthur J. Object-Oriented Design Heuristics. Reading, MA: Addison-Wesley, 1996.
3582This book contains numerous suggestions for improving program design, mostly at the
3583class level. I avoided the book for several years because it appeared to be too big—talk
3584about people in glass houses! However, the body of the book is only about 200 pages
3585long. Riel’s writing is accessible and enjoyable. The content is focused and practical.
3586C++
3587cc2e.com/0686 Meyers, Scott. Effective C++: 50 Specific Ways to Improve Your Programs and Designs, 2d
3588ed. Reading, MA: Addison-Wesley, 1998.
3589Meyers, Scott, 1996, More Effective C++: 35 New Ways to Improve Your Programs and
3590Designs. Reading, MA: Addison-Wesley, 1996. Both of Meyers’ books are canonical references
3591for C++ programmers. The books are entertaining and help to instill a language-
3592lawyer’s appreciation for the nuances of C++.
3593Java
3594cc2e.com/0693 Bloch, Joshua. Effective Java Programming Language Guide. Boston, MA: Addison-
3595Wesley, 2001. Bloch’s book provides much good Java-specific advice as well as introducing
3596more general, good object-oriented practices.
3597Visual Basic
3598cc2e.com/0600 The following books are good references on classes in Visual Basic:
3599Foxall, James. Practical Standards for Microsoft Visual Basic .NET. Redmond, WA:
3600Microsoft Press, 2003.
3601Cornell, Gary, and Jonathan Morrison. Programming VB .NET: A Guide for Experienced
3602Programmers. Berkeley, CA: Apress, 2002.
3603Barwell, Fred, et al. Professional VB.NET, 2d ed. Wrox, 2002.
3604160 Chapter 6: Working Classes
3605Key Points
3606â– Class interfaces should provide a consistent abstraction. Many problems arise
3607from violating this single principle.
3608■A class interface should hide something—a system interface, a design decision,
3609or an implementation detail.
3610■Containment is usually preferable to inheritance unless you’re modeling an “is
3611a†relationship.
3612■Inheritance is a useful tool, but it adds complexity, which is counter to Software’s
3613Primary Technical Imperative of managing complexity.
3614â– Classes are your primary tool for managing complexity. Give their design as
3615much attention as needed to accomplish that objective.
3616161
3617Chapter 7
3618High-Quality Routines
3619cc2e.com/0778 Contents
3620â– 7.1 Valid Reasons to Create a Routine: page 164
3621â– 7.2 Design at the Routine Level: page 168
3622â– 7.3 Good Routine Names: page 171
3623â– 7.4 How Long Can a Routine Be?: page 173
3624â– 7.5 How to Use Routine Parameters: page 174
3625â– 7.6 Special Considerations in the Use of Functions: page 181
3626â– 7.7 Macro Routines and Inline Routines: page 182
3627Related Topics
3628â– Steps in routine construction: Section 9.3
3629â– Working classes: Chapter 6
3630â– General design techniques: Chapter 5
3631â– Software architecture: Section 3.5
3632Chapter 6 described the details of creating classes. This chapter zooms in on routines,
3633on the characteristics that make the difference between a good routine and a bad one.
3634If you’d rather read about issues that affect the design of routines before wading into
3635the nitty-gritty details, be sure to read Chapter 5, “Design in Construction,†first and
3636come back to this chapter later. Some important attributes of high-quality routines are
3637also discussed in Chapter 8, “Defensive Programming.†If you’re more interested in
3638reading about steps to create routines and classes, Chapter 9, “The Pseudocode Programming
3639Process,†might be a better place to start.
3640Before jumping into the details of high-quality routines, it will be useful to nail down
3641two basic terms. What is a “routine� A routine is an individual method or procedure
3642invocable for a single purpose. Examples include a function in C++, a method in Java,
3643a function or sub procedure in Microsoft Visual Basic. For some uses, macros in C and
3644C++ can also be thought of as routines. You can apply many of the techniques for creating
3645a high-quality routine to these variants.
3646What is a high-quality routine? That’s a harder question. Perhaps the easiest answer is
3647to show what a high-quality routine is not. Here’s an example of a low-quality routine:
3648162 Chapter 7: High-Quality Routines
3649C++ Example of a Low-Quality Routine
3650void HandleStuff( CORP_DATA & inputRec, int crntQtr, EMP_DATA empRec,
3651double & estimRevenue, double ytdRevenue, int screenX, int screenY,
3652COLOR_TYPE & newColor, COLOR_TYPE & prevColor, StatusType & status,
3653int expenseType )
3654{
3655int i;
3656for ( i = 0; i < 100; i++ ) {
3657inputRec.revenue[i] = 0;
3658inputRec.expense[i] = corpExpense[ crntQtr ][ i ];
3659}
3660UpdateCorpDatabase( empRec );
3661estimRevenue = ytdRevenue * 4.0 / (double) crntQtr;
3662newColor = prevColor;
3663status = SUCCESS;
3664if ( expenseType == 1 ) {
3665for ( i = 0; i < 12; i++ )
3666profit[i] = revenue[i] - expense.type1[i];
3667}
3668else if ( expenseType == 2 ) {
3669profit[i] = revenue[i] - expense.type2[i];
3670}
3671else if ( expenseType == 3 )
3672profit[i] = revenue[i] - expense.type3[i];
3673}
3674What’s wrong with this routine? Here’s a hint: you should be able to find at least 10
3675different problems with it. Once you’ve come up with your own list, look at the following
3676list:
3677â– The routine has a bad name. HandleStuff() tells you nothing about what the routine
3678does.
3679■The routine isn’t documented. (The subject of documentation extends beyond
3680the boundaries of individual routines and is discussed in Chapter 32, “Self-Documenting
3681Code.â€)
3682â– The routine has a bad layout. The physical organization of the code on the page
3683gives few hints about its logical organization. Layout strategies are used haphazardly,
3684with different styles in different parts of the routine. Compare the styles
3685where expenseType == 2 and expenseType == 3. (Layout is discussed in Chapter 31,
3686“Layout and Style.â€)
3687■The routine’s input variable, inputRec, is changed. If it’s an input variable, its
3688value should not be modified (and in C++ it should be declared const). If the
3689value of the variable is supposed to be modified, the variable should not be
3690called inputRec.
3691■The routine reads and writes global variables—it reads from corpExpense and
3692writes to profit. It should communicate with other routines more directly than
3693by reading and writing global variables.
3694CODING
3695HORROR
3696A Low-Quality Routine 163
3697■The routine doesn’t have a single purpose. It initializes some variables, writes to
3698a database, does some calculations—none of which seem to be related to each
3699other in any way. A routine should have a single, clearly defined purpose.
3700■The routine doesn’t defend itself against bad data. If crntQtr equals 0, the expression
3701ytdRevenue * 4.0 / (double) crntQtr causes a divide-by-zero error.
3702â– The routine uses several magic numbers: 100, 4.0, 12, 2, and 3. Magic numbers
3703are discussed in Section 12.1, “Numbers in General.â€
3704■Some of the routine’s parameters are unused: screenX and screenY are not referenced
3705within the routine.
3706■One of the routine’s parameters is passed incorrectly: prevColor is labeled as a
3707reference parameter (&) even though it isn’t assigned a value within the routine.
3708â– The routine has too many parameters. The upper limit for an understandable
3709number of parameters is about 7; this routine has 11. The parameters are laid
3710out in such an unreadable way that most people wouldn’t try to examine them
3711closely or even count them.
3712■The routine’s parameters are poorly ordered and are not documented. (Parameter
3713ordering is discussed in this chapter. Documentation is discussed in Chapter 32.)
3714cc2e.com/0799
3715Cross-Reference The class is
3716also a good contender for
3717the single greatest invention
3718in computer science. For
3719details on how to use classes
3720effectively, see Chapter 6,
3721“Working Classes.â€
3722Aside from the computer itself, the routine is the single greatest invention in computer
3723science. The routine makes programs easier to read and easier to understand than any
3724other feature of any programming language, and it’s a crime to abuse this senior
3725statesman of computer science with code like that in the example just shown.
3726The routine is also the greatest technique ever invented for saving space and improving
3727performance. Imagine how much larger your code would be if you had to repeat
3728the code for every call to a routine instead of branching to the routine. Imagine how
3729hard it would be to make performance improvements in the same code used in a
3730dozen places instead of making them all in one routine. The routine makes modern
3731programming possible.
3732“OK,†you say, “I already know that routines are great, and I program with them all the
3733time. This discussion seems kind of remedial, so what do you want me to do about it?â€
3734I want you to understand that many valid reasons to create a routine exist and that
3735there are right ways and wrong ways to go about it. As an undergraduate computer-science
3736student, I thought that the main reason to create a routine was to avoid duplicate
3737code. The introductory textbook I used said that routines were good because the
3738avoidance of duplication made a program easier to develop, debug, document, and
3739maintain. Period. Aside from syntactic details about how to use parameters and local
3740variables, that was the extent of the textbook’s coverage. It was not a good or complete
3741explanation of the theory and practice of routines. The following sections contain a
3742much better explanation.
3743164 Chapter 7: High-Quality Routines
37447.1 Valid Reasons to Create a Routine
3745Here’s a list of valid reasons to create a routine. The reasons overlap somewhat, and
3746they’re not intended to make an orthogonal set.
3747Reduce complexity The single most important reason to create a routine is to reduce
3748a program’s complexity. Create a routine to hide information so that you won’t need
3749to think about it. Sure, you’ll need to think about it when you write the routine. But
3750after it’s written, you should be able to forget the details and use the routine without
3751any knowledge of its internal workings. Other reasons to create routines—minimizing
3752code size, improving maintainability, and improving correctness—are also good reasons,
3753but without the abstractive power of routines, complex programs would be
3754impossible to manage intellectually.
3755One indication that a routine needs to be broken out of another routine is deep nesting
3756of an inner loop or a conditional. Reduce the containing routine’s complexity by
3757pulling the nested part out and putting it into its own routine.
3758Introduce an intermediate, understandable abstraction Putting a section of code
3759into a well-named routine is one of the best ways to document its purpose. Instead of
3760reading a series of statements like
3761if ( node <> NULL ) then
3762while ( node.next <> NULL ) do
3763node = node.next
3764leafName = node.name
3765end while
3766else
3767leafName = ""
3768end if
3769you can read a statement like this:
3770leafName = GetLeafName( node )
3771The new routine is so short that nearly all it needs for documentation is a good name.
3772The name introduces a higher level of abstraction than the original eight lines of code,
3773which makes the code more readable and easier to understand, and it reduces complexity
3774within the routine that originally contained the code.
3775Avoid duplicate code Undoubtedly the most popular reason for creating a routine is
3776to avoid duplicate code. Indeed, creation of similar code in two routines implies an
3777error in decomposition. Pull the duplicate code from both routines, put a generic version
3778of the common code into a base class, and then move the two specialized routines
3779into subclasses. Alternatively, you could migrate the common code into its own
3780routine, and then let both call the part that was put into the new routine. With code in
3781one place, you save the space that would have been used by duplicated code. Modifications
3782will be easier because you’ll need to modify the code in only one location. The
3783KEY POINT
37847.1 Valid Reasons to Create a Routine 165
3785code will be more reliable because you’ll have to check only one place to ensure that
3786the code is right. Modifications will be more reliable because you’ll avoid making successive
3787and slightly different modifications under the mistaken assumption that
3788you’ve made identical ones.
3789Support subclassing You need less new code to override a short, well-factored routine
3790than a long, poorly factored routine. You’ll also reduce the chance of error in subclass
3791implementations if you keep overrideable routines simple.
3792Hide sequences It’s a good idea to hide the order in which events happen to be processed.
3793For example, if the program typically gets data from the user and then gets
3794auxiliary data from a file, neither the routine that gets the user data nor the routine
3795that gets the file data should depend on the other routine’s being performed first.
3796Another example of a sequence might be found when you have two lines of code that
3797read the top of a stack and decrement a stackTop variable. Put those two lines of code
3798into a PopStack() routine to hide the assumption about the order in which the two
3799operations must be performed. Hiding that assumption will be better than baking it
3800into code from one end of the system to the other.
3801Hide pointer operations Pointer operations tend to be hard to read and error prone.
3802By isolating them in routines, you can concentrate on the intent of the operation
3803rather than on the mechanics of pointer manipulation. Also, if the operations are done
3804in only one place, you can be more certain that the code is correct. If you find a better
3805data type than pointers, you can change the program without traumatizing the code
3806that would have used the pointers.
3807Improve portability Use of routines isolates nonportable capabilities, explicitly identifying
3808and isolating future portability work. Nonportable capabilities include nonstandard
3809language features, hardware dependencies, operating-system dependencies, and so on.
3810Simplify complicated boolean tests Understanding complicated boolean tests in
3811detail is rarely necessary for understanding program flow. Putting such a test into a
3812function makes the code more readable because (1) the details of the test are out of
3813the way and (2) a descriptive function name summarizes the purpose of the test.
3814Giving the test a function of its own emphasizes its significance. It encourages extra
3815effort to make the details of the test readable inside its function. The result is that both
3816the main flow of the code and the test itself become clearer. Simplifying a boolean test
3817is an example of reducing complexity, which was discussed earlier.
3818Improve performance You can optimize the code in one place instead of in several
3819places. Having code in one place will make it easier to profile to find inefficiencies.
3820Centralizing code into a routine means that a single optimization benefits all the code
3821that uses that routine, whether it uses it directly or indirectly. Having code in one
3822place makes it practical to recode the routine with a more efficient algorithm or in a
3823faster, more efficient language.
3824166 Chapter 7: High-Quality Routines
3825Cross-Reference For details
3826on information hiding, see
3827“Hide Secrets (Information
3828Hiding)†in Section 5.3.
3829To ensure all routines are small? No. With so many good reasons for putting code
3830into a routine, this one is unnecessary. In fact, some jobs are performed better in a single
3831large routine. (The best length for a routine is discussed in Section 7.4, “How Long
3832Can a Routine Be?â€)
3833Operations That Seem Too Simple to Put Into Routines
3834One of the strongest mental blocks to creating effective routines is a reluctance to create
3835a simple routine for a simple purpose. Constructing a whole routine to contain two
3836or three lines of code might seem like overkill, but experience shows how helpful a
3837good small routine can be.
3838Small routines offer several advantages. One is that they improve readability. I once
3839had the following single line of code in about a dozen places in a program:
3840Pseudocode Example of a Calculation
3841points = deviceUnits * ( POINTS_PER_INCH / DeviceUnitsPerInch() )
3842This is not the most complicated line of code you’ll ever read. Most people would
3843eventually figure out that it converts a measurement in device units to a measurement
3844in points. They would see that each of the dozen lines did the same thing. It could
3845have been clearer, however, so I created a well-named routine to do the conversion in
3846one place:
3847Pseudocode Example of a Calculation Converted to a Function
3848Function DeviceUnitsToPoints ( deviceUnits Integer ): Integer
3849DeviceUnitsToPoints = deviceUnits *
3850( POINTS_PER_INCH / DeviceUnitsPerInch() )
3851End Function
3852When the routine was substituted for the inline code, the dozen lines of code all
3853looked more or less like this one:
3854Pseudocode Example of a Function Call to a Calculation Function
3855points = DeviceUnitsToPoints( deviceUnits )
3856This line is more readable—even approaching self-documenting.
3857This example hints at another reason to put small operations into functions: small
3858operations tend to turn into larger operations. I didn’t know it when I wrote the routine,
3859but under certain conditions and when certain devices were active, Device-
3860UnitsPerlnch() returned 0. That meant I had to account for division by zero, which
3861took three more lines of code:
3862KEY POINT
38637.1 Valid Reasons to Create a Routine 167
3864Pseudocode Example of a Calculation That Expands Under Maintenance
3865Function DeviceUnitsToPoints( deviceUnits: Integer ) Integer;
3866if ( DeviceUnitsPerInch() <> 0 )
3867DeviceUnitsToPoints = deviceUnits *
3868( POINTS_PER_INCH / DeviceUnitsPerInch() )
3869else
3870DeviceUnitsToPoints = 0
3871end if
3872End Function
3873If that original line of code had still been in a dozen places, the test would have been
3874repeated a dozen times, for a total of 36 new lines of code. A simple routine reduced
3875the 36 new lines to 3.
3876Summary of Reasons to Create a Routine
3877Here’s a summary list of the valid reasons for creating a routine:
3878â– Reduce complexity
3879â– Introduce an intermediate, understandable abstraction
3880â– Avoid duplicate code
3881â– Support subclassing
3882â– Hide sequences
3883â– Hide pointer operations
3884â– Improve portability
3885â– Simplify complicated boolean tests
3886â– Improve performance
3887In addition, many of the reasons to create a class are also good reasons to create a routine:
3888â– Isolate complexity
3889â– Hide implementation details
3890â– Limit effects of changes
3891â– Hide global data
3892â– Make central points of control
3893â– Facilitate reusable code
3894â– Accomplish a specific refactoring
3895168 Chapter 7: High-Quality Routines
38967.2 Design at the Routine Level
3897The idea of cohesion was introduced in a paper by Wayne Stevens, Glenford Myers,
3898and Larry Constantine (1974). Other more modern concepts, including abstraction
3899and encapsulation, tend to yield more insight at the class level (and have, in fact,
3900largely superceded cohesion at the class level), but cohesion is still alive and well as
3901the workhorse design heuristic at the individual-routine level.
3902Cross-Reference For a discussion
3903of cohesion in general,
3904see “Aim for Strong
3905Cohesion†in Section 5.3.
3906For routines, cohesion refers to how closely the operations in a routine are related.
3907Some programmers prefer the term “strengthâ€: how strongly related are the operations
3908in a routine? A function like Cosine() is perfectly cohesive because the whole routine
3909is dedicated to performing one function. A function like CosineAndTan() has lower
3910cohesion because it tries to do more than one thing. The goal is to have each routine
3911do one thing well and not do anything else.
3912The payoff is higher reliability. One study of 450 routines found that 50 percent of the
3913highly cohesive routines were fault free, whereas only 18 percent of routines with low
3914cohesion were fault free (Card, Church, and Agresti 1986). Another study of a different
3915450 routines (which is just an unusual coincidence) found that routines with the
3916highest coupling-to-cohesion ratios had 7 times as many errors as those with the lowest
3917coupling-to-cohesion ratios and were 20 times as costly to fix (Selby and Basili
39181991).
3919Discussions about cohesion typically refer to several levels of cohesion. Understanding
3920the concepts is more important than remembering specific terms. Use the concepts
3921as aids in thinking about how to make routines as cohesive as possible.
3922Functional cohesion is the strongest and best kind of cohesion, occurring when a routine
3923performs one and only one operation. Examples of highly cohesive routines
3924include sin(), GetCustomerName(), EraseFile(), CalculateLoanPayment(), and AgeFrom-
3925Birthdate(). Of course, this evaluation of their cohesion assumes that the routines do
3926what their names say they do—if they do anything else, they are less cohesive and
3927poorly named.
3928Several other kinds of cohesion are normally considered to be less than ideal:
3929â– Sequential cohesion exists when a routine contains operations that must be performed
3930in a specific order, that share data from step to step, and that don’t make
3931up a complete function when done together.
3932An example of sequential cohesion is a routine that, given a birth date, calculates
3933an employee’s age and time to retirement. If the routine calculates the age and
3934then uses that result to calculate the employee’s time to retirement, it has
3935sequential cohesion. If the routine calculates the age and then calculates the
3936time to retirement in a completely separate computation that happens to use the
3937same birth-date data, it has only communicational cohesion.
39381
39392
39403
3941HARD DATA
39427.2 Design at the Routine Level 169
3943How would you make the routine functionally cohesive? You’d create separate
3944routines to compute an employee’s age given a birth date and compute time to
3945retirement given a birth date. The time-to-retirement routine could call the age
3946routine. They’d both have functional cohesion. Other routines could call either
3947routine or both routines.
3948â– Communicational cohesion occurs when operations in a routine make use of the
3949same data and aren’t related in any other way. If a routine prints a summary
3950report and then reinitializes the summary data passed into it, the routine has
3951communicational cohesion: the two operations are related only by the fact that
3952they use the same data.
3953To give this routine better cohesion, the summary data should be reinitialized
3954close to where it’s created, which shouldn’t be in the report-printing routine.
3955Split the operations into individual routines. The first prints the report. The second
3956reinitializes the data, close to the code that creates or modifies the data. Call
3957both routines from the higher-level routine that originally called the communicationally
3958cohesive routine.
3959â– Temporal cohesion occurs when operations are combined into a routine because
3960they are all done at the same time. Typical examples would be Startup(), CompleteNewEmployee(),
3961and Shutdown(). Some programmers consider temporal
3962cohesion to be unacceptable because it’s sometimes associated with bad programming
3963practices such as having a hodgepodge of code in a Startup() routine.
3964To avoid this problem, think of temporal routines as organizers of other events.
3965The Startup() routine, for example, might read a configuration file, initialize a
3966scratch file, set up a memory manager, and show an initial screen. To make it
3967most effective, have the temporally cohesive routine call other routines to perform
3968specific activities rather than performing the operations directly itself. That
3969way, it will be clear that the point of the routine is to orchestrate activities rather
3970than to do them directly.
3971This example raises the issue of choosing a name that describes the routine at
3972the right level of abstraction. You could decide to name the routine ReadConfig-
3973FileInitScratchFileEtc(), which would imply that the routine had only coincidental
3974cohesion. If you name it Startup(), however, it would be clear that it had a
3975single purpose and clear that it had functional cohesion.
3976The remaining kinds of cohesion are generally unacceptable. They result in code
3977that’s poorly organized, hard to debug, and hard to modify. If a routine has bad cohesion,
3978it’s better to put effort into a rewrite to have better cohesion than investing in a
3979pinpoint diagnosis of the problem. Knowing what to avoid can be useful, however, so
3980here are the unacceptable kinds of cohesion:
3981170 Chapter 7: High-Quality Routines
3982â– Procedural cohesion occurs when operations in a routine are done in a specified
3983order. An example is a routine that gets an employee name, then an address, and
3984then a phone number. The order of these operations is important only because
3985it matches the order in which the user is asked for the data on the input screen.
3986Another routine gets the rest of the employee data. The routine has procedural
3987cohesion because it puts a set of operations in a specified order and the operations
3988don’t need to be combined for any other reason.
3989To achieve better cohesion, put the separate operations into their own routines.
3990Make sure that the calling routine has a single, complete job: GetEmployee()
3991rather than GetFirstPartOfEmployeeData(). You’ll probably need to modify the
3992routines that get the rest of the data too. It’s common to modify two or more
3993original routines before you achieve functional cohesion in any of them.
3994â– Logical cohesion occurs when several operations are stuffed into the same routine
3995and one of the operations is selected by a control flag that’s passed in. It’s called
3996logical cohesion because the control flow or “logic†of the routine is the only
3997thing that ties the operations together—they’re all in a big if statement or case
3998statement together. It isn’t because the operations are logically related in any
3999other sense. Considering that the defining attribute of logical cohesion is that
4000the operations are unrelated, a better name might “illogical cohesion.â€
4001One example would be an InputAll() routine that inputs customer names,
4002employee timecard information, or inventory data depending on a flag passed to
4003the routine. Other examples would be ComputeAll(), EditAll(), PrintAll(), and
4004SaveAll(). The main problem with such routines is that you shouldn’t need to
4005pass in a flag to control another routine’s processing. Instead of having a routine
4006that does one of three distinct operations, depending on a flag passed to it, it’s
4007cleaner to have three routines, each of which does one distinct operation. If the
4008operations use some of the same code or share data, the code should be moved
4009into a lower-level routine and the routines should be packaged into a class.
4010Cross-Reference Although
4011the routine might have better
4012cohesion, a higher-level
4013design issue is whether the
4014system should be using a
4015case statement instead of
4016polymorphism. For more on
4017this issue, see “Replace conditionals
4018with polymorphism
4019(especially repeated case
4020statements)†in Section 24.3
4021It’s usually all right, however, to create a logically cohesive routine if its code consists
4022solely of a series of if or case statements and calls to other routines. In such
4023a case, if the routine’s only function is to dispatch commands and it doesn’t do
4024any of the processing itself, that’s usually a good design. The technical term for
4025this kind of routine is “event handler.†An event handler is often used in interactive
4026environments such as the Apple Macintosh, Microsoft Windows, and other
4027GUI environments.
4028â– Coincidental cohesion occurs when the operations in a routine have no discernible
4029relationship to each other. Other good names are “no cohesion†or “chaotic cohesion.â€
4030The low-quality C++ routine at the beginning of this chapter had coincidental
4031cohesion. It’s hard to convert coincidental cohesion to any better kind of
4032cohesion—you usually need to do a deeper redesign and reimplementation.
40337.3 Good Routine Names 171
4034None of these terms are magical or sacred. Learn the ideas rather than the terminology.
4035It’s nearly always possible to write routines with functional cohesion, so focus
4036your attention on functional cohesion for maximum benefit.
40377.3 Good Routine Names
4038Cross-Reference For details
4039on naming variables, see
4040Chapter 11, “The Power of
4041Variable Names.â€
4042A good name for a routine clearly describes everything the routine does. Here are
4043guidelines for creating effective routine names:
4044Describe everything the routine does In the routine’s name, describe all the outputs
4045and side effects. If a routine computes report totals and opens an output file, Compute-
4046ReportTotals() is not an adequate name for the routine. ComputeReportTotalsAndOpen-
4047OutputFile() is an adequate name but is too long and silly. If you have routines with
4048side effects, you’ll have many long, silly names. The cure is not to use less-descriptive
4049routine names; the cure is to program so that you cause things to happen directly
4050rather than with side effects.
4051Avoid meaningless, vague, or wishy-washy verbs Some verbs are elastic, stretched to
4052cover just about any meaning. Routine names like HandleCalculation(), PerformServices(),
4053OutputUser(), ProcessInput(), and DealWithOutput() don’t tell you what the routines
4054do. At the most, these names tell you that the routines have something to do
4055with calculations, services, users, input, and output. The exception would be when
4056the verb “handle†was used in the specific technical sense of handling an event.
4057Sometimes the only problem with a routine is that its name is wishy-washy; the routine
4058itself might actually be well designed. If HandleOutput() is replaced with FormatAndPrintOutput(),
4059you have a pretty good idea of what the routine does.
4060In other cases, the verb is vague because the operations performed by the routine are
4061vague. The routine suffers from a weakness of purpose, and the weak name is a symptom.
4062If that’s the case, the best solution is to restructure the routine and any related
4063routines so that they all have stronger purposes and stronger names that accurately
4064describe them.
4065Don’t differentiate routine names solely by number One developer wrote all his
4066code in one big function. Then he took every 15 lines and created functions named
4067Part1, Part2, and so on. After that, he created one high-level function that called each
4068part. This method of creating and naming routines is especially egregious (and rare, I
4069hope). But programmers sometimes use numbers to differentiate routines with names
4070like OutputUser, OutputUser1, and OutputUser2. The numerals at the ends of these
4071names provide no indication of the different abstractions the routines represent, and
4072the routines are thus poorly named.
4073Make names of routines as long as necessary Research shows that the optimum
4074average length for a variable name is 9 to 15 characters. Routines tend to be more com-
4075KEY POINT
4076KEY POINT
4077CODING
4078HORROR
4079172 Chapter 7: High-Quality Routines
4080plicated than variables, and good names for them tend to be longer. On the other
4081hand, routine names are often attached to object names, which essentially provides
4082part of the name for free. Overall, the emphasis when creating a routine name should
4083be to make the name as clear as possible, which means you should make its name as
4084long or short as needed to make it understandable.
4085Cross-Reference For the
4086distinction between procedures
4087and functions, see
4088Section 7.6, “Special Considerations
4089in the Use of Functions,â€
4090later in this chapter.
4091To name a function, use a description of the return value A function returns a value,
4092and the function should be named for the value it returns. For example, cos(),
4093customerId.Next(), printer.IsReady(), and pen.CurrentColor() are all good function
4094names that indicate precisely what the functions return.
4095To name a procedure, use a strong verb followed by an object A procedure with
4096functional cohesion usually performs an operation on an object. The name should
4097reflect what the procedure does, and an operation on an object implies a verb-plusobject
4098name. PrintDocument(), CalcMonthlyRevenues(), CheckOrderlnfo(), and RepaginateDocument()
4099are samples of good procedure names.
4100In object-oriented languages, you don’t need to include the name of the object in the
4101procedure name because the object itself is included in the call. You invoke routines
4102with statements like document.Print(), orderInfo.Check(), and monthlyRevenues.Calc().
4103Names like document.PrintDocument() are redundant and can become inaccurate
4104when they’re carried through to derived classes. If Check is a class derived from Document,
4105check.Print() seems clearly to be printing a check, whereas check.PrintDocument()
4106sounds like it might be printing a checkbook register or monthly statement, but it
4107doesn’t sound like it’s printing a check.
4108Cross-Reference For a similar
4109list of opposites in variable
4110names, see “Common
4111Opposites in Variable
4112Names†in Section 11.1.
4113Use opposites precisely Using naming conventions for opposites helps consistency,
4114which helps readability. Opposite-pairs like first/last are commonly understood.
4115Opposite-pairs like FileOpen() and _lclose() are not symmetrical and are confusing.
4116Here are some common opposites:
4117Establish conventions for common operations In some systems, it’s important to distinguish
4118among different kinds of operations. A naming convention is often the easiest
4119and most reliable way of indicating these distinctions.
4120The code on one of my projects assigned each object a unique identifier. We neglected
4121to establish a convention for naming the routines that would return the object identifier,
4122so we had routine names like these:
4123add/remove increment/decrement open/close
4124begin/end insert/delete show/hide
4125create/destroy lock/unlock source/target
4126first/last min/max start/stop
4127get/put next/previous up/down
4128get/set old/new
41297.4 How Long Can a Routine Be? 173
4130employee.id.Get()
4131dependent.GetId()
4132supervisor()
4133candidate.id()
4134The Employee class exposed its id object, which in turn exposed its Get() routine. The
4135Dependent class exposed a GetId() routine. The Supervisor class made the id its default
4136return value. The Candidate class made use of the fact that the id object’s default
4137return value was the id, and exposed the id object. By the middle of the project, no one
4138could remember which of these routines was supposed to be used on which object,
4139but by that time too much code had been written to go back and make everything consistent.
4140Consequently, every person on the team had to devote an unnecessary
4141amount of gray matter to remembering the inconsequential detail of which syntax
4142was used on which class to retrieve the id. A naming convention for retrieving ids
4143would have eliminated this annoyance.
41447.4 How Long Can a Routine Be?
4145On their way to America, the Pilgrims argued about the best maximum length for a
4146routine. After arguing about it for the entire trip, they arrived at Plymouth Rock and
4147started to draft the Mayflower Compact. They still hadn’t settled the maximum-length
4148question, and since they couldn’t disembark until they’d signed the compact, they
4149gave up and didn’t include it. The result has been an interminable debate ever since
4150about how long a routine can be.
4151The theoretical best maximum length is often described as one screen or one or two
4152pages of program listing, approximately 50 to 150 lines. In this spirit, IBM once limited
4153routines to 50 lines, and TRW limited them to two pages (McCabe 1976). Modern
4154programs tend to have volumes of extremely short routines mixed in with a few longer
4155routines. Long routines are far from extinct, however. Shortly before finishing this
4156book, I visited two client sites within a month. Programmers at one site were wrestling
4157with a routine that was about 4,000 lines of code long, and programmers at the other
4158site were trying to tame a routine that was more than 12,000 lines long!
4159A mountain of research on routine length has accumulated over the years, some of
4160which is applicable to modern programs, and some of which isn’t:
4161â– A study by Basili and Perricone found that routine size was inversely correlated
4162with errors: as the size of routines increased (up to 200 lines of code), the number
4163of errors per line of code decreased (Basili and Perricone 1984).
4164â– Another study found that routine size was not correlated with errors, even
4165though structural complexity and amount of data were correlated with errors
4166(Shen et al. 1985).
41671
41682
41693
4170HARD DATA
4171174 Chapter 7: High-Quality Routines
4172â– A 1986 study found that small routines (32 lines of code or fewer) were not correlated
4173with lower cost or fault rate (Card, Church, and Agresti 1986; Card and
4174Glass 1990). The evidence suggested that larger routines (65 lines of code or
4175more) were cheaper to develop per line of code.
4176â– An empirical study of 450 routines found that small routines (those with fewer
4177than 143 source statements, including comments) had 23 percent more errors
4178per line of code than larger routines but were 2.4 times less expensive to fix than
4179larger routines (Selby and Basili 1991).
4180â– Another study found that code needed to be changed least when routines averaged
4181100 to 150 lines of code (Lind and Vairavan 1989).
4182â– A study at IBM found that the most error-prone routines were those that were
4183larger than 500 lines of code. Beyond 500 lines, the error rate tended to be proportional
4184to the size of the routine (Jones 1986a).
4185Where does all this leave the question of routine length in object-oriented programs?
4186A large percentage of routines in object-oriented programs will be accessor routines,
4187which will be very short. From time to time, a complex algorithm will lead to a longer
4188routine, and in those circumstances, the routine should be allowed to grow organically
4189up to 100–200 lines. (A line is a noncomment, nonblank line of source code.)
4190Decades of evidence say that routines of such length are no more error prone than
4191shorter routines. Let issues such as the routine’s cohesion, depth of nesting, number
4192of variables, number of decision points, number of comments needed to explain the
4193routine, and other complexity-related considerations dictate the length of the routine
4194rather than imposing a length restriction per se.
4195That said, if you want to write routines longer than about 200 lines, be careful. None
4196of the studies that reported decreased cost, decreased error rates, or both with larger
4197routines distinguished among sizes larger than 200 lines, and you’re bound to run
4198into an upper limit of understandability as you pass 200 lines of code.
41997.5 How to Use Routine Parameters
4200Interfaces between routines are some of the most error-prone areas of a program. One
4201often-cited study by Basili and Perricone (1984) found that 39 percent of all errors
4202were internal interface errors—errors in communication between routines. Here are a
4203few guidelines for minimizing interface problems:
4204Cross-Reference For details
4205on documenting routine
4206parameters, see “Commenting
4207Routines†in Section 32.5.
4208For details on formatting
4209parameters, see Section
421031.7, “Laying Out Routines.â€
4211Put parameters in input-modify-output order Instead of ordering parameters randomly
4212or alphabetically, list the parameters that are input-only first, input-and-output
4213second, and output-only third. This ordering implies the sequence of operations happening
4214within the routine-inputting data, changing it, and sending back a result. Here
4215are examples of parameter lists in Ada:
42161
42172
42183
4219HARD DATA
42207.5 How to Use Routine Parameters 175
4221Ada Example of Parameters in Input-Modify-Output Order
4222procedure InvertMatrix(
4223Ada uses in and out keywords
4224to make input and
4225output parameters clear.
4226originalMatrix: in Matrix;
4227resultMatrix: out Matrix
4228);
4229...
4230procedure ChangeSentenceCase(
4231desiredCase: in StringCase;
4232sentence: in out Sentence
4233);
4234...
4235procedure PrintPageNumber(
4236pageNumber: in Integer;
4237status: out StatusType
4238);
4239This ordering convention conflicts with the C-library convention of putting the modified
4240parameter first. The input-modify-output convention makes more sense to me,
4241but if you consistently order parameters in some way, you will still do the readers of
4242your code a service.
4243Consider creating your own in and out keywords Other modern languages don’t
4244support the in and out keywords like Ada does. In those languages, you might still be
4245able to use the preprocessor to create your own in and out keywords:
4246C++ Example of Defining Your Own In and Out Keywords
4247#define IN
4248#define OUT
4249void InvertMatrix(
4250IN Matrix originalMatrix,
4251OUT Matrix *resultMatrix
4252);
4253...
4254void ChangeSentenceCase(
4255IN StringCase desiredCase,
4256IN OUT Sentence *sentenceToEdit
4257);
4258...
4259void PrintPageNumber(
4260IN int pageNumber,
4261OUT StatusType &status
4262);
4263In this case, the IN and OUT macro-keywords are used for documentation purposes.
4264To make the value of a parameter changeable by the called routine, the parameter still
4265needs to be passed as a pointer or as a reference parameter.
4266176 Chapter 7: High-Quality Routines
4267Before adopting this technique, be sure to consider a pair of significant drawbacks. Defining
4268your own IN and OUT keywords extends the C++ language in a way that will be unfamiliar
4269to most people reading your code. If you extend the language this way, be sure to
4270do it consistently, preferably projectwide. A second limitation is that the IN and OUT keywords
4271won’t be enforceable by the compiler, which means that you could potentially
4272label a parameter as IN and then modify it inside the routine anyway. That could lull a
4273reader of your code into assuming code is correct when it isn’t. Using C++’s const keyword
4274will normally be the preferable means of identifying input-only parameters.
4275If several routines use similar parameters, put the similar parameters in a consistent
4276order The order of routine parameters can be a mnemonic, and inconsistent order
4277can make parameters hard to remember. For example, in C, the fprintf() routine is the
4278same as the printf() routine except that it adds a file as the first argument. A similar
4279routine, fputs(), is the same as puts() except that it adds a file as the last argument. This
4280is an aggravating, pointless difference that makes the parameters of these routines
4281harder to remember than they need to be.
4282On the other hand, the routine strncpy() in C takes the arguments target string, source
4283string, and maximum number of bytes, in that order, and the routine memcpy() takes
4284the same arguments in the same order. The similarity between the two routines helps
4285in remembering the parameters in either routine.
4286Use all the parameters If you pass a parameter to a routine, use it. If you aren’t using
4287it, remove the parameter from the routine interface. Unused parameters are correlated
4288with an increased error rate. In one study, 46 percent of routines with no unused variables
4289had no errors, and only 17 to 29 percent of routines with more than one unreferenced
4290variable had no errors (Card, Church, and Agresti 1986).
4291This rule to remove unused parameters has one exception. If you’re compiling part of
4292your program conditionally, you might compile out parts of a routine that use a certain
4293parameter. Be nervous about this practice, but if you’re convinced it works, that’s
4294OK too. In general, if you have a good reason not to use a parameter, go ahead and
4295leave it in place. If you don’t have a good reason, make the effort to clean up the code.
4296Put status or error variables last By convention, status variables and variables that
4297indicate an error has occurred go last in the parameter list. They are incidental to the
4298main purpose of the routine, and they are output-only parameters, so it’s a sensible
4299convention.
4300Don’t use routine parameters as working variables It’s dangerous to use the parameters
4301passed to a routine as working variables. Use local variables instead. For example,
4302in the following Java fragment, the variable inputVal is improperly used to store
4303intermediate results of a computation:
43041
43052
43063
4307HARD DATA
43087.5 How to Use Routine Parameters 177
4309Java Example of Improper Use of Input Parameters
4310int Sample( int inputVal ) {
4311inputVal = inputVal * CurrentMultiplier( inputVal );
4312inputVal = inputVal + CurrentAdder( inputVal );
4313...
4314At this point, inputVal no
4315longer contains the value
4316that was input.
4317return inputVal;
4318}
4319In this code fragment, inputVal is misleading because by the time execution reaches the
4320last line, inputVal no longer contains the input value; it contains a computed value based
4321in part on the input value, and it is therefore misnamed. If you later need to modify the
4322routine to use the original input value in some other place, you’ll probably use inputVal
4323and assume that it contains the original input value when it actually doesn’t.
4324How do you solve the problem? Can you solve it by renaming inputVal? Probably not.
4325You could name it something like workingVal, but that’s an incomplete solution because
4326the name fails to indicate that the variable’s original value comes from outside the routine.
4327You could name it something ridiculous like inputValThatBecomesWorkingVal or
4328give up completely and name it x or val, but all these approaches are weak.
4329A better approach is to avoid current and future problems by using working variables
4330explicitly. The following code fragment demonstrates the technique:
4331Java Example of Good Use of Input Parameters
4332int Sample( int inputVal ) {
4333int workingVal = inputVal;
4334workingVal = workingVal * CurrentMultiplier( workingVal );
4335workingVal = workingVal + CurrentAdder( workingVal );
4336...
4337If you need to use the original
4338value of inputVal here
4339or somewhere else, it’s still
4340available.
4341...
4342return workingVal;
4343}
4344Introducing the new variable workingVal clarifies the role of inputVal and eliminates
4345the chance of erroneously using inputVal at the wrong time. (Don’t take this reasoning
4346as a justification for literally naming a variable inputVal or workingVal. In general,
4347inputVal and workingVal are terrible names for variables, and these names are used in
4348this example only to make the variables’ roles clear.)
4349Assigning the input value to a working variable emphasizes where the value comes
4350from. It eliminates the possibility that a variable from the parameter list will be modified
4351accidentally. In C++, this practice can be enforced by the compiler using the keyword
4352const. If you designate a parameter as const, you’re not allowed to modify its value
4353within a routine.
4354178 Chapter 7: High-Quality Routines
4355Cross-Reference For details
4356on interface assumptions,
4357see the introduction to
4358Chapter 8, “Defensive Programming.â€
4359For details on
4360documentation, see Chapter
436132, “Self-Documenting
4362Code.â€
4363Document interface assumptions about parameters If you assume the data being
4364passed to your routine has certain characteristics, document the assumptions as you
4365make them. It’s not a waste of effort to document your assumptions both in the routine
4366itself and in the place where the routine is called. Don’t wait until you’ve written
4367the routine to go back and write the comments—you won’t remember all your assumptions.
4368Even better than commenting your assumptions, use assertions to put them
4369into code.
4370What kinds of interface assumptions about parameters should you document?
4371â– Whether parameters are input-only, modified, or output-only
4372â– Units of numeric parameters (inches, feet, meters, and so on)
4373■Meanings of status codes and error values if enumerated types aren’t used
4374â– Ranges of expected values
4375â– Specific values that should never appear
4376Limit the number of a routine’s parameters to about seven Seven is a magic number
4377for people’s comprehension. Psychological research has found that people generally
4378cannot keep track of more than about seven chunks of information at once (Miller
43791956). This discovery has been applied to an enormous number of disciplines, and it
4380seems safe to conjecture that most people can’t keep track of more than about seven
4381routine parameters at once.
4382In practice, how much you can limit the number of parameters depends on how your
4383language handles complex data types. If you program in a modern language that supports
4384structured data, you can pass a composite data type containing 13 fields and
4385think of it as one mental “chunk†of data. If you program in a more primitive language,
4386you might need to pass all 13 fields individually.
4387Cross-Reference For details
4388on how to think about interfaces,
4389see “Good Abstractionâ€
4390in Section 6.2.
4391If you find yourself consistently passing more than a few arguments, the coupling
4392among your routines is too tight. Design the routine or group of routines to reduce the
4393coupling. If you are passing the same data to many different routines, group the routines
4394into a class and treat the frequently used data as class data.
4395Consider an input, modify, and output naming convention for parameters If you
4396find that it’s important to distinguish among input, modify, and output parameters,
4397establish a naming convention that identifies them. You could prefix them with i_,
4398m_, and o_. If you’re feeling verbose, you could prefix them with Input_, Modify_, and
4399Output_.
44001
44012
44023
4403HARD DATA
44047.5 How to Use Routine Parameters 179
4405Pass the variables or objects that the routine needs to maintain its interface
4406abstraction There are two competing schools of thought about how to pass members
4407of an object to a routine. Suppose you have an object that exposes data through 10
4408access routines and the called routine needs three of those data elements to do its job.
4409Proponents of the first school of thought argue that only the three specific elements
4410needed by the routine should be passed. They argue that doing this will keep the connections
4411between routines to a minimum; reduce coupling; and make them easier to
4412understand, reuse, and so on. They say that passing the whole object to a routine violates
4413the principle of encapsulation by potentially exposing all 10 access routines to
4414the routine that’s called.
4415Proponents of the second school argue that the whole object should be passed. They
4416argue that the interface can remain more stable if the called routine has the flexibility
4417to use additional members of the object without changing the routine’s interface.
4418They argue that passing three specific elements violates encapsulation by exposing
4419which specific data elements the routine is using.
4420I think both these rules are simplistic and miss the most important consideration:
4421what abstraction is presented by the routine’s interface? If the abstraction is that the routine
4422expects you to have three specific data elements, and it is only a coincidence that
4423those three elements happen to be provided by the same object, then you should pass
4424the three specific data elements individually. However, if the abstraction is that you
4425will always have that particular object in hand and the routine will do something or
4426other with that object, then you truly do break the abstraction when you expose the
4427three specific data elements.
4428If you’re passing the whole object and you find yourself creating the object, populating
4429it with the three elements needed by the called routine, and then pulling those elements
4430out of the object after the routine is called, that’s an indication that you should
4431be passing the three specific elements rather than the whole object. (In general, code
4432that “sets up†for a call to a routine or “takes down†after a call to a routine is an indication
4433that the routine is not well designed.)
4434If you find yourself frequently changing the parameter list to the routine, with the
4435parameters coming from the same object each time, that’s an indication that you
4436should be passing the whole object rather than specific elements.
4437180 Chapter 7: High-Quality Routines
4438Use named parameters In some languages, you can explicitly associate formal parameters
4439with actual parameters. This makes parameter usage more self-documenting and
4440helps avoid errors from mismatching parameters. Here’s an example in Visual Basic:
4441Visual Basic Example of Explicitly Identifying Parameters
4442Private Function Distance3d( _
4443Here’s where the formal
4444parameters are declared.
4445ByVal xDistance As Coordinate, _
4446ByVal yDistance As Coordinate, _
4447ByVal zDistance As Coordinate _
4448)
4449...
4450End Function
4451...
4452Private Function Velocity( _
4453ByVal latitude as Coordinate, _
4454ByVal longitude as Coordinate, _
4455ByVal elevation as Coordinate _
4456)
4457...
4458Here’s where the actual
4459parameters are mapped to
4460the formal parameters.
4461Distance = Distance3d( xDistance := latitude, yDistance := longitude, _
4462zDistance := elevation )
4463...
4464End Function
4465This technique is especially useful when you have longer-than-average lists of identically
4466typed arguments, which increases the chances that you can insert a parameter
4467mismatch without the compiler detecting it. Explicitly associating parameters may be
4468overkill in many environments, but in safety-critical or other high-reliability environments
4469the extra assurance that parameters match up the way you expect can be
4470worthwhile.
4471Make sure actual parameters match formal parameters Formal parameters, also
4472known as “dummy parameters,†are the variables declared in a routine definition.
4473Actual parameters are the variables, constants, or expressions used in the actual routine
4474calls.
4475A common mistake is to put the wrong type of variable in a routine call—for example,
4476using an integer when a floating point is needed. (This is a problem only in weakly
4477typed languages like C when you’re not using full compiler warnings. Strongly typed
4478languages such as C++ and Java don’t have this problem.) When arguments are input
4479only, this is seldom a problem; usually the compiler converts the actual type to the formal
4480type before passing it to the routine. If it is a problem, usually your compiler gives
4481you a warning. But in some cases, particularly when the argument is used for both
4482input and output, you can get stung by passing the wrong type of argument.
4483Develop the habit of checking types of arguments in parameter lists and heeding compiler
4484warnings about mismatched parameter types.
44857.6 Special Considerations in the Use of Functions 181
44867.6 Special Considerations in the Use of Functions
4487Modern languages such as C++, Java, and Visual Basic support both functions and procedures.
4488A function is a routine that returns a value; a procedure is a routine that does not.
4489In C++, all routines are typically called “functionsâ€; however, a function with a void return
4490type is semantically a procedure. The distinction between functions and procedures is as
4491much a semantic distinction as a syntactic one, and semantics should be your guide.
4492When to Use a Function and When to Use a Procedure
4493Purists argue that a function should return only one value, just as a mathematical function
4494does. This means that a function would take only input parameters and return its
4495only value through the function itself. The function would always be named for the value
4496it returned, as sin(), CustomerID(), and ScreenHeight() are. A procedure, on the other
4497hand, could take input, modify, and output parameters—as many of each as it wanted to.
4498A common programming practice is to have a function that operates as a procedure and
4499returns a status value. Logically, it works as a procedure, but because it returns a value,
4500it’s officially a function. For example, you might have a routine called FormatOutput()
4501used with a report object in statements like this one:
4502if ( report.FormatOutput( formattedReport ) = Success ) then ...
4503In this example, report.FormatOutput() operates as a procedure in that it has an output
4504parameter, formattedReport, but it is technically a function because the routine itself
4505returns a value. Is this a valid way to use a function? In defense of this approach, you
4506could maintain that the function return value has nothing to do with the main purpose
4507of the routine, formatting output, or with the routine name, report.FormatOutput(). In
4508that sense it operates more as a procedure does even if it is technically a function. The
4509use of the return value to indicate the success or failure of the procedure is not confusing
4510if the technique is used consistently.
4511The alternative is to create a procedure that has a status variable as an explicit parameter,
4512which promotes code like this fragment:
4513report.FormatOutput( formattedReport, outputStatus )
4514if ( outputStatus = Success ) then ...
4515I prefer the second style of coding, not because I’m hard-nosed about the difference
4516between functions and procedures but because it makes a clear separation between
4517the routine call and the test of the status value. To combine the call and the test into
4518one line of code increases the density of the statement and, correspondingly, its complexity.
4519The following use of a function is fine too:
4520outputStatus = report.FormatOutput( formattedReport )
4521if ( outputStatus = Success ) then ...
4522182 Chapter 7: High-Quality Routines
4523In short, use a function if the primary purpose of the routine is to return the value
4524indicated by the function name. Otherwise, use a procedure.
4525Setting the Function’s Return Value
4526Using a function creates the risk that the function will return an incorrect return
4527value. This usually happens when the function has several possible paths and one of
4528the paths doesn’t set a return value. To reduce this risk, do the following:
4529Check all possible return paths When creating a function, mentally execute each
4530path to be sure that the function returns a value under all possible circumstances. It’s
4531good practice to initialize the return value at the beginning of the function to a default
4532value—this provides a safety net in the event that the correct return value is not set.
4533Don’t return references or pointers to local data As soon as the routine ends and the
4534local data goes out of scope, the reference or pointer to the local data will be invalid. If
4535an object needs to return information about its internal data, it should save the information
4536as class member data. It should then provide accessor functions that return the values
4537of the member data items rather than references or pointers to local data.
45387.7 Macro Routines and Inline Routines
4539Cross-Reference Even if
4540your language doesn’t have
4541a macro preprocessor, you
4542can build your own. For
4543details, see Section 30.5,
4544“Building Your Own Programming
4545Tools.â€
4546Routines created with preprocessor macros call for a few unique considerations. The
4547following rules and examples pertain to using the preprocessor in C++. If you’re using
4548a different language or preprocessor, adapt the rules to your situation.
4549Fully parenthesize macro expressions Because macros and their arguments are
4550expanded into code, be careful that they expand the way you want them to. One common
4551problem lies in creating a macro like this one:
4552C++ Example of a Macro That Doesn’t Expand Properly
4553#define Cube( a ) a*a*a
4554If you pass this macro nonatomic values for a, it won’t do the multiplication properly.
4555If you use the expression Cube( x+1 ), it expands to x+1 * x + 1 * x + 1, which, because
4556of the precedence of the multiplication and addition operators, is not what you want.
4557A better, but still not perfect, version of the macro looks like this:
4558C++ Example of a Macro That Still Doesn’t Expand Properly
4559#define Cube( a ) (a)*(a)*(a)
4560KEY POINT
45617.7 Macro Routines and Inline Routines 183
4562This is close, but still no cigar. If you use Cube() in an expression that has operators
4563with higher precedence than multiplication, the (a)*(a)*(a) will be torn apart. To prevent
4564that, enclose the whole expression in parentheses:
4565C++ Example of a Macro That Works
4566#define Cube( a ) ((a)*(a)*(a))
4567Surround multiple-statement macros with curly braces A macro can have multiple
4568statements, which is a problem if you treat it as if it were a single statement. Here’s an
4569example of a macro that’s headed for trouble:
4570C++ Example of a Nonworking Macro with Multiple Statements
4571#define LookupEntry( key, index ) \
4572index = (key - 10) / 5; \
4573index = min( index, MAX_INDEX ); \
4574index = max( index, MIN_INDEX );
4575...
4576for ( entryCount = 0; entryCount < numEntries; entryCount++ )
4577LookupEntry( entryCount, tableIndex[ entryCount ] );
4578This macro is headed for trouble because it doesn’t work as a regular function would.
4579As it’s shown, the only part of the macro that’s executed in the for loop is the first line
4580of the macro:
4581index = (key - 10) / 5;
4582To avoid this problem, surround the macro with curly braces:
4583C++ Example of a Macro with Multiple Statements That Works
4584#define LookupEntry( key, index ) { \
4585index = (key - 10) / 5; \
4586index = min( index, MAX_INDEX ); \
4587index = max( index, MIN_INDEX ); \
4588}
4589The practice of using macros as substitutes for function calls is generally considered
4590risky and hard to understand—bad programming practice—so use this technique only
4591if your specific circumstances require it.
4592Name macros that expand to code like routines so that they can be replaced by routines
4593if necessary The convention in C++ for naming macros is to use all capital letters. If
4594the macro can be replaced by a routine, however, name it using the naming convention
4595for routines instead. That way you can replace macros with routines and vice
4596versa without changing anything but the routine involved.
4597CODING
4598HORROR
4599184 Chapter 7: High-Quality Routines
4600Following this recommendation entails some risk. If you commonly use ++ and -- as
4601side effects (as part of other statements), you’ll get burned when you use macros that
4602you think are routines. Considering the other problems with side effects, this is yet
4603another reason to avoid using side effects.
4604Limitations on the Use of Macro Routines
4605Modern languages like C++ provide numerous alternatives to the use of macros:
4606â– const for declaring constant values
4607â– inline for defining functions that will be compiled as inline code
4608â– template for defining standard operations like min, max, and so on in a type-safe
4609way
4610â– enum for defining enumerated types
4611â– typedef for defining simple type substitutions
4612As Bjarne Stroustrup, designer of C++ points out, “Almost every macro demonstrates
4613a flaw in the programming language, in the program, or in the programmer.... When
4614you use macros, you should expect inferior service from tools such as debuggers,
4615cross-reference tools, and profilers†(Stroustrup 1997). Macros are useful for supporting
4616conditional compilation—see Section 8.6, “Debugging Aidsâ€â€”but careful programmers
4617generally use a macro as an alternative to a routine only as a last resort.
4618Inline Routines
4619C++ supports an inline keyword. An inline routine allows the programmer to treat the
4620code as a routine at code-writing time, but the compiler will generally convert each
4621instance of the routine into inline code at compile time. The theory is that inline can
4622help produce highly efficient code that avoids routine-call overhead.
4623Use inline routines sparingly Inline routines violate encapsulation because C++
4624requires the programmer to put the code for the implementation of the inline routine
4625in the header file, which exposes it to every programmer who uses the header file.
4626Inline routines require a routine’s full code to be generated every time the routine is
4627invoked, which for an inline routine of any size will increase code size. That can create
4628problems of its own.
4629KEY POINT
46307.7 Macro Routines and Inline Routines 185
4631The bottom line on inlining for performance reasons is the same as the bottom line on
4632any other coding technique that’s motivated by performance: profile the code and
4633measure the improvement. If the anticipated performance gain doesn’t justify the
4634bother of profiling the code to verify the improvement, it doesn’t justify the erosion in
4635code quality either.
4636cc2e.com/0792
4637Cross-Reference This is a
4638checklist of considerations
4639about the quality of the routine.
4640For a list of the steps
4641used to build a routine, see
4642the checklist “The Pseudocode
4643Programming Processâ€
4644in Chapter 9, page 215.
4645CHECKLIST: High-Quality Routines
4646Big-Picture Issues
4647â‘ Is the reason for creating the routine sufficient?
4648â‘ Have all parts of the routine that would benefit from being put into routines
4649of their own been put into routines of their own?
4650①Is the routine’s name a strong, clear verb-plus-object name for a procedure
4651or a description of the return value for a function?
4652①Does the routine’s name describe everything the routine does?
4653â‘ Have you established naming conventions for common operations?
4654①Does the routine have strong, functional cohesion—doing one and only
4655one thing and doing it well?
4656①Do the routines have loose coupling—are the routine’s connections to
4657other routines small, intimate, visible, and flexible?
4658â‘ Is the length of the routine determined naturally by its function and logic,
4659rather than by an artificial coding standard?
4660Parameter-Passing Issues
4661①Does the routine’s parameter list, taken as a whole, present a consistent
4662interface abstraction?
4663①Are the routine’s parameters in a sensible order, including matching the
4664order of parameters in similar routines?
4665â‘ Are interface assumptions documented?
4666â‘ Does the routine have seven or fewer parameters?
4667â‘ Is each input parameter used?
4668â‘ Is each output parameter used?
4669â‘ Does the routine avoid using input parameters as working variables?
4670â‘ If the routine is a function, does it return a valid value under all possible
4671circumstances?
4672186 Chapter 7: High-Quality Routines
4673Key Points
4674â– The most important reason for creating a routine is to improve the intellectual
4675manageability of a program, and you can create a routine for many other good
4676reasons. Saving space is a minor reason; improved readability, reliability, and
4677modifiability are better reasons.
4678â– Sometimes the operation that most benefits from being put into a routine of its
4679own is a simple one.
4680â– You can classify routines into various kinds of cohesion, but you can make most
4681routines functionally cohesive, which is best.
4682■The name of a routine is an indication of its quality. If the name is bad and it’s
4683accurate, the routine might be poorly designed. If the name is bad and it’s inaccurate,
4684it’s not telling you what the program does. Either way, a bad name means
4685that the program needs to be changed.
4686â– Functions should be used only when the primary purpose of the function is to
4687return the specific value described by the function’s name.
4688â– Careful programmers use macro routines with care and only as a last resort.
4689187
4690Chapter 8
4691Defensive Programming
4692cc2e.com/0861 Contents
4693â– 8.1 Protecting Your Program from Invalid Inputs: page 188
4694â– 8.2 Assertions: page 189
4695â– 8.3 Error-Handling Techniques: page 194
4696â– 8.4 Exceptions: page 198
4697â– 8.5 Barricade Your Program to Contain the Damage Caused by Errors: page 203
4698â– 8.6 Debugging Aids: page 205
4699â– 8.7 Determining How Much Defensive Programming to Leave in Production
4700Code: page 209
4701â– 8.8 Being Defensive About Defensive Programming: page 210
4702Related Topics
4703â– Information hiding: "Hide Secrets (Information Hiding)" in Section 5.3
4704â– Design for change: "Identify Areas Likely to Change" in Section 5.3
4705â– Software architecture: Section 3.5
4706â– Design in Construction: Chapter 5
4707â– Debugging: Chapter 23
4708Defensive programming doesn’t mean being defensive about your programming—“It
4709does so work!†The idea is based on defensive driving. In defensive driving, you adopt
4710the mind-set that you’re never sure what the other drivers are going to do. That way,
4711you make sure that if they do something dangerous you won’t be hurt. You take
4712responsibility for protecting yourself even when it might be the other driver’s fault. In
4713defensive programming, the main idea is that if a routine is passed bad data, it won’t
4714be hurt, even if the bad data is another routine’s fault. More generally, it’s the recognition
4715that programs will have problems and modifications, and that a smart programmer
4716will develop code accordingly.
4717This chapter describes how to protect yourself from the cold, cruel world of invalid
4718data, events that can “never†happen, and other programmers’ mistakes. If you’re an
4719experienced programmer, you might skip the next section on handling input data and
4720begin with Section 8.2, which reviews the use of assertions.
4721KEY POINT
4722188 Chapter 8: Defensive Programming
47238.1 Protecting Your Program from Invalid Inputs
4724In school you might have heard the expression, “Garbage in, garbage out.†That expression
4725is essentially software development’s version of caveat emptor: let the user beware.
4726For production software, garbage in, garbage out isn’t good enough. A good program
4727never puts out garbage, regardless of what it takes in. A good program uses “garbage in,
4728nothing out,†“garbage in, error message out,†or “no garbage allowed in†instead. By
4729today’s standards, “garbage in, garbage out†is the mark of a sloppy, nonsecure program.
4730There are three general ways to handle garbage in:
4731Check the values of all data from external sources When getting data from a file, a
4732user, the network, or some other external interface, check to be sure that the data falls
4733within the allowable range. Make sure that numeric values are within tolerances and
4734that strings are short enough to handle. If a string is intended to represent a restricted
4735range of values (such as a financial transaction ID or something similar), be sure that
4736the string is valid for its intended purpose; otherwise reject it. If you’re working on a
4737secure application, be especially leery of data that might attack your system:
4738attempted buffer overflows, injected SQL commands, injected HTML or XML code,
4739integer overflows, data passed to system calls, and so on.
4740Check the values of all routine input parameters Checking the values of routine
4741input parameters is essentially the same as checking data that comes from an external
4742source, except that the data comes from another routine instead of from an external
4743interface. The discussion in Section 8.5, “Barricade Your Program to Contain the Damage
4744Caused by Errors,†provides a practical way to determine which routines need to
4745check their inputs.
4746Decide how to handle bad inputs Once you’ve detected an invalid parameter, what
4747do you do with it? Depending on the situation, you might choose any of a dozen different
4748approaches, which are described in detail in Section 8.3, “Error-Handling Techniques,â€
4749later in this chapter.
4750Defensive programming is useful as an adjunct to the other quality-improvement techniques
4751described in this book. The best form of defensive coding is not inserting
4752errors in the first place. Using iterative design, writing pseudocode before code, writing
4753test cases before writing the code, and having low-level design inspections are all
4754activities that help to prevent inserting defects. They should thus be given a higher priority
4755than defensive programming. Fortunately, you can use defensive programming
4756in combination with the other techniques.
4757As Figure 8-1 suggests, protecting yourself from seemingly small problems can make
4758more of a difference than you might think. The rest of this chapter describes specific
4759options for checking data from external sources, checking input parameters, and handling
4760bad inputs.
4761KEY POINT
47628.2 Assertions 189
4763Figure 8-1 Part of the Interstate-90 floating bridge in Seattle sank during a storm because
4764the flotation tanks were left uncovered, they filled with water, and the bridge became too
4765heavy to float. During construction, protecting yourself against the small stuff matters more
4766than you might think.
47678.2 Assertions
4768An assertion is code that’s used during development—usually a routine or macro—that
4769allows a program to check itself as it runs. When an assertion is true, that means
4770everything is operating as expected. When it’s false, that means it has detected an
4771unexpected error in the code. For example, if the system assumes that a customerinformation
4772file will never have more than 50,000 records, the program might contain
4773an assertion that the number of records is less than or equal to 50,000. As long as the
4774number of records is less than or equal to 50,000, the assertion will be silent. If it
4775encounters more than 50,000 records, however, it will loudly “assert†that an error is
4776in the program.
4777Assertions are especially useful in large, complicated programs and in high-reliability
4778programs. They enable programmers to more quickly flush out mismatched interface
4779assumptions, errors that creep in when code is modified, and so on.
4780An assertion usually takes two arguments: a boolean expression that describes the
4781assumption that’s supposed to be true, and a message to display if it isn’t. Here’s what a
4782Java assertion would look like if the variable denominator were expected to be nonzero:
4783Mike Siegel/The Seattle Times
4784KEY POINT
4785190 Chapter 8: Defensive Programming
4786Java Example of an Assertion
4787assert denominator != 0 : "denominator is unexpectedly equal to 0.";
4788This assertion asserts that denominator is not equal to 0. The first argument, denominator
4789!= 0, is a boolean expression that evaluates to true or false. The second argument
4790is a message to print if the first argument is false—that is, if the assertion is false.
4791Use assertions to document assumptions made in the code and to flush out unexpected
4792conditions. Assertions can be used to check assumptions like these:
4793■That an input parameter’s value falls within its expected range (or an output
4794parameter’s value does)
4795â– That a file or stream is open (or closed) when a routine begins executing (or
4796when it ends executing)
4797â– That a file or stream is at the beginning (or end) when a routine begins executing
4798(or when it ends executing)
4799â– That a file or stream is open for read-only, write-only, or both read and write
4800â– That the value of an input-only variable is not changed by a routine
4801â– That a pointer is non-null
4802â– That an array or other container passed into a routine can contain at least X
4803number of data elements
4804â– That a table has been initialized to contain real values
4805â– That a container is empty (or full) when a routine begins executing (or when it
4806finishes)
4807â– That the results from a highly optimized, complicated routine match the results
4808from a slower but clearly written routine
4809Of course, these are just the basics, and your own routines will contain many more
4810specific assumptions that you can document using assertions.
4811Normally, you don’t want users to see assertion messages in production code; assertions
4812are primarily for use during development and maintenance. Assertions are normally
4813compiled into the code at development time and compiled out of the code for production.
4814During development, assertions flush out contradictory assumptions, unexpected
4815conditions, bad values passed to routines, and so on. During production, they can be
4816compiled out of the code so that the assertions don’t degrade system performance.
48178.2 Assertions 191
4818Building Your Own Assertion Mechanism
4819Cross-Reference Building
4820your own assertion routine is
4821a good example of programming
4822“into†a language
4823rather than just programming
4824“in†a language. For
4825more details on this distinction,
4826see Section 34.4, "Program
4827into Your Language,
4828Not in It."
4829Many languages have built-in support for assertions, including C++, Java, and
4830Microsoft Visual Basic. If your language doesn’t directly support assertion routines,
4831they are easy to write. The standard C++ assert macro doesn’t provide for text messages.
4832Here’s an example of an improved ASSERT implemented as a C++ macro:
4833C++ Example of an Assertion Macro
4834#define ASSERT( condition, message ) { \
4835if ( !(condition) ) { \
4836LogError( "Assertion failed: ", \
4837#condition, message ); \
4838exit( EXIT_FAILURE ); \
4839} \
4840}
4841Guidelines for Using Assertions
4842Here are some guidelines for using assertions:
4843Use error-handling code for conditions you expect to occur; use assertions for
4844conditions that should never occur Assertions check for conditions that should
4845never occur. Error-handling code checks for off-nominal circumstances that might not
4846occur very often, but that have been anticipated by the programmer who wrote the
4847code and that need to be handled by the production code. Error handling typically
4848checks for bad input data; assertions check for bugs in the code.
4849If error-handling code is used to address an anomalous condition, the error handling
4850will enable the program to respond to the error gracefully. If an assertion is fired for an
4851anomalous condition, the corrective action is not merely to handle an error gracefully—
4852the corrective action is to change the program’s source code, recompile, and
4853release a new version of the software.
4854A good way to think of assertions is as executable documentation—you can’t rely on
4855them to make the code work, but they can document assumptions more actively than
4856program-language comments can.
4857Avoid putting executable code into assertions Putting code into an assertion raises
4858the possibility that the compiler will eliminate the code when you turn off the assertions.
4859Suppose you have an assertion like this:
4860192 Chapter 8: Defensive Programming
4861Cross-Reference You could
4862view this as one of many
4863problems associated with
4864putting multiple statements
4865on one line. For more examples,
4866see "Using Only One
4867Statement per Line" in
4868Section 31.5.
4869Visual Basic Example of a Dangerous Use of an Assertion
4870Debug.Assert( PerformAction() ) ' Couldn't perform action
4871The problem with this code is that, if you don’t compile the assertions, you don’t compile
4872the code that performs the action. Put executable statements on their own lines,
4873assign the results to status variables, and test the status variables instead. Here’s an
4874example of a safe use of an assertion:
4875Visual Basic Example of a Safe Use of an Assertion
4876actionPerformed = PerformAction()
4877Debug.Assert( actionPerformed ) ' Couldn't perform action
4878Further Reading For much
4879more on preconditions and
4880postconditions, see Object-
4881Oriented Software Construction
4882(Meyer 1997).
4883Use assertions to document and verify preconditions and postconditions Preconditions
4884and postconditions are part of an approach to program design and development
4885known as “design by contract†(Meyer 1997). When preconditions and postconditions
4886are used, each routine or class forms a contract with the rest of the program.
4887Preconditions are the properties that the client code of a routine or class promises will
4888be true before it calls the routine or instantiates the object. Preconditions are the client
4889code’s obligations to the code it calls.
4890Postconditions are the properties that the routine or class promises will be true when it
4891concludes executing. Postconditions are the routine’s or class’s obligations to the
4892code that uses it.
4893Assertions are a useful tool for documenting preconditions and postconditions. Comments
4894could be used to document preconditions and postconditions, but, unlike comments,
4895assertions can check dynamically whether the preconditions and
4896postconditions are true.
4897In the following example, assertions are used to document the preconditions and
4898postcondition of the Velocity routine.
4899Visual Basic Example of Using Assertions to Document Preconditions and
4900Postconditions
4901Private Function Velocity ( _
4902ByVal latitude As Single, _
4903ByVal longitude As Single, _
4904ByVal elevation As Single _
4905) As Single
4906' Preconditions
4907Debug.Assert ( -90 <= latitude And latitude <= 90 )
4908Debug.Assert ( 0 <= longitude And longitude < 360 )
4909Debug.Assert ( -500 <= elevation And elevation <= 75000 )
49108.2 Assertions 193
4911...
4912' Postconditions
4913Debug.Assert ( 0 <= returnVelocity And returnVelocity <= 600 )
4914' return value
4915Velocity = returnVelocity
4916End Function
4917If the variables latitude, longitude, and elevation were coming from an external source,
4918invalid values should be checked and handled by error-handling code rather than by
4919assertions. If the variables are coming from a trusted, internal source, however, and
4920the routine’s design is based on the assumption that these values will be within their
4921valid ranges, then assertions are appropriate.
4922Cross-Reference For more
4923on robustness, see "Robustness
4924vs. Correctness" in Section
49258.3, later in this chapter.
4926For highly robust code, assert and then handle the error anyway For any given
4927error condition, a routine will generally use either an assertion or error-handling code,
4928but not both. Some experts argue that only one kind is needed (Meyer 1997).
4929But real-world programs and projects tend to be too messy to rely solely on assertions.
4930On a large, long-lasting system, different parts might be designed by different designers
4931over a period of 5–10 years or more. The designers will be separated in time, across
4932numerous versions. Their designs will focus on different technologies at different
4933points in the system’s development. The designers will be separated geographically,
4934especially if parts of the system are acquired from external sources. Programmers will
4935have worked to different coding standards at different points in the system’s lifetime.
4936On a large development team, some programmers will inevitably be more conscientious
4937than others and some parts of the code will be reviewed more rigorously than
4938other parts of the code. Some programmers will unit test their code more thoroughly
4939than others. With test teams working across different geographic regions and subject
4940to business pressures that result in test coverage that varies with each release, you
4941can’t count on comprehensive, system-level regression testing, either.
4942In such circumstances, both assertions and error-handling code might be used to
4943address the same error. In the source code for Microsoft Word, for example, conditions
4944that should always be true are asserted, but such errors are also handled by
4945error-handling code in case the assertion fails. For extremely large, complex, longlived
4946applications like Word, assertions are valuable because they help to flush out as
4947many development-time errors as possible. But the application is so complex (millions
4948of lines of code) and has gone through so many generations of modification that
4949it isn’t realistic to assume that every conceivable error will be detected and corrected
4950before the software ships, and so errors must be handled in the production version of
4951the system as well.
4952194 Chapter 8: Defensive Programming
4953Here’s an example of how that might work in the Velocity example:
4954Visual Basic Example of Using Assertions to Document Preconditions and
4955Postconditions
4956Private Function Velocity ( _
4957ByRef latitude As Single, _
4958ByRef longitude As Single, _
4959ByRef elevation As Single _
4960) As Single
4961' Preconditions
4962Here is the assertion code. Debug.Assert ( -90 <= latitude And latitude <= 90 )
4963Debug.Assert ( 0 <= longitude And longitude < 360 )
4964Debug.Assert ( -500 <= elevation And elevation <= 75000 )
4965...
4966' Sanitize input data. Values should be within the ranges asserted above,
4967' but if a value is not within its valid range, it will be changed to the
4968' closest legal value
4969Here is the code that handles
4970bad input data at run time.
4971If ( latitude < -90 ) Then
4972latitude = -90
4973ElseIf ( latitude > 90 ) Then
4974latitude = 90
4975End If
4976If ( longitude < 0 ) Then
4977longitude = 0
4978ElseIf ( longitude > 360 ) Then
4979...
49808.3 Error-Handling Techniques
4981Assertions are used to handle errors that should never occur in the code. How do
4982you handle errors that you do expect to occur? Depending on the specific circumstances,
4983you might want to return a neutral value, substitute the next piece of valid
4984data, return the same answer as the previous time, substitute the closest legal value,
4985log a warning message to a file, return an error code, call an error-processing routine
4986or object, display an error message, or shut down—or you might want to use a combination
4987of these responses.
4988Here are some more details on these options:
4989Return a neutral value Sometimes the best response to bad data is to continue operating
4990and simply return a value that’s known to be harmless. A numeric computation
4991might return 0. A string operation might return an empty string, or a pointer operation
4992might return an empty pointer. A drawing routine that gets a bad input value for
4993color in a video game might use the default background or foreground color. A drawing
4994routine that displays x-ray data for cancer patients, however, would not want to
4995display a “neutral value.†In that case, you’d be better off shutting down the program
4996than displaying incorrect patient data.
49978.3 Error-Handling Techniques 195
4998Substitute the next piece of valid data When processing a stream of data, some circumstances
4999call for simply returning the next valid data. If you’re reading records
5000from a database and encounter a corrupted record, you might simply continue reading
5001until you find a valid record. If you’re taking readings from a thermometer 100
5002times per second and you don’t get a valid reading one time, you might simply wait
5003another 1/100th of a second and take the next reading.
5004Return the same answer as the previous time If the thermometer-reading software
5005doesn’t get a reading one time, it might simply return the same value as last time.
5006Depending on the application, temperatures might not be very likely to change much in
50071/100th of a second. In a video game, if you detect a request to paint part of the screen
5008an invalid color, you might simply return the same color used previously. But if you’re
5009authorizing transactions at a cash machine, you probably wouldn’t want to use the
5010“same answer as last timeâ€â€”that would be the previous user’s bank account number!
5011Substitute the closest legal value In some cases, you might choose to return the closest
5012legal value, as in the Velocity example earlier. This is often a reasonable approach
5013when taking readings from a calibrated instrument. The thermometer might be calibrated
5014between 0 and 100 degrees Celsius, for example. If you detect a reading less
5015than 0, you can substitute 0, which is the closest legal value. If you detect a value
5016greater than 100, you can substitute 100. For a string operation, if a string length is
5017reported to be less than 0, you could substitute 0. My car uses this approach to error
5018handling whenever I back up. Since my speedometer doesn’t show negative speeds,
5019when I back up it simply shows a speed of 0—the closest legal value.
5020Log a warning message to a file When bad data is detected, you might choose to log
5021a warning message to a file and then continue on. This approach can be used in conjunction
5022with other techniques like substituting the closest legal value or substituting
5023the next piece of valid data. If you use a log, consider whether you can safely make it
5024publicly available or whether you need to encrypt it or protect it some other way.
5025Return an error code You could decide that only certain parts of a system will handle
5026errors. Other parts will not handle errors locally; they will simply report that an
5027error has been detected and trust that some other routine higher up in the calling hierarchy
5028will handle the error. The specific mechanism for notifying the rest of the system
5029that an error has occurred could be any of the following:
5030â– Set the value of a status variable
5031■Return status as the function’s return value
5032■Throw an exception by using the language’s built-in exception mechanism
5033In this case, the specific error-reporting mechanism is less important than the decision
5034about which parts of the system will handle errors directly and which will just
5035report that they’ve occurred. If security is an issue, be sure that calling routines always
5036check return codes.
5037196 Chapter 8: Defensive Programming
5038Call an error-processing routine/object Another approach is to centralize error handling
5039in a global error-handling routine or error-handling object. The advantage of this
5040approach is that error-processing responsibility can be centralized, which can make
5041debugging easier. The tradeoff is that the whole program will know about this central
5042capability and will be coupled to it. If you ever want to reuse any of the code from the
5043system in another system, you’ll have to drag the error-handling machinery along
5044with the code you reuse.
5045This approach has an important security implication. If your code has encountered a
5046buffer overrun, it’s possible that an attacker has compromised the address of the handler
5047routine or object. Thus, once a buffer overrun has occurred while an application
5048is running, it is no longer safe to use this approach.
5049Display an error message wherever the error is encountered This approach minimizes
5050error-handling overhead; however, it does have the potential to spread user
5051interface messages through the entire application, which can create challenges when
5052you need to create a consistent user interface, when you try to clearly separate the UI
5053from the rest of the system, or when you try to localize the software into a different
5054language. Also, beware of telling a potential attacker of the system too much. Attackers
5055sometimes use error messages to discover how to attack a system.
5056Handle the error in whatever way works best locally Some designs call for handling
5057all errors locally—the decision of which specific error-handling method to use is left
5058up to the programmer designing and implementing the part of the system that
5059encounters the error.
5060This approach provides individual developers with great flexibility, but it creates a significant
5061risk that the overall performance of the system will not satisfy its requirements
5062for correctness or robustness (more on this in a moment). Depending on how
5063developers end up handling specific errors, this approach also has the potential to
5064spread user interface code throughout the system, which exposes the program to all
5065the problems associated with displaying error messages.
5066Shut down Some systems shut down whenever they detect an error. This approach
5067is useful in safety-critical applications. For example, if the software that controls radiation
5068equipment for treating cancer patients receives bad input data for the radiation
5069dosage, what is its best error-handling response? Should it use the same value as last
5070time? Should it use the closest legal value? Should it use a neutral value? In this case,
5071shutting down is the best option. We’d much prefer to reboot the machine than to run
5072the risk of delivering the wrong dosage.
5073A similar approach can be used to improve the security of Microsoft Windows. By
5074default, Windows continues to operate even when its security log is full. But you can
5075configure Windows to halt the server if the security log becomes full, which can be
5076appropriate in a security-critical environment.
50778.3 Error-Handling Techniques 197
5078Robustness vs. Correctness
5079As the video game and x-ray examples show us, the style of error processing that is
5080most appropriate depends on the kind of software the error occurs in. These examples
5081also illustrate that error processing generally favors more correctness or more
5082robustness. Developers tend to use these terms informally, but, strictly speaking,
5083these terms are at opposite ends of the scale from each other. Correctness means never
5084returning an inaccurate result; returning no result is better than returning an inaccurate
5085result. Robustness means always trying to do something that will allow the software
5086to keep operating, even if that leads to results that are inaccurate sometimes.
5087Safety-critical applications tend to favor correctness to robustness. It is better to return
5088no result than to return a wrong result. The radiation machine is a good example of
5089this principle.
5090Consumer applications tend to favor robustness to correctness. Any result whatsoever is
5091usually better than the software shutting down. The word processor I’m using occasionally
5092displays a fraction of a line of text at the bottom of the screen. If it detects that condition,
5093do I want the word processor to shut down? No. I know that the next time I hit
5094Page Up or Page Down, the screen will refresh and the display will be back to normal.
5095High-Level Design Implications of Error Processing
5096With so many options, you need to be careful to handle invalid parameters in consistent
5097ways throughout the program. The way in which errors are handled affects the software’s
5098ability to meet requirements related to correctness, robustness, and other nonfunctional
5099attributes. Deciding on a general approach to bad parameters is an architectural or highlevel
5100design decision and should be addressed at one of those levels.
5101Once you decide on the approach, make sure you follow it consistently. If you decide
5102to have high-level code handle errors and low-level code merely report errors, make
5103sure the high-level code actually handles the errors! Some languages give you the
5104option of ignoring the fact that a function is returning an error code—in C++, you’re
5105not required to do anything with a function’s return value—but don’t ignore error
5106information! Test the function return value. If you don’t expect the function ever to
5107produce an error, check it anyway. The whole point of defensive programming is
5108guarding against errors you don’t expect.
5109This guideline holds true for system functions as well as for your own functions.
5110Unless you’ve set an architectural guideline of not checking system calls for errors,
5111check for error codes after each call. If you detect an error, include the error number
5112and the description of the error.
5113KEY POINT
5114198 Chapter 8: Defensive Programming
51158.4 Exceptions
5116Exceptions are a specific means by which code can pass along errors or exceptional
5117events to the code that called it. If code in one routine encounters an unexpected condition
5118that it doesn’t know how to handle, it throws an exception, essentially throwing
5119up its hands and yelling, “I don’t know what to do about this—I sure hope somebody
5120else knows how to handle it!†Code that has no sense of the context of an error can
5121return control to other parts of the system that might have a better ability to interpret
5122the error and do something useful about it.
5123Exceptions can also be used to straighten out tangled logic within a single stretch of
5124code, such as the “Rewrite with try-finally†example in Section 17.3. The basic structure
5125of an exception is that a routine uses throw to throw an exception object. Code in some
5126other routine up the calling hierarchy will catch the exception within a try-catch block.
5127Popular languages vary in how they implement exceptions. Table 8-1 summarizes the
5128major differences in three of them:
5129Table 8-1 Popular-Language Support for Exceptions
5130Exception
5131Attribute C++ Java Visual Basic
5132Try-catch support yes yes yes
5133Try-catch-finally
5134support
5135no yes yes
5136What can be
5137thrown
5138Exception object or
5139object derived from
5140Exception class; object
5141pointer; object reference;
5142data type like
5143string or int
5144Exception object or
5145object derived from
5146Exception class
5147Exception object or
5148object derived from
5149Exception class
5150Effect of uncaught
5151exception
5152Invokes std::unexpected(),
5153which by
5154default invokes
5155std::terminate(),
5156which by default
5157invokes abort()
5158Terminates thread
5159of execution if
5160exception is a
5161“checked exceptionâ€;
5162no effect if
5163exception is a
5164“runtime
5165exceptionâ€
5166Terminates
5167program
5168Exceptions thrown
5169must be defined
5170in class interface
5171No Yes No
5172Exceptions caught
5173must be defined
5174in class interface
5175No Yes No
51768.4 Exceptions 199
5177Programs that use exceptions
5178as part of their normal
5179processing suffer from all
5180the readability and maintainability
5181problems of classic
5182spaghetti code.
5183—Andy Hunt and Dave
5184Thomas
5185Exceptions have an attribute in common with inheritance: used judiciously, they can
5186reduce complexity. Used imprudently, they can make code almost impossible to follow.
5187This section contains suggestions for realizing the benefits of exceptions and
5188avoiding the difficulties often associated with them.
5189Use exceptions to notify other parts of the program about errors that should not be
5190ignored The overriding benefit of exceptions is their ability to signal error conditions
5191in such a way that they cannot be ignored (Meyers 1996). Other approaches to
5192handling errors create the possibility that an error condition can propagate through a
5193code base undetected. Exceptions eliminate that possibility.
5194Throw an exception only for conditions that are truly exceptional Exceptions
5195should be reserved for conditions that are truly exceptional—in other words, for conditions
5196that cannot be addressed by other coding practices. Exceptions are used in
5197similar circumstances to assertions—for events that are not just infrequent but for
5198events that should never occur.
5199Exceptions represent a tradeoff between a powerful way to handle unexpected conditions
5200on the one hand and increased complexity on the other. Exceptions weaken
5201encapsulation by requiring the code that calls a routine to know which exceptions
5202might be thrown inside the code that’s called. That increases code complexity, which
5203works against what Chapter 5, “Design in Construction,†refers to as Software’s Primary
5204Technical Imperative: Managing Complexity.
5205Don’t use an exception to pass the buck If an error condition can be handled locally,
5206handle it locally. Don’t throw an uncaught exception in a section of code if you can
5207handle the error locally.
5208Avoid throwing exceptions in constructors and destructors unless you catch them in the
5209same place The rules for how exceptions are processed become very complicated
5210very quickly when exceptions are thrown in constructors and destructors. In C++, for
5211example, destructors aren’t called unless an object is fully constructed, which means
5212if code within a constructor throws an exception, the destructor won’t be called,
5213thereby setting up a possible resource leak (Meyers 1996, Stroustrup 1997). Similarly
5214complicated rules apply to exceptions within destructors.
5215Language lawyers might say that remembering rules like these is “trivial,†but programmers
5216who are mere mortals will have trouble remembering them. It’s better programming
5217practice simply to avoid the extra complexity such code creates by not
5218writing that kind of code in the first place.
5219Cross-Reference For more
5220on maintaining consistent
5221interface abstractions, see
5222"Good Abstraction" in
5223Section 6.2.
5224Throw exceptions at the right level of abstraction A routine should present a consistent
5225abstraction in its interface, and so should a class. The exceptions thrown are part
5226of the routine interface, just like specific data types are.
5227200 Chapter 8: Defensive Programming
5228When you choose to pass an exception to the caller, make sure the exception’s level of
5229abstraction is consistent with the routine interface’s abstraction. Here’s an example of
5230what not to do:
5231Bad Java Example of a Class that Throws an Exception at an Inconsistent Level
5232of Abstraction
5233class Employee {
5234...
5235Here is the declaration of the
5236exception that’s at an inconsistent
5237level of abstraction.
5238public TaxId GetTaxId() throws EOFException {
5239...
5240}
5241...
5242}
5243The GetTaxId() code passes the lower-level EOFException exception back to its caller. It
5244doesn’t take ownership of the exception itself; it exposes some details about how it’s
5245implemented by passing the lower-level exception to its caller. This effectively couples
5246the routine’s client’s code not to the Employee class’s code but to the code below the
5247Employee class that throws the EOFException exception. Encapsulation is broken, and
5248intellectual manageability starts to decline.
5249Instead, the GetTaxId() code should pass back an exception that’s consistent with the
5250class interface of which it’s a part, like this:
5251Good Java Example of a Class that Throws an Exception at a Consistent Level
5252of Abstraction
5253class Employee {
5254...
5255Here is the declaration of
5256the exception that contributes
5257to a consistent level
5258of abstraction.
5259public TaxId GetTaxId() throws EmployeeDataNotAvailable {
5260...
5261}
5262...
5263}
5264The exception-handling code inside GetTaxId() will probably just map the
5265io_disk_not_ready exception onto the EmployeeDataNotAvailable exception, which is
5266fine because that’s sufficient to preserve the interface abstraction.
5267Include in the exception message all information that led to the exception Every
5268exception occurs in specific circumstances that are detected at the time the code
5269throws the exception. This information is invaluable to the person who reads the
5270exception message. Be sure the message contains the information needed to understand
5271why the exception was thrown. If the exception was thrown because of an array
5272CODING
5273HORROR
52748.4 Exceptions 201
5275index error, be sure the exception message includes the upper and lower array limits
5276and the value of the illegal index.
5277Avoid empty catch blocks Sometimes it’s tempting to pass off an exception that you
5278don’t know what to do with, like this:
5279Bad Java Example of Ignoring an Exception
5280try {
5281...
5282// lots of code
5283...
5284} catch ( AnException exception ) {
5285}
5286Such an approach says that either the code within the try block is wrong because it
5287raises an exception for no reason, or the code within the catch block is wrong because
5288it doesn’t handle a valid exception. Determine which is the root cause of the problem,
5289and then fix either the try block or the catch block.
5290You might occasionally find rare circumstances in which an exception at a lower level
5291really doesn’t represent an exception at the level of abstraction of the calling routine. If
5292that’s the case, at least document why an empty catch block is appropriate. You could
5293“document†that case with comments or by logging a message to a file, as follows:
5294Good Java Example of Ignoring an Exception
5295try {
5296...
5297// lots of code
5298...
5299} catch ( AnException exception ) {
5300LogError( "Unexpected exception" );
5301}
5302Know the exceptions your library code throws If you’re working in a language that
5303doesn’t require a routine or class to define the exceptions it throws, be sure you know
5304what exceptions are thrown by any library code you use. Failing to catch an exception
5305generated by library code will crash your program just as fast as failing to catch an
5306exception you generated yourself. If the library code doesn’t document the exceptions it
5307throws, create prototyping code to exercise the libraries and flush out the exceptions.
5308Consider building a centralized exception reporter One approach to ensuring consistency
5309in exception handling is to use a centralized exception reporter. The centralized
5310exception reporter provides a central repository for knowledge about what kinds
5311of exceptions there are, how each exception should be handled, formatting of exception
5312messages, and so on.
5313CODING
5314HORROR
5315202 Chapter 8: Defensive Programming
5316Here is an example of a simple exception handler that simply prints a diagnostic
5317message:
5318Visual Basic Example of a Centralized Exception Reporter, Part 1
5319Further Reading For a more
5320detailed explanation of this
5321technique, see Practical
5322Standards for Microsoft
5323Visual Basic .NET (Foxall
53242003).
5325Sub ReportException( _
5326ByVal className, _
5327ByVal thisException As Exception _
5328)
5329Dim message As String
5330Dim caption As String
5331message = "Exception: " & thisException.Message & "." & ControlChars.CrLf & _
5332"Class: " & className & ControlChars.CrLf & _
5333"Routine: " & thisException.TargetSite.Name & ControlChars.CrLf
5334caption = "Exception"
5335MessageBox.Show( message, caption, MessageBoxButtons.OK, _
5336MessageBoxIcon.Exclamation )
5337End Sub
5338You would use this generic exception handler with code like this:
5339Visual Basic Example of a Centralized Exception Reporter, Part 2
5340Try
5341...
5342Catch exceptionObject As Exception
5343ReportException( CLASS_NAME, exceptionObject )
5344End Try
5345The code in this version of ReportException() is simple. In a real application, you
5346could make the code as simple or as elaborate as needed to meet your exceptionhandling
5347needs.
5348If you do decide to build a centralized exception reporter, be sure to consider the general
5349issues involved in centralized error handling, which are discussed in "Call an
5350error-processing routine/object" in Section 8.3.
5351Standardize your project’s use of exceptions To keep exception handling as intellectually
5352manageable as possible, you can standardize your use of exceptions in several
5353ways:
5354■If you’re working in a language like C++ that allows you to throw a variety of
5355kinds of objects, data, and pointers, standardize on what specifically you will
5356throw. For compatibility with other languages, consider throwing only objects
5357derived from the Exception base class.
53588.5 Barricade Your Program to Contain the Damage Caused by Errors 203
5359â– Consider creating your own project-specific exception class, which can serve as
5360the base class for all exceptions thrown on your project. This supports centralizing
5361and standardizing logging, error reporting, and so on.
5362â– Define the specific circumstances under which code is allowed to use throwcatch
5363syntax to perform error processing locally.
5364â– Define the specific circumstances under which code is allowed to throw an
5365exception that won’t be handled locally.
5366â– Determine whether a centralized exception reporter will be used.
5367â– Define whether exceptions are allowed in constructors and destructors.
5368Cross-Reference For numerous
5369alternative error-handling
5370approaches, see
5371Section 8.3, "Error-Handling
5372Techniques,†earlier in this
5373chapter.
5374Consider alternatives to exceptions Several programming languages have supported
5375exceptions for 5–10 years or more, but little conventional wisdom has emerged
5376about how to use them safely.
5377Some programmers use exceptions to handle errors just because their language provides
5378that particular error-handling mechanism. You should always consider the full
5379set of error-handling alternatives: handling the error locally, propagating the error by
5380using an error code, logging debug information to a file, shutting down the system, or
5381using some other approach. Handling errors with exceptions just because your language
5382provides exception handling is a classic example of programming in a language
5383rather than programming into a language. (For details on that distinction, see Section
53844.3, “Your Location on the Technology Wave,†and Section 34.4, "Program into Your
5385Language, Not in It."
5386Finally, consider whether your program really needs to handle exceptions, period. As
5387Bjarne Stroustrup points out, sometimes the best response to a serious run-time error
5388is to release all acquired resources and abort. Let the user rerun the program with
5389proper input (Stroustrup 1997).
53908.5 Barricade Your Program to Contain the Damage Caused
5391by Errors
5392Barricades are a damage-containment strategy. The reason is similar to that for having
5393isolated compartments in the hull of a ship. If the ship runs into an iceberg and pops
5394open the hull, that compartment is shut off and the rest of the ship isn’t affected. They
5395are also similar to firewalls in a building. A building’s firewalls prevent fire from spreading
5396from one part of a building to another part. (Barricades used to be called “firewalls,â€
5397but the term “firewall†now commonly refers to blocking hostile network traffic.)
5398One way to barricade for defensive programming purposes is to designate certain
5399interfaces as boundaries to “safe†areas. Check data crossing the boundaries of a safe
5400204 Chapter 8: Defensive Programming
5401area for validity, and respond sensibly if the data isn’t valid. Figure 8-2 illustrates
5402this concept.
5403Figure 8-2 Defining some parts of the software that work with dirty data and some that
5404work with clean data can be an effective way to relieve the majority of the code of the
5405responsibility for checking for bad data.
5406This same approach can be used at the class level. The class’s public methods assume
5407the data is unsafe, and they are responsible for checking the data and sanitizing it.
5408Once the data has been accepted by the class’s public methods, the class’s private
5409methods can assume the data is safe.
5410Another way of thinking about this approach is as an operating-room technique. Data
5411is sterilized before it’s allowed to enter the operating room. Anything that’s in the
5412operating room is assumed to be safe. The key design decision is deciding what to put
5413in the operating room, what to keep out, and where to put the doors—which routines
5414are considered to be inside the safety zone, which are outside, and which sanitize the
5415data. The easiest way to do this is usually by sanitizing external data as it arrives, but
5416data often needs to be sanitized at more than one level, so multiple levels of sterilization
5417are sometimes required.
5418Convert input data to the proper type at input time Input typically arrives in the
5419form of a string or number. Sometimes the value will map onto a boolean type like
5420“yes†or “no.†Sometimes the value will map onto an enumerated type like Color_Red,
5421Color_Green, and Color_Blue. Carrying data of questionable type for any length of time
5422in a program increases complexity and increases the chance that someone can crash
5423your program by inputting a color like “Yes.†Convert input data to the proper form as
5424soon as possible after it’s input.
5425Internal
5426Class 11
5427Internal
5428Class 9
5429Internal
5430Class 7
5431Internal
5432Class 5
5433Internal
5434Class 3
5435Internal
5436Class 1
5437Internal
5438Class n
5439Internal
5440Class 10
5441Internal
5442Class 8
5443Internal
5444Class 6
5445Internal
5446Class 4
5447Internal
5448Class 2
5449Validation
5450Class 1
5451Validation
5452Class 2
5453Validation
5454Class n
5455Graphical
5456User Interface
5457Command
5458Line Interface
5459Real-time
5460Data Feed
5461External
5462Files
5463Other external
5464objects
5465Data here is
5466assumed to be dirty
5467and untrusted.
5468These classes are responsible
5469for cleaning the data. They
5470make up the barricade.
5471These classses can
5472assume data is clean
5473and trusted.
54748.6 Debugging Aids 205
5475Relationship Between Barricades and Assertions
5476The use of barricades makes the distinction between assertions and error handling
5477clean-cut. Routines that are outside the barricade should use error handling because it
5478isn’t safe to make any assumptions about the data. Routines inside the barricade
5479should use assertions, because the data passed to them is supposed to be sanitized
5480before it’s passed across the barricade. If one of the routines inside the barricade
5481detects bad data, that’s an error in the program rather than an error in the data.
5482The use of barricades also illustrates the value of deciding at the architectural level
5483how to handle errors. Deciding which code is inside and which is outside the barricade
5484is an architecture-level decision.
54858.6 Debugging Aids
5486Another key aspect of defensive programming is the use of debugging aids, which can
5487be a powerful ally in quickly detecting errors.
5488Don’t Automatically Apply Production Constraints to the
5489Development Version
5490Further Reading For more
5491on using debug code to support
5492defensive programming,
5493see Writing Solid Code
5494(Maguire 1993).
5495A common programmer blind spot is the assumption that limitations of the production
5496software apply to the development version. The production version has to run
5497fast. The development version might be able to run slow. The production version has
5498to be stingy with resources. The development version might be allowed to use
5499resources extravagantly. The production version shouldn’t expose dangerous operations
5500to the user. The development version can have extra operations that you can use
5501without a safety net.
5502One program I worked on made extensive use of a quadruply linked list. The linkedlist
5503code was error prone, and the linked list tended to get corrupted. I added a menu
5504option to check the integrity of the linked list.
5505In debug mode, Microsoft Word contains code in the idle loop that checks the integrity
5506of the Document object every few seconds. This helps to detect data corruption
5507quickly, and it makes for easier error diagnosis.
5508Be willing to trade speed and resource usage during development in exchange for
5509built-in tools that can make development go more smoothly.
5510KEY POINT
5511206 Chapter 8: Defensive Programming
5512Introduce Debugging Aids Early
5513The earlier you introduce debugging aids, the more they’ll help. Typically, you won’t
5514go to the effort of writing a debugging aid until after you’ve been bitten by a problem
5515several times. If you write the aid after the first time, however, or use one from a previous
5516project, it will help throughout the project.
5517Use Offensive Programming
5518Cross-Reference For more
5519details on handling unanticipated
5520cases, see "Tips for
5521Using case Statements" in
5522Section 15.2.
5523Exceptional cases should be handled in a way that makes them obvious during
5524development and recoverable when production code is running. Michael Howard
5525and David LeBlanc refer to this approach as “offensive programming†(Howard and
5526LeBlanc 2003).
5527Suppose you have a case statement that you expect to handle only five kinds of
5528events. During development, the default case should be used to generate a warning
5529that says “Hey! There’s another case here! Fix the program!†During production,
5530however, the default case should do something more graceful, like writing a message
5531to an error-log file.
5532A dead program normally
5533does a lot less damage than
5534a crippled one.
5535—Andy Hunt and
5536Dave Thomas
5537Here are some ways you can program offensively:
5538■Make sure asserts abort the program. Don’t allow programmers to get into the
5539habit of just hitting the Enter key to bypass a known problem. Make the problem
5540painful enough that it will be fixed.
5541â– Completely fill any memory allocated so that you can detect memory allocation
5542errors.
5543â– Completely fill any files or streams allocated to flush out any file-format errors.
5544■Be sure the code in each case statement’s default or else clause fails hard (aborts
5545the program) or is otherwise impossible to overlook.
5546■Fill an object with junk data just before it’s deleted.
5547â– Set up the program to e-mail error log files to yourself so that you can see the
5548kinds of errors that are occurring in the released software, if that’s appropriate
5549for the kind of software you’re developing.
5550Sometimes the best defense is a good offense. Fail hard during development so that
5551you can fail softer during production.
5552Plan to Remove Debugging Aids
5553If you’re writing code for your own use, it might be fine to leave all the debugging code
5554in the program. If you’re writing code for commercial use, the performance penalty in
5555size and speed can be prohibitive. Plan to avoid shuffling debugging code in and out
5556of a program. Here are several ways to do that:
55578.6 Debugging Aids 207
5558Cross-Reference For details
5559on version control, see Section
556028.2, "Configuration
5561Management."
5562Use version-control tools and build tools like ant and make Version-control tools
5563can build different versions of a program from the same source files. In development
5564mode, you can set the build tool to include all the debug code. In production mode,
5565you can set it to exclude any debug code you don’t want in the commercial version.
5566Use a built-in preprocessor If your programming environment has a preprocessor—
5567as C++ does, for example—you can include or exclude debug code at the flick of a compiler
5568switch. You can use the preprocessor directly or by writing a macro that works
5569with preprocessor definitions. Here’s an example of writing code using the preprocessor
5570directly:
5571C++ Example of Using the Preprocessor Directly to Control Debug Code
5572To include the debugging
5573code, use #define to define
5574the symbol DEBUG. To
5575exclude the debugging code,
5576don’t define DEBUG.
5577#define DEBUG
5578...
5579#if defined( DEBUG )
5580// debugging code
5581...
5582#endif
5583This theme has several variations. Rather than just defining DEBUG, you can assign it
5584a value and then test for the value rather than testing whether it’s defined. That way
5585you can differentiate between different levels of debug code. You might have some
5586debug code that you want in your program all the time, so you surround that by a
5587statement like #if DEBUG > 0. Other debug code might be for specific purposes only,
5588so you can surround it by a statement like #if DEBUG == POINTER_ERROR. In other
5589places, you might want to set debug levels, so you could have statements like #if
5590DEBUG > LEVEL_A.
5591If you don’t like having #if defined()s spread throughout your code, you can write a
5592preprocessor macro to accomplish the same task. Here’s an example:
5593C++ Example of Using a Preprocessor Macro to Control Debug Code
5594#define DEBUG
5595#if defined( DEBUG )
5596#define DebugCode( code_fragment ) { code_fragment }
5597#else
5598#define DebugCode( code_fragment )
5599#endif
5600...
5601DebugCode(
5602This code is included or
5603excluded, depending on
5604whether DEBUG has been
5605defined.
5606statement 1;
5607statement 2;
5608...
5609statement n;
5610);
5611...
5612208 Chapter 8: Defensive Programming
5613As in the first example of using the preprocessor, this technique can be altered in a
5614variety of ways that make it more sophisticated than completely including all debug
5615code or completely excluding all of it.
5616Cross-Reference For more
5617information on preprocessors
5618and for direction to
5619sources of information on
5620writing one of your own, see
5621“Macro Preprocessors†in
5622Section 30.3.
5623Write your own preprocessor If a language doesn’t include a preprocessor, it’s fairly
5624easy to write one for including and excluding debug code. Establish a convention for
5625designating debug code, and write your precompiler to follow that convention. For
5626example, in Java you could write a precompiler to respond to the keywords //#BEGIN
5627DEBUG and //#END DEBUG. Write a script to call the preprocessor, and then compile
5628the processed code. You’ll save time in the long run, and you won’t mistakenly
5629compile the unpreprocessed code.
5630Cross-Reference For details
5631on stubs, see “Building Scaffolding
5632to Test Individual
5633Routines†in Section 22.5.
5634Use debugging stubs In many instances, you can call a routine to do debugging
5635checks. During development, the routine might perform several operations before control
5636returns to the caller. For production code, you can replace the complicated routine
5637with a stub routine that merely returns control immediately to the caller or that performs
5638a couple of quick operations before returning control. This approach incurs only
5639a small performance penalty, and it’s a quicker solution than writing your own preprocessor.
5640Keep both the development and production versions of the routines so that you
5641can switch back and forth during future development and production.
5642You might start with a routine designed to check pointers that are passed to it:
5643C++ Example of a Routine That Uses a Debugging Stub
5644void DoSomething(
5645SOME_TYPE *pointer;
5646...
5647) {
5648// check parameters passed in
5649This line calls the routine to
5650check the pointer.
5651CheckPointer( pointer );
5652...
5653}
5654During development, the CheckPointer() routine would perform full checking on the
5655pointer. It would be slow but effective, and it could look like this:
5656C++ Example of a Routine for Checking Pointers During Development
5657This routine checks any
5658pointer that’s passed to it. It
5659can be used during development
5660to perform as many
5661checks as you can bear.
5662void CheckPointer( void *pointer ) {
5663// perform check 1--maybe check that it's not NULL
5664// perform check 2--maybe check that its dogtag is legitimate
5665// perform check 3--maybe check that what it points to isn't corrupted
5666...
5667// perform check n--...
5668}
56698.7 Determining How Much Defensive Programming to Leave in Production Code 209
5670When the code is ready for production, you might not want all the overhead associated
5671with this pointer checking. You could swap out the preceding routine and swap
5672in this routine:
5673C++ Example of a Routine for Checking Pointers During Production
5674This routine just returns
5675immediately to the caller.
5676void CheckPointer( void *pointer ) {
5677// no code; just return to caller
5678}
5679This is not an exhaustive survey of all the ways you can plan to remove debugging
5680aids, but it should be enough to give you an idea for some things that will work in your
5681environment.
56828.7 Determining How Much Defensive Programming to
5683Leave in Production Code
5684One of the paradoxes of defensive programming is that during development, you’d like
5685an error to be noticeable—you’d rather have it be obnoxious than risk overlooking it. But
5686during production, you’d rather have the error be as unobtrusive as possible, to have the
5687program recover or fail gracefully. Here are some guidelines for deciding which defensive
5688programming tools to leave in your production code and which to leave out:
5689Leave in code that checks for important errors Decide which areas of the program
5690can afford to have undetected errors and which areas cannot. For example, if you were
5691writing a spreadsheet program, you could afford to have undetected errors in the
5692screen-update area of the program because the main penalty for an error is only a
5693messy screen. You could not afford to have undetected errors in the calculation engine
5694because such errors might result in subtly incorrect results in someone’s spreadsheet.
5695Most users would rather suffer a messy screen than incorrect tax calculations and an
5696audit by the IRS.
5697Remove code that checks for trivial errors If an error has truly trivial consequences,
5698remove code that checks for it. In the previous example, you might remove the code
5699that checks the spreadsheet screen update. “Remove†doesn’t mean physically remove
5700the code. It means use version control, precompiler switches, or some other technique
5701to compile the program without that particular code. If space isn’t a problem, you
5702could leave in the error-checking code but have it log messages to an error-log file
5703unobtrusively.
5704Remove code that results in hard crashes As I mentioned, during development,
5705when your program detects an error, you’d like the error to be as noticeable as possible
5706so that you can fix it. Often, the best way to accomplish that goal is to have the program
5707print a debugging message and crash when it detects an error. This is useful
5708even for minor errors.
5709210 Chapter 8: Defensive Programming
5710During production, your users need a chance to save their work before the program
5711crashes and they are probably willing to tolerate a few anomalies in exchange for keeping
5712the program going long enough for them to do that. Users don’t appreciate anything
5713that results in the loss of their work, regardless of how much it helps debugging
5714and ultimately improves the quality of the program. If your program contains debugging
5715code that could cause a loss of data, take it out of the production version.
5716Leave in code that helps the program crash gracefully If your program contains
5717debugging code that detects potentially fatal errors, leave the code in that allows the
5718program to crash gracefully. In the Mars Pathfinder, for example, engineers left some
5719of the debug code in by design. An error occurred after the Pathfinder had landed. By
5720using the debug aids that had been left in, engineers at JPL were able to diagnose the
5721problem and upload revised code to the Pathfinder, and the Pathfinder completed its
5722mission perfectly (March 1999).
5723Log errors for your technical support personnel Consider leaving debugging aids in
5724the production code but changing their behavior so that it’s appropriate for the production
5725version. If you’ve loaded your code with assertions that halt the program during
5726development, you might consider changing the assertion routine to log messages
5727to a file during production rather than eliminating them altogether.
5728Make sure that the error messages you leave in are friendly If you leave internal
5729error messages in the program, verify that they’re in language that’s friendly to the
5730user. In one of my early programs, I got a call from a user who reported that she’d gotten
5731a message that read “You’ve got a bad pointer allocation, Dog Breath!†Fortunately
5732for me, she had a sense of humor. A common and effective approach is to notify the
5733user of an “internal error†and list an e-mail address or phone number the user can
5734use to report it.
57358.8 Being Defensive About Defensive Programming
5736Too much of anything is bad,
5737but too much whiskey is just
5738enough.
5739—Mark Twain
5740Too much defensive programming creates problems of its own. If you check data
5741passed as parameters in every conceivable way in every conceivable place, your program
5742will be fat and slow. What’s worse, the additional code needed for defensive programming
5743adds complexity to the software. Code installed for defensive
5744programming is not immune to defects, and you’re just as likely to find a defect in
5745defensive-programming code as in any other code—more likely, if you write the code
5746casually. Think about where you need to be defensive, and set your defensive-programming
5747priorities accordingly.
57488.8 Being Defensive About Defensive Programming 211
5749cc2e.com/0868 CHECKLIST: Defensive Programming
5750General
5751â‘ Does the routine protect itself from bad input data?
5752â‘ Have you used assertions to document assumptions, including preconditions
5753and postconditions?
5754â‘ Have assertions been used only to document conditions that should never
5755occur?
5756â‘ Does the architecture or high-level design specify a specific set of errorhandling
5757techniques?
5758â‘ Does the architecture or high-level design specify whether error handling
5759should favor robustness or correctness?
5760â‘ Have barricades been created to contain the damaging effect of errors and
5761reduce the amount of code that has to be concerned about error processing?
5762â‘ Have debugging aids been used in the code?
5763â‘ Have debugging aids been installed in such a way that they can be activated
5764or deactivated without a great deal of fuss?
5765①Is the amount of defensive programming code appropriate—neither too
5766much nor too little?
5767â‘ Have you used offensive-programming techniques to make errors difficult
5768to overlook during development?
5769Exceptions
5770â‘ Has your project defined a standardized approach to exception handling?
5771â‘ Have you considered alternatives to using an exception?
5772â‘ Is the error handled locally rather than throwing a nonlocal exception, if
5773possible?
5774â‘ Does the code avoid throwing exceptions in constructors and destructors?
5775â‘ Are all exceptions at the appropriate levels of abstraction for the routines
5776that throw them?
5777â‘ Does each exception include all relevant exception background information?
5778â‘ Is the code free of empty catch blocks? (Or if an empty catch block truly is
5779appropriate, is it documented?)
5780212 Chapter 8: Defensive Programming
5781Security Issues
5782â‘ Does the code that checks for bad input data check for attempted buffer
5783overflows, SQL injection, HTML injection, integer overflows, and other
5784malicious inputs?
5785â‘ Are all error-return codes checked?
5786â‘ Are all exceptions caught?
5787â‘ Do error messages avoid providing information that would help an
5788attacker break into the system?
5789Additional Resources
5790cc2e.com/0875 Take a look at the following defensive-programming resources:
5791Security
5792Howard, Michael, and David LeBlanc. Writing Secure Code, 2d ed. Redmond, WA:
5793Microsoft Press, 2003. Howard and LeBlanc cover the security implications of trusting
5794input. The book is eye-opening in that it illustrates just how many ways a program can
5795be breached—some of which have to do with construction practices and many of which
5796don’t. The book spans a full range of requirements, design, code, and test issues.
5797Assertions
5798Maguire, Steve. Writing Solid Code. Redmond, WA: Microsoft Press, 1993. Chapter 2
5799contains an excellent discussion on the use of assertions, including several interesting
5800examples of assertions in well-known Microsoft products.
5801Stroustrup, Bjarne. The C++ Programming Language, 3d ed. Reading, MA: Addison-
5802Wesley, 1997. Section 24.3.7.2 describes several variations on the theme of implementing
5803assertions in C++, including the relationship between assertions and preconditions
5804and postconditions.
5805Meyer, Bertrand. Object-Oriented Software Construction, 2d ed. New York, NY: Prentice
5806Hall PTR, 1997. This book contains the definitive discussion of preconditions and
5807postconditions.
5808Exceptions
5809Meyer, Bertrand. Object-Oriented Software Construction, 2d ed. New York, NY: Prentice
5810Hall PTR, 1997. Chapter 12 contains a detailed discussion of exception handling.
5811Key Points 213
5812Stroustrup, Bjarne. The C++ Programming Language, 3d ed. Reading, MA: Addison-
5813Wesley, 1997. Chapter 14 contains a detailed discussion of exception handling in C++.
5814Section 14.11 contains an excellent summary of 21 tips for handling C++ exceptions.
5815Meyers, Scott. More Effective C++: 35 New Ways to Improve Your Programs and Designs.
5816Reading, MA: Addison-Wesley, 1996. Items 9–15 describe numerous nuances of
5817exception handling in C++.
5818Arnold, Ken, James Gosling, and David Holmes. The Java Programming Language, 3d
5819ed. Boston, MA: Addison-Wesley, 2000. Chapter 8 contains a discussion of exception
5820handling in Java.
5821Bloch, Joshua. Effective Java Programming Language Guide. Boston, MA: Addison-Wesley,
58222001. Items 39–47 describe nuances of exception handling in Java.
5823Foxall, James. Practical Standards for Microsoft Visual Basic .NET. Redmond, WA:
5824Microsoft Press, 2003. Chapter 10 describes exception handling in Visual Basic.
5825Key Points
5826■Production code should handle errors in a more sophisticated way than “garbage
5827in, garbage out.â€
5828â– Defensive-programming techniques make errors easier to find, easier to fix, and
5829less damaging to production code.
5830â– Assertions can help detect errors early, especially in large systems, high-reliability
5831systems, and fast-changing code bases.
5832â– The decision about how to handle bad inputs is a key error-handling decision
5833and a key high-level design decision.
5834â– Exceptions provide a means of handling errors that operates in a different
5835dimension from the normal flow of the code. They are a valuable addition to the
5836programmer’s intellectual toolbox when used with care, and they should be
5837weighed against other error-processing techniques.
5838â– Constraints that apply to the production system do not necessarily apply to the
5839development version. You can use that to your advantage, adding code to the
5840development version that helps to flush out errors quickly.
5841
5842215
5843Chapter 9
5844The Pseudocode
5845Programming Process
5846cc2e.com/0936 Contents
5847â– 9.1 Summary of Steps in Building Classes and Routines: page 216
5848â– 9.2 Pseudocode for Pros: page 218
5849â– 9.3 Constructing Routines by Using the PPP: page 220
5850â– 9.4 Alternatives to the PPP: page 232
5851Related Topics
5852â– Creating high-quality classes: Chapter 6
5853â– Characteristics of high-quality routines: Chapter 7
5854â– Design in Construction: Chapter 5
5855â– Commenting style: Chapter 32
5856Although you could view this whole book as an extended description of the programming
5857process for creating classes and routines, this chapter puts the steps in context.
5858This chapter focuses on programming in the small—on the specific steps for building
5859an individual class and its routines, the steps that are critical on projects of all sizes.
5860The chapter also describes the Pseudocode Programming Process (PPP), which
5861reduces the work required during design and documentation and improves the quality
5862of both.
5863If you’re an expert programmer, you might just skim this chapter, but look at the summary
5864of steps and review the tips for constructing routines using the Pseudocode Programming
5865Process in Section 9.3. Few programmers exploit the full power of the
5866process, and it offers many benefits.
5867The PPP is not the only procedure for creating classes and routines. Section 9.4, at the
5868end of this chapter, describes the most popular alternatives, including test-first development
5869and design by contract.
5870216 Chapter 9: The Pseudocode Programming Process
58719.1 Summary of Steps in Building Classes and Routines
5872Class construction can be approached from numerous directions, but usually it’s an
5873iterative process of creating a general design for the class, enumerating specific routines
5874within the class, constructing specific routines, and checking class construction
5875as a whole. As Figure 9-1 suggests, class creation can be a messy process for all the reasons
5876that design is a messy process (reasons that are described in Section 5.1, “Design
5877Challengesâ€).
5878Figure 9-1 Details of class construction vary, but the activities generally occur in the order
5879shown here.
5880Steps in Creating a Class
5881The key steps in constructing a class are:
5882Create a general design for the class Class design includes numerous specific issues.
5883Define the class’s specific responsibilities, define what “secrets†the class will hide, and
5884define exactly what abstraction the class interface will capture. Determine whether the
5885class will be derived from another class and whether other classes will be allowed to
5886derive from it. Identify the class’s key public methods, and identify and design any nontrivial
5887data members used by the class. Iterate through these topics as many times as
5888needed to create a straightforward design for the routine. These considerations and
5889many others are discussed in more detail in Chapter 6, “Working Classes.â€
5890Begin
5891Done
5892Create a
5893general design
5894for the class
5895Review and
5896test the class as
5897a whole
5898Construct the
5899routines within
5900the class
59019.1 Summary of Steps in Building Classes and Routines 217
5902Construct each routine within the class Once you’ve identified the class’s major routines
5903in the first step, you must construct each specific routine. Construction of each
5904routine typically unearths the need for additional routines, both minor and major, and
5905issues arising from creating those additional routines often ripple back to the overall
5906class design.
5907Review and test the class as a whole Normally, each routine is tested as it’s created.
5908After the class as a whole becomes operational, the class as a whole should be
5909reviewed and tested for any issues that can’t be tested at the individual-routine level.
5910Steps in Building a Routine
5911Many of a class’s routines will be simple and straightforward to implement: accessor
5912routines, pass-throughs to other objects’ routines, and the like. Implementation of
5913other routines will be more complicated, and creation of those routines benefits from
5914a systematic approach. The major activities involved in creating a routine—designing
5915the routine, checking the design, coding the routine, and checking the code—are typically
5916performed in the order shown in Figure 9-2.
5917Figure 9-2 These are the major activities that go into constructing a routine. They’re usually
5918performed in the order shown.
5919Experts have developed numerous approaches to creating routines, and my favorite
5920approach is the Pseudocode Programming Process, described in the next section.
5921Begin
5922Repeat if
5923necessary
5924Done
5925Design the
5926routine
5927Check the
5928design
5929Review and
5930test the code
5931Code the
5932routine
5933218 Chapter 9: The Pseudocode Programming Process
59349.2 Pseudocode for Pros
5935The term “pseudocode†refers to an informal, English-like notation for describing how
5936an algorithm, a routine, a class, or a program will work. The Pseudocode Programming
5937Process defines a specific approach to using pseudocode to streamline the creation
5938of code within routines.
5939Because pseudocode resembles English, it’s natural to assume that any English-like
5940description that collects your thoughts will have roughly the same effect as any other.
5941In practice, you’ll find that some styles of pseudocode are more useful than others.
5942Here are guidelines for using pseudocode effectively:
5943â– Use English-like statements that precisely describe specific operations.
5944â– Avoid syntactic elements from the target programming language. Pseudocode
5945allows you to design at a slightly higher level than the code itself. When you use
5946programming-language constructs, you sink to a lower level, eliminating the
5947main benefit of design at a higher level, and you saddle yourself with unnecessary
5948syntactic restrictions.
5949Cross-Reference For details
5950on commenting at the level
5951of intent, see “Kinds of Commentsâ€
5952in Section 32.4.
5953â– Write pseudocode at the level of intent. Describe the meaning of the approach
5954rather than how the approach will be implemented in the target language.
5955â– Write pseudocode at a low enough level that generating code from it will be
5956nearly automatic. If the pseudocode is at too high a level, it can gloss over problematic
5957details in the code. Refine the pseudocode in more and more detail until
5958it seems as if it would be easier to simply write the code.
5959Once the pseudocode is written, you build the code around it and the pseudocode
5960turns into programming-language comments. This eliminates most commenting
5961effort. If the pseudocode follows the guidelines, the comments will be complete and
5962meaningful.
5963Here’s an example of a design in pseudocode that violates virtually all the principles
5964just described:
5965Example of Bad Pseudocode
5966increment resource number by 1
5967allocate a dlg struct using malloc
5968if malloc() returns NULL then return 1
5969invoke OSrsrc_init to initialize a resource for the operating system
5970*hRsrcPtr = resource number
5971return 0
5972What is the intent of this block of pseudocode? Because it’s poorly written, it’s hard to
5973tell. This so-called pseudocode is bad because it includes target language coding
5974details, such as *hRsrcPtr (in specific C-language pointer notation) and malloc() (a spe-
5975CODING
5976HORROR
59779.2 Pseudocode for Pros 219
5978cific C-language function). This pseudocode block focuses on how the code will be
5979written rather than on the meaning of the design. It gets into coding details—whether
5980the routine returns a 1 or a 0. If you think about this pseudocode from the standpoint
5981of whether it will turn into good comments, you’ll begin to understand that it isn’t
5982much help.
5983Here’s a design for the same operation in a much-improved pseudocode:
5984Example of Good Pseudocode
5985Keep track of current number of resources in use
5986If another resource is available
5987Allocate a dialog box structure
5988If a dialog box structure could be allocated
5989Note that one more resource is in use
5990Initialize the resource
5991Store the resource number at the location provided by the caller
5992Endif
5993Endif
5994Return true if a new resource was created; else return false
5995This pseudocode is better than the first because it’s written entirely in English; it
5996doesn’t use any syntactic elements of the target language. In the first example, the
5997pseudocode could have been implemented only in C. In the second example, the
5998pseudocode doesn’t restrict the choice of languages. The second block of pseudocode
5999is also written at the level of intent. What does the second block of pseudocode mean?
6000It is probably easier for you to understand than the first block.
6001Even though it’s written in clear English, the second block of pseudocode is precise
6002and detailed enough that it can easily be used as a basis for programming-language
6003code. When the pseudocode statements are converted to comments, they’ll be a good
6004explanation of the code’s intent.
6005Here are the benefits you can expect from using this style of pseudocode:
6006â– Pseudocode makes reviews easier. You can review detailed designs without
6007examining source code. Pseudocode makes low-level design reviews easier and
6008reduces the need to review the code itself.
6009â– Pseudocode supports the idea of iterative refinement. You start with a high-level
6010design, refine the design to pseudocode, and then refine the pseudocode to
6011source code. This successive refinement in small steps allows you to check your
6012design as you drive it to lower levels of detail. The result is that you catch highlevel
6013errors at the highest level, mid-level errors at the middle level, and low-level
6014errors at the lowest level—before any of them becomes a problem or contaminates
6015work at more detailed levels.
6016220 Chapter 9: The Pseudocode Programming Process
6017Further Reading For more
6018information on the advantages
6019of making changes at
6020the least-value stage, see
6021Andy Grove’s High Output
6022Management (Grove 1983).
6023â– Pseudocode makes changes easier. A few lines of pseudocode are easier to change
6024than a page of code. Would you rather change a line on a blueprint or rip out a
6025wall and nail in the two-by-fours somewhere else? The effects aren’t as physically
6026dramatic in software, but the principle of changing the product when it’s most
6027malleable is the same. One of the keys to the success of a project is to catch errors
6028at the “least-value stage,†the stage at which the least effort has been invested.
6029Much less has been invested at the pseudocode stage than after full coding, testing,
6030and debugging, so it makes economic sense to catch the errors early.
6031â– Pseudocode minimizes commenting effort. In the typical coding scenario, you
6032write the code and add comments afterward. In the PPP, the pseudocode statements
6033become the comments, so it actually takes more work to remove the comments
6034than to leave them in.
6035â– Pseudocode is easier to maintain than other forms of design documentation.
6036With other approaches, design is separated from the code, and when one
6037changes, the two fall out of agreement. With the PPP, the pseudocode statements
6038become comments in the code. As long as the inline comments are maintained,
6039the pseudocode’s documentation of the design will be accurate.
6040As a tool for detailed design, pseudocode is hard to beat. One survey found that programmers
6041prefer pseudocode for the way it eases construction in a programming language,
6042for its ability to help them detect insufficiently detailed designs, and for the
6043ease of documentation and ease of modification it provides (Ramsey, Atwood, and
6044Van Doren 1983). Pseudocode isn’t the only tool for detailed design, but pseudocode
6045and the PPP are useful tools to have in your programmer’s toolbox. Try them. The
6046next section shows you how.
60479.3 Constructing Routines by Using the PPP
6048This section describes the activities involved in constructing a routine, namely these:
6049â– Design the routine.
6050â– Code the routine.
6051â– Check the code.
6052â– Clean up loose ends.
6053â– Repeat as needed.
6054Design the Routine
6055Cross-Reference For details
6056on other aspects of design,
6057see Chapters 5 through 8.
6058Once you’ve identified a class’s routines, the first step in constructing any of the class’s
6059more complicated routines is to design it. Suppose that you want to write a routine to
6060KEY POINT
60619.3 Constructing Routines by Using the PPP 221
6062output an error message depending on an error code, and suppose that you call the routine
6063ReportErrorMessage(). Here’s an informal spec for ReportErrorMessage():
6064ReportErrorMessage() takes an error code as an input argument and outputs
6065an error message corresponding to the code. It’s responsible for handling
6066invalid codes. If the program is operating interactively, ReportErrorMessage()
6067displays the message to the user. If it’s operating in command-line mode,
6068ReportErrorMessage() logs the message to a message file. After outputting the
6069message, ReportErrorMessage() returns a status value, indicating whether it
6070succeeded or failed.
6071The rest of the chapter uses this routine as a running example. The rest of this section
6072describes how to design the routine.
6073Cross-Reference For details
6074on checking prerequisites,
6075see Chapter 3, “Measure
6076Twice, Cut Once: Upstream
6077Prerequisites,†and Chapter 4,
6078“Key Construction Decisions.â€
6079Check the prerequisites Before doing any work on the routine itself, check to see that
6080the job of the routine is well defined and fits cleanly into the overall design. Check to
6081be sure that the routine is actually called for, at the very least indirectly, by the
6082project’s requirements.
6083Define the problem the routine will solve State the problem the routine will solve in
6084enough detail to allow creation of the routine. If the high-level design is sufficiently
6085detailed, the job might already be done. The high-level design should at least indicate
6086the following:
6087â– The information the routine will hide
6088â– Inputs to the routine
6089â– Outputs from the routine
6090Cross-Reference For details
6091on preconditions and postconditions,
6092see “Use assertions
6093to document and verify
6094preconditions and postconditionsâ€
6095in Section 8.2.
6096â– Preconditions that are guaranteed to be true before the routine is called (input
6097values within certain ranges, streams initialized, files opened or closed, buffers
6098filled or flushed, etc.)
6099â– Postconditions that the routine guarantees will be true before it passes control
6100back to the caller (output values within specified ranges, streams initialized, files
6101opened or closed, buffers filled or flushed, etc.)
6102Here’s how these concerns are addressed in the ReportErrorMessage() example:
6103â– The routine hides two facts: the error message text and the current processing
6104method (interactive or command line).
6105â– There are no preconditions guaranteed to the routine.
6106â– The input to the routine is an error code.
6107â– Two kinds of output are called for: the first is the error message, and the second
6108is the status that ReportErrorMessage() returns to the calling routine.
6109â– The routine guarantees that the status value will have a value of either Success or
6110Failure.
6111222 Chapter 9: The Pseudocode Programming Process
6112Cross-Reference For details
6113on naming routines, see Section
61147.3, “Good Routine
6115Names.â€
6116Name the routine Naming the routine might seem trivial, but good routine names
6117are one sign of a superior program and they’re not easy to come up with. In general, a
6118routine should have a clear, unambiguous name. If you have trouble creating a good
6119name, that usually indicates that the purpose of the routine isn’t clear. A vague, wishywashy
6120name is like a politician on the campaign trail. It sounds as if it’s saying something,
6121but when you take a hard look, you can’t figure out what it means. If you can
6122make the name clearer, do so. If the wishy-washy name results from a wishy-washy
6123design, pay attention to the warning sign. Back up and improve the design.
6124In the example, ReportErrorMessage() is unambiguous. It is a good name.
6125Further Reading For a different
6126approach to construction
6127that focuses on writing
6128test cases first, see Test-
6129Driven Development: By
6130Example (Beck 2003).
6131Decide how to test the routine As you’re writing the routine, think about how you
6132can test it. This is useful for you when you do unit testing and for the tester who tests
6133your routine independently.
6134In the example, the input is simple, so you might plan to test ReportErrorMessage()
6135with all valid error codes and a variety of invalid codes.
6136Research functionality available in the standard libraries The single biggest way to
6137improve both the quality of your code and your productivity is to reuse good code. If
6138you find yourself grappling to design a routine that seems overly complicated, ask
6139whether some or all of the routine’s functionality might already be available in the
6140library code of the language, platform, or tools you’re using. Ask whether the code
6141might be available in library code maintained by your company. Many algorithms
6142have already been invented, tested, discussed in the trade literature, reviewed, and
6143improved. Rather than spending your time inventing something when someone has
6144already written a Ph.D. dissertation on it, take a few minutes to look through the code
6145that’s already been written and make sure you’re not doing more work than necessary.
6146Think about error handling Think about all the things that could possibly go wrong
6147in the routine. Think about bad input values, invalid values returned from other routines,
6148and so on.
6149Routines can handle errors numerous ways, and you should choose consciously how
6150to handle errors. If the program’s architecture defines the program’s error-handling
6151strategy, you can simply plan to follow that strategy. In other cases, you have to decide
6152what approach will work best for the specific routine.
6153Think about efficiency Depending on your situation, you can address efficiency in
6154one of two ways. In the first situation, in the vast majority of systems, efficiency isn’t
6155critical. In such a case, see that the routine’s interface is well abstracted and its code is
6156readable so that you can improve it later if you need to. If you have good encapsulation,
6157you can replace a slow, resource-hogging, high-level language implementation
6158with a better algorithm or a fast, lean, low-level language implementation, and you
6159won’t affect any other routines.
61609.3 Constructing Routines by Using the PPP 223
6161Cross-Reference For details
6162on efficiency, see Chapter 25,
6163“Code-Tuning Strategies,â€
6164and Chapter 26, “Code-
6165Tuning Techniques.â€
6166In the second situation—in the minority of systems—performance is critical. The performance
6167issue might be related to scarce database connections, limited memory, few
6168available handles, ambitious timing constraints, or some other scarce resource. The
6169architecture should indicate how many resources each routine (or class) is allowed to
6170use and how fast it should perform its operations.
6171Design your routine so that it will meet its resource and speed goals. If either
6172resources or speed seems more critical, design so that you trade resources for speed or
6173vice versa. It’s acceptable during initial construction of the routine to tune it enough to
6174meet its resource and speed budgets.
6175Aside from taking the approaches suggested for these two general situations, it’s usually
6176a waste of effort to work on efficiency at the level of individual routines. The big
6177optimizations come from refining the high-level design, not the individual routines.
6178You generally use micro-optimizations only when the high-level design turns out not
6179to support the system’s performance goals, and you won’t know that until the whole
6180program is done. Don’t waste time scraping for incremental improvements until you
6181know they’re needed.
6182Research the algorithms and data types If functionality isn’t available in the available
6183libraries, it might still be described in an algorithms book. Before you launch into
6184writing complicated code from scratch, check an algorithms book to see what’s
6185already available. If you use a predefined algorithm, be sure to adapt it correctly to
6186your programming language.
6187Write the pseudocode You might not have much in writing after you finish the preceding
6188steps. The main purpose of the steps is to establish a mental orientation that’s
6189useful when you actually write the routine.
6190Cross-Reference This discussion
6191assumes that good
6192design techniques are used to
6193create the pseudocode version
6194of the routine. For details
6195on design, see Chapter 5,
6196“Design in Construction.â€
6197With the preliminary steps completed, you can begin to write the routine as high-level
6198pseudocode. Go ahead and use your programming editor or your integrated environment
6199to write the pseudocode—the pseudocode will be used shortly as the basis for
6200programming-language code.
6201Start with the general and work toward something more specific. The most general
6202part of a routine is a header comment describing what the routine is supposed to do,
6203so first write a concise statement of the purpose of the routine. Writing the statement
6204will help you clarify your understanding of the routine. Trouble in writing the general
6205comment is a warning that you need to understand the routine’s role in the program
6206better. In general, if it’s hard to summarize the routine’s role, you should probably
6207assume that something is wrong. Here’s an example of a concise header comment
6208describing a routine:
6209224 Chapter 9: The Pseudocode Programming Process
6210Example of a Header Comment for a Routine
6211This routine outputs an error message based on an error code
6212supplied by the calling routine. The way it outputs the message
6213depends on the current processing state, which it retrieves
6214on its own. It returns a value indicating success or failure.
6215After you’ve written the general comment, fill in high-level pseudocode for the routine.
6216Here’s the pseudocode for this example:
6217Example of Pseudocode for a Routine
6218This routine outputs an error message based on an error code
6219supplied by the calling routine. The way it outputs the message
6220depends on the current processing state, which it retrieves
6221on its own. It returns a value indicating success or failure.
6222set the default status to "fail"
6223look up the message based on the error code
6224if the error code is valid
6225if doing interactive processing, display the error message
6226interactively and declare success
6227if doing command line processing, log the error message to the
6228command line and declare success
6229if the error code isn't valid, notify the user that an internal error
6230has been detected
6231return status information
6232Again, note that the pseudocode is written at a fairly high level. It certainly isn’t written
6233in a programming language. Instead, it expresses in precise English what the
6234routine needs to do.
6235Cross-Reference For details
6236on effective use of variables,
6237see Chapters 10 through 13.
6238Think about the data You can design the routine’s data at several different points in
6239the process. In this example, the data is simple and data manipulation isn’t a prominent
6240part of the routine. If data manipulation is a prominent part of the routine, it’s worthwhile
6241to think about the major pieces of data before you think about the routine’s logic.
6242Definitions of key data types are useful to have when you design the logic of a routine.
6243Cross-Reference For details
6244on review techniques, see
6245Chapter 21, “Collaborative
6246Construction.â€
6247Check the pseudocode Once you’ve written the pseudocode and designed the data,
6248take a minute to review the pseudocode you’ve written. Back away from it, and think
6249about how you would explain it to someone else.
6250Ask someone else to look at it or listen to you explain it. You might think that it’s silly
6251to have someone look at 11 lines of pseudocode, but you’ll be surprised. Pseudocode
6252can make your assumptions and high-level mistakes more obvious than programming-
6253language code does. People are also more willing to review a few lines of
6254pseudocode than they are to review 35 lines of C++ or Java.
62559.3 Constructing Routines by Using the PPP 225
6256Make sure you have an easy and comfortable understanding of what the routine does
6257and how it does it. If you don’t understand it conceptually, at the pseudocode level,
6258what chance do you have of understanding it at the programming-language level? And
6259if you don’t understand it, who else will?
6260Cross-Reference For more
6261on iteration, see Section
626234.8, “Iterate, Repeatedly,
6263Again and Again.â€
6264Try a few ideas in pseudocode, and keep the best (iterate) Try as many ideas as you
6265can in pseudocode before you start coding. Once you start coding, you get emotionally
6266involved with your code and it becomes harder to throw away a bad design and start over.
6267The general idea is to iterate the routine in pseudocode until the pseudocode statements
6268become simple enough that you can fill in code below each statement and leave
6269the original pseudocode as documentation. Some of the pseudocode from your first
6270attempt might be high-level enough that you need to decompose it further. Be sure
6271you do decompose it further. If you’re not sure how to code something, keep working
6272with the pseudocode until you are sure. Keep refining and decomposing the
6273pseudocode until it seems like a waste of time to write it instead of the actual code.
6274Code the Routine
6275Once you’ve designed the routine, construct it. You can perform construction steps in
6276a nearly standard order, but feel free to vary them as you need to. Figure 9-3 shows the
6277steps in constructing a routine.
6278Figure 9-3 You’ll perform all of these steps as you design a routine but not necessarily in
6279any particular order.
6280Start with pseudocode
6281Write the routine declaration
6282Write the first and last statements, and turn
6283the pseudocode into high-level comments
6284Fill in the code below each comment
6285Repeat as needed
6286Clean up leftovers
6287Done
6288Check the code
6289226 Chapter 9: The Pseudocode Programming Process
6290Write the routine declaration Write the routine interface statement—the function
6291declaration in C++, method declaration in Java, function or sub procedure declaration
6292in Microsoft Visual Basic, or whatever your language calls for. Turn the original header
6293comment into a programming-language comment. Leave it in position above the
6294pseudocode you’ve already written. Here are the example routine’s interface statement
6295and header in C++:
6296C++ Example of a Routine Interface and Header Added to Pseudocode
6297Here’s the header comment
6298that’s been turned into a
6299C++-style comment.
6300/* This routine outputs an error message based on an error code
6301supplied by the calling routine. The way it outputs the message
6302depends on the current processing state, which it retrieves
6303on its own. It returns a value indicating success or failure.
6304*/
6305Here’s the interface
6306statement.
6307Status ReportErrorMessage(
6308ErrorCode errorToReport
6309)
6310set the default status to "fail"
6311look up the message based on the error code
6312if the error code is valid
6313if doing interactive processing, display the error message
6314interactively and declare success
6315if doing command line processing, log the error message to the
6316command line and declare success
6317if the error code isn't valid, notify the user that an
6318internal error has been detected
6319return status information
6320This is a good time to make notes about any interface assumptions. In this case, the
6321interface variable errorToReport is straightforward and typed for its specific purpose,
6322so it doesn’t need to be documented.
6323Turn the pseudocode into high-level comments Keep the ball rolling by writing the
6324first and last statements: { and } in C++. Then turn the pseudocode into comments.
6325Here’s how it would look in the example:
6326C++ Example of Writing the First and Last Statements Around Pseudocode
6327/* This routine outputs an error message based on an error code
6328supplied by the calling routine. The way it outputs the message
6329depends on the current processing state, which it retrieves
6330on its own. It returns a value indicating success or failure.
6331*/
6332Status ReportErrorMessage(
6333ErrorCode errorToReport
6334) {
63359.3 Constructing Routines by Using the PPP 227
6336The pseudocode statements
6337from here down have been
6338turned into C++ comments.
6339// set the default status to "fail"
6340// look up the message based on the error code
6341// if the error code is valid
6342// if doing interactive processing, display the error message
6343// interactively and declare success
6344// if doing command line processing, log the error message to the
6345// command line and declare success
6346// if the error code isn't valid, notify the user that an
6347// internal error has been detected
6348// return status information
6349}
6350At this point, the character of the routine is evident. The design work is complete, and
6351you can sense how the routine works even without seeing any code. You should feel that
6352converting the pseudocode to programming-language code will be mechanical, natural,
6353and easy. If you don’t, continue designing in pseudocode until the design feels solid.
6354Cross-Reference This is a
6355case where the writing metaphor
6356works well—in the
6357small. For criticism of applying
6358the writing metaphor in
6359the large, see “Software Penmanship:
6360Writing Code†in
6361Section 2.3.
6362Fill in the code below each comment Fill in the code below each line of pseudocode
6363comment. The process is a lot like writing a term paper. First you write an outline, and
6364then you write a paragraph for each point in the outline. Each pseudocode comment
6365describes a block or paragraph of code. Like the lengths of literary paragraphs, the
6366lengths of code paragraphs vary according to the thought being expressed, and the
6367quality of the paragraphs depends on the vividness and focus of the thoughts in them.
6368In this example, the first two pseudocode comments give rise to two lines of code:
6369C++ Example of Expressing Pseudocode Comments as Code
6370/* This routine outputs an error message based on an error code
6371supplied by the calling routine. The way it outputs the message
6372depends on the current processing state, which it retrieves
6373on its own. It returns a value indicating success or failure.
6374*/
6375Status ReportErrorMessage(
6376ErrorCode errorToReport
6377) {
6378// set the default status to "fail"
6379Here’s the code that’s been
6380filled in.
6381Status errorMessageStatus = Status_Failure;
6382// look up the message based on the error code
6383Here’s the new variable
6384errorMessage.
6385Message errorMessage = LookupErrorMessage( errorToReport );
6386// if the error code is valid
6387// if doing interactive processing, display the error message
6388// interactively and declare success
6389// if doing command line processing, log the error message to the
6390// command line and declare success
6391228 Chapter 9: The Pseudocode Programming Process
6392// if the error code isn't valid, notify the user that an
6393// internal error has been detected
6394// return status information
6395}
6396This is a start on the code. The variable errorMessage is used, so it needs to be declared.
6397If you were commenting after the fact, two lines of comments for two lines of code
6398would nearly always be overkill. In this approach, however, it’s the semantic content
6399of the comments that’s important, not how many lines of code they comment. The
6400comments are already there, and they explain the intent of the code, so leave them in.
6401The code below each of the remaining comments needs to be filled in:
6402C++ Example of a Complete Routine Created with the Pseudocode
6403Programming Process
6404/* This routine outputs an error message based on an error code
6405supplied by the calling routine. The way it outputs the message
6406depends on the current processing state, which it retrieves
6407on its own. It returns a value indicating success or failure.
6408*/
6409Status ReportErrorMessage(
6410ErrorCode errorToReport
6411) {
6412// set the default status to "fail"
6413Status errorMessageStatus = Status_Failure;
6414// look up the message based on the error code
6415Message errorMessage = LookupErrorMessage( errorToReport );
6416// if the error code is valid
6417The code for each comment
6418has been filled in from here
6419down.
6420if ( errorMessage.ValidCode() ) {
6421// determine the processing method
6422ProcessingMethod errorProcessingMethod = CurrentProcessingMethod();
6423// if doing interactive processing, display the error message
6424// interactively and declare success
6425if ( errorProcessingMethod == ProcessingMethod_Interactive ) {
6426DisplayInteractiveMessage( errorMessage.Text() );
6427errorMessageStatus = Status_Success;
6428}
6429// if doing command line processing, log the error message to the
6430// command line and declare success
6431This code is a good candidate
6432for being further decomposed
6433into a new routine:
6434DisplayCommandLine-
6435Message().
6436else if ( errorProcessingMethod == ProcessingMethod_CommandLine ) {
6437CommandLine messageLog;
6438if ( messageLog.Status() == CommandLineStatus_Ok ) {
6439messageLog.AddToMessageQueue( errorMessage.Text() );
6440messageLog.FlushMessageQueue();
6441errorMessageStatus = Status_Success;
6442}
64439.3 Constructing Routines by Using the PPP 229
6444This code and comment are
6445new and are the result of
6446fleshing out the if test.
6447else {
6448// can't do anything because the routine is already error processing
6449}
6450This code and comment are
6451also new.
6452else {
6453// can't do anything because the routine is already error processing
6454}
6455}
6456// if the error code isn't valid, notify the user that an
6457// internal error has been detected
6458else {
6459DisplayInteractiveMessage(
6460"Internal Error: Invalid error code in ReportErrorMessage()"
6461);
6462}
6463// return status information
6464return errorMessageStatus;
6465}
6466Each comment has given rise to one or more lines of code. Each block of code forms a
6467complete thought based on the comment. The comments have been retained to provide
6468a higher-level explanation of the code. All variables have been declared and defined
6469close to the point they’re first used. Each comment should normally expand to about 2
6470to 10 lines of code. (Because this example is just for purposes of illustration, the code
6471expansion is on the low side of what you should usually experience in practice.)
6472Now look again at the spec on page 221 and the initial pseudocode on page 224. The
6473original five-sentence spec expanded to 15 lines of pseudocode (depending on how
6474you count the lines), which in turn expanded into a page-long routine. Even though
6475the spec was detailed, creation of the routine required substantial design work in
6476pseudocode and code. That low-level design is one reason why “coding†is a nontrivial
6477task and why the subject of this book is important.
6478Check whether code should be further factored In some cases, you’ll see an explosion
6479of code below one of the initial lines of pseudocode. In this case, you should consider
6480taking one of two courses of action:
6481Cross-Reference For more
6482on refactoring, see Chapter
648324, “Refactoring.â€
6484 Factor the code below the comment into a new routine. If you find one line of
6485pseudocode expanding into more code that than you expected, factor the code
6486into its own routine. Write the code to call the routine, including the routine name.
6487If you’ve used the PPP well, the name of the new routine should drop out easily
6488from the pseudocode. Once you’ve completed the routine you were originally creating,
6489you can dive into the new routine and apply the PPP again to that routine.
6490 Apply the PPP recursively. Rather than writing a couple dozen lines of code
6491below one line of pseudocode, take the time to decompose the original line of
6492pseudocode into several more lines of pseudocode. Then continue filling in the
6493code below each of the new lines of pseudocode.
6494230 Chapter 9: The Pseudocode Programming Process
6495Check the Code
6496After designing and implementing the routine, the third big step in constructing it is
6497checking to be sure that what you’ve constructed is correct. Any errors you miss at this
6498stage won’t be found until later testing. They’re more expensive to find and correct
6499then, so you should find all that you can at this stage.
6500Cross-Reference For details
6501on checking for errors in
6502architecture and requirements,
6503see Chapter 3,
6504“Measure Twice, Cut Once:
6505Upstream Prerequisites.â€
6506A problem might not appear until the routine is fully coded for several reasons. An
6507error in the pseudocode might become more apparent in the detailed implementation
6508logic. A design that looks elegant in pseudocode might become clumsy in the implementation
6509language. Working with the detailed implementation might disclose an
6510error in the architecture, high-level design, or requirements. Finally, the code might
6511have an old-fashioned, mongrel coding error—nobody’s perfect! For all these reasons,
6512review the code before you move on.
6513Mentally check the routine for errors The first formal check of a routine is mental.
6514The cleanup and informal checking steps mentioned earlier are two kinds of mental
6515checks. Another is executing each path mentally. Mentally executing a routine is difficult,
6516and that difficulty is one reason to keep your routines small. Make sure that you
6517check nominal paths and endpoints and all exception conditions. Do this both by
6518yourself, which is called “desk checking,†and with one or more peers, which is called
6519a “peer review,†a “walk-through,†or an “inspection,†depending on how you do it.
6520One of the biggest differences between hobbyists and professional programmers is
6521the difference that grows out of moving from superstition into understanding. The
6522word “superstition†in this context doesn’t refer to a program that gives you the creeps
6523or generates extra errors when the moon is full. It means substituting feelings about
6524the code for understanding. If you often find yourself suspecting that the compiler or
6525the hardware made an error, you’re still in the realm of superstition. A study conducted
6526many years ago found that only about five percent of all errors are hardware,
6527compiler, or operating-system errors (Ostrand and Weyuker 1984). Today, that percentage
6528would probably be even lower. Programmers who have moved into the realm
6529of understanding always suspect their own work first because they know that they
6530cause 95 percent of errors. Understand the role of each line of code and why it’s
6531needed. Nothing is ever right just because it seems to work. If you don’t know why it
6532works, it probably doesn’t—you just don’t know it yet.
6533Bottom line: A working routine isn’t enough. If you don’t know why it works, study it,
6534discuss it, and experiment with alternative designs until you do.
6535Compile the routine After reviewing the routine, compile it. It might seem inefficient
6536to wait this long to compile since the code was completed several pages ago. Admittedly,
6537you might have saved some work by compiling the routine earlier and letting
6538the computer check for undeclared variables, naming conflicts, and so on.
65391
65402
65413
6542HARD DATA
6543KEY POINT
65449.3 Constructing Routines by Using the PPP 231
6545You’ll benefit in several ways, however, by not compiling until late in the process. The
6546main reason is that when you compile new code, an internal stopwatch starts ticking.
6547After the first compile, you step up the pressure: “I’ll get it right with just one more
6548compile.†The “Just One More Compile†syndrome leads to hasty, error-prone changes
6549that take more time in the long run. Avoid the rush to completion by not compiling
6550until you’ve convinced yourself that the routine is right.
6551The point of this book is to show how to rise above the cycle of hacking something
6552together and running it to see if it works. Compiling before you’re sure your program
6553works is often a symptom of the hacker mindset. If you’re not caught in the
6554hacking-and-compiling cycle, compile when you feel it’s appropriate. But be conscious
6555of the tug most people feel toward “hacking, compiling, and fixing†their way
6556to a working program.
6557Here are some guidelines for getting the most out of compiling your routine:
6558■Set the compiler’s warning level to the pickiest level possible. You can catch an
6559amazing number of subtle errors simply by allowing the compiler to detect them.
6560â– Use validators. The compiler checking performed by languages like C can be
6561supplemented by use of tools like lint. Even code that isn’t compiled, such as
6562HTML and JavaScript, can be checked by validation tools.
6563â– Eliminate the causes of all error messages and warnings. Pay attention to what
6564the messages tell you about your code. A large number of warnings often indicates
6565low-quality code, and you should try to understand each warning you get.
6566In practice, warnings you’ve seen again and again have one of two possible
6567effects: you ignore them and they camouflage other, more important, warnings,
6568or they simply become annoying. It’s usually safer and less painful to rewrite the
6569code to solve the underlying problem and eliminate the warnings.
6570Step through the code in the debugger Once the routine compiles, put it into the
6571debugger and step through each line of code. Make sure each line executes as you
6572expect it to. You can find many errors by following this simple practice.
6573Cross-Reference For details,
6574see Chapter 22, “Developer
6575Testing.†Also see “Building
6576Scaffolding to Test Individual
6577Classes†in Section 22.5.
6578Test the code Test the code using the test cases you planned or created while you
6579were developing the routine. You might have to develop scaffolding to support your
6580test cases—that is, code that’s used to support routines while they’re tested and that
6581isn’t included in the final product. Scaffolding can be a test-harness routine that calls
6582your routine with test data, or it can be stubs called by your routine.
6583Cross-Reference For details,
6584see Chapter 23, “Debugging.â€
6585Remove errors from the routine Once an error has been detected, it has to be
6586removed. If the routine you’re developing is buggy at this point, chances are good that
6587it will stay buggy. If you find that a routine is unusually buggy, start over. Don’t hack
6588around it—rewrite it. Hacks usually indicate incomplete understanding and guarantee
6589errors both now and later. Creating an entirely new design for a buggy routine pays
6590off. Few things are more satisfying than rewriting a problematic routine and never
6591finding another error in it.
6592232 Chapter 9: The Pseudocode Programming Process
6593Clean Up Leftovers
6594When you’ve finished checking your code for problems, check it for the general characteristics
6595described throughout this book. You can take several cleanup steps to
6596make sure that the routine’s quality is up to your standards:
6597■Check the routine’s interface. Make sure that all input and output data is
6598accounted for and that all parameters are used. For more details, see Section 7.5,
6599“How to Use Routine Parameters.â€
6600â– Check for general design quality. Make sure the routine does one thing and does
6601it well, that it’s loosely coupled to other routines, and that it’s designed defensively.
6602For details, see Chapter 7, “High-Quality Routines.â€
6603■Check the routine’s variables. Check for inaccurate variable names, unused
6604objects, undeclared variables, improperly initialized objects, and so on. For
6605details, see the chapters on using variables, Chapters 10 through 13.
6606■Check the routine’s statements and logic. Check for off-by-one errors, infinite
6607loops, improper nesting, and resource leaks. For details, see the chapters on
6608statements, Chapters 14 through 19.
6609■Check the routine’s layout. Make sure you’ve used white space to clarify the logical
6610structure of the routine, expressions, and parameter lists. For details, see
6611Chapter 31, “Layout and Style.â€
6612■Check the routine’s documentation. Make sure the pseudocode that was translated
6613into comments is still accurate. Check for algorithm descriptions, for documentation
6614on interface assumptions and nonobvious dependencies, for
6615justification of unclear coding practices, and so on. For details, see Chapter 32,
6616“Self-Documenting Code.â€
6617â– Remove redundant comments. Sometimes a pseudocode comment turns out to be
6618redundant with the code the comment describes, especially when the PPP has been
6619applied recursively and the comment just precedes a call to a well-named routine.
6620Repeat Steps as Needed
6621If the quality of the routine is poor, back up to the pseudocode. High-quality programming
6622is an iterative process, so don’t hesitate to loop through the construction
6623activities again.
66249.4 Alternatives to the PPP
6625For my money, the PPP is the best method for creating classes and routines. Here are
6626some different approaches recommended by other experts. You can use these
6627approaches as alternatives or as supplements to the PPP.
66289.4 Alternatives to the PPP 233
6629Test-first development Test-first is a popular development style in which test cases
6630are written prior to writing any code. This approach is described in more detail in
6631“Test First or Test Last?†in Section 22.2. A good book on test-first programming is
6632Kent Beck’s Test-Driven Development: By Example (Beck 2003).
6633Refactoring Refactoring is a development approach in which you improve code
6634through a series of semantic preserving transformations. Programmers use patterns of
6635bad code or “smells†to identify sections of code that need to be improved. Chapter
663624, “Refactoring,†describes this approach in detail, and a good book on the topic is
6637Martin Fowler’s Refactoring: Improving the Design of Existing Code (Fowler 1999).
6638Design by contract Design by contract is a development approach in which each
6639routine is considered to have preconditions and postconditions. This approach is
6640described in “Use assertions to document and verify preconditions and postconditionsâ€
6641in Section 8.2. The best source of information on design by contract is Bertrand
6642Meyers’s Object-Oriented Software Construction (Meyer 1997).
6643Hacking? Some programmers try to hack their way toward working code rather
6644than using a systematic approach like the PPP. If you’ve ever found that you’ve coded
6645yourself into a corner in a routine and have to start over, that’s an indication that the
6646PPP might work better. If you find yourself losing your train of thought in the middle
6647of coding a routine, that’s another indication that the PPP would be beneficial. Have
6648you ever simply forgotten to write part of a class or part of routine? That hardly ever
6649happens if you’re using the PPP. If you find yourself staring at the computer screen not
6650knowing where to start, that’s a surefire sign that the PPP would make your programming
6651life easier.
6652cc2e.com/0943 CHECKLIST: The Pseudocode Programming Process
6653Cross-Reference The point
6654of this list is to check
6655whether you followed a
6656good set of steps to create a
6657routine. For a checklist that
6658focuses on the quality of the
6659routine itself, see the “High-
6660Quality Routines†checklist in
6661Chapter 7, page 185.
6662â‘ Have you checked that the prerequisites have been satisfied?
6663â‘ Have you defined the problem that the class will solve?
6664â‘ Is the high-level design clear enough to give the class and each of its routines
6665a good name?
6666â‘ Have you thought about how to test the class and each of its routines?
6667â‘ Have you thought about efficiency mainly in terms of stable interfaces and
6668readable implementations or mainly in terms of meeting resource and
6669speed budgets?
6670â‘ Have you checked the standard libraries and other code libraries for applicable
6671routines or components?
6672â‘ Have you checked reference books for helpful algorithms?
6673234 Chapter 9: The Pseudocode Programming Process
6674â‘ Have you designed each routine by using detailed pseudocode?
6675â‘ Have you mentally checked the pseudocode? Is it easy to understand?
6676â‘ Have you paid attention to warnings that would send you back to design
6677(use of global data, operations that seem better suited to another class or
6678another routine, and so on)?
6679â‘ Did you translate the pseudocode to code accurately?
6680â‘ Did you apply the PPP recursively, breaking routines into smaller routines
6681when needed?
6682â‘ Did you document assumptions as you made them?
6683â‘ Did you remove comments that turned out to be redundant?
6684â‘ Have you chosen the best of several iterations, rather than merely stopping
6685after your first iteration?
6686â‘ Do you thoroughly understand your code? Is it easy to understand?
6687Key Points
6688â– Constructing classes and constructing routines tends to be an iterative process.
6689Insights gained while constructing specific routines tend to ripple back through
6690the class’s design.
6691â– Writing good pseudocode calls for using understandable English, avoiding features
6692specific to a single programming language, and writing at the level of
6693intent (describing what the design does rather than how it will do it).
6694â– The Pseudocode Programming Process is a useful tool for detailed design and
6695makes coding easy. Pseudocode translates directly into comments, ensuring
6696that the comments are accurate and useful.
6697■Don’t settle for the first design you think of. Iterate through multiple approaches
6698in pseudocode and pick the best approach before you begin writing code.
6699â– Check your work at each step, and encourage others to check it too. That way,
6700you’ll catch mistakes at the least expensive level, when you’ve invested the least
6701amount of effort.