· 3 years ago · Sep 07, 2022, 12:20 AM
1\input texinfo
2
3@c Copyright @copyright{} 2022 Richard Stallman and Free Software Foundation, Inc.
4
5(The work of Trevis Rothwell and Nelson Beebe has been assigned or
6licensed to the FSF.)
7
8@c move alignment later?
9
10@setfilename ./c
11@settitle GNU C Language Manual
12@documentencoding UTF-8
13
14@smallbook
15@synindex vr fn
16
17@copying
18Copyright @copyright{} 2022 Richard Stallman and Free Software Foundation, Inc.
19
20(The work of Trevis Rothwell and Nelson Beebe has been assigned or
21licensed to the FSF.)
22
23@quotation
24Permission is granted to copy, distribute and/or modify this document
25under the terms of the GNU Free Documentation License, Version 1.3 or
26any later version published by the Free Software Foundation; with the
27Invariant Sections being ``GNU General Public License,'' with the
28Front-Cover Texts being ``A GNU Manual,'' and with the Back-Cover
29Texts as in (a) below. A copy of the license is included in the
30section entitled ``GNU Free Documentation License.''
31
32(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
33modify this GNU manual. Buying copies from the FSF supports it in
34developing GNU and promoting software freedom.''
35@end quotation
36@end copying
37
38@dircategory Programming
39@direntry
40* C: (c). GNU C Language Intro and Reference Manual
41@end direntry
42
43@documentencoding UTF-8
44
45@titlepage
46@sp 6
47@center @titlefont{GNU C}
48@center @titlefont{Language Intro}
49@center @titlefont{and}
50@center @titlefont{Reference Manual}
51@sp 4
52@c @center @value{EDITION} Edition
53@sp 5
54@center Richard Stallman
55@center and
56@center Trevis Rothwell
57@center plus Nelson Beebe
58@center on floating point
59@page
60@vskip 0pt plus 1filll
61
62@insertcopying
63
64@sp 2
65WILL BE Published by the Free Software Foundation @*
6651 Franklin Street, Fifth Floor @*
67Boston, MA 02110-1301 USA @*
68ISBN ?-??????-??-?
69
70@ignore
71@sp 1
72Cover art by J. Random Artist
73@end ignore
74
75@end titlepage
76
77@summarycontents
78@contents
79
80
81@node Top
82@ifnottex
83@top GNU C Manual
84@end ifnottex
85@iftex
86@top Preface
87@end iftex
88
89This manual explains the C language for use with the GNU Compiler
90Collection (GCC) on the GNU/Linux system and other systems. We refer
91to this dialect as GNU C. If you already know C, you can use this as
92a reference manual.
93
94If you understand basic concepts of programming but know nothing about
95C, you can read this manual sequentially from the beginning to learn
96the C language.
97
98If you are a beginner to programming, we recommend you first learn a
99language with automatic garbage collection and no explicit pointers,
100rather than starting with C@. Good choices include Lisp, Scheme,
101Python and Java. C's explicit pointers mean that programmers must be
102careful to avoid certain kinds of errors.
103
104C is a venerable language; it was first used in 1973. The GNU C
105Compiler, which was subsequently extended into the GNU Compiler
106Collection, was first released in 1987. Other important languages
107were designed based on C: once you know C, it gives you a useful base
108for learning C@t{++}, C#, Java, Scala, D, Go, and more.
109
110The special advantage of C is that it is fairly simple while allowing
111close access to the computer's hardware, which previously required
112writing in assembler language to describe the individual machine
113instructions. Some have called C a ``high-level assembler language''
114because of its explicit pointers and lack of automatic management of
115storage. As one wag put it, ``C combines the power of assembler
116language with the convenience of assembler language.'' However, C is
117far more portable, and much easier to read and write, than assembler
118language.
119
120This manual focuses on the GNU C language supported by the GNU
121Compiler Collection, version ???. When a construct may be absent or
122work differently in other C compilers, we say so. When it is not part
123of ISO standard C, we say it is a ``GNU C extension,'' because it is
124useful to know that; however, other dialects and standards are not the
125focus of this manual. We keep those notes short, unless it is vital
126to say more. For the same reason, we hardly mention C@t{++} or other
127languages that the GNU Compiler Collection supports.
128
129Some aspects of the meaning of C programs depend on the target
130platform: which computer, and which operating system, the compiled
131code will run on. Where this is the case, we say so.
132
133The C language provides no built-in facilities for performing such
134common operations as input/output, memory management, string
135manipulation, and the like. Instead, these facilities are defined in
136a standard library, which is automatically available in every C
137program. @xref{Top, The GNU C Library, , libc, The GNU C Library
138Reference Manual}.
139
140This manual incorporates the former GNU C Preprocessor Manual, which
141was among the earliest GNU Manuals. It also uses some text from the
142earlier GNU C Manual that was written by Trevis Rothwell and James
143Youngman.
144
145GNU C has many obscure features, each one either for historical
146compatibility or meant for very special situations. We have left them
147to a companion manual, the GNU C Obscurities Manual, which will be
148published digitally later.
149
150@menu
151* The First Example:: Getting started with basic C code.
152* Complete Program:: A whole example program
153 that can be compiled and run.
154* Storage:: Basic layout of storage; bytes.
155* Beyond Integers:: Exploring different numeric types.
156* Lexical Syntax:: The various lexical components of C programs.
157* Arithmetic:: Numeric computations.
158* Assignment Expressions:: Storing values in variables.
159* Execution Control Expressions:: Expressions combining values in various ways.
160* Binary Operator Grammar:: An overview of operator precedence.
161* Order of Execution:: The order of program execution.
162* Primitive Types:: More details about primitive data types.
163* Constants:: Explicit constant values:
164 details and examples.
165* Type Size:: The memory space occupied by a type.
166* Pointers:: Creating and manipulating memory pointers.
167* Structures:: Compound data types built
168 by grouping other types.
169* Arrays:: Creating and manipulating arrays.
170* Enumeration Types:: Sets of integers with named values.
171* Defining Typedef Names:: Using @code{typedef} to define type names.
172* Statements:: Controling program flow.
173* Variables:: Details about declaring, initializing,
174 and using variables.
175* Type Qualifiers:: Mark variables for certain intended uses.
176* Functions:: Declaring, defining, and calling functions.
177* Compatible Types:: How to tell if two types are compatible
178 with each other.
179* Type Conversions:: Converting between types.
180* Scope:: Different categories of identifier scope.
181* Preprocessing:: Using the GNU C preprocessor.
182* Integers in Depth:: How integer numbers are represented.
183* Floating Point in Depth:: How floating-point numbers are represented.
184* Compilation:: How to compile multi-file programs.
185* Directing Compilation:: Operations that affect compilation
186 but don't change the program.
187
188Appendices
189
190* Type Alignment:: Where in memory a type can validly start.
191* Aliasing:: Accessing the same data in two types.
192* Digraphs:: Two-character aliases for some characters.
193* Attributes:: Specifying additional information
194 in a declaration.
195* Signals:: Fatal errors triggered in various scenarios.
196* GNU Free Documentation License:: The license for this manual.
197* Symbol Index:: Keyword and symbol index.
198* Concept Index:: Detailed topical index.
199
200@detailmenu
201--- The Detailed Node Listing ---
202
203* Recursive Fibonacci:: Writing a simple function recursively.
204* Stack:: Each function call uses space in the stack.
205* Iterative Fibonacci:: Writing the same function iteratively.
206* Complete Example:: Turn the simple function into a full program.
207* Complete Explanation:: Explanation of each part of the example.
208* Complete Line-by-Line:: Explaining each line of the example.
209* Compile Example:: Using GCC to compile the example.
210* Float Example:: A function that uses floating-point numbers.
211* Array Example:: A function that works with arrays.
212* Array Example Call:: How to call that function.
213* Array Example Variations:: Different ways to write the call example.
214
215Lexical Syntax
216
217* English:: Write programs in English!
218* Characters:: The characters allowed in C programs.
219* Whitespace:: The particulars of whitespace characters.
220* Comments:: How to include comments in C code.
221* Identifiers:: How to form identifiers (names).
222* Operators/Punctuation:: Characters used as operators or punctuation.
223* Line Continuation:: Splitting one line into multiple lines.
224* Digraphs:: Two-character substitutes for some characters.
225
226Arithmetic
227
228* Basic Arithmetic:: Addition, subtraction, multiplication,
229 and division.
230* Integer Arithmetic:: How C performs arithmetic with integer values.
231* Integer Overflow:: When an integer value exceeds the range
232 of its type.
233* Mixed Mode:: Calculating with both integer values
234 and floating-point values.
235* Division and Remainder:: How integer division works.
236* Numeric Comparisons:: Comparing numeric values for
237 equality or order.
238* Shift Operations:: Shift integer bits left or right.
239* Bitwise Operations:: Bitwise conjunction, disjunction, negation.
240
241Assignment Expressions
242
243* Simple Assignment:: The basics of storing a value.
244* Lvalues:: Expressions into which a value can be stored.
245* Modifying Assignment:: Shorthand for changing an lvalue's contents.
246* Increment/Decrement:: Shorthand for incrementing and decrementing
247 an lvalue's contents.
248* Postincrement/Postdecrement:: Accessing then incrementing or decrementing.
249* Assignment in Subexpressions:: How to avoid ambiguity.
250* Write Assignments Separately:: Write assignments as separate statements.
251
252Execution Control Expressions
253
254* Logical Operators:: Logical conjunction, disjunction, negation.
255* Logicals and Comparison:: Logical operators with comparison operators.
256* Logicals and Assignments:: Assignments with logical operators.
257* Conditional Expression:: An if/else construct inside expressions.
258* Comma Operator:: Build a sequence of subexpressions.
259
260Order of Execution
261
262* Reordering of Operands:: Operations in C are not necessarily computed
263 in the order they are written.
264* Associativity and Ordering:: Some associative operations are performed
265 in a particular order; others are not.
266* Sequence Points:: Some guarantees about the order of operations.
267* Postincrement and Ordering:: Ambiguous excution order with postincrement.
268* Ordering of Operands:: Evaluation order of operands
269 and function arguments.
270* Optimization and Ordering:: Compiler optimizations can reorder operations
271 only if it has no impact on program results.
272
273Primitive Data Types
274
275* Integer Types:: Description of integer types.
276* Floating-Point Data Types:: Description of floating-point types.
277* Complex Data Types:: Description of complex number types.
278* The Void Type:: A type indicating no value at all.
279* Other Data Types:: A brief summary of other types.
280
281Constants
282
283* Integer Constants:: Literal integer values.
284* Integer Const Type:: Types of literal integer values.
285* Floating Constants:: Literal floating-point values.
286* Imaginary Constants:: Literal imaginary number values.
287* Invalid Numbers:: Avoiding preprocessing number misconceptions.
288* Character Constants:: Literal character values.
289* Unicode Character Codes:: Unicode characters represented
290 in either UTF-16 or UTF-32.
291* Wide Character Constants:: Literal characters values larger than 8 bits.
292* String Constants:: Literal string values.
293* UTF-8 String Constants:: Literal UTF-8 string values.
294* Wide String Constants:: Literal string values made up of
295 16- or 32-bit characters.
296
297Pointers
298
299* Address of Data:: Using the ``address-of'' operator.
300* Pointer Types:: For each type, there is a pointer type.
301* Pointer Declarations:: Declaring variables with pointer types.
302* Pointer Type Designators:: Designators for pointer types.
303* Pointer Dereference:: Accessing what a pointer points at.
304* Null Pointers:: Pointers which do not point to any object.
305* Invalid Dereference:: Dereferencing null or invalid pointers.
306* Void Pointers:: Totally generic pointers, can cast to any.
307* Pointer Comparison:: Comparing memory address values.
308* Pointer Arithmetic:: Computing memory address values.
309* Pointers and Arrays:: Using pointer syntax instead of array syntax.
310* Pointer Arithmetic Low Level:: More about computing memory address values.
311* Pointer Increment/Decrement:: Incrementing and decrementing pointers.
312* Pointer Arithmetic Drawbacks:: A common pointer bug to watch out for.
313* Pointer-Integer Conversion:: Converting pointer types to integer types.
314* Printing Pointers:: Using @code{printf} for a pointer's value.
315
316Structures
317
318* Referencing Fields:: Accessing field values in a structure object.
319* Dynamic Memory Allocation:: Allocating space for objects
320 while the program is running.
321* Field Offset:: Memory layout of fields within a structure.
322* Structure Layout:: Planning the memory layout of fields.
323* Packed Structures:: Packing structure fields as close as possible.
324* Bit Fields:: Dividing integer fields
325 into fields with fewer bits.
326* Bit Field Packing:: How bit fields pack together in integers.
327* const Fields:: Making structure fields immutable.
328* Zero Length:: Zero-length array as a variable-length object.
329* Flexible Array Fields:: Another approach to variable-length objects.
330* Overlaying Structures:: Casting one structure type
331 over an object of another structure type.
332* Structure Assignment:: Assigning values to structure objects.
333* Unions:: Viewing the same object in different types.
334* Packing With Unions:: Using a union type to pack various types into
335 the same memory space.
336* Cast to Union:: Casting a value one of the union's alternative
337 types to the type of the union itself.
338* Structure Constructors:: Building new structure objects.
339* Unnamed Types as Fields:: Fields' types do not always need names.
340* Incomplete Types:: Types which have not been fully defined.
341* Intertwined Incomplete Types:: Defining mutually-recursive structue types.
342* Type Tags:: Scope of structure and union type tags.
343
344Arrays
345
346* Accessing Array Elements:: How to access individual elements of an array.
347* Declaring an Array:: How to name and reserve space for a new array.
348* Strings:: A string in C is a special case of array.
349* Incomplete Array Types:: Naming, but not allocating, a new array.
350* Limitations of C Arrays:: Arrays are not first-class objects.
351* Multidimensional Arrays:: Arrays of arrays.
352* Constructing Array Values:: Assigning values to an entire array at once.
353* Arrays of Variable Length:: Declaring arrays of non-constant size.
354
355Statements
356
357* Expression Statement:: Evaluate an expression, as a statement,
358 usually done for a side effect.
359* if Statement:: Basic conditional execution.
360* if-else Statement:: Multiple branches for conditional execution.
361* Blocks:: Grouping multiple statements together.
362* return Statement:: Return a value from a function.
363* Loop Statements:: Repeatedly executing a statement or block.
364* switch Statement:: Multi-way conditional choices.
365* switch Example:: A plausible example of using @code{switch}.
366* Duffs Device:: A special way to use @code{switch}.
367* Case Ranges:: Ranges of values for @code{switch} cases.
368* Null Statement:: A statement that does nothing.
369* goto Statement:: Jump to another point in the source code,
370 identified by a label.
371* Local Labels:: Labels with limited scope.
372* Labels as Values:: Getting the address of a label.
373* Statement Exprs:: A series of statements used as an expression.
374
375Variables
376
377* Variable Declarations:: Name a variable and and reserve space for it.
378* Initializers:: Assigning inital values to variables.
379* Designated Inits:: Assigning initial values to array elements
380 at particular array indices.
381* Auto Type:: Obtaining the type of a variable.
382* Local Variables:: Variables declared in function definitions.
383* File-Scope Variables:: Variables declared outside of
384 function definitions.
385* Static Local Variables:: Variables declared within functions,
386 but with permanent storage allocation.
387* Extern Declarations:: Declaring a variable
388 which is allocated somewhere else.
389* Allocating File-Scope:: When is space allocated
390 for file-scope variables?
391* auto and register:: Historically used storage directions.
392* Omitting Types:: The bad practice of declaring variables
393 with implicit type.
394
395Type Qualifiers
396
397* const:: Variables whose values don't change.
398* volatile:: Variables whose values may be accessed
399 or changed outside of the control of
400 this program.
401* restrict Pointers:: Restricted pointers for code optimization.
402* restrict Pointer Example:: Example of how that works.
403
404Functions
405
406* Function Definitions:: Writing the body of a function.
407* Function Declarations:: Declaring the interface of a function.
408* Function Calls:: Using functions.
409* Function Call Semantics:: Call-by-value argument passing.
410* Function Pointers:: Using references to functions.
411* The main Function:: Where execution of a GNU C program begins.
412
413Type Conversions
414
415* Explicit Type Conversion:: Casting a value from one type to another.
416* Assignment Type Conversions:: Automatic conversion by assignment operation.
417* Argument Promotions:: Automatic conversion of function parameters.
418* Operand Promotions:: Automatic conversion of arithmetic operands.
419* Common Type:: When operand types differ, which one is used?
420
421Scope
422
423* Scope:: Different categories of identifier scope.
424
425Preprocessing
426
427* Preproc Overview:: Introduction to the C preprocessor.
428* Directives:: The form of preprocessor directives.
429* Preprocessing Tokens:: The lexical elements of preprocessing.
430* Header Files:: Including one source file in another.
431* Macros:: Macro expansion by the preprocessor.
432* Conditionals:: Controling whether to compile some lines
433 or ignore them.
434* Diagnostics:: Reporting warnings and errors.
435* Line Control:: Reporting source line numbers.
436* Null Directive:: A preprocessing no-op.
437
438Integers in Depth
439
440* Integer Representations:: How integer values appear in memory.
441* Maximum and Minimum Values:: Value ranges of integer types.
442
443Floating Point in Depth
444
445* Floating Representations:: How floating-point values appear in memory.
446* Floating Type Specs:: Precise details of memory representations.
447* Special Float Values:: Infinity, Not a Number, and Subnormal Numbers.
448* Invalid Optimizations:: Don't mess up non-numbers and signed zeros.
449* Exception Flags:: Handling certain conditions in floating point.
450* Exact Floating-Point:: Not all floating calculations lose precision.
451* Rounding:: When a floating result can't be represented
452 exactly in the floating-point type in use.
453* Rounding Issues:: Avoid magnifying rounding errors.
454* Significance Loss:: Subtracting numbers that are almost equal.
455* Fused Multiply-Add:: Taking advantage of a special floating-point
456 instruction for faster execution.
457* Error Recovery:: Determining rounding errors.
458* Exact Floating Constants:: Precisely specified floating-point numbers.
459* Handling Infinity:: When floating calculation is out of range.
460* Handling NaN:: What floating calculation is undefined.
461* Signed Zeros:: Positive zero vs. negative zero.
462* Scaling by the Base:: A useful exact floating-point operation.
463* Rounding Control:: Specifying some rounding behaviors.
464* Machine Epsilon:: The smallest number you can add to 1.0
465 and get a sum which is larger than 1.0.
466* Complex Arithmetic:: Details of arithmetic with complex numbers.
467* Round-Trip Base Conversion:: What happens between base-2 and base-10.
468* Further Reading:: References for floating-point numbers.
469
470Directing Compilation
471
472* Pragmas:: Controling compilation of some constructs.
473* Static Assertions:: Compile-time tests for conditions.
474
475@end detailmenu
476@end menu
477
478@node The First Example
479@chapter The First Example
480
481This chapter presents the source code for a very simple C program and
482uses it to explain a few features of the language. If you already
483know the basic points of C presented in this chapter, you can skim it
484or skip it.
485
486@menu
487* Recursive Fibonacci:: Writing a simple function recursively.
488* Stack:: Each function call uses space in the stack.
489* Iterative Fibonacci:: Writing the same function iteratively.
490@end menu
491
492@node Recursive Fibonacci
493@section Example: Recursive Fibonacci
494@cindex recursive Fibonacci function
495@cindex Fibonacci function, recursive
496
497To introduce the most basic features of C, let's look at code for a
498simple mathematical function that does calculations on integers. This
499function calculates the @var{n}th number in the Fibonacci series, in
500which each number is the sum of the previous two: 1, 1, 2, 3, 5, 8,
50113, 21, 34, 55, @dots{}.
502
503@example
504int
505fib (int n)
506@{
507 if (n <= 2) /* @r{This avoids infinite recursion.} */
508 return 1;
509 else
510 return fib (n - 1) + fib (n - 2);
511@}
512@end example
513
514This very simple program illustrates several features of C:
515
516@itemize @bullet
517@item
518A function definition, whose first two lines constitute the function
519header. @xref{Function Definitions}.
520
521@item
522A function parameter @code{n}, referred to as the variable @code{n}
523inside the function body. @xref{Function Parameter Variables}.
524A function definition uses parameters to refer to the argument
525values provided in a call to that function.
526
527@item
528Arithmetic. C programs add with @samp{+} and subtract with
529@samp{-}. @xref{Arithmetic}.
530
531@item
532Numeric comparisons. The operator @samp{<=} tests for ``less than or
533equal.'' @xref{Numeric Comparisons}.
534
535@item
536Integer constants written in base 10.
537@xref{Integer Constants}.
538
539@item
540A function call. The function call @code{fib (n - 1)} calls the
541function @code{fib}, passing as its argument the value @code{n - 1}.
542@xref{Function Calls}.
543
544@item
545A comment, which starts with @samp{/*} and ends with @samp{*/}. The
546comment has no effect on the execution of the program. Its purpose is
547to provide explanations to people reading the source code. Including
548comments in the code is tremendously important---they provide
549background information so others can understand the code more quickly.
550@xref{Comments}.
551
552@item
553Two kinds of statements, the @code{return} statement and the
554@code{if}@dots{}@code{else} statement. @xref{Statements}.
555
556@item
557Recursion. The function @code{fib} calls itself; that is called a
558@dfn{recursive call}. These are valid in C, and quite common.
559
560The @code{fib} function would not be useful if it didn't return.
561Thus, recursive definitions, to be of any use, must avoid infinite
562recursion.
563
564This function definition prevents infinite recursion by specially
565handling the case where @code{n} is two or less. Thus the maximum
566depth of recursive calls is less than @code{n}.
567@end itemize
568
569@menu
570* Function Header:: The function's name and how it is called.
571* Function Body:: Declarations and statements that implement the function.
572@end menu
573
574@node Function Header
575@subsection Function Header
576@cindex function header
577
578In our example, the first two lines of the function definition are the
579@dfn{header}. Its purpose is to state the function's name and say how
580it is called:
581
582@example
583int
584fib (int n)
585@end example
586
587@noindent
588says that the function returns an integer (type @code{int}), its name is
589@code{fib}, and it takes one argument named @code{n} which is also an
590integer. (Data types will be explained later, in @ref{Primitive Types}.)
591
592@node Function Body
593@subsection Function Body
594@cindex function body
595@cindex recursion
596
597The rest of the function definition is called the @dfn{function body}.
598Like every function body, this one starts with @samp{@{}, ends with
599@samp{@}}, and contains zero or more @dfn{statements} and
600@dfn{declarations}. Statements specify actions to take, whereas
601declarations define names of variables, functions, and so on. Each
602statement and each declaration ends with a semicolon (@samp{;}).
603
604Statements and declarations often contain @dfn{expressions}; an
605expression is a construct whose execution produces a @dfn{value} of
606some data type, but may also take actions through ``side effects''
607that alter subsequent execution. A statement, by contrast, does not
608have a value; it affects further execution of the program only through
609the actions it takes.
610
611This function body contains no declarations, and just one statement,
612but that one is a complex statement in that it contains nested
613statements. This function uses two kinds of statements:
614
615@table @code
616@item return
617The @code{return} statement makes the function return immediately.
618It looks like this:
619
620@example
621return @var{value};
622@end example
623
624Its meaning is to compute the expression @var{value} and exit the
625function, making it return whatever value that expression produced.
626For instance,
627
628@example
629return 1;
630@end example
631
632@noindent
633returns the integer 1 from the function, and
634
635@example
636return fib (n - 1) + fib (n - 2);
637@end example
638
639@noindent
640returns a value computed by performing two function calls
641as specified and adding their results.
642
643@item @code{if}@dots{}@code{else}
644The @code{if}@dots{}@code{else} statement is a @dfn{conditional}.
645Each time it executes, it chooses one of its two substatements to execute
646and ignores the other. It looks like this:
647
648@example
649if (@var{condition})
650 @var{if-true-statement}
651else
652 @var{if-false-statement}
653@end example
654
655Its meaning is to compute the expression @var{condition} and, if it's
656``true,'' execute @var{if-true-statement}. Otherwise, execute
657@var{if-false-statement}. @xref{if-else Statement}.
658
659Inside the @code{if}@dots{}@code{else} statement, @var{condition} is
660simply an expression. It's considered ``true'' if its value is
661nonzero. (A comparison operation, such as @code{n <= 2}, produces the
662value 1 if it's ``true'' and 0 if it's ``false.'' @xref{Numeric
663Comparisons}.) Thus,
664
665@example
666if (n <= 2)
667 return 1;
668else
669 return fib (n - 1) + fib (n - 2);
670@end example
671
672@noindent
673first tests whether the value of @code{n} is less than or equal to 2.
674If so, the expression @code{n <= 2} has the value 1. So execution
675continues with the statement
676
677@example
678return 1;
679@end example
680
681@noindent
682Otherwise, execution continues with this statement:
683
684@example
685return fib (n - 1) + fib (n - 2);
686@end example
687
688Each of these statements ends the execution of the function and
689provides a value for it to return. @xref{return Statement}.
690@end table
691
692Calculating @code{fib} using ordinary integers in C works only for
693@var{n} < 47, because the value of @code{fib (47)} is too large to fit
694in type @code{int}. The addition operation that tries to add
695@code{fib (46)} and @code{fib (45)} cannot deliver the correct result.
696This occurrence is called @dfn{integer overflow}.
697
698Overflow can manifest itself in various ways, but one thing that can't
699possibly happen is to produce the correct value, since that can't fit
700in the space for the value. @xref{Integer Overflow}.
701
702@xref{Functions}, for a full explanation about functions.
703
704@node Stack
705@section The Stack, And Stack Overflow
706@cindex stack
707@cindex stack frame
708@cindex stack overflow
709@cindex recursion, drawbacks of
710
711@cindex stack frame
712Recursion has a drawback: there are limits to how many nested function
713calls a program can make. In C, each function call allocates a block
714of memory which it uses until the call returns. C allocates these
715blocks consecutively within a large area of memory known as the
716@dfn{stack}, so we refer to the blocks as @dfn{stack frames}.
717
718The size of the stack is limited; if the program tries to use too
719much, that causes the program to fail because the stack is full. This
720is called @dfn{stack overflow}.
721
722@cindex crash
723@cindex segmentation fault
724Stack overflow on GNU/Linux typically manifests itself as the
725@dfn{signal} named @code{SIGSEGV}, also known as a ``segmentation
726fault.'' By default, this signal terminates the program immediately,
727rather than letting the program try to recover, or reach an expected
728ending point. (We commonly say in this case that the program
729``crashes''). @xref{Signals}.
730
731It is inconvenient to observe a crash by passing too large
732an argument to recursive Fibonacci, because the program would run a
733long time before it crashes. This algorithm is simple but
734ridiculously slow: in calculating @code{fib (@var{n})}, the number of
735(recursive) calls @code{fib (1)} or @code{fib (2)} that it makes equals
736the final result.
737
738However, you can observe stack overflow very quickly if you use
739this function instead:
740
741@example
742int
743fill_stack (int n)
744@{
745 if (n <= 1) /* @r{This limits the depth of recursion.} */
746 return 1;
747 else
748 return fill_stack (n - 1);
749@}
750@end example
751
752Under gNewSense GNU/Linux on the Lemote Yeeloong, without optimization
753and using the default configuration, an experiment showed there is
754enough stack space to do 261906 nested calls to that function. One
755more, and the stack overflows and the program crashes. On another
756platform, with a different configuration, or with a different
757function, the limit might be bigger or smaller.
758
759@node Iterative Fibonacci
760@section Example: Iterative Fibonacci
761@cindex iterative Fibonacci function
762@cindex Fibonacci function, iterative
763
764Here's a much faster algorithm for computing the same Fibonacci
765series. It is faster for two reasons. First, it uses @dfn{iteration}
766(that is, repetition or looping) rather than recursion, so it doesn't
767take time for a large number of function calls. But mainly, it is
768faster because the number of repetitions is small---only @code{@var{n}}.
769
770@c If you change this, change the duplicate in node Example of for.
771
772@example
773int
774fib (int n)
775@{
776 int last = 1; /* @r{Initial value is @code{fib (1)}.} */
777 int prev = 0; /* @r{Initial value controls @code{fib (2)}.} */
778 int i;
779
780 for (i = 1; i < n; ++i)
781 /* @r{If @code{n} is 1 or less, the loop runs zero times,} */
782 /* @r{since @code{i < n} is false the first time.} */
783 @{
784 /* @r{Now @code{last} is @code{fib (@code{i})}}
785 @r{and @code{prev} is @code{fib (@code{i} @minus{} 1)}.} */
786 /* @r{Compute @code{fib (@code{i} + 1)}.} */
787 int next = prev + last;
788 /* @r{Shift the values down.} */
789 prev = last;
790 last = next;
791 /* @r{Now @code{last} is @code{fib (@code{i} + 1)}}
792 @r{and @code{prev} is @code{fib (@code{i})}.}
793 @r{But that won't stay true for long,}
794 @r{because we are about to increment @code{i}.} */
795 @}
796
797 return last;
798@}
799@end example
800
801This definition computes @code{fib (@var{n})} in a time proportional
802to @code{@var{n}}. The comments in the definition explain how it works: it
803advances through the series, always keeps the last two values in
804@code{last} and @code{prev}, and adds them to get the next value.
805
806Here are the additional C features that this definition uses:
807
808@table @asis
809@item Internal blocks
810Within a function, wherever a statement is called for, you can write a
811@dfn{block}. It looks like @code{@{ @r{@dots{}} @}} and contains zero or
812more statements and declarations. (You can also use additional
813blocks as statements in a block.)
814
815The function body also counts as a block, which is why it can contain
816statements and declarations.
817
818@xref{Blocks}.
819
820@item Declarations of local variables
821This function body contains declarations as well as statements. There
822are three declarations directly in the function body, as well as a
823fourth declaration in an internal block. Each starts with @code{int}
824because it declares a variable whose type is integer. One declaration
825can declare several variables, but each of these declarations is
826simple and declares just one variable.
827
828Variables declared inside a block (either a function body or an
829internal block) are @dfn{local variables}. These variables exist only
830within that block; their names are not defined outside the block, and
831exiting the block deallocates their storage. This example declares
832four local variables: @code{last}, @code{prev}, @code{i}, and
833@code{next}.
834
835The most basic local variable declaration looks like this:
836
837@example
838@var{type} @var{variablename};
839@end example
840
841For instance,
842
843@example
844int i;
845@end example
846
847@noindent
848declares the local variable @code{i} as an integer.
849@xref{Variable Declarations}.
850
851@item Initializers
852When you declare a variable, you can also specify its initial value,
853like this:
854
855@example
856@var{type} @var{variablename} = @var{value};
857@end example
858
859For instance,
860
861@example
862int last = 1;
863@end example
864
865@noindent
866declares the local variable @code{last} as an integer (type
867@code{int}) and starts it off with the value 1. @xref{Initializers}.
868
869@item Assignment
870Assignment: a specific kind of expression, written with the @samp{=}
871operator, that stores a new value in a variable or other place. Thus,
872
873@example
874@var{variable} = @var{value}
875@end example
876
877@noindent
878is an expression that computes @code{@var{value}} and stores the value in
879@code{@var{variable}}. @xref{Assignment Expressions}.
880
881@item Expression statements
882An expression statement is an expression followed by a semicolon.
883That computes the value of the expression, then ignores the value.
884
885An expression statement is useful when the expression changes some
886data or has other side effects---for instance, with function calls, or
887with assignments as in this example. @xref{Expression Statement}.
888
889Using an expression with no side effects in an expression statement is
890pointless except in very special cases. For instance, the expression
891statement @code{x;} would examine the value of @code{x} and ignore it.
892That is not useful.
893
894@item Increment operator
895The increment operator is @samp{++}. @code{++i} is an
896expression that is short for @code{i = i + 1}.
897@xref{Increment/Decrement}.
898
899@item @code{for} statements
900A @code{for} statement is a clean way of executing a statement
901repeatedly---a @dfn{loop} (@pxref{Loop Statements}). Specifically,
902
903@example
904for (i = 1; i < n; ++i)
905 @var{body}
906@end example
907
908@noindent
909means to start by doing @code{i = 1} (set @code{i} to one) to prepare
910for the loop. The loop itself consists of
911
912@itemize @bullet
913@item
914Testing @code{i < n} and exiting the loop if that's false.
915
916@item
917Executing @var{body}.
918
919@item
920Advancing the loop (executing @code{++i}, which increments @code{i}).
921@end itemize
922
923The net result is to execute @var{body} with 0 in @code{i},
924then with 1 in @code{i}, and so on, stopping just before the repetition
925where @code{i} would equal @code{n}.
926
927The body of the @code{for} statement must be one and only one
928statement. You can't write two statements in a row there; if you try
929to, only the first of them will be treated as part of the loop.
930
931The way to put multiple statements in those places is to group them
932with a block, and that's what we do in this example.
933@end table
934
935@node Complete Program
936@chapter A Complete Program
937@cindex complete example program
938@cindex example program, complete
939
940It's all very well to write a Fibonacci function, but you cannot run
941it by itself. It is a useful program, but it is not a complete
942program.
943
944In this chapter we present a complete program that contains the
945@code{fib} function. This example shows how to make the program
946start, how to make it finish, how to do computation, and how to print
947a result.
948
949@menu
950* Complete Example:: Turn the simple function into a full program.
951* Complete Explanation:: Explanation of each part of the example.
952* Complete Line-by-Line:: Explaining each line of the example.
953* Compile Example:: Using GCC to compile the example.
954@end menu
955
956@node Complete Example
957@section Complete Program Example
958
959Here is the complete program that uses the simple, recursive version
960of the @code{fib} function (@pxref{Recursive Fibonacci}):
961
962@example
963#include <stdio.h>
964
965int
966fib (int n)
967@{
968 if (n <= 2) /* @r{This avoids infinite recursion.} */
969 return 1;
970 else
971 return fib (n - 1) + fib (n - 2);
972@}
973
974int
975main (void)
976@{
977 printf ("Fibonacci series item %d is %d\n",
978 20, fib (20));
979 return 0;
980@}
981@end example
982
983@noindent
984This program prints a message that shows the value of @code{fib (20)}.
985
986Now for an explanation of what that code means.
987
988@node Complete Explanation
989@section Complete Program Explanation
990
991@ifnottex
992Here's the explanation of the code of the example in the
993previous section.
994@end ifnottex
995
996This sample program prints a message that shows the value of @code{fib
997(20)}, and exits with code 0 (which stands for successful execution).
998
999Every C program is started by running the function named @code{main}.
1000Therefore, the example program defines a function named @code{main} to
1001provide a way to start it. Whatever that function does is what the
1002program does. @xref{The main Function}.
1003
1004The @code{main} function is the first one called when the program
1005runs, but it doesn't come first in the example code. The order of the
1006function definitions in the source code makes no difference to the
1007program's meaning.
1008
1009The initial call to @code{main} always passes certain arguments, but
1010@code{main} does not have to pay attention to them. To ignore those
1011arguments, define @code{main} with @code{void} as the parameter list.
1012(@code{void} as a function's parameter list normally means ``call with
1013no arguments,'' but @code{main} is a special case.)
1014
1015The function @code{main} returns 0 because that is
1016the conventional way for @code{main} to indicate successful execution.
1017It could instead return a positive integer to indicate failure, and
1018some utility programs have specific conventions for the meaning of
1019certain numeric @dfn{failure codes}. @xref{Values from main}.
1020
1021@cindex @code{printf}
1022The simplest way to print text in C is by calling the @code{printf}
1023function, so here we explain what that does.
1024
1025@cindex standard output
1026The first argument to @code{printf} is a @dfn{string constant}
1027(@pxref{String Constants}) that is a template for output. The
1028function @code{printf} copies most of that string directly as output,
1029including the newline character at the end of the string, which is
1030written as @samp{\n}. The output goes to the program's @dfn{standard
1031output} destination, which in the usual case is the terminal.
1032
1033@samp{%} in the template introduces a code that substitutes other text
1034into the output. Specifically, @samp{%d} means to take the next
1035argument to @code{printf} and substitute it into the text as a decimal
1036number. (The argument for @samp{%d} must be of type @code{int}; if it
1037isn't, @code{printf} will malfunction.) So the output is a line that
1038looks like this:
1039
1040@example
1041Fibonacci series item 20 is 6765
1042@end example
1043
1044This program does not contain a definition for @code{printf} because
1045it is defined by the C library, which makes it available in all C
1046programs. However, each program does need to @dfn{declare}
1047@code{printf} so it will be called correctly. The @code{#include}
1048line takes care of that; it includes a @dfn{header file} called
1049@file{stdio.h} into the program's code. That file is provided by the
1050operating system and it contains declarations for the many standard
1051input/output functions in the C library, one of which is
1052@code{printf}.
1053
1054Don't worry about header files for now; we'll explain them later in
1055@ref{Header Files}.
1056
1057The first argument of @code{printf} does not have to be a string
1058constant; it can be any string (@pxref{Strings}). However, using a
1059constant is the most common case.
1060
1061To learn more about @code{printf} and other facilities of the C
1062library, see @ref{Top, The GNU C Library, , libc, The GNU C Library
1063Reference Manual}.
1064
1065@node Complete Line-by-Line
1066@section Complete Program, Line by Line
1067
1068Here's the same example, explained line by line.
1069@strong{Beginners, do you find this helpful or not?
1070Would you prefer a different layout for the example?
1071Please tell rms@@gnu.org.}
1072
1073@example
1074#include <stdio.h> /* @r{Include declaration of usual} */
1075 /* @r{I/O functions such as @code{printf}.} */
1076 /* @r{Most programs need these.} */
1077
1078int /* @r{This function returns an @code{int}.} */
1079fib (int n) /* @r{Its name is @code{fib};} */
1080 /* @r{its argument is called @code{n}.} */
1081@{ /* @r{Start of function body.} */
1082 /* @r{This stops the recursion from being infinite.} */
1083 if (n <= 2) /* @r{If @code{n} is 1 or 2,} */
1084 return 1; /* @r{make @code{fib} return 1.} */
1085 else /* @r{otherwise, add the two previous} */
1086 /* @r{fibonacci numbers.} */
1087 return fib (n - 1) + fib (n - 2);
1088@}
1089
1090int /* @r{This function returns an @code{int}.} */
1091main (void) /* @r{Start here; ignore arguments.} */
1092@{ /* @r{Print message with numbers in it.} */
1093 printf ("Fibonacci series item %d is %d\n",
1094 20, fib (20));
1095 return 0; /* @r{Terminate program, report success.} */
1096@}
1097@end example
1098
1099@node Compile Example
1100@section Compiling the Example Program
1101@cindex compiling
1102@cindex executable file
1103
1104To run a C program requires converting the source code into an
1105@dfn{executable file}. This is called @dfn{compiling} the program,
1106and the command to do that using GNU C is @command{gcc}.
1107
1108This example program consists of a single source file. If we
1109call that file @file{fib1.c}, the complete command to compile it is
1110this:
1111
1112@example
1113gcc -g -O -o fib1 fib1.c
1114@end example
1115
1116@noindent
1117Here, @option{-g} says to generate debugging information, @option{-O}
1118says to optimize at the basic level, and @option{-o fib1} says to put
1119the executable program in the file @file{fib1}.
1120
1121To run the program, use its file name as a shell command.
1122For instance,
1123
1124@example
1125./fib1
1126@end example
1127
1128@noindent
1129However, unless you are sure the program is correct, you should
1130expect to need to debug it. So use this command,
1131
1132@example
1133gdb fib1
1134@end example
1135
1136@noindent
1137which starts the GDB debugger (@pxref{Sample Session, Sample Session,
1138A Sample GDB Session, gdb, Debugging with GDB}) so you can run and
1139debug the executable program @code{fib1}.
1140
1141
1142@xref{Compilation}, for an introduction to compiling more complex
1143programs which consist of more than one source file.
1144
1145@node Storage
1146@chapter Storage and Data
1147@cindex bytes
1148@cindex storage organization
1149@cindex memory organization
1150
1151Storage in C programs is made up of units called @dfn{bytes}. On
1152nearly all computers, a byte consists of 8 bits, but there are a few
1153peculiar computers (mostly ``embedded controllers'' for very small
1154systems) where a byte is longer than that. This manual does not try
1155to explain the peculiarity of those computers; we assume that a byte
1156is 8 bits.
1157
1158Every C data type is made up of a certain number of bytes; that number
1159is the data type's @dfn{size}. @xref{Type Size}, for details. The
1160types @code{signed char} and @code{unsigned char} are one byte long;
1161use those types to operate on data byte by byte. @xref{Signed and
1162Unsigned Types}. You can refer to a series of consecutive bytes as an
1163array of @code{char} elements; that's what an ASCII string looks like
1164in memory. @xref{String Constants}.
1165
1166@node Beyond Integers
1167@chapter Beyond Integers
1168
1169So far we've presented programs that operate on integers. In this
1170chapter we'll present examples of handling non-integral numbers and
1171arrays of numbers.
1172
1173@menu
1174* Float Example:: A function that uses floating-point numbers.
1175* Array Example:: A function that works with arrays.
1176* Array Example Call:: How to call that function.
1177* Array Example Variations:: Different ways to write the call example.
1178@end menu
1179
1180@node Float Example
1181@section An Example with Non-Integer Numbers
1182@cindex floating point example
1183
1184Here's a function that operates on and returns @dfn{floating point}
1185numbers that don't have to be integers. Floating point represents a
1186number as a fraction together with a power of 2. (For more detail,
1187@pxref{Floating-Point Data Types}.) This example calculates the
1188average of three floating point numbers that are passed to it as
1189arguments:
1190
1191@example
1192double
1193average_of_three (double a, double b, double c)
1194@{
1195 return (a + b + c) / 3;
1196@}
1197@end example
1198
1199The values of the parameter @var{a}, @var{b} and @var{c} do not have to be
1200integers, and even when they happen to be integers, most likely their
1201average is not an integer.
1202
1203@code{double} is the usual data type in C for calculations on
1204floating-point numbers.
1205
1206To print a @code{double} with @code{printf}, we must use @samp{%f}
1207instead of @samp{%d}:
1208
1209@example
1210printf ("Average is %f\n",
1211 average_of_three (1.1, 9.8, 3.62));
1212@end example
1213
1214The code that calls @code{printf} must pass a @code{double} for
1215printing with @samp{%f} and an @code{int} for printing with @samp{%d}.
1216If the argument has the wrong type, @code{printf} will produce garbage
1217output.
1218
1219Here's a complete program that computes the average of three
1220specific numbers and prints the result:
1221
1222@example
1223double
1224average_of_three (double a, double b, double c)
1225@{
1226 return (a + b + c) / 3;
1227@}
1228
1229int
1230main (void)
1231@{
1232 printf ("Average is %f\n",
1233 average_of_three (1.1, 9.8, 3.62));
1234 return 0;
1235@}
1236@end example
1237
1238From now on we will not present examples of calls to @code{main}.
1239Instead we encourage you to write them for yourself when you want
1240to test executing some code.
1241
1242@node Array Example
1243@section An Example with Arrays
1244@cindex array example
1245
1246A function to take the average of three numbers is very specific and
1247limited. A more general function would take the average of any number
1248of numbers. That requires passing the numbers in an array. An array
1249is an object in memory that contains a series of values of the same
1250data type. This chapter presents the basic concepts and use of arrays
1251through an example; for the full explanation, see @ref{Arrays}.
1252
1253Here's a function definition to take the average of several
1254floating-point numbers, passed as type @code{double}. The first
1255parameter, @code{length}, specifies how many numbers are passed. The
1256second parameter, @code{input_data}, is an array that holds those
1257numbers.
1258
1259@example
1260double
1261avg_of_double (int length, double input_data[])
1262@{
1263 double sum = 0;
1264 int i;
1265
1266 for (i = 0; i < length; i++)
1267 sum = sum + input_data[i];
1268
1269 return sum / length;
1270@}
1271@end example
1272
1273This introduces the expression to refer to an element of an array:
1274@code{input_data[i]} means the element at index @code{i} in
1275@code{input_data}. The index of the element can be any expression
1276with an integer value; in this case, the expression is @code{i}.
1277@xref{Accessing Array Elements}.
1278
1279@cindex zero-origin indexing
1280The lowest valid index in an array is 0, @emph{not} 1, and the highest
1281valid index is one less than the number of elements. (This is known
1282as @dfn{zero-origin indexing}.)
1283
1284This example also introduces the way to declare that a function
1285parameter is an array. Such declarations are modeled after the syntax
1286for an element of the array. Just as @code{double foo} declares that
1287@code{foo} is of type @code{double}, @code{double input_data[]}
1288declares that each element of @code{input_data} is of type
1289@code{double}. Therefore, @code{input_data} itself has type ``array
1290of @code{double}.''
1291
1292When declaring an array parameter, it's not necessary to say how long
1293the array is. In this case, the parameter @code{input_data} has no
1294length information. That's why the function needs another parameter,
1295@code{length}, for the caller to provide that information to the
1296function @code{avg_of_double}.
1297
1298@node Array Example Call
1299@section Calling the Array Example
1300
1301To call the function @code{avg_of_double} requires making an
1302array and then passing it as an argument. Here is an example.
1303
1304@example
1305@{
1306 /* @r{The array of values to average.} */
1307 double nums_to_average[5];
1308 /* @r{The average, once we compute it.} */
1309 double average;
1310
1311 /* @r{Fill in elements of @code{nums_to_average}.} */
1312
1313 nums_to_average[0] = 58.7;
1314 nums_to_average[1] = 5.1;
1315 nums_to_average[2] = 7.7;
1316 nums_to_average[3] = 105.2;
1317 nums_to_average[4] = -3.14159;
1318
1319 average = avg_of_double (5, nums_to_average);
1320
1321 /* @r{@dots{}now make use of @code{average}@dots{}} */
1322@}
1323@end example
1324
1325This shows an array subscripting expression again, this time
1326on the left side of an assignment, storing a value into an
1327element of an array.
1328
1329It also shows how to declare a local variable that is an array:
1330@code{double nums_to_average[5];}. Since this declaration allocates the
1331space for the array, it needs to know the array's length. You can
1332specify the length with any expression whose value is an integer, but
1333in this declaration the length is a constant, the integer 5.
1334
1335The name of the array, when used by itself as an expression, stands
1336for the address of the array's data, and that's what gets passed to
1337the function @code{avg_of_double} in @code{avg_of_double (5,
1338nums_to_average)}.
1339
1340We can make the code easier to maintain by avoiding the need to write
13415, the array length, when calling @code{avg_of_double}. That way, if
1342we change the array to include more elements, we won't have to change
1343that call. One way to do this is with the @code{sizeof} operator:
1344
1345@example
1346 average = avg_of_double ((sizeof (nums_to_average)
1347 / sizeof (nums_to_average[0])),
1348 nums_to_average);
1349@end example
1350
1351This computes the number of elements in @code{nums_to_average} by dividing
1352its total size by the size of one element. @xref{Type Size}, for more
1353details of using @code{sizeof}.
1354
1355We don't show in this example what happens after storing the result of
1356@code{avg_of_double} in the variable @code{average}. Presumably
1357more code would follow that uses that result somehow. (Why compute
1358the average and not use it?) But that isn't part of this topic.
1359
1360@node Array Example Variations
1361@section Variations for Array Example
1362
1363The code to call @code{avg_of_double} has two declarations that
1364start with the same data type:
1365
1366@example
1367 /* @r{The array of values to average.} */
1368 double nums_to_average[5];
1369 /* @r{The average, once we compute it.} */
1370 double average;
1371@end example
1372
1373In C, you can combine the two, like this:
1374
1375@example
1376 double nums_to_average[5], average;
1377@end example
1378
1379This declares @code{nums_to_average} so each of its elements is a
1380@code{double}, and @code{average} so that it simply is a
1381@code{double}.
1382
1383However, while you @emph{can} combine them, that doesn't mean you
1384@emph{should}. If it is useful to write comments about the variables,
1385and usually it is, then it's clearer to keep the declarations separate
1386so you can put a comment on each one.
1387
1388We set all of the elements of the array @code{nums_to_average} with
1389assignments, but it is more convenient to use an initializer in the
1390declaration:
1391
1392@example
1393@{
1394 /* @r{The array of values to average.} */
1395 double nums_to_average[]
1396 = @{ 58.7, 5.1, 7.7, 105.2, -3.14159 @};
1397
1398 /* @r{The average, once we compute it.} */
1399 average = avg_of_double ((sizeof (nums_to_average)
1400 / sizeof (nums_to_average[0])),
1401 nums_to_average);
1402
1403 /* @r{@dots{}now make use of @code{average}@dots{}} */
1404@}
1405@end example
1406
1407The array initializer is a comma-separated list of values, delimited
1408by braces. @xref{Initializers}.
1409
1410Note that the declaration does not specify a size for
1411@code{nums_to_average}, so the size is determined from the
1412initializer. There are five values in the initializer, so
1413@code{nums_to_average} gets length 5. If we add another element to
1414the initializer, @code{nums_to_average} will have six elements.
1415
1416Because the code computes the number of elements from the size of
1417the array, using @code{sizeof}, the program will operate on all the
1418elements in the initializer, regardless of how many those are.
1419
1420@node Lexical Syntax
1421@chapter Lexical Syntax
1422@cindex lexical syntax
1423@cindex token
1424
1425To start the full description of the C language, we explain the
1426lexical syntax and lexical units of C code. The lexical units of a
1427programming language are known as @dfn{tokens}. This chapter covers
1428all the tokens of C except for constants, which are covered in a later
1429chapter (@pxref{Constants}). One vital kind of token is the
1430@dfn{identifier} (@pxref{Identifiers}), which is used for names of any
1431kind.
1432
1433@menu
1434* English:: Write programs in English!
1435* Characters:: The characters allowed in C programs.
1436* Whitespace:: The particulars of whitespace characters.
1437* Comments:: How to include comments in C code.
1438* Identifiers:: How to form identifiers (names).
1439* Operators/Punctuation:: Characters used as operators or punctuation.
1440* Line Continuation:: Splitting one line into multiple lines.
1441@end menu
1442
1443@node English
1444@section Write Programs in English!
1445
1446In principle, you can write the function and variable names in a
1447program, and the comments, in any human language. C allows any kinds
1448of characters in comments, and you can put non-ASCII characters into
1449identifiers with a special prefix. However, to enable programmers in
1450all countries to understand and develop the program, it is best given
1451today's circumstances to write identifiers and comments in
1452English.
1453
1454English is the one language that programmers in all countries
1455generally study. If a program's names are in English, most
1456programmers in Bangladesh, Belgium, Bolivia, Brazil, and Bulgaria can
1457understand them. Most programmers in those countries can speak
1458English, or at least read it, but they do not read each other's
1459languages at all. In India, with so many languages, two programmers
1460may have no common language other than English.
1461
1462If you don't feel confident in writing English, do the best you can,
1463and follow each English comment with a version in a language you
1464write better; add a note asking others to translate that to English.
1465Someone will eventually do that.
1466
1467The program's user interface is a different matter. We don't need to
1468choose one language for that; it is easy to support multiple languages
1469and let each user choose the language to use. This requires writing
1470the program to support localization of its interface. (The
1471@code{gettext} package exists to support this; @pxref{Message
1472Translation, The GNU C Library, , libc, The GNU C Library Reference
1473Manual}.) Then a community-based translation effort can provide
1474support for all the languages users want to use.
1475
1476@node Characters
1477@section Characters
1478@cindex character set
1479@cindex Unicode
1480
1481@c ??? How to express ¶?
1482
1483GNU C source files are usually written in the
1484@url{https://en.wikipedia.org/wiki/ASCII,,ASCII} character set, which
1485was defined in the 1960s for English. However, they can also include
1486Unicode characters represented in the
1487@url{https://en.wikipedia.org/wiki/UTF-8,,UTF-8} multibyte encoding.
1488This makes it possible to represent accented letters such as @samp{á},
1489as well as other scripts such as Arabic, Chinese, Cyrillic, Hebrew,
1490Japanese, and Korean.@footnote{On some obscure systems, GNU C uses
1491UTF-EBCDIC instead of UTF-8, but that is not worth describing in this
1492manual.}
1493
1494In C source code, non-ASCII characters are valid in comments, in wide
1495character constants (@pxref{Wide Character Constants}), and in string
1496constants (@pxref{String Constants}).
1497
1498@c ??? valid in identifiers?
1499Another way to specify non-ASCII characters in constants (character or
1500string) and identifiers is with an escape sequence starting with
1501backslash, specifying the intended Unicode character. (@xref{Unicode
1502Character Codes}.) This specifies non-ASCII characters without
1503putting a real non-ASCII character in the source file itself.
1504
1505C accepts two-character aliases called @dfn{digraphs} for certain
1506characters. @xref{Digraphs}.
1507
1508@node Whitespace
1509@section Whitespace
1510@cindex whitespace characters in source files
1511@cindex space character in source
1512@cindex tab character in source
1513@cindex formfeed in source
1514@cindex linefeed in source
1515@cindex newline in source
1516@cindex carriage return in source
1517@cindex vertical tab in source
1518
1519Whitespace means characters that exist in a file but appear blank in a
1520printed listing of a file (or traditionally did appear blank, several
1521decades ago). The C language requires whitespace in order to separate
1522two consecutive identifiers, or to separate an identifier from a
1523numeric constant. Other than that, and a few special situations
1524described later, whitespace is optional; you can put it in when you
1525wish, to make the code easier to read.
1526
1527Space and tab in C code are treated as whitespace characters. So are
1528line breaks. You can represent a line break with the newline
1529character (also called @dfn{linefeed} or LF), CR (carriage return), or
1530the CRLF sequence (two characters: carriage return followed by a
1531newline character).
1532
1533The @dfn{formfeed} character, Control-L, was traditionally used to
1534divide a file into pages. It is still used this way in source code,
1535and the tools that generate nice printouts of source code still start
1536a new page after each ``formfeed'' character. Dividing code into
1537pages separated by formfeed characters is a good way to break it up
1538into comprehensible pieces and show other programmers where they start
1539and end.
1540
1541The @dfn{vertical tab} character, Control-K, was traditionally used to
1542make printing advance down to the next section of a page. We know of
1543no particular reason to use it in source code, but it is still
1544accepted as whitespace in C.
1545
1546Comments are also syntactically equivalent to whitespace.
1547@ifinfo
1548@xref{Comments}.
1549@end ifinfo
1550
1551@node Comments
1552@section Comments
1553@cindex comments
1554
1555A comment encapsulates text that has no effect on the program's
1556execution or meaning.
1557
1558The purpose of comments is to explain the code to people that read it.
1559Writing good comments for your code is tremendously important---they
1560should provide background information that helps programmers
1561understand the reasons why the code is written the way it is. You,
1562returning to the code six months from now, will need the help of these
1563comments to remember why you wrote it this way.
1564
1565Outdated comments that become incorrect are counterproductive, so part
1566of the software developer's responsibility is to update comments as
1567needed to correspond with changes to the program code.
1568
1569C allows two kinds of comment syntax, the traditional style and the
1570C@t{++} style. A traditional C comment starts with @samp{/*} and ends
1571with @samp{*/}. For instance,
1572
1573@example
1574/* @r{This is a comment in traditional C syntax.} */
1575@end example
1576
1577A traditional comment can contain @samp{/*}, but these delimiters do
1578not nest as pairs. The first @samp{*/} ends the comment regardless of
1579whether it contains @samp{/*} sequences.
1580
1581@example
1582/* @r{This} /* @r{is a comment} */ But this is not! */
1583@end example
1584
1585A @dfn{line comment} starts with @samp{//} and ends at the end of the line.
1586For instance,
1587
1588@example
1589// @r{This is a comment in C@t{++} style.}
1590@end example
1591
1592Line comments do nest, in effect, because @samp{//} inside a line
1593comment is part of that comment:
1594
1595@example
1596// @r{this whole line is} // @r{one comment}
1597This is code, not comment.
1598@end example
1599
1600It is safe to put line comments inside block comments, or vice versa.
1601
1602@example
1603@group
1604/* @r{traditional comment}
1605 // @r{contains line comment}
1606 @r{more traditional comment}
1607 */ text here is not a comment
1608
1609// @r{line comment} /* @r{contains traditional comment} */
1610@end group
1611@end example
1612
1613But beware of commenting out one end of a traditional comment with a line
1614comment. The delimiter @samp{/*} doesn't start a comment if it occurs
1615inside an already-started comment.
1616
1617@example
1618@group
1619 // @r{line comment} /* @r{That would ordinarily begin a block comment.}
1620 Oops! The line comment has ended;
1621 this isn't a comment any more. */
1622@end group
1623@end example
1624
1625Comments are not recognized within string constants. @t{@w{"/* blah
1626*/"}} is the string constant @samp{@w{/* blah */}}, not an empty
1627string.
1628
1629In this manual we show the text in comments in a variable-width font,
1630for readability, but this font distinction does not exist in source
1631files.
1632
1633A comment is syntactically equivalent to whitespace, so it always
1634separates tokens. Thus,
1635
1636@example
1637@group
1638 int/* @r{comment} */foo;
1639@r{is equivalent to}
1640 int foo;
1641@end group
1642@end example
1643
1644@noindent
1645but clean code always uses real whitespace to separate the comment
1646visually from surrounding code.
1647
1648@node Identifiers
1649@section Identifiers
1650@cindex identifiers
1651
1652An @dfn{identifier} (name) in C is a sequence of letters and digits,
1653as well as @samp{_}, that does not start with a digit. Most compilers
1654also allow @samp{$}. An identifier can be as long as you like; for
1655example,
1656
1657@example
1658int anti_dis_establishment_arian_ism;
1659@end example
1660
1661@cindex case of letters in identifiers
1662Letters in identifiers are case-sensitive in C; thus, @code{a}
1663and @code{A} are two different identifiers.
1664
1665@cindex keyword
1666@cindex reserved words
1667Identifiers in C are used as variable names, function names, typedef
1668names, enumeration constants, type tags, field names, and labels.
1669Certain identifiers in C are @dfn{keywords}, which means they have
1670specific syntactic meanings. Keywords in C are @dfn{reserved words},
1671meaning you cannot use them in any other way. For instance, you can't
1672define a variable or function named @code{return} or @code{if}.
1673
1674You can also include other characters, even non-ASCII characters, in
1675identifiers by writing their Unicode character names, which start with
1676@samp{\u} or @samp{\U}, in the identifier name. @xref{Unicode
1677Character Codes}. However, it is usually a bad idea to use non-ASCII
1678characters in identifiers, and when they are written in English, they
1679never need non-ASCII characters. @xref{English}.
1680
1681Whitespace is required to separate two consecutive identifiers, or to
1682separate an identifier from a preceding or following numeric
1683constant.
1684
1685@node Operators/Punctuation
1686@section Operators and Punctuation
1687@cindex operators
1688@cindex punctuation
1689
1690Here we describe the lexical syntax of operators and punctuation in C.
1691The specific operators of C and their meanings are presented in
1692subsequent chapters.
1693
1694Most operators in C consist of one or two characters that can't be
1695used in identifiers. The characters used for operators in C are
1696@samp{!~^&|*/%+-=<>,.?:}.
1697
1698Some operators are a single character. For instance, @samp{-} is the
1699operator for negation (with one operand) and the operator for
1700subtraction (with two operands).
1701
1702Some operators are two characters. For example, @samp{++} is the
1703increment operator. Recognition of multicharacter operators works by
1704grouping together as many consecutive characters as can constitute one
1705operator.
1706
1707For instance, the character sequence @samp{++} is always interpreted
1708as the increment operator; therefore, if we want to write two
1709consecutive instances of the operator @samp{+}, we must separate them
1710with a space so that they do not combine as one token. Applying the
1711same rule, @code{a+++++b} is always tokenized as @code{@w{a++ ++ +
1712b}}, not as @code{@w{a++ + ++b}}, even though the latter could be part
1713of a valid C program and the former could not (since @code{a++}
1714is not an lvalue and thus can't be the operand of @code{++}).
1715
1716A few C operators are keywords rather than special characters. They
1717include @code{sizeof} (@pxref{Type Size}) and @code{_Alignof}
1718(@pxref{Type Alignment}).
1719
1720The characters @samp{;@{@}[]()} are used for punctuation and grouping.
1721Semicolon (@samp{;}) ends a statement. Braces (@samp{@{} and
1722@samp{@}}) begin and end a block at the statement level
1723(@pxref{Blocks}), and surround the initializer (@pxref{Initializers})
1724for a variable with multiple elements or components (such as arrays or
1725structures).
1726
1727Square brackets (@samp{[} and @samp{]}) do array indexing, as in
1728@code{array[5]}.
1729
1730Parentheses are used in expressions for explicit nesting of
1731expressions (@pxref{Basic Arithmetic}), around the parameter
1732declarations in a function declaration or definition, and around the
1733arguments in a function call, as in @code{printf ("Foo %d\n", i)}
1734(@pxref{Function Calls}). Several kinds of statements also use
1735parentheses as part of their syntax---for instance, @code{if}
1736statements, @code{for} statements, @code{while} statements, and
1737@code{switch} statements. @xref{if Statement}, and following
1738sections.
1739
1740Parentheses are also required around the operand of the operator
1741keywords @code{sizeof} and @code{_Alignof} when the operand is a data
1742type rather than a value. @xref{Type Size}.
1743
1744@node Line Continuation
1745@section Line Continuation
1746@cindex line continuation
1747@cindex continuation of lines
1748
1749The sequence of a backslash and a newline is ignored absolutely
1750anywhere in a C program. This makes it possible to split a single
1751source line into multiple lines in the source file. GNU C tolerates
1752and ignores other whitespace between the backslash and the newline.
1753In particular, it always ignores a CR (carriage return) character
1754there, in case some text editor decided to end the line with the CRLF
1755sequence.
1756
1757The main use of line continuation in C is for macro definitions that
1758would be inconveniently long for a single line (@pxref{Macros}).
1759
1760It is possible to continue a line comment onto another line with
1761backslash-newline. You can put backslash-newline in the middle of an
1762identifier, even a keyword, or an operator. You can even split
1763@samp{/*}, @samp{*/}, and @samp{//} onto multiple lines with
1764backslash-newline. Here's an ugly example:
1765
1766@example
1767@group
1768/\
1769*
1770*/ fo\
1771o +\
1772= 1\
17730;
1774@end group
1775@end example
1776
1777@noindent
1778That's equivalent to @samp{/* */ foo += 10;}.
1779
1780Don't do those things in real programs, since they make code hard to
1781read.
1782
1783@strong{Note:} For the sake of using certain tools on the source code, it is
1784wise to end every source file with a newline character which is not
1785preceded by a backslash, so that it really ends the last line.
1786
1787@node Arithmetic
1788@chapter Arithmetic
1789@cindex arithmetic operators
1790@cindex operators, arithmetic
1791
1792@c ??? Duplication with other sections -- get rid of that?
1793
1794Arithmetic operators in C attempt to be as similar as possible to the
1795abstract arithmetic operations, but it is impossible to do this
1796perfectly. Numbers in a computer have a finite range of possible
1797values, and non-integer values have a limit on their possible
1798accuracy. Nonetheless, in most cases you will encounter no surprises
1799in using @samp{+} for addition, @samp{-} for subtraction, and @samp{*}
1800for multiplication.
1801
1802Each C operator has a @dfn{precedence}, which is its rank in the
1803grammatical order of the various operators. The operators with the
1804highest precedence grab adjoining operands first; these expressions
1805then become operands for operators of lower precedence. We give some
1806information about precedence of operators in this chapter where we
1807describe the operators; for the full explanation, see @ref{Binary
1808Operator Grammar}.
1809
1810The arithmetic operators always @dfn{promote} their operands before
1811operating on them. This means converting narrow integer data types to
1812a wider data type (@pxref{Operand Promotions}). If you are just
1813learning C, don't worry about this yet.
1814
1815Given two operands that have different types, most arithmetic
1816operations convert them both to their @dfn{common type}. For
1817instance, if one is @code{int} and the other is @code{double}, the
1818common type is @code{double}. (That's because @code{double} can
1819represent all the values that an @code{int} can hold, but not vice
1820versa.) For the full details, see @ref{Common Type}.
1821
1822@menu
1823* Basic Arithmetic:: Addition, subtraction, multiplication,
1824 and division.
1825* Integer Arithmetic:: How C performs arithmetic with integer values.
1826* Integer Overflow:: When an integer value exceeds the range
1827 of its type.
1828* Mixed Mode:: Calculating with both integer values
1829 and floating-point values.
1830* Division and Remainder:: How integer division works.
1831* Numeric Comparisons:: Comparing numeric values for equality or order.
1832* Shift Operations:: Shift integer bits left or right.
1833* Bitwise Operations:: Bitwise conjunction, disjunction, negation.
1834@end menu
1835
1836@node Basic Arithmetic
1837@section Basic Arithmetic
1838@cindex addition operator
1839@cindex subtraction operator
1840@cindex multiplication operator
1841@cindex division operator
1842@cindex negation operator
1843@cindex operator, addition
1844@cindex operator, subtraction
1845@cindex operator, multiplication
1846@cindex operator, division
1847@cindex operator, negation
1848
1849Basic arithmetic in C is done with the usual binary operators of
1850algebra: addition (@samp{+}), subtraction (@samp{-}), multiplication
1851(@samp{*}) and division (@samp{/}). The unary operator @samp{-} is
1852used to change the sign of a number. The unary @code{+} operator also
1853exists; it yields its operand unaltered.
1854
1855@samp{/} is the division operator, but dividing integers may not give
1856the result you expect. Its value is an integer, which is not equal to
1857the mathematical quotient when that is a fraction. Use @samp{%} to
1858get the corresponding integer remainder when necessary.
1859@xref{Division and Remainder}. Floating point division yields value
1860as close as possible to the mathematical quotient.
1861
1862These operators use algebraic syntax with the usual algebraic
1863precedence rule (@pxref{Binary Operator Grammar}) that multiplication
1864and division are done before addition and subtraction, but you can use
1865parentheses to explicitly specify how the operators nest. They are
1866left-associative (@pxref{Associativity and Ordering}). Thus,
1867
1868@example
1869-a + b - c + d * e / f
1870@end example
1871
1872@noindent
1873is equivalent to
1874
1875@example
1876(((-a) + b) - c) + ((d * e) / f)
1877@end example
1878
1879@node Integer Arithmetic
1880@section Integer Arithmetic
1881@cindex integer arithmetic
1882
1883Each of the basic arithmetic operations in C has two variants for
1884integers: @dfn{signed} and @dfn{unsigned}. The choice is determined
1885by the data types of their operands.
1886
1887Each integer data type in C is either @dfn{signed} or @dfn{unsigned}.
1888A signed type can hold a range of positive and negative numbers, with
1889zero near the middle of the range. An unsigned type can hold only
1890nonnegative numbers; its range starts with zero and runs upward.
1891
1892The most basic integer types are @code{int}, which normally can hold
1893numbers from @minus{}2,147,483,648 to 2,147,483,647, and @code{unsigned
1894int}, which normally can hold numbers from 0 to 4,294.967,295. (This
1895assumes @code{int} is 32 bits wide, always true for GNU C on real
1896computers but not always on embedded controllers.) @xref{Integer
1897Types}, for full information about integer types.
1898
1899When a basic arithmetic operation is given two signed operands, it
1900does signed arithmetic. Given two unsigned operands, it does
1901unsigned arithmetic.
1902
1903If one operand is @code{unsigned int} and the other is @code{int}, the
1904operator treats them both as unsigned. More generally, the common
1905type of the operands determines whether the operation is signed or
1906not. @xref{Common Type}.
1907
1908Printing the results of unsigned arithmetic with @code{printf} using
1909@samp{%d} can produce surprising results for values far away from
1910zero. Even though the rules above say that the computation was done
1911with unsigned arithmetic, the printed result may appear to be signed!
1912
1913The explanation is that the bit pattern resulting from addition,
1914subtraction or multiplication is actually the same for signed and
1915unsigned operations. The difference is only in the data type of the
1916result, which affects the @emph{interpretation} of the result bit pattern,
1917and whether the arithmetic operation can overflow (see the next section).
1918
1919But @samp{%d} doesn't know its argument's data type. It sees only the
1920value's bit pattern, and it is defined to interpret that as
1921@code{signed int}. To print it as unsigned requires using @samp{%u}
1922instead of @samp{%d}. @xref{Formatted Output, The GNU C Library, ,
1923libc, The GNU C Library Reference Manual}.
1924
1925Arithmetic in C never operates directly on narrow integer types (those
1926with fewer bits than @code{int}; @ref{Narrow Integers}). Instead it
1927``promotes'' them to @code{int}. @xref{Operand Promotions}.
1928
1929@node Integer Overflow
1930@section Integer Overflow
1931@cindex integer overflow
1932@cindex overflow, integer
1933
1934When the mathematical value of an arithmetic operation doesn't fit in
1935the range of the data type in use, that's called @dfn{overflow}.
1936When it happens in integer arithmetic, it is @dfn{integer overflow}.
1937
1938Integer overflow happens only in arithmetic operations. Type conversion
1939operations, by definition, do not cause overflow, not even when the
1940result can't fit in its new type. @xref{Integer Conversion}.
1941
1942Signed numbers use two's-complement representation, in which the most
1943negative number lacks a positive counterpart (@pxref{Integers in
1944Depth}). Thus, the unary @samp{-} operator on a signed integer can
1945overflow.
1946
1947@menu
1948* Unsigned Overflow:: Overlow in unsigned integer arithmetic.
1949* Signed Overflow:: Overlow in signed integer arithmetic.
1950@end menu
1951
1952@node Unsigned Overflow
1953@subsection Overflow with Unsigned Integers
1954
1955Unsigned arithmetic in C ignores overflow; it produces the true result
1956modulo the @var{n}th power of 2, where @var{n} is the number of bits
1957in the data type. We say it ``truncates'' the true result to the
1958lowest @var{n} bits.
1959
1960A true result that is negative, when taken modulo the @var{n}th power
1961of 2, yields a positive number. For instance,
1962
1963@example
1964unsigned int x = 1;
1965unsigned int y;
1966
1967y = -x;
1968@end example
1969
1970@noindent
1971causes overflow because the negative number @minus{}1 can't be stored
1972in an unsigned type. The actual result, which is @minus{}1 modulo the
1973@var{n}th power of 2, is one less than the @var{n}th power of 2. That
1974is the largest value that the unsigned data type can store. For a
197532-bit @code{unsigned int}, the value is 4,294,967,295. @xref{Maximum
1976and Minimum Values}.
1977
1978Adding that number to itself, as here,
1979
1980@example
1981unsigned int z;
1982
1983z = y + y;
1984@end example
1985
1986@noindent
1987ought to yield 8,489,934,590; however, that is again too large to fit,
1988so overflow truncates the value to 4,294,967,294. If that were a
1989signed integer, it would mean @minus{}2, which (not by coincidence)
1990equals @minus{}1 + @minus{}1.
1991
1992@node Signed Overflow
1993@subsection Overflow with Signed Integers
1994@cindex compiler options for integer overflow
1995@cindex integer overflow, compiler options
1996@cindex overflow, compiler options
1997
1998For signed integers, the result of overflow in C is @emph{in
1999principle} undefined, meaning that anything whatsoever could happen.
2000Therefore, C compilers can do optimizations that treat the overflow
2001case with total unconcern. (Since the result of overflow is undefined
2002in principle, one cannot claim that these optimizations are
2003erroneous.)
2004
2005@strong{Watch out:} These optimizations can do surprising things. For
2006instance,
2007
2008@example
2009int i;
2010@r{@dots{}}
2011if (i < i + 1)
2012 x = 5;
2013@end example
2014
2015@noindent
2016could be optimized to do the assignment unconditionally, because the
2017@code{if}-condition is always true if @code{i + 1} does not overflow.
2018
2019GCC offers compiler options to control handling signed integer
2020overflow. These options operate per module; that is, each module
2021behaves according to the options it was compiled with.
2022
2023These two options specify particular ways to handle signed integer
2024overflow, other than the default way:
2025
2026@table @option
2027@item -fwrapv
2028Make signed integer operations well-defined, like unsigned integer
2029operations: they produce the @var{n} low-order bits of the true
2030result. The highest of those @var{n} bits is the sign bit of the
2031result. With @option{-fwrapv}, these out-of-range operations are not
2032considered overflow, so (strictly speaking) integer overflow never
2033happens.
2034
2035The option @option{-fwrapv} enables some optimizations based on the
2036defined values of out-of-range results. In GCC 8, it disables
2037optimizations that are based on assuming signed integer operations
2038will not overflow.
2039
2040@item -ftrapv
2041Generate a signal @code{SIGFPE} when signed integer overflow occurs.
2042This terminates the program unless the program handles the signal.
2043@xref{Signals}.
2044@end table
2045
2046One other option is useful for finding where overflow occurs:
2047
2048@ignore
2049@item -fno-strict-overflow
2050Disable optimizations that are based on assuming signed integer
2051operations will not overflow.
2052@end ignore
2053
2054@table @option
2055@item -fsanitize=signed-integer-overflow
2056Output a warning message at run time when signed integer overflow
2057occurs. This checks the @samp{+}, @samp{*}, and @samp{-} operators.
2058This takes priority over @option{-ftrapv}.
2059@end table
2060
2061@node Mixed Mode
2062@section Mixed-Mode Arithmetic
2063
2064Mixing integers and floating-point numbers in a basic arithmetic
2065operation converts the integers automatically to floating point.
2066In most cases, this gives exactly the desired results.
2067But sometimes it matters precisely where the conversion occurs.
2068
2069If @code{i} and @code{j} are integers, @code{(i + j) * 2.0} adds them
2070as an integer, then converts the sum to floating point for the
2071multiplication. If the addition gets an overflow, that is not
2072equivalent to converting both integers to floating point and then
2073adding them. You can get the latter result by explicitly converting
2074the integers, as in @code{((double) i + (double) j) * 2.0}.
2075@xref{Explicit Type Conversion}.
2076
2077@c Eggert's report
2078Adding or multiplying several values, including some integers and some
2079floating point, does the operations left to right. Thus, @code{3.0 +
2080i + j} converts @code{i} to floating point, then adds 3.0, then
2081converts @code{j} to floating point and adds that. You can specify a
2082different order using parentheses: @code{3.0 + (i + j)} adds @code{i}
2083and @code{j} first and then adds that result (converting to floating
2084point) to 3.0. In this respect, C differs from other languages, such
2085as Fortran.
2086
2087@node Division and Remainder
2088@section Division and Remainder
2089@cindex remainder operator
2090@cindex modulus
2091@cindex operator, remainder
2092
2093Division of integers in C rounds the result to an integer. The result
2094is always rounded towards zero.
2095
2096@example
2097 16 / 3 @result{} 5
2098-16 / 3 @result{} -5
2099 16 / -3 @result{} -5
2100-16 / -3 @result{} 5
2101@end example
2102
2103@noindent
2104To get the corresponding remainder, use the @samp{%} operator:
2105
2106@example
2107 16 % 3 @result{} 1
2108-16 % 3 @result{} -1
2109 16 % -3 @result{} 1
2110-16 % -3 @result{} -1
2111@end example
2112
2113@noindent
2114@samp{%} has the same operator precedence as @samp{/} and @samp{*}.
2115
2116From the rounded quotient and the remainder, you can reconstruct
2117the dividend, like this:
2118
2119@example
2120int
2121original_dividend (int divisor, int quotient, int remainder)
2122@{
2123 return divisor * quotient + remainder;
2124@}
2125@end example
2126
2127To do unrounded division, use floating point. If only one operand is
2128floating point, @samp{/} converts the other operand to floating
2129point.
2130
2131@example
213216.0 / 3 @result{} 5.333333333333333
213316 / 3.0 @result{} 5.333333333333333
213416.0 / 3.0 @result{} 5.333333333333333
213516 / 3 @result{} 5
2136@end example
2137
2138The remainder operator @samp{%} is not allowed for floating-point
2139operands, because it is not needed. The concept of remainder makes
2140sense for integers because the result of division of integers has to
2141be an integer. For floating point, the result of division is a
2142floating-point number, in other words a fraction, which will differ
2143from the exact result only by a very small amount.
2144
2145There are functions in the standard C library to calculate remainders
2146from integral-values division of floating-point numbers.
2147@xref{Remainder Functions, The GNU C Library, , libc, The GNU C Library
2148Reference Manual}.
2149
2150Integer division overflows in one specific case: dividing the smallest
2151negative value for the data type (@pxref{Maximum and Minimum Values})
2152by @minus{}1. That's because the correct result, which is the
2153corresponding positive number, does not fit (@pxref{Integer Overflow})
2154in the same number of bits. On some computers now in use, this always
2155causes a signal @code{SIGFPE} (@pxref{Signals}), the same behavior
2156that the option @option{-ftrapv} specifies (@pxref{Signed Overflow}).
2157
2158Division by zero leads to unpredictable results---depending on the
2159type of computer, it might cause a signal @code{SIGFPE}, or it might
2160produce a numeric result.
2161
2162@cindex division by zero
2163@cindex zero, division by
2164@strong{Watch out:} Make sure the program does not divide by zero. If
2165you can't prove that the divisor is not zero, test whether it is zero,
2166and skip the division if so.
2167
2168@node Numeric Comparisons
2169@section Numeric Comparisons
2170@cindex numeric comparisons
2171@cindex comparisons
2172@cindex operators, comparison
2173@cindex equal operator
2174@cindex not-equal operator
2175@cindex less-than operator
2176@cindex greater-than operator
2177@cindex less-or-equal operator
2178@cindex greater-or-equal operator
2179@cindex operator, equal
2180@cindex operator, not-equal
2181@cindex operator, less-than
2182@cindex operator, greater-than
2183@cindex operator, less-or-equal
2184@cindex operator, greater-or-equal
2185@cindex truth value
2186
2187There are two kinds of comparison operators: @dfn{equality} and
2188@dfn{ordering}. Equality comparisons test whether two expressions
2189have the same value. The result is a @dfn{truth value}: a number that
2190is 1 for ``true'' and 0 for ``false.''
2191
2192@example
2193a == b /* @r{Test for equal.} */
2194a != b /* @r{Test for not equal.} */
2195@end example
2196
2197The equality comparison is written @code{==} because plain @code{=}
2198is the assignment operator.
2199
2200Ordering comparisons test which operand is greater or less. Their
2201results are truth values. These are the ordering comparisons of C:
2202
2203@example
2204a < b /* @r{Test for less-than.} */
2205a > b /* @r{Test for greater-than.} */
2206a <= b /* @r{Test for less-than-or-equal.} */
2207a >= b /* @r{Test for greater-than-or-equal.} */
2208@end example
2209
2210For any integers @code{a} and @code{b}, exactly one of the comparisons
2211@code{a < b}, @code{a == b} and @code{a > b} is true, just as in
2212mathematics. However, if @code{a} and @code{b} are special floating
2213point values (not ordinary numbers), all three can be false.
2214@xref{Special Float Values}, and @ref{Invalid Optimizations}.
2215
2216@node Shift Operations
2217@section Shift Operations
2218@cindex shift operators
2219@cindex operators, shift
2220@cindex operators, shift
2221@cindex shift count
2222
2223@dfn{Shifting} an integer means moving the bit values to the left or
2224right within the bits of the data type. Shifting is defined only for
2225integers. Here's the way to write it:
2226
2227@example
2228/* @r{Left shift.} */
22295 << 2 @result{} 20
2230
2231/* @r{Right shift.} */
22325 >> 2 @result{} 1
2233@end example
2234
2235@noindent
2236The left operand is the value to be shifted, and the right operand
2237says how many bits to shift it (the @dfn{shift count}). The left
2238operand is promoted (@pxref{Operand Promotions}), so shifting never
2239operates on a narrow integer type; it's always either @code{int} or
2240wider. The value of the shift operator has the same type as the
2241promoted left operand.
2242
2243@menu
2244* Bits Shifted In:: How shifting makes new bits to shift in.
2245* Shift Caveats:: Caveats of shift operations.
2246* Shift Hacks:: Clever tricks with shift operations.
2247@end menu
2248
2249@node Bits Shifted In
2250@subsection Shifting Makes New Bits
2251
2252A shift operation shifts towards one end of the number and has to
2253generate new bits at the other end.
2254
2255Shifting left one bit must generate a new least significant bit. It
2256always brings in zero there. It is equivalent to multiplying by the
2257appropriate power of 2. For example,
2258
2259@example
22605 << 3 @r{is equivalent to} 5 * 2*2*2
2261-10 << 4 @r{is equivalent to} -10 * 2*2*2*2
2262@end example
2263
2264The meaning of shifting right depends on whether the data type is
2265signed or unsigned (@pxref{Signed and Unsigned Types}). For a signed
2266data type, it performs ``arithmetic shift,'' which keeps the number's
2267sign unchanged by duplicating the sign bit. For an unsigned data
2268type, it performs ``logical shift,'' which always shifts in zeros at
2269the most significant bit.
2270
2271In both cases, shifting right one bit is division by two, rounding
2272towards negative infinity. For example,
2273
2274@example
2275(unsigned) 19 >> 2 @result{} 4
2276(unsigned) 20 >> 2 @result{} 5
2277(unsigned) 21 >> 2 @result{} 5
2278@end example
2279
2280For negative left operand @code{a}, @code{a >> 1} is not equivalent to
2281@code{a / 2}. They both divide by 2, but @samp{/} rounds toward
2282zero.
2283
2284The shift count must be zero or greater. Shifting by a negative
2285number of bits gives machine-dependent results.
2286
2287@node Shift Caveats
2288@subsection Caveats for Shift Operations
2289
2290@strong{Warning:} If the shift count is greater than or equal to the
2291width in bits of the first operand, the results are machine-dependent.
2292Logically speaking, the ``correct'' value would be either -1 (for
2293right shift of a negative number) or 0 (in all other cases), but what
2294it really generates is whatever the machine's shift instruction does in
2295that case. So unless you can prove that the second operand is not too
2296large, write code to check it at run time.
2297
2298@strong{Warning:} Never rely on how the shift operators relate in
2299precedence to other arithmetic binary operators. Programmers don't
2300remember these precedences, and won't understand the code. Always use
2301parentheses to explicitly specify the nesting, like this:
2302
2303@example
2304a + (b << 5) /* @r{Shift first, then add.} */
2305(a + b) << 5 /* @r{Add first, then shift.} */
2306@end example
2307
2308Note: according to the C standard, shifting of signed values isn't
2309guaranteed to work properly when the value shifted is negative, or
2310becomes negative during the operation of shifting left. However, only
2311pedants have a reason to be concerned about this; only computers with
2312strange shift instructions could plausibly do this wrong. In GNU C,
2313the operation always works as expected,
2314
2315@node Shift Hacks
2316@subsection Shift Hacks
2317
2318You can use the shift operators for various useful hacks. For
2319example, given a date specified by day of the month @code{d}, month
2320@code{m}, and year @code{y}, you can store the entire date in a single
2321integer @code{date}:
2322
2323@example
2324unsigned int d = 12;
2325unsigned int m = 6;
2326unsigned int y = 1983;
2327unsigned int date = ((y << 4) + m) << 5) + d;
2328@end example
2329
2330@noindent
2331To extract the original day, month, and year out of
2332@code{date}, use a combination of shift and remainder.
2333
2334@example
2335d = date % 32;
2336m = (date >> 5) % 16;
2337y = date >> 9;
2338@end example
2339
2340@code{-1 << LOWBITS} is a clever way to make an integer whose
2341@code{LOWBITS} lowest bits are all 0 and the rest are all 1.
2342@code{-(1 << LOWBITS)} is equivalent to that, due to associativity of
2343multiplication, since negating a value is equivalent to multiplying it
2344by @minus{}1.
2345
2346@node Bitwise Operations
2347@section Bitwise Operations
2348@cindex bitwise operators
2349@cindex operators, bitwise
2350@cindex negation, bitwise
2351@cindex conjunction, bitwise
2352@cindex disjunction, bitwise
2353
2354Bitwise operators operate on integers, treating each bit independently.
2355They are not allowed for floating-point types.
2356
2357The examples in this section use binary constants, starting with
2358@samp{0b} (@pxref{Integer Constants}). They stand for 32-bit integers
2359of type @code{int}.
2360
2361@table @code
2362@item ~@code{a}
2363Unary operator for bitwise negation; this changes each bit of
2364@code{a} from 1 to 0 or from 0 to 1.
2365
2366@example
2367~0b10101000 @result{} 0b11111111111111111111111101010111
2368~0 @result{} 0b11111111111111111111111111111111
2369~0b11111111111111111111111111111111 @result{} 0
2370~ (-1) @result{} 0
2371@end example
2372
2373It is useful to remember that @code{~@var{x} + 1} equals
2374@code{-@var{x}}, for integers, and @code{~@var{x}} equals
2375@code{-@var{x} - 1}. The last example above shows this with @minus{}1
2376as @var{x}.
2377
2378@item @code{a} & @code{b}
2379Binary operator for bitwise ``and'' or ``conjunction.'' Each bit in
2380the result is 1 if that bit is 1 in both @code{a} and @code{b}.
2381
2382@example
23830b10101010 & 0b11001100 @result{} 0b10001000
2384@end example
2385
2386@item @code{a} | @code{b}
2387Binary operator for bitwise ``or'' (``inclusive or'' or
2388``disjunction''). Each bit in the result is 1 if that bit is 1 in
2389either @code{a} or @code{b}.
2390
2391@example
23920b10101010 | 0b11001100 @result{} 0b11101110
2393@end example
2394
2395@item @code{a} ^ @code{b}
2396Binary operator for bitwise ``xor'' (``exclusive or''). Each bit in
2397the result is 1 if that bit is 1 in exactly one of @code{a} and @code{b}.
2398
2399@example
24000b10101010 ^ 0b11001100 @result{} 0b01100110
2401@end example
2402@end table
2403
2404To understand the effect of these operators on signed integers, keep
2405in mind that all modern computers use two's-complement representation
2406(@pxref{Integer Representations}) for negative integers. This means
2407that the highest bit of the number indicates the sign; it is 1 for a
2408negative number and 0 for a positive number. In a negative number,
2409the value in the other bits @emph{increases} as the number gets closer
2410to zero, so that @code{0b111@r{@dots{}}111} is @minus{}1 and
2411@code{0b100@r{@dots{}}000} is the most negative possible integer.
2412
2413@strong{Warning:} C defines a precedence ordering for the bitwise
2414binary operators, but you should never rely on it. You should
2415never rely on how bitwise binary operators relate in precedence to the
2416arithmetic and shift binary operators. Other programmers don't
2417remember this precedence ordering, so always use parentheses to
2418explicitly specify the nesting.
2419
2420For example, suppose @code{offset} is an integer that specifies
2421the offset within shared memory of a table, except that its bottom few
2422bits (@code{LOWBITS} says how many) are special flags. Here's
2423how to get just that offset and add it to the base address.
2424
2425@example
2426shared_mem_base + (offset & (-1 << LOWBITS))
2427@end example
2428
2429Thanks to the outer set of parentheses, we don't need to know whether
2430@samp{&} has higher precedence than @samp{+}. Thanks to the inner
2431set, we don't need to know whether @samp{&} has higher precedence than
2432@samp{<<}. But we can rely on all unary operators to have higher
2433precedence than any binary operator, so we don't need parentheses
2434around the left operand of @samp{<<}.
2435
2436@node Assignment Expressions
2437@chapter Assignment Expressions
2438@cindex assignment expressions
2439@cindex operators, assignment
2440
2441As a general concept in programming, an @dfn{assignment} is a
2442construct that stores a new value into a place where values can be
2443stored---for instance, in a variable. Such places are called
2444@dfn{lvalues} (@pxref{Lvalues}) because they are locations that hold a value.
2445
2446An assignment in C is an expression because it has a value; we call
2447it an @dfn{assignment expression}. A simple assignment looks like
2448
2449@example
2450@var{lvalue} = @var{value-to-store}
2451@end example
2452
2453@noindent
2454We say it assigns the value of the expression @var{value-to-store} to
2455the location @var{lvalue}, or that it stores @var{value-to-store}
2456there. You can think of the ``l'' in ``lvalue'' as standing for
2457``left,'' since that's what you put on the left side of the assignment
2458operator.
2459
2460However, that's not the only way to use an lvalue, and not all lvalues
2461can be assigned to. To use the lvalue in the left side of an
2462assignment, it has to be @dfn{modifiable}. In C, that means it was
2463not declared with the type qualifier @code{const} (@pxref{const}).
2464
2465The value of the assignment expression is that of @var{lvalue} after
2466the new value is stored in it. This means you can use an assignment
2467inside other expressions. Assignment operators are right-associative
2468so that
2469
2470@example
2471x = y = z = 0;
2472@end example
2473
2474@noindent
2475is equivalent to
2476
2477@example
2478x = (y = (z = 0));
2479@end example
2480
2481This is the only useful way for them to associate;
2482the other way,
2483
2484@example
2485((x = y) = z) = 0;
2486@end example
2487
2488@noindent
2489would be invalid since an assignment expression such as @code{x = y}
2490is not valid as an lvalue.
2491
2492@strong{Warning:} Write parentheses around an assignment if you nest
2493it inside another expression, unless that is a conditional expression,
2494or comma-separated series, or another assignment.
2495
2496@menu
2497* Simple Assignment:: The basics of storing a value.
2498* Lvalues:: Expressions into which a value can be stored.
2499* Modifying Assignment:: Shorthand for changing an lvalue's contents.
2500* Increment/Decrement:: Shorthand for incrementing and decrementing
2501 an lvalue's contents.
2502* Postincrement/Postdecrement:: Accessing then incrementing or decrementing.
2503* Assignment in Subexpressions:: How to avoid ambiguity.
2504* Write Assignments Separately:: Write assignments as separate statements.
2505@end menu
2506
2507@node Simple Assignment
2508@section Simple Assignment
2509@cindex simple assignment
2510@cindex assignment, simple
2511
2512A @dfn{simple assignment expression} computes the value of the right
2513operand and stores it into the lvalue on the left. Here is a simple
2514assignment expression that stores 5 in @code{i}:
2515
2516@example
2517i = 5
2518@end example
2519
2520@noindent
2521We say that this is an @dfn{assignment to} the variable @code{i} and
2522that it @dfn{assigns} @code{i} the value 5. It has no semicolon
2523because it is an expression (so it has a value). Adding a semicolon
2524at the end would make it a statement (@pxref{Expression Statement}).
2525
2526Here is another example of a simple assignment expression. Its
2527operands are not simple, but the kind of assignment done here is
2528simple assignment.
2529
2530@example
2531x[foo ()] = y + 6
2532@end example
2533
2534A simple assignment with two different numeric data types converts the
2535right operand value to the lvalue's type, if possible. It can convert
2536any numeric type to any other numeric type.
2537
2538Simple assignment is also allowed on some non-numeric types: pointers
2539(@pxref{Pointers}), structures (@pxref{Structure Assignment}), and
2540unions (@pxref{Unions}).
2541
2542@strong{Warning:} Assignment is not allowed on arrays because
2543there are no array values in C; C variables can be arrays, but these
2544arrays cannot be manipulated as wholes. @xref{Limitations of C
2545Arrays}.
2546
2547@xref{Assignment Type Conversions}, for the complete rules about data
2548types used in assignments.
2549
2550@node Lvalues
2551@section Lvalues
2552@cindex lvalues
2553
2554An expression that identifies a memory space that holds a value is
2555called an @dfn{lvalue}, because it is a location that can hold a value.
2556
2557The standard kinds of lvalues are:
2558
2559@itemize @bullet
2560@item
2561A variable.
2562
2563@item
2564A pointer-dereference expression (@pxref{Pointer Dereference}) using
2565unary @samp{*}.
2566
2567@item
2568A structure field reference (@pxref{Structures}) using @samp{.}, if
2569the structure value is an lvalue.
2570
2571@item
2572A structure field reference using @samp{->}. This is always an lvalue
2573since @samp{->} implies pointer dereference.
2574
2575@item
2576A union alternative reference (@pxref{Unions}), on the same conditions
2577as for structure fields.
2578
2579@item
2580An array-element reference using @samp{[@r{@dots{}}]}, if the array
2581is an lvalue.
2582@end itemize
2583
2584If an expression's outermost operation is any other operator, that
2585expression is not an lvalue. Thus, the variable @code{x} is an
2586lvalue, but @code{x + 0} is not, even though these two expressions
2587compute the same value (assuming @code{x} is a number).
2588
2589An array can be an lvalue (the rules above determine whether it is
2590one), but using the array in an expression converts it automatically
2591to a pointer to the first element. The result of this conversion is
2592not an lvalue. Thus, if the variable @code{a} is an array, you can't
2593use @code{a} by itself as the left operand of an assignment. But you
2594can assign to an element of @code{a}, such as @code{a[0]}. That is an
2595lvalue since @code{a} is an lvalue.
2596
2597@node Modifying Assignment
2598@section Modifying Assignment
2599@cindex modifying assignment
2600@cindex assignment, modifying
2601
2602You can abbreviate the common construct
2603
2604@example
2605@var{lvalue} = @var{lvalue} + @var{expression}
2606@end example
2607
2608@noindent
2609as
2610
2611@example
2612@var{lvalue} += @var{expression}
2613@end example
2614
2615This is known as a @dfn{modifying assignment}. For instance,
2616
2617@example
2618i = i + 5;
2619i += 5;
2620@end example
2621
2622@noindent
2623shows two statements that are equivalent. The first uses
2624simple assignment; the second uses modifying assignment.
2625
2626Modifying assignment works with any binary arithmetic operator. For
2627instance, you can subtract something from an lvalue like this,
2628
2629@example
2630@var{lvalue} -= @var{expression}
2631@end example
2632
2633@noindent
2634or multiply it by a certain amount like this,
2635
2636@example
2637@var{lvalue} *= @var{expression}
2638@end example
2639
2640@noindent
2641or shift it by a certain amount like this.
2642
2643@example
2644@var{lvalue} <<= @var{expression}
2645@var{lvalue} >>= @var{expression}
2646@end example
2647
2648In most cases, this feature adds no power to the language, but it
2649provides substantial convenience. Also, when @var{lvalue} contains
2650code that has side effects, the simple assignment performs those side
2651effects twice, while the modifying assignment performs them once. For
2652instance,
2653
2654@example
2655x[foo ()] = x[foo ()] + 5;
2656@end example
2657
2658@noindent
2659calls @code{foo} twice, and it could return different values each
2660time. If @code{foo ()} returns 1 the first time and 3 the second
2661time, then the effect could be to add @code{x[3]} and 5 and store the
2662result in @code{x[1]}, or to add @code{x[1]} and 5 and store the
2663result in @code{x[3]}. We don't know which of the two it will do,
2664because C does not specify which call to @code{foo} is computed first.
2665
2666Such a statement is not well defined, and shouldn't be used.
2667
2668By contrast,
2669
2670@example
2671x[foo ()] += 5;
2672@end example
2673
2674@noindent
2675is well defined: it calls @code{foo} only once to determine which
2676element of @code{x} to adjust, and it adjusts that element by adding 5
2677to it.
2678
2679@node Increment/Decrement
2680@section Increment and Decrement Operators
2681@cindex increment operator
2682@cindex decrement operator
2683@cindex operator, increment
2684@cindex operator, decrement
2685@cindex preincrement expression
2686@cindex predecrement expression
2687
2688The operators @samp{++} and @samp{--} are the @dfn{increment} and
2689@dfn{decrement} operators. When used on a numeric value, they add or
2690subtract 1. We don't consider them assignments, but they are
2691equivalent to assignments.
2692
2693Using @samp{++} or @samp{--} as a prefix, before an lvalue, is called
2694@dfn{preincrement} or @dfn{predecrement}. This adds or subtracts 1
2695and the result becomes the expression's value. For instance,
2696
2697@example
2698#include <stdio.h> /* @r{Declares @code{printf}.} */
2699
2700int
2701main (void)
2702@{
2703 int i = 5;
2704 printf ("%d\n", i);
2705 printf ("%d\n", ++i);
2706 printf ("%d\n", i);
2707 return 0;
2708@}
2709@end example
2710
2711@noindent
2712prints lines containing 5, 6, and 6 again. The expression @code{++i}
2713increments @code{i} from 5 to 6, and has the value 6, so the output
2714from @code{printf} on that line says @samp{6}.
2715
2716Using @samp{--} instead, for predecrement,
2717
2718@example
2719#include <stdio.h> /* @r{Declares @code{printf}.} */
2720
2721int
2722main (void)
2723@{
2724 int i = 5;
2725 printf ("%d\n", i);
2726 printf ("%d\n", --i);
2727 printf ("%d\n", i);
2728 return 0;
2729@}
2730@end example
2731
2732@noindent
2733prints three lines that contain (respectively) @samp{5}, @samp{4}, and
2734again @samp{4}.
2735
2736@node Postincrement/Postdecrement
2737@section Postincrement and Postdecrement
2738@cindex postincrement expression
2739@cindex postdecrement expression
2740@cindex operator, postincrement
2741@cindex operator, postdecrement
2742
2743Using @samp{++} or @samp{--} @emph{after} an lvalue does something
2744peculiar: it gets the value directly out of the lvalue and @emph{then}
2745increments or decrement it. Thus, the value of @code{i++} is the same
2746as the value of @code{i}, but @code{i++} also increments @code{i} ``a
2747little later.'' This is called @dfn{postincrement} or
2748@dfn{postdecrement}.
2749
2750For example,
2751
2752@example
2753#include <stdio.h> /* @r{Declares @code{printf}.} */
2754
2755int
2756main (void)
2757@{
2758 int i = 5;
2759 printf ("%d\n", i);
2760 printf ("%d\n", i++);
2761 printf ("%d\n", i);
2762 return 0;
2763@}
2764@end example
2765
2766@noindent
2767prints lines containing 5, again 5, and 6. The expression @code{i++}
2768has the value 5, which is the value of @code{i} at the time,
2769but it increments @code{i} from 5 to 6 just a little later.
2770
2771How much later is ``just a little later''? That is flexible. The
2772increment has to happen by the next @dfn{sequence point}. In simple cases,
2773that means by the end of the statement. @xref{Sequence Points}.
2774
2775If a unary operator precedes a postincrement or postincrement expression,
2776the increment nests inside:
2777
2778@example
2779-a++ @r{is equivalent to} -(a++)
2780@end example
2781
2782That's the only order that makes sense; @code{-a} is not an lvalue, so
2783it can't be incremented.
2784
2785@node Assignment in Subexpressions
2786@section Pitfall: Assignment in Subexpressions
2787@cindex assignment in subexpressions
2788@cindex subexpressions, assignment in
2789
2790In C, the order of computing parts of an expression is not fixed.
2791Aside from a few special cases, the operations can be computed in any
2792order. If one part of the expression has an assignment to @code{x}
2793and another part of the expression uses @code{x}, the result is
2794unpredictable because that use might be computed before or after the
2795assignment.
2796
2797Here's an example of ambiguous code:
2798
2799@example
2800x = 20;
2801printf ("%d %d\n", x, x = 4);
2802@end example
2803
2804@noindent
2805If the second argument, @code{x}, is computed before the third argument,
2806@code{x = 4}, the second argument's value will be 20. If they are
2807computed in the other order, the second argument's value will be 4.
2808
2809Here's one way to make that code unambiguous:
2810
2811@example
2812y = 20;
2813printf ("%d %d\n", y, x = 4);
2814@end example
2815
2816Here's another way, with the other meaning:
2817
2818@example
2819x = 4;
2820printf ("%d %d\n", x, x);
2821@end example
2822
2823This issue applies to all kinds of assignments, and to the increment
2824and decrement operators, which are equivalent to assignments.
2825@xref{Order of Execution}, for more information about this.
2826
2827However, it can be useful to write assignments inside an
2828@code{if}-condition or @code{while}-test along with logical operators.
2829@xref{Logicals and Assignments}.
2830
2831@node Write Assignments Separately
2832@section Write Assignments in Separate Statements
2833
2834It is often convenient to write an assignment inside an
2835@code{if}-condition, but that can reduce the readability of the
2836program. Here's an example of what to avoid:
2837
2838@example
2839if (x = advance (x))
2840 @r{@dots{}}
2841@end example
2842
2843The idea here is to advance @code{x} and test if the value is nonzero.
2844However, readers might miss the fact that it uses @samp{=} and not
2845@samp{==}. In fact, writing @samp{=} where @samp{==} was intended
2846inside a condition is a common error, so GNU C can give warnings when
2847@samp{=} appears in a way that suggests it's an error.
2848
2849It is much clearer to write the assignment as a separate statement, like this:
2850
2851@example
2852x = advance (x);
2853if (x != 0)
2854 @r{@dots{}}
2855@end example
2856
2857@noindent
2858This makes it unmistakably clear that @code{x} is assigned a new value.
2859
2860Another method is to use the comma operator (@pxref{Comma Operator}),
2861like this:
2862
2863@example
2864if (x = advance (x), x != 0)
2865 @r{@dots{}}
2866@end example
2867
2868@noindent
2869However, putting the assignment in a separate statement is usually clearer
2870unless the assignment is very short, because it reduces nesting.
2871
2872@node Execution Control Expressions
2873@chapter Execution Control Expressions
2874@cindex execution control expressions
2875@cindex expressions, execution control
2876
2877This chapter describes the C operators that combine expressions to
2878control which of those expressions execute, or in which order.
2879
2880@menu
2881* Logical Operators:: Logical conjunction, disjunction, negation.
2882* Logicals and Comparison:: Logical operators with comparison operators.
2883* Logicals and Assignments:: Assignments with logical operators.
2884* Conditional Expression:: An if/else construct inside expressions.
2885* Comma Operator:: Build a sequence of subexpressions.
2886@end menu
2887
2888@node Logical Operators
2889@section Logical Operators
2890@cindex logical operators
2891@cindex operators, logical
2892@cindex conjunction operator
2893@cindex disjunction operator
2894@cindex negation operator, logical
2895
2896The @dfn{logical operators} combine truth values, which are normally
2897represented in C as numbers. Any expression with a numeric value is a
2898valid truth value: zero means false, and any other value means true.
2899A pointer type is also meaningful as a truth value; a null pointer
2900(which is zero) means false, and a non-null pointer means true
2901(@pxref{Pointer Types}). The value of a logical operator is always 1
2902or 0 and has type @code{int} (@pxref{Integer Types}).
2903
2904The logical operators are used mainly in the condition of an @code{if}
2905statement, or in the end test in a @code{for} statement or
2906@code{while} statement (@pxref{Statements}). However, they are valid
2907in any context where an integer-valued expression is allowed.
2908
2909@table @samp
2910@item ! @var{exp}
2911Unary operator for logical ``not.'' The value is 1 (true) if
2912@var{exp} is 0 (false), and 0 (false) if @var{exp} is nonzero (true).
2913
2914@strong{Warning:} if @code{exp} is anything but an lvalue or a
2915function call, you should write parentheses around it.
2916
2917@item @var{left} && @var{right}
2918The logical ``and'' binary operator computes @var{left} and, if necessary,
2919@var{right}. If both of the operands are true, the @samp{&&} expression
2920gives the value 1 (which is true). Otherwise, the @samp{&&} expression
2921gives the value 0 (false). If @var{left} yields a false value,
2922that determines the overall result, so @var{right} is not computed.
2923
2924@item @var{left} || @var{right}
2925The logical ``or'' binary operator computes @var{left} and, if necessary,
2926@var{right}. If at least one of the operands is true, the @samp{||} expression
2927gives the value 1 (which is true). Otherwise, the @samp{||} expression
2928gives the value 0 (false). If @var{left} yields a true value,
2929that determines the overall result, so @var{right} is not computed.
2930@end table
2931
2932@strong{Warning:} never rely on the relative precedence of @samp{&&}
2933and @samp{||}. When you use them together, always use parentheses to
2934specify explicitly how they nest, as shown here:
2935
2936@example
2937if ((r != 0 && x % r == 0)
2938 ||
2939 (s != 0 && x % s == 0))
2940@end example
2941
2942@node Logicals and Comparison
2943@section Logical Operators and Comparisons
2944
2945The most common thing to use inside the logical operators is a
2946comparison. Conveniently, @samp{&&} and @samp{||} have lower
2947precedence than comparison operators and arithmetic operators, so we
2948can write expressions like this without parentheses and get the
2949nesting that is natural: two comparison operations that must both be
2950true.
2951
2952@example
2953if (r != 0 && x % r == 0)
2954@end example
2955
2956@noindent
2957This example also shows how it is useful that @samp{&&} guarantees to
2958skip the right operand if the left one turns out false. Because of
2959that, this code never tries to divide by zero.
2960
2961This is equivalent:
2962
2963@example
2964if (r && x % r == 0)
2965@end example
2966
2967@noindent
2968A truth value is simply a number, so @code{r}
2969as a truth value tests whether it is nonzero.
2970But @code{r}'s meaning is not a truth value---it is a number to divide by.
2971So it is better style to write the explicit @code{!= 0}.
2972
2973Here's another equivalent way to write it:
2974
2975@example
2976if (!(r == 0) && x % r == 0)
2977@end example
2978
2979@noindent
2980This illustrates the unary @samp{!} operator, and the need to
2981write parentheses around its operand.
2982
2983@node Logicals and Assignments
2984@section Logical Operators and Assignments
2985
2986There are cases where assignments nested inside the condition can
2987actually make a program @emph{easier} to read. Here is an example
2988using a hypothetical type @code{list} which represents a list; it
2989tests whether the list has at least two links, using hypothetical
2990functions, @code{nonempty} which is true of the argument is a nonempty
2991list, and @code{list_next} which advances from one list link to the
2992next. We assume that a list is never a null pointer, so that the
2993assignment expressions are always ``true.''
2994
2995@example
2996if (nonempty (list)
2997 && (temp1 = list_next (list))
2998 && nonempty (temp1)
2999 && (temp2 = list_next (temp1)))
3000 @r{@dots{}} /* @r{use @code{temp1} and @code{temp2}} */
3001@end example
3002
3003@noindent
3004Here we get the benefit of the @samp{&&} operator, to avoid executing
3005the rest of the code if a call to @code{nonempty} says ``false.'' The
3006only natural place to put the assignments is among those calls.
3007
3008It would be possible to rewrite this as several statements, but that
3009could make it much more cumbersome. On the other hand, when the test
3010is even more complex than this one, splitting it into multiple
3011statements might be necessary for clarity.
3012
3013If an empty list is a null pointer, we can dispense with calling
3014@code{nonempty}:
3015
3016@example
3017if ((temp1 = list_next (list))
3018 && (temp2 = list_next (temp1)))
3019 @r{@dots{}}
3020@end example
3021
3022@node Conditional Expression
3023@section Conditional Expression
3024@cindex conditional expression
3025@cindex expression, conditional
3026
3027C has a conditional expression that selects one of two expressions
3028to compute and get the value from. It looks like this:
3029
3030@example
3031@var{condition} ? @var{iftrue} : @var{iffalse}
3032@end example
3033
3034@menu
3035* Conditional Rules:: Rules for the conditional operator.
3036* Conditional Branches:: About the two branches in a conditional.
3037@end menu
3038
3039@node Conditional Rules
3040@subsection Rules for Conditional Operator
3041
3042The first operand, @var{condition}, should be a value that can be
3043compared with zero---a number or a pointer. If it is true (nonzero),
3044then the conditional expression computes @var{iftrue} and its value
3045becomes the value of the conditional expression. Otherwise the
3046conditional expression computes @var{iffalse} and its value becomes
3047the value of the conditional expression. The conditional expression
3048always computes just one of @var{iftrue} and @var{iffalse}, never both
3049of them.
3050
3051Here's an example: the absolute value of a number @code{x}
3052can be written as @code{(x >= 0 ? x : -x)}.
3053
3054@strong{Warning:} The conditional expression operators have rather low
3055syntactic precedence. Except when the conditional expression is used
3056as an argument in a function call, write parentheses around it. For
3057clarity, always write parentheses around it if it extends across more
3058than one line.
3059
3060Assignment operators and the comma operator (@pxref{Comma Operator})
3061have lower precedence than conditional expression operators, so write
3062parentheses around those when they appear inside a conditional
3063expression. @xref{Order of Execution}.
3064
3065@node Conditional Branches
3066@subsection Conditional Operator Branches
3067@cindex branches of conditional expression
3068
3069We call @var{iftrue} and @var{iffalse} the @dfn{branches} of the
3070conditional.
3071
3072The two branches should normally have the same type, but a few
3073exceptions are allowed. If they are both numeric types, the
3074conditional converts both to their common type (@pxref{Common Type}).
3075
3076With pointers (@pxref{Pointers}), the two values can be pointers to
3077nearly compatible types (@pxref{Compatible Types}). In this case, the
3078result type is a similar pointer whose target type combines all the
3079type qualifiers (@pxref{Type Qualifiers}) of both branches.
3080
3081If one branch has type @code{void *} and the other is a pointer to an
3082object (not to a function), the conditional converts the @code{void *}
3083branch to the type of the other.
3084
3085If one branch is an integer constant with value zero and the other is
3086a pointer, the conditional converts zero to the pointer's type.
3087
3088In GNU C, you can omit @var{iftrue} in a conditional expression. In
3089that case, if @var{condition} is nonzero, its value becomes the value of
3090the conditional expression, after conversion to the common type.
3091Thus,
3092
3093@example
3094x ? : y
3095@end example
3096
3097@noindent
3098has the value of @code{x} if that is nonzero; otherwise, the value of
3099@code{y}.
3100
3101@cindex side effect in ?:
3102@cindex ?: side effect
3103Omitting @var{iftrue} is useful when @var{condition} has side effects.
3104In that case, writing that expression twice would carry out the side
3105effects twice, but writing it once does them just once. For example,
3106if we suppose that the function @code{next_element} advances a pointer
3107variable to point to the next element in a list and returns the new
3108pointer,
3109
3110@example
3111next_element () ? : default_pointer
3112@end example
3113
3114@noindent
3115is a way to advance the pointer and use its new value if it isn't
3116null, but use @code{default_pointer} if that is null. We must not do
3117it this way,
3118
3119@example
3120next_element () ? next_element () : default_pointer
3121@end example
3122
3123@noindent
3124because it would advance the pointer a second time.
3125
3126@node Comma Operator
3127@section Comma Operator
3128@cindex comma operator
3129@cindex operator, comma
3130
3131The comma operator stands for sequential execution of expressions.
3132The value of the comma expression comes from the last expression in
3133the sequence; the previous expressions are computed only for their
3134side effects. It looks like this:
3135
3136@example
3137@var{exp1}, @var{exp2} @r{@dots{}}
3138@end example
3139
3140@noindent
3141You can bundle any number of expressions together this way, by putting
3142commas between them.
3143
3144@menu
3145* Uses of Comma:: When to use the comma operator.
3146* Clean Comma:: Clean use of the comma operator.
3147* Avoid Comma:: When to not use the comma operator.
3148@end menu
3149
3150@node Uses of Comma
3151@subsection The Uses of the Comma Operator
3152
3153With commas, you can put several expressions into a place that
3154requires just one expression---for example, in the header of a
3155@code{for} statement. This statement
3156
3157@example
3158for (i = 0, j = 10, k = 20; i < n; i++)
3159@end example
3160
3161@noindent
3162contains three assignment expressions, to initialize @code{i}, @code{j}
3163and @code{k}. The syntax of @code{for} requires just one expression
3164for initialization; to include three assignments, we use commas to
3165bundle them into a single larger expression, @code{i = 0, j = 10, k =
316620}. This technique is also useful in the loop-advance expression,
3167the last of the three inside the @code{for} parentheses.
3168
3169In the @code{for} statement and the @code{while} statement
3170(@pxref{Loop Statements}), a comma provides a way to perform some side
3171effect before the loop-exit test. For example,
3172
3173@example
3174while (printf ("At the test, x = %d\n", x), x != 0)
3175@end example
3176
3177@node Clean Comma
3178@subsection Clean Use of the Comma Operator
3179
3180Always write parentheses around a series of comma operators, except
3181when it is at top level in an expression statement, or within the
3182parentheses of an @code{if}, @code{for}, @code{while}, or @code{switch}
3183statement (@pxref{Statements}). For instance, in
3184
3185@example
3186for (i = 0, j = 10, k = 20; i < n; i++)
3187@end example
3188
3189@noindent
3190the commas between the assignments are clear because they are between
3191a parenthesis and a semicolon.
3192
3193The arguments in a function call are also separated by commas, but that is
3194not an instance of the comma operator. Note the difference between
3195
3196@example
3197foo (4, 5, 6)
3198@end example
3199
3200@noindent
3201which passes three arguments to @code{foo} and
3202
3203@example
3204foo ((4, 5, 6))
3205@end example
3206
3207@noindent
3208which uses the comma operator and passes just one argument
3209(with value 6).
3210
3211@strong{Warning:} don't use the comma operator around an argument
3212of a function unless it helps understand the code. When you do so,
3213don't put part of another argument on the same line. Instead, add a
3214line break to make the parentheses around the comma operator easier to
3215see, like this.
3216
3217@example
3218foo ((mumble (x, y), frob (z)),
3219 *p)
3220@end example
3221
3222@node Avoid Comma
3223@subsection When Not to Use the Comma Operator
3224
3225You can use a comma in any subexpression, but in most cases it only
3226makes the code confusing, and it is clearer to raise all but the last
3227of the comma-separated expressions to a higher level. Thus, instead
3228of this:
3229
3230@example
3231x = (y += 4, 8);
3232@end example
3233
3234@noindent
3235it is much clearer to write this:
3236
3237@example
3238y += 4, x = 8;
3239@end example
3240
3241@noindent
3242or this:
3243
3244@example
3245y += 4;
3246x = 8;
3247@end example
3248
3249Use commas only in the cases where there is no clearer alternative
3250involving multiple statements.
3251
3252By contrast, don't hesitate to use commas in the expansion in a macro
3253definition. The trade-offs of code clarity are different in that
3254case, because the @emph{use} of the macro may improve overall clarity
3255so much that the ugliness of the macro's @emph{definition} is a small
3256price to pay. @xref{Macros}.
3257
3258@node Binary Operator Grammar
3259@chapter Binary Operator Grammar
3260@cindex binary operator grammar
3261@cindex grammar, binary operator
3262@cindex operator precedence
3263@cindex precedence, operator
3264@cindex left-associative
3265
3266@dfn{Binary operators} are those that take two operands, one
3267on the left and one on the right.
3268
3269All the binary operators in C are syntactically left-associative.
3270This means that @w{@code{a @var{op} b @var{op} c}} means @w{@code{(a
3271@var{op} b) @var{op} c}}. However, you should only write repeated
3272operators without parentheses using @samp{+}, @samp{-}, @samp{*} and
3273@samp{/}, because those cases are clear from algebra. So it is ok to
3274write @code{a + b + c} or @code{a - b - c}, but never @code{a == b ==
3275c} or @code{a % b % c}.
3276
3277Each C operator has a @dfn{precedence}, which is its rank in the
3278grammatical order of the various operators. The operators with the
3279highest precedence grab adjoining operands first; these expressions
3280then become operands for operators of lower precedence.
3281
3282The precedence order of operators in C is fully specified, so any
3283combination of operations leads to a well-defined nesting. We state
3284only part of the full precedence ordering here because it is bad
3285practice for C code to depend on the other cases. For cases not
3286specified in this chapter, always use parentheses to make the nesting
3287explicit.@footnote{Personal note from Richard Stallman: I wrote GCC without
3288remembering anything about the C precedence order beyond what's stated
3289here. I studied the full precedence table to write the parser, and
3290promptly forgot it again. If you need to look up the full precedence order
3291to understand some C code, fix the code with parentheses so nobody else
3292needs to do that.}
3293
3294You can depend on this subsequence of the precedence ordering
3295(stated from highest precedence to lowest):
3296
3297@enumerate
3298@item
3299Component access (@samp{.} and @samp{->}).
3300
3301@item
3302Unary prefix operators.
3303
3304@item
3305Unary postfix operators.
3306
3307@item
3308Multiplication, division, and remainder (they have the same precedence).
3309
3310@item
3311Addition and subtraction (they have the same precedence).
3312
3313@item
3314Comparisons---but watch out!
3315
3316@item
3317Logical operators @samp{&&} and @samp{||}---but watch out!
3318
3319@item
3320Conditional expression with @samp{?} and @samp{:}.
3321
3322@item
3323Assignments.
3324
3325@item
3326Sequential execution (the comma operator, @samp{,}).
3327@end enumerate
3328
3329Two of the lines in the above list say ``but watch out!'' That means
3330that the line covers operators with subtly different precedence.
3331Never depend on the grammar of C to decide how two comparisons nest;
3332instead, always use parentheses to specify their nesting.
3333
3334You can let several @samp{&&} operators associate, or several
3335@samp{||} operators, but always use parentheses to show how @samp{&&}
3336and @samp{||} nest with each other. @xref{Logical Operators}.
3337
3338There is one other precedence ordering that code can depend on:
3339
3340@enumerate
3341@item
3342Unary postfix operators.
3343
3344@item
3345Bitwise and shift operators---but watch out!
3346
3347@item
3348Conditional expression with @samp{?} and @samp{:}.
3349@end enumerate
3350
3351The caveat for bitwise and shift operators is like that for logical
3352operators: you can let multiple uses of one bitwise operator
3353associate, but always use parentheses to control nesting of dissimilar
3354operators.
3355
3356These lists do not specify any precedence ordering between the bitwise
3357and shift operators of the second list and the binary operators above
3358conditional expressions in the first list. When they come together,
3359parenthesize them. @xref{Bitwise Operations}.
3360
3361@node Order of Execution
3362@chapter Order of Execution
3363@cindex order of execution
3364
3365The order of execution of a C program is not always obvious, and not
3366necessarily predictable. This chapter describes what you can count on.
3367
3368@menu
3369* Reordering of Operands:: Operations in C are not necessarily computed
3370 in the order they are written.
3371* Associativity and Ordering:: Some associative operations are performed
3372 in a particular order; others are not.
3373* Sequence Points:: Some guarantees about the order of operations.
3374* Postincrement and Ordering:: Ambiguous excution order with postincrement.
3375* Ordering of Operands:: Evaluation order of operands
3376 and function arguments.
3377* Optimization and Ordering:: Compiler optimizations can reorder operations
3378 only if it has no impact on program results.
3379@end menu
3380
3381@node Reordering of Operands
3382@section Reordering of Operands
3383@cindex ordering of operands
3384@cindex reordering of operands
3385@cindex operand execution ordering
3386
3387The C language does not necessarily carry out operations within an
3388expression in the order they appear in the code. For instance, in
3389this expression,
3390
3391@example
3392foo () + bar ()
3393@end example
3394
3395@noindent
3396@code{foo} might be called first or @code{bar} might be called first.
3397If @code{foo} updates a datum and @code{bar} uses that datum, the
3398results can be unpredictable.
3399
3400The unpredictable order of computation of subexpressions also makes a
3401difference when one of them contains an assignment. We already saw
3402this example of bad code,
3403
3404@example
3405x = 20;
3406printf ("%d %d\n", x, x = 4);
3407@end example
3408
3409@noindent
3410in which the second argument, @code{x}, has a different value
3411depending on whether it is computed before or after the assignment in
3412the third argument.
3413
3414@node Associativity and Ordering
3415@section Associativity and Ordering
3416@cindex associativity and ordering
3417
3418An associative binary operator, such as @code{+}, when used repeatedly
3419can combine any number of operands. The operands' values may be
3420computed in any order.
3421
3422If the values are integers and overflow can be ignored, they may be
3423combined in any order. Thus, given four functions that return
3424@code{unsigned int}, calling them and adding their results as here
3425
3426@example
3427(foo () + bar ()) + (baz () + quux ())
3428@end example
3429
3430@noindent
3431may add up the results in any order.
3432
3433By contrast, arithmetic on signed integers, with overflow significant,
3434is not really associative (@pxref{Integer Overflow}). Thus, the
3435additions must be done in the order specified, obeying parentheses and
3436left-association. That means computing @code{(foo () + bar ())} and
3437@code{(baz () + quux ())} first (in either order), then adding the
3438two.
3439
3440The same applies to arithmetic on floating-point values, since that
3441too is not really associative. However, the GCC option
3442@option{-funsafe-math-optimizations} allows the compiler to change the
3443order of calculation when an associative operation (associative in
3444exact mathematics) combines several operands. The option takes effect
3445when compiling a module (@pxref{Compilation}). Changing the order
3446of association can enable the program to pipeline the floating point
3447operations.
3448
3449In all these cases, the four function calls can be done in any order.
3450There is no right or wrong about that.
3451
3452@node Sequence Points
3453@section Sequence Points
3454@cindex sequence points
3455@cindex full expression
3456
3457There are some points in the code where C makes limited guarantees
3458about the order of operations. These are called @dfn{sequence
3459points}. Here is where they occur:
3460
3461@itemize @bullet
3462@item
3463At the end of a @dfn{full expression}; that is to say, an expression
3464that is not part of a larger expression. All side effects specified
3465by that expression are carried out before execution moves
3466on to subsequent code.
3467
3468@item
3469At the end of the first operand of certain operators: @samp{,},
3470@samp{&&}, @samp{||}, and @samp{?:}. All side effects specified by
3471that expression are carried out before any execution of the
3472next operand.
3473
3474The commas that separate arguments in a function call are @emph{not}
3475comma operators, and they do not create sequence points. The rule
3476for function arguments and the rule for operands are different
3477(@pxref{Ordering of Operands}).
3478
3479@item
3480Just before calling a function. All side effects specified by the
3481argument expressions are carried out before calling the function.
3482
3483If the function to be called is not constant---that is, if it is
3484computed by an expression---all side effects in that expression are
3485carried out before calling the function.
3486@end itemize
3487
3488The ordering imposed by a sequence point applies locally to a limited
3489range of code, as stated above in each case. For instance, the
3490ordering imposed by the comma operator does not apply to code outside
3491that comma operator. Thus, in this code,
3492
3493@example
3494(x = 5, foo (x)) + x * x
3495@end example
3496
3497@noindent
3498the sequence point of the comma operator orders @code{x = 5} before
3499@code{foo (x)}, but @code{x * x} could be computed before or after
3500them.
3501
3502@node Postincrement and Ordering
3503@section Postincrement and Ordering
3504@cindex postincrement and ordering
3505@cindex ordering and postincrement
3506
3507Ordering requirements are loose with the postincrement and
3508postdecrement operations (@pxref{Postincrement/Postdecrement}), which
3509specify side effects to happen ``a little later.'' They must happen
3510before the next sequence point, but that still leaves room for various
3511meanings. In this expression,
3512
3513@example
3514z = x++ - foo ()
3515@end example
3516
3517@noindent
3518it's unpredictable whether @code{x} gets incremented before or after
3519calling the function @code{foo}. If @code{foo} refers to @code{x},
3520it might see the old value or it might see the incremented value.
3521
3522In this perverse expression,
3523
3524@example
3525x = x++
3526@end example
3527
3528@noindent
3529@code{x} will certainly be incremented but the incremented value may
3530not stick. If the incrementation of @code{x} happens after the
3531assignment to @code{x}, the incremented value will remain in place.
3532But if the incrementation happens first, the assignment will overwrite
3533that with the not-yet-incremented value, so the expression as a whole
3534will leave @code{x} unchanged.
3535
3536@node Ordering of Operands
3537@section Ordering of Operands
3538@cindex ordering of operands
3539@cindex operand ordering
3540
3541Operands and arguments can be computed in any order, but there are limits to
3542this intermixing in GNU C:
3543
3544@itemize @bullet
3545@item
3546The operands of a binary arithmetic operator can be computed in either
3547order, but they can't be intermixed: one of them has to come first,
3548followed by the other. Any side effects in the operand that's computed
3549first are executed before the other operand is computed.
3550
3551@item
3552That applies to assignment operators too, except that in simple assignment
3553the previous value of the left operand is unused.
3554
3555@item
3556The arguments in a function call can be computed in any order, but
3557they can't be intermixed. Thus, one argument is fully computed, then
3558another, and so on until they are all done. Any side effects in one argument
3559are executed before computation of another argument begins.
3560@end itemize
3561
3562These rules don't cover side effects caused by postincrement and
3563postdecrement operators---those can be deferred up to the next
3564sequence point.
3565
3566If you want to get pedantic, the fact is that GCC can reorder the
3567computations in many other ways provided that doesn't alter the result
3568of running the program. However, because they don't alter the result
3569of running the program, they are negligible, unless you are concerned
3570with the values in certain variables at various times as seen by other
3571processes. In those cases, you can use @code{volatile} to prevent
3572optimizations that would make them behave strangely. @xref{volatile}.
3573
3574@node Optimization and Ordering
3575@section Optimization and Ordering
3576@cindex optimization and ordering
3577@cindex ordering and optimization
3578
3579Sequence points limit the compiler's freedom to reorder operations
3580arbitrarily, but optimizations can still reorder them if the compiler
3581concludes that this won't alter the results. Thus, in this code,
3582
3583@example
3584x++;
3585y = z;
3586x++;
3587@end example
3588
3589@noindent
3590there is a sequence point after each statement, so the code is
3591supposed to increment @code{x} once before the assignment to @code{y}
3592and once after. However, incrementing @code{x} has no effect on
3593@code{y} or @code{z}, and setting @code{y} can't affect @code{x}, so
3594the code could be optimized into this:
3595
3596@example
3597y = z;
3598x += 2;
3599@end example
3600
3601Normally that has no effect except to make the program faster. But
3602there are special situations where it can cause trouble due to things
3603that the compiler cannot know about, such as shared memory. To limit
3604optimization in those places, use the @code{volatile} type qualifier
3605(@pxref{volatile}).
3606
3607@node Primitive Types
3608@chapter Primitive Data Types
3609@cindex primitive types
3610@cindex types, primitive
3611
3612This chapter describes all the primitive data types of C---that is,
3613all the data types that aren't built up from other types. They
3614include the types @code{int} and @code{double} that we've already covered.
3615
3616@menu
3617* Integer Types:: Description of integer types.
3618* Floating-Point Data Types:: Description of floating-point types.
3619* Complex Data Types:: Description of complex number types.
3620* The Void Type:: A type indicating no value at all.
3621* Other Data Types:: A brief summary of other types.
3622* Type Designators:: Referring to a data type abstractly.
3623@end menu
3624
3625These types are all made up of bytes (@pxref{Storage}).
3626
3627@node Integer Types
3628@section Integer Data Types
3629@cindex integer types
3630@cindex types, integer
3631
3632Here we describe all the integer types and their basic
3633characteristics. @xref{Integers in Depth}, for more information about
3634the bit-level integer data representations and arithmetic.
3635
3636@menu
3637* Basic Integers:: Overview of the various kinds of integers.
3638* Signed and Unsigned Types:: Integers can either hold both negative and
3639 non-negative values, or only non-negative.
3640* Narrow Integers:: When to use smaller integer types.
3641* Integer Conversion:: Casting a value from one integer type
3642 to another.
3643* Boolean Type:: An integer type for boolean values.
3644* Integer Variations:: Sizes of integer types can vary
3645 across platforms.
3646@end menu
3647
3648@node Basic Integers
3649@subsection Basic Integers
3650
3651@findex char
3652@findex int
3653@findex short int
3654@findex long int
3655@findex long long int
3656
3657Integer data types in C can be signed or unsigned. An unsigned type
3658can represent only positive numbers and zero. A signed type can
3659represent both positive and negative numbers, in a range spread almost
3660equally on both sides of zero.
3661
3662Aside from signedness, the integer data types vary in size: how many
3663bytes long they are. The size determines how many different integer
3664values the type can hold.
3665
3666Here's a list of the signed integer data types, with the sizes they
3667have on most computers. Each has a corresponding unsigned type; see
3668@ref{Signed and Unsigned Types}.
3669
3670@table @code
3671@item signed char
3672One byte (8 bits). This integer type is used mainly for integers that
3673represent characters, as part of arrays or other data structures.
3674
3675@item short
3676@itemx short int
3677Two bytes (16 bits).
3678
3679@item int
3680Four bytes (32 bits).
3681
3682@item long
3683@itemx long int
3684Four bytes (32 bits) or eight bytes (64 bits), depending on the
3685platform. Typically it is 32 bits on 32-bit computers
3686and 64 bits on 64-bit computers, but there are exceptions.
3687
3688@item long long
3689@itemx long long int
3690Eight bytes (64 bits). Supported in GNU C in the 1980s, and
3691incorporated into standard C as of ISO C99.
3692@end table
3693
3694You can omit @code{int} when you use @code{long} or @code{short}.
3695This is harmless and customary.
3696
3697@node Signed and Unsigned Types
3698@subsection Signed and Unsigned Types
3699@cindex signed types
3700@cindex unsigned types
3701@cindex types, signed
3702@cindex types, unsigned
3703@findex signed
3704@findex unsigned
3705
3706An unsigned integer type can represent only positive numbers and zero.
3707A signed type can represent both positive and negative number, in a
3708range spread almost equally on both sides of zero. For instance,
3709@code{unsigned char} holds numbers from 0 to 255 (on most computers),
3710while @code{signed char} holds numbers from @minus{}128 to 127. Each of
3711these types holds 256 different possible values, since they are both 8
3712bits wide.
3713
3714Write @code{signed} or @code{unsigned} before the type keyword to
3715specify a signed or an unsigned type. However, the integer types
3716other than @code{char} are signed by default; with them, @code{signed}
3717is a no-op.
3718
3719Plain @code{char} may be signed or unsigned; this depends on the
3720compiler, the machine in use, and its operating system.
3721
3722In many programs, it makes no difference whether @code{char} is
3723signed. When it does matter, don't leave it to chance; write
3724@code{signed char} or @code{unsigned char}.@footnote{Personal note from
3725Richard Stallman: Eating with hackers at a fish restaurant, I ordered
3726Arctic Char. When my meal arrived, I noted that the chef had not
3727signed it. So I complained, ``This char is unsigned---I wanted a
3728signed char!'' Or rather, I would have said this if I had thought of
3729it fast enough.}
3730
3731@node Narrow Integers
3732@subsection Narrow Integers
3733
3734The types that are narrower than @code{int} are rarely used for
3735ordinary variables---we declare them @code{int} instead. This is
3736because C converts those narrower types to @code{int} for any
3737arithmetic. There is literally no reason to declare a local variable
3738@code{char}, for instance.
3739
3740In particular, if the value is really a character, you should declare
3741the variable @code{int}. Not @code{char}! Using that narrow type can
3742force the compiler to truncate values for conversion, which is a
3743waste. Furthermore, some functions return either a character value,
3744or @minus{}1 for ``no character.'' Using @code{int} keeps those
3745values distinct.
3746
3747The narrow integer types are useful as parts of other objects, such as
3748arrays and structures. Compare these array declarations, whose sizes
3749on 32-bit processors are shown:
3750
3751@example
3752signed char ac[1000]; /* @r{1000 bytes} */
3753short as[1000]; /* @r{2000 bytes} */
3754int ai[1000]; /* @r{4000 bytes} */
3755long long all[1000]; /* @r{8000 bytes} */
3756@end example
3757
3758In addition, character strings must be made up of @code{char}s,
3759because that's what all the standard library string functions expect.
3760Thus, array @code{ac} could be used as a character string, but the
3761others could not be.
3762
3763@node Integer Conversion
3764@subsection Conversion among Integer Types
3765
3766C converts between integer types implicitly in many situations. It
3767converts the narrow integer types, @code{char} and @code{short}, to
3768@code{int} whenever they are used in arithmetic. Assigning a new
3769value to an integer variable (or other lvalue) converts the value to
3770the variable's type.
3771
3772You can also convert one integer type to another explicitly with a
3773@dfn{cast} operator. @xref{Explicit Type Conversion}.
3774
3775The process of conversion to a wider type is straightforward: the
3776value is unchanged. The only exception is when converting a negative
3777value (in a signed type, obviously) to a wider unsigned type. In that
3778case, the result is a positive value with the same bits
3779(@pxref{Integers in Depth}).
3780
3781@cindex truncation
3782Converting to a narrower type, also called @dfn{truncation}, involves
3783discarding some of the value's bits. This is not considered overflow
3784(@pxref{Integer Overflow}) because loss of significant bits is a
3785normal consequence of truncation. Likewise for conversion between
3786signed and unsigned types of the same width.
3787
3788More information about conversion for assignment is in
3789@ref{Assignment Type Conversions}. For conversion for arithmetic,
3790see @ref{Argument Promotions}.
3791
3792@node Boolean Type
3793@subsection Boolean Type
3794@cindex boolean type
3795@cindex type, boolean
3796@findex bool
3797
3798The unsigned integer type @code{bool} holds truth values: its possible
3799values are 0 and 1. Converting any nonzero value to @code{bool}
3800results in 1. For example:
3801
3802@example
3803bool a = 0;
3804bool b = 1;
3805bool c = 4; /* @r{Stores the value 1 in @code{c}.} */
3806@end example
3807
3808Unlike @code{int}, @code{bool} is not a keyword. It is defined in
3809the header file @file{stdbool.h}.
3810
3811@node Integer Variations
3812@subsection Integer Variations
3813
3814The integer types of C have standard @emph{names}, but what they
3815@emph{mean} varies depending on the kind of platform in use:
3816which kind of computer, which operating system, and which compiler.
3817It may even depend on the compiler options used.
3818
3819Plain @code{char} may be signed or unsigned; this depends on the
3820platform, too. Even for GNU C, there is no general rule.
3821
3822In theory, all of the integer types' sizes can vary. @code{char} is
3823always considered one ``byte'' for C, but it is not necessarily an
38248-bit byte; on some platforms it may be more than 8 bits. ISO C
3825specifies only that none of these types is narrower than the ones
3826above it in the list in @ref{Basic Integers}, and that @code{short}
3827has at least 16 bits.
3828
3829It is possible that in the future GNU C will support platforms where
3830@code{int} is 64 bits long. In practice, however, on today's real
3831computers, there is little variation; you can rely on the table
3832given previously (@pxref{Basic Integers}).
3833
3834To be completely sure of the size of an integer type,
3835use the types @code{int16_t}, @code{int32_t} and @code{int64_t}.
3836Their corresponding unsigned types add @samp{u} at the front.
3837To define these, include the header file @file{stdint.h}.
3838
3839The GNU C Compiler compiles for some embedded controllers that use two
3840bytes for @code{int}. On some, @code{int} is just one ``byte,'' and
3841so is @code{short int}---but that ``byte'' may contain 16 bits or even
384232 bits. These processors can't support an ordinary operating system
3843(they may have their own specialized operating systems), and most C
3844programs do not try to support them.
3845
3846@node Floating-Point Data Types
3847@section Floating-Point Data Types
3848@cindex floating-point types
3849@cindex types, floating-point
3850@findex double
3851@findex float
3852@findex long double
3853
3854@dfn{Floating point} is the binary analogue of scientific notation:
3855internally it represents a number as a fraction and a binary exponent; the
3856value is that fraction multiplied by the specified power of 2.
3857
3858For instance, to represent 6, the fraction would be 0.75 and the
3859exponent would be 3; together they stand for the value @math{0.75 * 2@sup{3}},
3860meaning 0.75 * 8. The value 1.5 would use 0.75 as the fraction and 1
3861as the exponent. The value 0.75 would use 0.75 as the fraction and 0
3862as the exponent. The value 0.375 would use 0.75 as the fraction and
3863-1 as the exponent.
3864
3865These binary exponents are used by machine instructions. You can
3866write a floating-point constant this way if you wish, using
3867hexadecimal; but normally we write floating-point numbers in decimal.
3868@xref{Floating Constants}.
3869
3870C has three floating-point data types:
3871
3872@table @code
3873@item double
3874``Double-precision'' floating point, which uses 64 bits. This is the
3875normal floating-point type, and modern computers normally do
3876their floating-point computations in this type, or some wider type.
3877Except when there is a special reason to do otherwise, this is the
3878type to use for floating-point values.
3879
3880@item float
3881``Single-precision'' floating point, which uses 32 bits. It is useful
3882for floating-point values stored in structures and arrays, to save
3883space when the full precision of @code{double} is not needed. In
3884addition, single-precision arithmetic is faster on some computers, and
3885occasionally that is useful. But not often---most programs don't use
3886the type @code{float}.
3887
3888C would be cleaner if @code{float} were the name of the type we
3889use for most floating-point values; however, for historical reasons,
3890that's not so.
3891
3892@item long double
3893``Extended-precision'' floating point is either 80-bit or 128-bit
3894precision, depending on the machine in use. On some machines, which
3895have no floating-point format wider than @code{double}, this is
3896equivalent to @code{double}.
3897@end table
3898
3899Floating-point arithmetic raises many subtle issues. @xref{Floating
3900Point in Depth}, for more information.
3901
3902@node Complex Data Types
3903@section Complex Data Types
3904@cindex complex numbers
3905@cindex types, complex
3906@cindex @code{_Complex} keyword
3907@cindex @code{__complex__} keyword
3908@findex _Complex
3909@findex __complex__
3910
3911Complex numbers can include both a real part and an imaginary part.
3912The numeric constants covered above have real-numbered values. An
3913imaginary-valued constant is an ordinary real-valued constant followed
3914by @samp{i}.
3915
3916To declare numeric variables as complex, use the @code{_Complex}
3917keyword.@footnote{For compatibility with older versions of GNU C, the
3918keyword @code{__complex__} is also allowed. Going forward, however,
3919use the new @code{_Complex} keyword as defined in ISO C11.} The
3920standard C complex data types are floating point,
3921
3922@example
3923_Complex float foo;
3924_Complex double bar;
3925_Complex long double quux;
3926@end example
3927
3928@noindent
3929but GNU C supports integer complex types as well.
3930
3931Since @code{_Complex} is a keyword just like @code{float} and
3932@code{double} and @code{long}, the keywords can appear in any order,
3933but the order shown above seems most logical.
3934
3935GNU C supports constants for complex values; for instance, @code{4.0 +
39363.0i} has the value 4 + 3i as type @code{_Complex double}.
3937@xref{Imaginary Constants}.
3938
3939To pull the real and imaginary parts of the number back out, GNU C
3940provides the keywords @code{__real__} and @code{__imag__}:
3941
3942@example
3943_Complex double foo = 4.0 + 3.0i;
3944
3945double a = __real__ foo; /* @r{@code{a} is now 4.0.} */
3946double b = __imag__ foo; /* @r{@code{b} is now 3.0.} */
3947@end example
3948
3949@noindent
3950Standard C does not include these keywords, and instead relies on
3951functions defined in @code{complex.h} for accessing the real and
3952imaginary parts of a complex number: @code{crealf}, @code{creal}, and
3953@code{creall} extract the real part of a float, double, or long double
3954complex number, respectively; @code{cimagf}, @code{cimag}, and
3955@code{cimagl} extract the imaginary part.
3956
3957@cindex complex conjugation
3958GNU C also defines @samp{~} as an operator for complex conjugation,
3959which means negating the imaginary part of a complex number:
3960
3961@example
3962_Complex double foo = 4.0 + 3.0i;
3963_Complex double bar = ~foo; /* @r{@code{bar} is now 4 @minus{} 3i.} */
3964@end example
3965
3966@noindent
3967For standard C compatibility, you can use the appropriate library
3968function: @code{conjf}, @code{conj}, or @code{confl}.
3969
3970@node The Void Type
3971@section The Void Type
3972@cindex void type
3973@cindex type, void
3974@findex void
3975
3976The data type @code{void} is a dummy---it allows no operations. It
3977really means ``no value at all.'' When a function is meant to return
3978no value, we write @code{void} for its return type. Then
3979@code{return} statements in that function should not specify a value
3980(@pxref{return Statement}). Here's an example:
3981
3982@example
3983void
3984print_if_positive (double x, double y)
3985@{
3986 if (x <= 0)
3987 return;
3988 if (y <= 0)
3989 return;
3990 printf ("Next point is (%f,%f)\n", x, y);
3991@}
3992@end example
3993
3994A @code{void}-returning function is comparable to what some other languages
3995call a ``procedure'' instead of a ``function.''
3996
3997@c ??? Already presented
3998@c @samp{%f} in an output template specifies to format a @code{double} value
3999@c as a decimal number, using a decimal point if needed.
4000
4001@node Other Data Types
4002@section Other Data Types
4003
4004Beyond the primitive types, C provides several ways to construct new
4005data types. For instance, you can define @dfn{pointers}, values that
4006represent the addresses of other data (@pxref{Pointers}). You can
4007define @dfn{structures}, as in many other languages
4008(@pxref{Structures}), and @dfn{unions}, which specify multiple ways
4009to look at the same memory space (@pxref{Unions}). @dfn{Enumerations}
4010are collections of named integer codes (@pxref{Enumeration Types}).
4011
4012@dfn{Array types} in C are used for allocating space for objects,
4013but C does not permit operating on an array value as a whole. @xref{Arrays}.
4014
4015@node Type Designators
4016@section Type Designators
4017@cindex type designator
4018
4019Some C constructs require a way to designate a specific data type
4020independent of any particular variable or expression which has that
4021type. The way to do this is with a @dfn{type designator}. The
4022constucts that need one include casts (@pxref{Explicit Type
4023Conversion}) and @code{sizeof} (@pxref{Type Size}).
4024
4025We also use type designators to talk about the type of a value in C,
4026so you will see many type designators in this manual. When we say,
4027``The value has type @code{int},'' @code{int} is a type designator.
4028
4029To make the designator for any type, imagine a variable declaration
4030for a variable of that type and delete the variable name and the final
4031semicolon.
4032
4033For example, to designate the type of full-word integers, we start
4034with the declaration for a variable @code{foo} with that type,
4035which is this:
4036
4037@example
4038int foo;
4039@end example
4040
4041@noindent
4042Then we delete the variable name @code{foo} and the semicolon, leaving
4043@code{int}---exactly the keyword used in such a declaration.
4044Therefore, the type designator for this type is @code{int}.
4045
4046What about long unsigned integers? From the declaration
4047
4048@example
4049unsigned long int foo;
4050@end example
4051
4052@noindent
4053we determine that the designator is @code{unsigned long int}.
4054
4055Following this procedure, the designator for any primitive type is
4056simply the set of keywords which specifies that type in a declaration.
4057The same is true for compound types such as structures, unions, and
4058enumerations.
4059
4060Designators for pointer types do follow the rule of deleting the
4061variable name and semicolon, but the result is not so simple.
4062@xref{Pointer Type Designators}, as part of the chapter about
4063pointers. @xref{Array Type Designators}), for designators for array
4064types.
4065
4066To understand what type a designator stands for, imagine a variable
4067name inserted into the right place in the designator to make a valid
4068declaration. What type would that variable be declared as? That is the
4069type the designator designates.
4070
4071@node Constants
4072@chapter Constants
4073@cindex constants
4074
4075A @dfn{constant} is an expression that stands for a specific value by
4076explicitly representing the desired value. C allows constants for
4077numbers, characters, and strings. We have already seen numeric and
4078string constants in the examples.
4079
4080@menu
4081* Integer Constants:: Literal integer values.
4082* Integer Const Type:: Types of literal integer values.
4083* Floating Constants:: Literal floating-point values.
4084* Imaginary Constants:: Literal imaginary number values.
4085* Invalid Numbers:: Avoiding preprocessing number misconceptions.
4086* Character Constants:: Literal character values.
4087* String Constants:: Literal string values.
4088* UTF-8 String Constants:: Literal UTF-8 string values.
4089* Unicode Character Codes:: Unicode characters represented
4090 in either UTF-16 or UTF-32.
4091* Wide Character Constants:: Literal characters values larger than 8 bits.
4092* Wide String Constants:: Literal string values made up of
4093 16- or 32-bit characters.
4094@end menu
4095
4096@node Integer Constants
4097@section Integer Constants
4098@cindex integer constants
4099@cindex constants, integer
4100
4101An integer constant consists of a number to specify the value,
4102followed optionally by suffix letters to specify the data type.
4103
4104The simplest integer constants are numbers written in base 10
4105(decimal), such as @code{5}, @code{77}, and @code{403}. A decimal
4106constant cannot start with the character @samp{0} (zero) because
4107that makes the constant octal.
4108
4109You can get the effect of a negative integer constant by putting a
4110minus sign at the beginning. Grammatically speaking, that is an
4111arithmetic expression rather than a constant, but it behaves just like
4112a true constant.
4113
4114Integer constants can also be written in octal (base 8), hexadecimal
4115(base 16), or binary (base 2). An octal constant starts with the
4116character @samp{0} (zero), followed by any number of octal digits
4117(@samp{0} to @samp{7}):
4118
4119@example
41200 // @r{zero}
4121077 // @r{63}
41220403 // @r{259}
4123@end example
4124
4125@noindent
4126Pedantically speaking, the constant @code{0} is an octal constant, but
4127we can think of it as decimal; it has the same value either way.
4128
4129A hexadecimal constant starts with @samp{0x} (upper or lower case)
4130followed by hex digits (@samp{0} to @samp{9}, as well as @samp{a}
4131through @samp{f} in upper or lower case):
4132
4133@example
41340xff // @r{255}
41350XA0 // @r{160}
41360xffFF // @r{65535}
4137@end example
4138
4139@cindex binary integer constants
4140A binary constant starts with @samp{0b} (upper or lower case) followed
4141by bits (each represented by the characters @samp{0} or @samp{1}):
4142
4143@example
41440b101 // @r{5}
4145@end example
4146
4147Binary constants are a GNU C extension, not part of the C standard.
4148
4149Sometimes a space is needed after an integer constant to avoid
4150lexical confusion with the following tokens. @xref{Invalid Numbers}.
4151
4152@node Integer Const Type
4153@section Integer Constant Data Types
4154@cindex integer constant data types
4155@cindex constant data types, integer
4156@cindex types of integer constants
4157
4158The type of an integer constant is normally @code{int}, if the value
4159fits in that type, but here are the complete rules. The type
4160of an integer constant is the first one in this sequence that can
4161properly represent the value,
4162
4163@enumerate
4164@item
4165@code{int}
4166@item
4167@code{unsigned int}
4168@item
4169@code{long int}
4170@item
4171@code{unsigned long int}
4172@item
4173@code{long long int}
4174@item
4175@code{unsigned long long int}
4176@end enumerate
4177
4178@noindent
4179and that isn't excluded by the following rules.
4180
4181If the constant has @samp{l} or @samp{L} as a suffix, that excludes the
4182first two types (non-@code{long}).
4183
4184If the constant has @samp{ll} or @samp{LL} as a suffix, that excludes
4185first four types (non-@code{long long}).
4186
4187If the constant has @samp{u} or @samp{U} as a suffix, that excludes
4188the signed types.
4189
4190Otherwise, if the constant is decimal, that excludes the unsigned
4191types.
4192@c ### This said @code{unsigned int} is excluded.
4193@c ### See 17 April 2016
4194
4195Here are some examples of the suffixes.
4196
4197@example
41983000000000u // @r{three billion as @code{unsigned int}.}
41990LL // @r{zero as a @code{long long int}.}
42000403l // @r{259 as a @code{long int}.}
4201@end example
4202
4203Suffixes in integer constants are rarely used. When the precise type
4204is important, it is cleaner to convert explicitly (@pxref{Explicit
4205Type Conversion}).
4206
4207@xref{Integer Types}.
4208
4209@node Floating Constants
4210@section Floating-Point Constants
4211@cindex floating-point constants
4212@cindex constants, floating-point
4213
4214A floating-point constant must have either a decimal point, an
4215exponent-of-ten, or both; they distinguish it from an integer
4216constant.
4217
4218To indicate an exponent, write @samp{e} or @samp{E}. The exponent
4219value follows. It is always written as a decimal number; it can
4220optionally start with a sign. The exponent @var{n} means to multiply
4221the constant's value by ten to the @var{n}th power.
4222
4223Thus, @samp{1500.0}, @samp{15e2}, @samp{15e+2}, @samp{15.0e2},
4224@samp{1.5e+3}, @samp{.15e4}, and @samp{15000e-1} are six ways of
4225writing a floating-point number whose value is 1500. They are all
4226equivalent.
4227
4228Here are more examples with decimal points:
4229
4230@example
42311.0
42321000.
42333.14159
4234.05
4235.0005
4236@end example
4237
4238For each of them, here are some equivalent constants written with
4239exponents:
4240
4241@example
42421e0, 1.0000e0
4243100e1, 100e+1, 100E+1, 1e3, 10000e-1
42443.14159e0
42455e-2, .0005e+2, 5E-2, .0005E2
4246.05e-2
4247@end example
4248
4249A floating-point constant normally has type @code{double}. You can
4250force it to type @code{float} by adding @samp{f} or @samp{F}
4251at the end. For example,
4252
4253@example
42543.14159f
42553.14159e0f
42561000.f
4257100E1F
4258.0005f
4259.05e-2f
4260@end example
4261
4262Likewise, @samp{l} or @samp{L} at the end forces the constant
4263to type @code{long double}.
4264
4265You can use exponents in hexadecimal floating constants, but since
4266@samp{e} would be interpreted as a hexadecimal digit, the character
4267@samp{p} or @samp{P} (for ``power'') indicates an exponent.
4268
4269The exponent in a hexadecimal floating constant is a possibly-signed
4270decimal integer that specifies a power of 2 (@emph{not} 10 or 16) to
4271multiply into the number.
4272
4273Here are some examples:
4274
4275@example
4276@group
42770xAp2 // @r{40 in decimal}
42780xAp-1 // @r{5 in decimal}
42790x2.0Bp4 // @r{16.75 decimal}
42800xE.2p3 // @r{121 decimal}
42810x123.ABCp0 // @r{291.6708984375 in decimal}
42820x123.ABCp4 // @r{4666.734375 in decimal}
42830x100p-8 // @r{1}
42840x10p-4 // @r{1}
42850x1p+4 // @r{16}
42860x1p+8 // @r{256}
4287@end group
4288@end example
4289
4290@xref{Floating-Point Data Types}.
4291
4292@node Imaginary Constants
4293@section Imaginary Constants
4294@cindex imaginary constants
4295@cindex complex constants
4296@cindex constants, imaginary
4297
4298A complex number consists of a real part plus an imaginary part.
4299(Either or both parts may be zero.) This section explains how to
4300write numeric constants with imaginary values. By adding these to
4301ordinary real-valued numeric constants, we can make constants with
4302complex values.
4303
4304The simple way to write an imaginary-number constant is to attach the
4305suffix @samp{i} or @samp{I}, or @samp{j} or @samp{J}, to an integer or
4306floating-point constant. For example, @code{2.5fi} has type
4307@code{_Complex float} and @code{3i} has type @code{_Complex int}.
4308The four alternative suffix letters are all equivalent.
4309
4310@cindex _Complex_I
4311The other way to write an imaginary constant is to multiply a real
4312constant by @code{_Complex_I}, which represents the imaginary number
4313i. Standard C doesn't support suffixing with @samp{i} or @samp{j}, so
4314this clunky way is needed.
4315
4316To write a complex constant with a nonzero real part and a nonzero
4317imaginary part, write the two separately and add them, like this:
4318
4319@example
43204.0 + 3.0i
4321@end example
4322
4323@noindent
4324That gives the value 4 + 3i, with type @code{_Complex double}.
4325
4326Such a sum can include multiple real constants, or none. Likewise, it
4327can include multiple imaginary constants, or none. For example:
4328
4329@example
4330_Complex double foo, bar, quux;
4331
4332foo = 2.0i + 4.0 + 3.0i; /* @r{Imaginary part is 5.0.} */
4333bar = 4.0 + 12.0; /* @r{Imaginary part is 0.0.} */
4334quux = 3.0i + 15.0i; /* @r{Real part is 0.0.} */
4335@end example
4336
4337@xref{Complex Data Types}.
4338
4339@node Invalid Numbers
4340@section Invalid Numbers
4341
4342Some number-like constructs which are not really valid as numeric
4343constants are treated as numbers in preprocessing directives. If
4344these constructs appear outside of preprocessing, they are erroneous.
4345@xref{Preprocessing Tokens}.
4346
4347Sometimes we need to insert spaces to separate tokens so that they
4348won't be combined into a single number-like construct. For example,
4349@code{0xE+12} is a preprocessing number that is not a valid numeric
4350constant, so it is a syntax error. If what we want is the three
4351tokens @code{@w{0xE + 12}}, we have to use those spaces as separators.
4352
4353@node Character Constants
4354@section Character Constants
4355@cindex character constants
4356@cindex constants, character
4357@cindex escape sequence
4358
4359A @dfn{character constant} is written with single quotes, as in
4360@code{'@var{c}'}. In the simplest case, @var{c} is a single ASCII
4361character that the constant should represent. The constant has type
4362@code{int}, and its value is the character code of that character.
4363For instance, @code{'a'} represents the character code for the letter
4364@samp{a}: 97, that is.
4365
4366To put the @samp{'} character (single quote) in the character
4367constant, @dfn{quote} it with a backslash (@samp{\}). This character
4368constant looks like @code{'\''}. This sort of sequence, starting with
4369@samp{\}, is called an @dfn{escape sequence}---the backslash character
4370here functions as a kind of @dfn{escape character}.
4371
4372To put the @samp{\} character (backslash) in the character constant,
4373quote it likewise with @samp{\} (another backslash). This character
4374constant looks like @code{'\\'}.
4375
4376@cindex bell character
4377@cindex @samp{\a}
4378@cindex backspace
4379@cindex @samp{\b}
4380@cindex tab (ASCII character)
4381@cindex @samp{\t}
4382@cindex vertical tab
4383@cindex @samp{\v}
4384@cindex formfeed
4385@cindex @samp{\f}
4386@cindex newline
4387@cindex @samp{\n}
4388@cindex return (ASCII character)
4389@cindex @samp{\r}
4390@cindex escape (ASCII character)
4391@cindex @samp{\e}
4392Here are all the escape sequences that represent specific
4393characters in a character constant. The numeric values shown are
4394the corresponding ASCII character codes, as decimal numbers.
4395
4396@example
4397'\a' @result{} 7 /* @r{alarm, @kbd{CTRL-g}} */
4398'\b' @result{} 8 /* @r{backspace, @key{BS}, @kbd{CTRL-h}} */
4399'\t' @result{} 9 /* @r{tab, @key{TAB}, @kbd{CTRL-i}} */
4400'\n' @result{} 10 /* @r{newline, @kbd{CTRL-j}} */
4401'\v' @result{} 11 /* @r{vertical tab, @kbd{CTRL-k}} */
4402'\f' @result{} 12 /* @r{formfeed, @kbd{CTRL-l}} */
4403'\r' @result{} 13 /* @r{carriage return, @key{RET}, @kbd{CTRL-m}} */
4404'\e' @result{} 27 /* @r{escape character, @key{ESC}, @kbd{CTRL-[}} */
4405'\\' @result{} 92 /* @r{backslash character, @kbd{\}} */
4406'\'' @result{} 39 /* @r{singlequote character, @kbd{'}} */
4407'\"' @result{} 34 /* @r{doublequote character, @kbd{"}} */
4408'\?' @result{} 63 /* @r{question mark, @kbd{?}} */
4409@end example
4410
4411@samp{\e} is a GNU C extension; to stick to standard C, write @samp{\33}.
4412
4413You can also write octal and hex character codes as
4414@samp{\@var{octalcode}} or @samp{\x@var{hexcode}}. Decimal is not an
4415option here, so octal codes do not need to start with @samp{0}.
4416
4417The character constant's value has type @code{int}. However, the
4418character code is treated initially as a @code{char} value, which is
4419then converted to @code{int}. If the character code is greater than
4420127 (@code{0177} in octal), the resulting @code{int} may be negative
4421on a platform where the type @code{char} is 8 bits long and signed.
4422
4423@node String Constants
4424@section String Constants
4425@cindex string constants
4426@cindex constants, string
4427
4428A @dfn{string constant} represents a series of characters. It starts
4429with @samp{"} and ends with @samp{"}; in between are the contents of
4430the string. Quoting special characters such as @samp{"}, @samp{\} and
4431newline in the contents works in string constants as in character
4432constants. In a string constant, @samp{'} does not need to be quoted.
4433
4434A string constant defines an array of characters which contains the
4435specified characters followed by the null character (code 0). Using
4436the string constant is equivalent to using the name of an array with
4437those contents. In simple cases, the length in bytes of the string
4438constant is one greater than the number of characters written in it.
4439
4440As with any array in C, using the string constant in an expression
4441converts the array to a pointer (@pxref{Pointers}) to the array's
4442first element (@pxref{Accessing Array Elements}). This pointer will
4443have type @code{char *} because it points to an element of type
4444@code{char}. @code{char *} is an example of a type designator for a
4445pointer type (@pxref{Pointer Type Designators}). That type is used
4446for strings generally, not just the strings expressed as constants
4447in a program.
4448
4449Thus, the string constant @code{"Foo!"} is almost
4450equivalent to declaring an array like this
4451
4452@example
4453char string_array_1[] = @{'F', 'o', 'o', '!', '\0' @};
4454@end example
4455
4456@noindent
4457and then using @code{string_array_1} in the program. There
4458are two differences, however:
4459
4460@itemize @bullet
4461@item
4462The string constant doesn't define a name for the array.
4463
4464@item
4465The string constant is probably stored in a read-only area of memory.
4466@end itemize
4467
4468Newlines are not allowed in the text of a string constant. The motive
4469for this prohibition is to catch the error of omitting the closing
4470@samp{"}. To put a newline in a constant string, write it as
4471@samp{\n} in the string constant.
4472
4473A real null character in the source code inside a string constant
4474causes a warning. To put a null character in the middle of a string
4475constant, write @samp{\0} or @samp{\000}.
4476
4477Consecutive string constants are effectively concatenated. Thus,
4478
4479@example
4480"Fo" "o!" @r{is equivalent to} "Foo!"
4481@end example
4482
4483This is useful for writing a string containing multiple lines,
4484like this:
4485
4486@example
4487"This message is so long that it needs more than\n"
4488"a single line of text. C does not allow a newline\n"
4489"to represent itself in a string constant, so we have to\n"
4490"write \\n to put it in the string. For readability of\n"
4491"the source code, it is advisable to put line breaks in\n"
4492"the source where they occur in the contents of the\n"
4493"constant.\n"
4494@end example
4495
4496The sequence of a backslash and a newline is ignored anywhere
4497in a C program, and that includes inside a string constant.
4498Thus, you can write multi-line string constants this way:
4499
4500@example
4501"This is another way to put newlines in a string constant\n\
4502and break the line after them in the source code."
4503@end example
4504
4505@noindent
4506However, concatenation is the recommended way to do this.
4507
4508You can also write perverse string constants like this,
4509
4510@example
4511"Fo\
4512o!"
4513@end example
4514
4515@noindent
4516but don't do that---write it like this instead:
4517
4518@example
4519"Foo!"
4520@end example
4521
4522Be careful to avoid passing a string constant to a function that
4523modifies the string it receives. The memory where the string constant
4524is stored may be read-only, which would cause a fatal @code{SIGSEGV}
4525signal that normally terminates the function (@pxref{Signals}. Even
4526worse, the memory may not be read-only. Then the function might
4527modify the string constant, thus spoiling the contents of other string
4528constants that are supposed to contain the same value and are unified
4529by the compiler.
4530
4531@node UTF-8 String Constants
4532@section UTF-8 String Constants
4533@cindex UTF-8 String Constants
4534
4535Writing @samp{u8} immediately before a string constant, with no
4536intervening space, means to represent that string in UTF-8 encoding as
4537a sequence of bytes. UTF-8 represents ASCII characters with a single
4538byte, and represents non-ASCII Unicode characters (codes 128 and up)
4539as multibyte sequences. Here is an example of a UTF-8 constant:
4540
4541@example
4542u8"A cónstàñt"
4543@end example
4544
4545This constant occupies 13 bytes plus the terminating null,
4546because each of the accented letters is a two-byte sequence.
4547
4548Concatenating an ordinary string with a UTF-8 string conceptually
4549produces another UTF-8 string. However, if the ordinary string
4550contains character codes 128 and up, the results cannot be relied on.
4551
4552@node Unicode Character Codes
4553@section Unicode Character Codes
4554@cindex Unicode character codes
4555@cindex universal character names
4556
4557You can specify Unicode characters, for individual character constants
4558or as part of string constants (@pxref{String Constants}), using
4559escape sequences. Use the @samp{\u} escape sequence with a 16-bit
4560hexadecimal Unicode character code. If the code value is too big for
456116 bits, use the @samp{\U} escape sequence with a 32-bit hexadecimal
4562Unicode character code. (These codes are called @dfn{universal
4563character names}.) For example,
4564
4565@example
4566\u6C34 /* @r{16-bit code (UTF-16)} */
4567\U0010ABCD /* @r{32-bit code (UTF-32)} */
4568@end example
4569
4570@noindent
4571One way to use these is in UTF-8 string constants (@pxref{UTF-8 String
4572Constants}). For instance,
4573
4574@example
4575u8"fóó \u6C34 \U0010ABCD"
4576@end example
4577
4578 You can also use them in wide character constants (@pxref{Wide
4579Character Constants}), like this:
4580
4581@example
4582u'\u6C34' /* @r{16-bit code} */
4583U'\U0010ABCD' /* @r{32-bit code} */
4584@end example
4585
4586@noindent
4587and in wide string constants (@pxref{Wide String Constants}), like
4588this:
4589
4590@example
4591u"\u6C34\u6C33" /* @r{16-bit code} */
4592U"\U0010ABCD" /* @r{32-bit code} */
4593@end example
4594
4595Codes in the range of @code{D800} through @code{DFFF} are not valid
4596in Unicode. Codes less than @code{00A0} are also forbidden, except for
4597@code{0024}, @code{0040}, and @code{0060}; these characters are
4598actually ASCII control characters, and you can specify them with other
4599escape sequences (@pxref{Character Constants}).
4600
4601@node Wide Character Constants
4602@section Wide Character Constants
4603@cindex wide character constants
4604@cindex constants, wide character
4605
4606A @dfn{wide character constant} represents characters with more than 8
4607bits of character code. This is an obscure feature that we need to
4608document but that you probably won't ever use. If you're just
4609learning C, you may as well skip this section.
4610
4611The original C wide character constant looks like @samp{L} (upper
4612case!) followed immediately by an ordinary character constant (with no
4613intervening space). Its data type is @code{wchar_t}, which is an
4614alias defined in @file{stddef.h} for one of the standard integer
4615types. Depending on the platform, it could be 16 bits or 32 bits. If
4616it is 16 bits, these character constants use the UTF-16 form of
4617Unicode; if 32 bits, UTF-32.
4618
4619There are also Unicode wide character constants which explicitly
4620specify the width. These constants start with @samp{u} or @samp{U}
4621instead of @samp{L}. @samp{u} specifies a 16-bit Unicode wide
4622character constant, and @samp{U} a 32-bit Unicode wide character
4623constant. Their types are, respectively, @code{char16_t} and
4624@w{@code{char32_t}}; they are declared in the header file
4625@file{uchar.h}. These character constants are valid even if
4626@file{uchar.h} is not included, but some uses of them may be
4627inconvenient without including it to declare those type names.
4628
4629The character represented in a wide character constant can be an
4630ordinary ASCII character. @code{L'a'}, @code{u'a'} and @code{U'a'}
4631are all valid, and they are all equal to @code{'a'}.
4632
4633In all three kinds of wide character constants, you can write a
4634non-ASCII Unicode character in the constant itself; the constant's
4635value is the character's Unicode character code. Or you can specify
4636the Unicode character with an escape sequence (@pxref{Unicode
4637Character Codes}).
4638
4639@node Wide String Constants
4640@section Wide String Constants
4641@cindex wide string constants
4642@cindex constants, wide string
4643
4644A @dfn{wide string constant} stands for an array of 16-bit or 32-bit
4645characters. They are rarely used; if you're just
4646learning C, you may as well skip this section.
4647
4648There are three kinds of wide string constants, which differ in the
4649data type used for each character in the string. Each wide string
4650constant is equivalent to an array of integers, but the data type of
4651those integers depends on the kind of wide string. Using the constant
4652in an expression will convert the array to a pointer to its first
4653element, as usual for arrays in C (@pxref{Accessing Array Elements}).
4654For each kind of wide string constant, we state here what type that
4655pointer will be.
4656
4657@table @code
4658@item char16_t
4659This is a 16-bit Unicode wide string constant: each element is a
466016-bit Unicode character code with type @code{char16_t}, so the string
4661has the pointer type @code{char16_t@ *}. (That is a type designator;
4662@pxref{Pointer Type Designators}.) The constant is written as
4663@samp{u} (which must be lower case) followed (with no intervening
4664space) by a string constant with the usual syntax.
4665
4666@item char32_t
4667This is a 32-bit Unicode wide string constant: each element is a
466832-bit Unicode character code, and the string has type @code{char32_t@ *}.
4669It's written as @samp{U} (which must be upper case) followed (with no
4670intervening space) by a string constant with the usual syntax.
4671
4672@item wchar_t
4673This is the original kind of wide string constant. It's written as
4674@samp{L} (which must be upper case) followed (with no intervening
4675space) by a string constant with the usual syntax, and the string has
4676type @code{wchar_t@ *}.
4677
4678The width of the data type @code{wchar_t} depends on the target
4679platform, which makes this kind of wide string somewhat less useful
4680than the newer kinds.
4681@end table
4682
4683@code{char16_t} and @code{char32_t} are declared in the header file
4684@file{uchar.h}. @code{wchar_t} is declared in @file{stddef.h}.
4685
4686Consecutive wide string constants of the same kind concatenate, just
4687like ordinary string constants. A wide string constant concatenated
4688with an ordinary string constant results in a wide string constant.
4689You can't concatenate two wide string constants of different kinds.
4690You also can't concatenate a wide string constant (of any kind) with a
4691UTF-8 string constant.
4692
4693@node Type Size
4694@chapter Type Size
4695@cindex type size
4696@cindex size of type
4697@findex sizeof
4698
4699Each data type has a @dfn{size}, which is the number of bytes
4700(@pxref{Storage}) that it occupies in memory. To refer to the size in
4701a C program, use @code{sizeof}. There are two ways to use it:
4702
4703@table @code
4704@item sizeof @var{expression}
4705This gives the size of @var{expression}, based on its data type. It
4706does not calculate the value of @var{expression}, only its size, so if
4707@var{expression} includes side effects or function calls, they do not
4708happen. Therefore, @code{sizeof} is always a compile-time operation
4709that has zero run-time cost.
4710
4711A value that is a bit field (@pxref{Bit Fields}) is not allowed as an
4712operand of @code{sizeof}.
4713
4714For example,
4715
4716@example
4717double a;
4718
4719i = sizeof a + 10;
4720@end example
4721
4722@noindent
4723sets @code{i} to 18 on most computers because @code{a} occupies 8 bytes.
4724
4725Here's how to determine the number of elements in an array
4726@code{array}:
4727
4728@example
4729(sizeof array / sizeof array[0])
4730@end example
4731
4732@noindent
4733The expression @code{sizeof array} gives the size of the array, not
4734the size of a pointer to an element. However, if @var{expression} is
4735a function parameter that was declared as an array, that
4736variable really has a pointer type (@pxref{Array Parm Pointer}), so
4737the result is the size of that pointer.
4738
4739@item sizeof (@var{type})
4740This gives the size of @var{type}.
4741For example,
4742
4743@example
4744i = sizeof (double) + 10;
4745@end example
4746
4747@noindent
4748is equivalent to the previous example.
4749
4750You can't apply @code{sizeof} to an incomplete type (@pxref{Incomplete
4751Types}), nor @code{void}. Using it on a function type gives 1 in GNU
4752C, which makes adding an integer to a function pointer work as desired
4753(@pxref{Pointer Arithmetic}).
4754@end table
4755
4756@strong{Warning}: When you use @code{sizeof} with a type
4757instead of an expression, you must write parentheses around the type.
4758
4759@strong{Warning}: When applying @code{sizeof} to the result of a cast
4760(@pxref{Explicit Type Conversion}), you must write parentheses around
4761the cast expression to avoid an ambiguity in the grammar of C@.
4762Specifically,
4763
4764@example
4765sizeof (int) -x
4766@end example
4767
4768@noindent
4769parses as
4770
4771@example
4772(sizeof (int)) - x
4773@end example
4774
4775@noindent
4776If what you want is
4777
4778@example
4779sizeof ((int) -x)
4780@end example
4781
4782@noindent
4783you must write it that way, with parentheses.
4784
4785The data type of the value of the @code{sizeof} operator is always one
4786of the unsigned integer types; which one of those types depends on the
4787machine. The header file @code{stddef.h} defines the typedef name
4788@code{size_t} as an alias for this type. @xref{Defining Typedef
4789Names}.
4790
4791@node Pointers
4792@chapter Pointers
4793@cindex pointers
4794
4795Among high-level languages, C is rather low level, close to the
4796machine. This is mainly because it has explicit @dfn{pointers}. A
4797pointer value is the numeric address of data in memory. The type of
4798data to be found at that address is specified by the data type of the
4799pointer itself. The unary operator @samp{*} gets the data that a
4800pointer points to---this is called @dfn{dereferencing the pointer}.
4801
4802C also allows pointers to functions, but since there are some
4803differences in how they work, we treat them later. @xref{Function
4804Pointers}.
4805
4806@menu
4807* Address of Data:: Using the ``address-of'' operator.
4808* Pointer Types:: For each type, there is a pointer type.
4809* Pointer Declarations:: Declaring variables with pointer types.
4810* Pointer Type Designators:: Designators for pointer types.
4811* Pointer Dereference:: Accessing what a pointer points at.
4812* Null Pointers:: Pointers which do not point to any object.
4813* Invalid Dereference:: Dereferencing null or invalid pointers.
4814* Void Pointers:: Totally generic pointers, can cast to any.
4815* Pointer Comparison:: Comparing memory address values.
4816* Pointer Arithmetic:: Computing memory address values.
4817* Pointers and Arrays:: Using pointer syntax instead of array syntax.
4818* Pointer Arithmetic Low Level:: More about computing memory address values.
4819* Pointer Increment/Decrement:: Incrementing and decrementing pointers.
4820* Pointer Arithmetic Drawbacks:: A common pointer bug to watch out for.
4821* Pointer-Integer Conversion:: Converting pointer types to integer types.
4822* Printing Pointers:: Using @code{printf} for a pointer's value.
4823@end menu
4824
4825@node Address of Data
4826@section Address of Data
4827
4828@cindex address-of operator
4829The most basic way to make a pointer is with the ``address-of''
4830operator, @samp{&}. Let's suppose we have these variables available:
4831
4832@example
4833int i;
4834double a[5];
4835@end example
4836
4837Now, @code{&i} gives the address of the variable @code{i}---a pointer
4838value that points to @code{i}'s location---and @code{&a[3]} gives the
4839address of the element 3 of @code{a}. (It is actually the fourth
4840element in the array, since the first element has index 0.)
4841
4842The address-of operator is unusual because it operates on a place to
4843store a value (an lvalue, @pxref{Lvalues}), not on the value currently
4844stored there. (The left argument of a simple assignment is unusual in
4845the same way.) You can use it on any lvalue except a bit field
4846(@pxref{Bit Fields}) or a constructor (@pxref{Structure
4847Constructors}).
4848
4849
4850@node Pointer Types
4851@section Pointer Types
4852
4853For each data type @var{t}, there is a type for pointers to type
4854@var{t}. For these variables,
4855
4856@example
4857int i;
4858double a[5];
4859@end example
4860
4861@itemize @bullet
4862@item
4863@code{i} has type @code{int}; we say
4864@code{&i} is a ``pointer to @code{int}.''
4865
4866@item
4867@code{a} has type @code{double[5]}; we say @code{&a} is a ``pointer to
4868arrays of five @code{double}s.''
4869
4870@item
4871@code{a[3]} has type @code{double}; we say @code{&a[3]} is a ``pointer
4872to @code{double}.''
4873@end itemize
4874
4875@node Pointer Declarations
4876@section Pointer-Variable Declarations
4877
4878The way to declare that a variable @code{foo} points to type @var{t} is
4879
4880@example
4881@var{t} *foo;
4882@end example
4883
4884To remember this syntax, think ``if you dereference @code{foo}, using
4885the @samp{*} operator, what you get is type @var{t}. Thus, @code{foo}
4886points to type @var{t}.''
4887
4888Thus, we can declare variables that hold pointers to these three
4889types, like this:
4890
4891@example
4892int *ptri; /* @r{Pointer to @code{int}.} */
4893double *ptrd; /* @r{Pointer to @code{double}.} */
4894double (*ptrda)[5]; /* @r{Pointer to @code{double[5]}.} */
4895@end example
4896
4897@samp{int *ptri;} means, ``if you dereference @code{ptri}, you get an
4898@code{int}.'' @samp{double (*ptrda)[5];} means, ``if you dereference
4899@code{ptrda}, then subscript it by an integer less than 5, you get a
4900@code{double}.'' The parentheses express the point that you would
4901dereference it first, then subscript it.
4902
4903Contrast the last one with this:
4904
4905@example
4906double *aptrd[5]; /* @r{Array of five pointers to @code{double}.} */
4907@end example
4908
4909@noindent
4910Because @samp{*} has higher syntactic precedence than subscripting,
4911you would subscript @code{aptrd} then dereference it. Therefore, it
4912declares an array of pointers, not a pointer.
4913
4914@node Pointer Type Designators
4915@section Pointer-Type Designators
4916
4917Every type in C has a designator; you make it by deleting the variable
4918name and the semicolon from a declaration (@pxref{Type
4919Designators}). Here are the designators for the pointer
4920types of the example declarations in the previous section:
4921
4922@example
4923int * /* @r{Pointer to @code{int}.} */
4924double * /* @r{Pointer to @code{double}.} */
4925double (*)[5] /* @r{Pointer to @code{double[5]}.} */
4926@end example
4927
4928Remember, to understand what type a designator stands for, imagine the
4929variable name that would be in the declaration, and figure out what
4930type it would declare that variable with. @code{double (*)[5]} can
4931only come from @code{double (*@var{variable})[5]}, so it's a pointer
4932which, when dereferenced, gives an array of 5 @code{double}s.
4933
4934@node Pointer Dereference
4935@section Dereferencing Pointers
4936@cindex dereferencing pointers
4937@cindex pointer dereferencing
4938
4939The main use of a pointer value is to @dfn{dereference it} (access the
4940data it points at) with the unary @samp{*} operator. For instance,
4941@code{*&i} is the value at @code{i}'s address---which is just
4942@code{i}. The two expressions are equivalent, provided @code{&i} is
4943valid.
4944
4945A pointer-dereference expression whose type is data (not a function)
4946is an lvalue.
4947
4948Pointers become really useful when we store them somewhere and use
4949them later. Here's a simple example to illustrate the practice:
4950
4951@example
4952@{
4953 int i;
4954 int *ptr;
4955
4956 ptr = &i;
4957
4958 i = 5;
4959
4960 @r{@dots{}}
4961
4962 return *ptr; /* @r{Returns 5, fetched from @code{i}.} */
4963@}
4964@end example
4965
4966This shows how to declare the variable @code{ptr} as type
4967@code{int *} (pointer to @code{int}), store a pointer value into it
4968(pointing at @code{i}), and use it later to get the value of the
4969object it points at (the value in @code{i}).
4970
4971If anyone can provide a useful example which is this basic,
4972I would be grateful.
4973
4974@node Null Pointers
4975@section Null Pointers
4976@cindex null pointers
4977@cindex pointers, null
4978
4979@c ???stdio loads sttddef
4980
4981A pointer value can be @dfn{null}, which means it does not point to
4982any object. The cleanest way to get a null pointer is by writing
4983@code{NULL}, a standard macro defined in @file{stddef.h}. You can
4984also do it by casting 0 to the desired pointer type, as in
4985@code{(char *) 0}. (The cast operator performs explicit type conversion;
4986@xref{Explicit Type Conversion}.)
4987
4988You can store a null pointer in any lvalue whose data type
4989is a pointer type:
4990
4991@example
4992char *foo;
4993foo = NULL;
4994@end example
4995
4996These two, if consecutive, can be combined into a declaration with
4997initializer,
4998
4999@example
5000char *foo = NULL;
5001@end example
5002
5003You can also explicitly cast @code{NULL} to the specific pointer type
5004you want---it makes no difference.
5005
5006@example
5007char *foo;
5008foo = (char *) NULL;
5009@end example
5010
5011To test whether a pointer is null, compare it with zero or
5012@code{NULL}, as shown here:
5013
5014@example
5015if (p != NULL)
5016 /* @r{@code{p} is not null.} */
5017 operate (p);
5018@end example
5019
5020Since testing a pointer for not being null is basic and frequent, all
5021but beginners in C will understand the conditional without need for
5022@code{!= NULL}:
5023
5024@example
5025if (p)
5026 /* @r{@code{p} is not null.} */
5027 operate (p);
5028@end example
5029
5030@node Invalid Dereference
5031@section Dereferencing Null or Invalid Pointers
5032
5033Trying to dereference a null pointer is an error. On most platforms,
5034it generally causes a signal, usually @code{SIGSEGV}
5035(@pxref{Signals}).
5036
5037@example
5038char *foo = NULL;
5039c = *foo; /* @r{This causes a signal and terminates.} */
5040@end example
5041
5042@noindent
5043Likewise a pointer that has the wrong alignment for the target data type
5044(on most types of computer), or points to a part of memory that has
5045not been allocated in the process's address space.
5046
5047The signal terminates the program, unless the program has arranged to
5048handle the signal (@pxref{Signal Handling, The GNU C Library, , libc,
5049The GNU C Library Reference Manual}).
5050
5051However, the signal might not happen if the dereference is optimized
5052away. In the example above, if you don't subsequently use the value
5053of @code{c}, GCC might optimize away the code for @code{*foo}. You
5054can prevent such optimization using the @code{volatile} qualifier, as
5055shown here:
5056
5057@example
5058volatile char *p;
5059volatile char c;
5060c = *p;
5061@end example
5062
5063You can use this to test whether @code{p} points to unallocated
5064memory. Set up a signal handler first, so the signal won't terminate
5065the program.
5066
5067@node Void Pointers
5068@section Void Pointers
5069@cindex void pointers
5070@cindex pointers, void
5071
5072The peculiar type @code{void *}, a pointer whose target type is
5073@code{void}, is used often in C@. It represents a pointer to
5074we-don't-say-what. Thus,
5075
5076@example
5077void *numbered_slot_pointer (int);
5078@end example
5079
5080@noindent
5081declares a function @code{numbered_slot_pointer} that takes an
5082integer parameter and returns a pointer, but we don't say what type of
5083data it points to.
5084
5085With type @code{void *}, you can pass the pointer around and test
5086whether it is null. However, dereferencing it gives a @code{void}
5087value that can't be used (@pxref{The Void Type}). To dereference the
5088pointer, first convert it to some other pointer type.
5089
5090Assignments convert @code{void *} automatically to any other pointer
5091type, if the left operand has a pointer type; for instance,
5092
5093@example
5094@{
5095 int *p;
5096 /* @r{Converts return value to @code{int *}.} */
5097 p = numbered_slot_pointer (5);
5098 @r{@dots{}}
5099@}
5100@end example
5101
5102Passing an argument of type @code{void *} for a parameter that has a
5103pointer type also converts. For example, supposing the function
5104@code{hack} is declared to require type @code{float *} for its
5105argument, this will convert the null pointer to that type.
5106
5107@example
5108/* @r{Declare @code{hack} that way.}
5109 @r{We assume it is defined somewhere else.} */
5110void hack (float *);
5111@dots{}
5112/* @r{Now call @code{hack}.} */
5113@{
5114 /* @r{Converts return value of @code{numbered_slot_pointer}}
5115 @r{to @code{float *} to pass it to @code{hack}.} */
5116 hack (numbered_slot_pointer (5));
5117 @r{@dots{}}
5118@}
5119@end example
5120
5121 You can also convert to another pointer type with an explicit cast
5122(@pxref{Explicit Type Conversion}), like this:
5123@example
5124(int *) numbered_slot_pointer (5)
5125@end example
5126
5127Here is an example which decides at run time which pointer
5128type to convert to:
5129
5130@example
5131void
5132extract_int_or_double (void *ptr, bool its_an_int)
5133@{
5134 if (its_an_int)
5135 handle_an_int (*(int *)ptr);
5136 else
5137 handle_a_double (*(double *)ptr);
5138@}
5139@end example
5140
5141The expression @code{*(int *)ptr} means to convert @code{ptr}
5142to type @code{int *}, then dereference it.
5143
5144@node Pointer Comparison
5145@section Pointer Comparison
5146@cindex pointer comparison
5147@cindex comparison, pointer
5148
5149Two pointer values are equal if they point to the same location, or if
5150they are both null. You can test for this with @code{==} and
5151@code{!=}. Here's a trivial example:
5152
5153@example
5154@{
5155 int i;
5156 int *p, *q;
5157
5158 p = &i;
5159 q = &i;
5160 if (p == q)
5161 printf ("This will be printed.\n");
5162 if (p != q)
5163 printf ("This won't be printed.\n");
5164@}
5165@end example
5166
5167Ordering comparisons such as @code{>} and @code{>=} operate on
5168pointers by converting them to unsigned integers. The C standard says
5169the two pointers must point within the same object in memory, but on
5170GNU/Linux systems these operations simply compare the numeric values
5171of the pointers.
5172
5173The pointer values to be compared should in principle have the same type, but
5174they are allowed to differ in limited cases. First of all, if the two
5175pointers' target types are nearly compatible (@pxref{Compatible
5176Types}), the comparison is allowed.
5177
5178If one of the operands is @code{void *} (@pxref{Void Pointers}) and
5179the other is another pointer type, the comparison operator converts
5180the @code{void *} pointer to the other type so as to compare them.
5181(In standard C, this is not allowed if the other type is a function
5182pointer type, but that works in GNU C@.)
5183
5184Comparison operators also allow comparing the integer 0 with a pointer
5185value. Thus works by converting 0 to a null pointer of the same type
5186as the other operand.
5187
5188@node Pointer Arithmetic
5189@section Pointer Arithmetic
5190@cindex pointer arithmetic
5191@cindex arithmetic, pointer
5192
5193Adding an integer (positive or negative) to a pointer is valid in C@.
5194It assumes that the pointer points to an element in an array, and
5195advances or retracts the pointer across as many array elements as the
5196integer specifies. Here is an example, in which adding a positive
5197integer advances the pointer to a later element in the same array.
5198
5199@example
5200void
5201incrementing_pointers ()
5202@{
5203 int array[5] = @{ 45, 29, 104, -3, 123456 @};
5204 int elt0, elt1, elt4;
5205
5206 int *p = &array[0];
5207 /* @r{Now @code{p} points at element 0. Fetch it.} */
5208 elt0 = *p;
5209
5210 ++p;
5211 /* @r{Now @code{p} points at element 1. Fetch it.} */
5212 elt1 = *p;
5213
5214 p += 3;
5215 /* @r{Now @code{p} points at element 4 (the last). Fetch it.} */
5216 elt4 = *p;
5217
5218 printf ("elt0 %d elt1 %d elt4 %d.\n",
5219 elt0, elt1, elt4);
5220 /* @r{Prints elt0 45 elt1 29 elt4 123456.} */
5221@}
5222@end example
5223
5224Here's an example where adding a negative integer retracts the pointer
5225to an earlier element in the same array.
5226
5227@example
5228void
5229decrementing_pointers ()
5230@{
5231 int array[5] = @{ 45, 29, 104, -3, 123456 @};
5232 int elt0, elt3, elt4;
5233
5234 int *p = &array[4];
5235 /* @r{Now @code{p} points at element 4 (the last). Fetch it.} */
5236 elt4 = *p;
5237
5238 --p;
5239 /* @r{Now @code{p} points at element 3. Fetch it.} */
5240 elt3 = *p;
5241
5242 p -= 3;
5243 /* @r{Now @code{p} points at element 0. Fetch it.} */
5244 elt0 = *p;
5245
5246 printf ("elt0 %d elt3 %d elt4 %d.\n",
5247 elt0, elt3, elt4);
5248 /* @r{Prints elt0 45 elt3 -3 elt4 123456.} */
5249@}
5250@end example
5251
5252If one pointer value was made by adding an integer to another
5253pointer value, it should be possible to subtract the pointer values
5254and recover that integer. That works too in C@.
5255
5256@example
5257void
5258subtract_pointers ()
5259@{
5260 int array[5] = @{ 45, 29, 104, -3, 123456 @};
5261 int *p0, *p3, *p4;
5262
5263 int *p = &array[4];
5264 /* @r{Now @code{p} points at element 4 (the last). Save the value.} */
5265 p4 = p;
5266
5267 --p;
5268 /* @r{Now @code{p} points at element 3. Save the value.} */
5269 p3 = p;
5270
5271 p -= 3;
5272 /* @r{Now @code{p} points at element 0. Save the value.} */
5273 p0 = p;
5274
5275 printf ("%d, %d, %d, %d\n",
5276 p4 - p0, p0 - p0, p3 - p0, p0 - p3);
5277 /* @r{Prints 4, 0, 3, -3.} */
5278@}
5279@end example
5280
5281The addition operation does not know where arrays are. All it does is
5282add the integer (multiplied by object size) to the value of the
5283pointer. When the initial pointer and the result point into a single
5284array, the result is well-defined.
5285
5286@strong{Warning:} Only experts should do pointer arithmetic involving pointers
5287into different memory objects.
5288
5289The difference between two pointers has type @code{int}, or
5290@code{long} if necessary (@pxref{Integer Types}). The clean way to
5291declare it is to use the typedef name @code{ptrdiff_t} defined in the
5292file @file{stddef.h}.
5293
5294This definition of pointer subtraction is consistent with
5295pointer-integer addition, in that @code{(p3 - p1) + p1} equals
5296@code{p3}, as in ordinary algebra.
5297
5298In standard C, addition and subtraction are not allowed on @code{void
5299*}, since the target type's size is not defined in that case.
5300Likewise, they are not allowed on pointers to function types.
5301However, these operations work in GNU C, and the ``size of the target
5302type'' is taken as 1.
5303
5304@node Pointers and Arrays
5305@section Pointers and Arrays
5306@cindex pointers and arrays
5307@cindex arrays and pointers
5308
5309The clean way to refer to an array element is
5310@code{@var{array}[@var{index}]}. Another, complicated way to do the
5311same job is to get the address of that element as a pointer, then
5312dereference it: @code{* (&@var{array}[0] + @var{index})} (or
5313equivalently @code{* (@var{array} + @var{index})}). This first gets a
5314pointer to element zero, then increments it with @code{+} to point to
5315the desired element, then gets the value from there.
5316
5317That pointer-arithmetic construct is the @emph{definition} of square
5318brackets in C@. @code{@var{a}[@var{b}]} means, by definition,
5319@code{*(@var{a} + @var{b})}. This definition uses @var{a} and @var{b}
5320symmetrically, so one must be a pointer and the other an integer; it
5321does not matter which comes first.
5322
5323Since indexing with square brackets is defined in terms of addition
5324and dereference, that too is symmetrical. Thus, you can write
5325@code{3[array]} and it is equivalent to @code{array[3]}. However, it
5326would be foolish to write @code{3[array]}, since it has no advantage
5327and could confuse people who read the code.
5328
5329It may seem like a discrepancy that the definition @code{*(@var{a} +
5330@var{b})} requires a pointer, but @code{array[3]} uses an array value
5331instead. Why is this valid? The name of the array, when used by
5332itself as an expression (other than in @code{sizeof}), stands for a
5333pointer to the arrays's zeroth element. Thus, @code{array + 3}
5334converts @code{array} implicitly to @code{&array[0]}, and the result
5335is a pointer to element 3, equivalent to @code{&array[3]}.
5336
5337Since square brackets are defined in terms of such addition,
5338@code{array[3]} first converts @code{array} to a pointer. That's why
5339it works to use an array directly in that construct.
5340
5341@node Pointer Arithmetic Low Level
5342@section Pointer Arithmetic at Low Level
5343@cindex pointer arithmetic, low level
5344@cindex low level pointer arithmetic
5345
5346The behavior of pointer arithmetic is theoretically defined only when
5347the pointer values all point within one object allocated in memory.
5348But the addition and subtraction operators can't tell whether the
5349pointer values are all within one object. They don't know where
5350objects start and end. So what do they really do?
5351
5352Adding pointer @var{p} to integer @var{i} treats @var{p} as a memory
5353address, which is in fact an integer---call it @var{pint}. It treats
5354@var{i} as a number of elements of the type that @var{p} points to.
5355These elements' sizes add up to @code{@var{i} * sizeof (*@var{p})}.
5356So the sum, as an integer, is @code{@var{pint} + @var{i} * sizeof
5357(*@var{p})}. This value is reinterpreted as a pointer like @var{p}.
5358
5359If the starting pointer value @var{p} and the result do not point at
5360parts of the same object, the operation is not officially legitimate,
5361and C code is not ``supposed'' to do it. But you can do it anyway,
5362and it gives precisely the results described by the procedure above.
5363In some special situations it can do something useful, but non-wizards
5364should avoid it.
5365
5366Here's a function to offset a pointer value @emph{as if} it pointed to
5367an object of any given size, by explicitly performing that calculation:
5368
5369@example
5370#include <stdint.h>
5371
5372void *
5373ptr_add (void *p, int i, int objsize)
5374@{
5375 intptr_t p_address = (long) p;
5376 intptr_t totalsize = i * objsize;
5377 intptr_t new_address = p_address + totalsize;
5378 return (void *) new_address;
5379@}
5380@end example
5381
5382@noindent
5383@cindex @code{intptr_t}
5384This does the same job as @code{@var{p} + @var{i}} with the proper
5385pointer type for @var{p}. It uses the type @code{intptr_t}, which is
5386defined in the header file @file{stdint.h}. (In practice, @code{long
5387long} would always work, but it is cleaner to use @code{intptr_t}.)
5388
5389@node Pointer Increment/Decrement
5390@section Pointer Increment and Decrement
5391@cindex pointer increment and decrement
5392@cindex incrementing pointers
5393@cindex decrementing pointers
5394
5395The @samp{++} operator adds 1 to a variable. We have seen it for
5396integers (@pxref{Increment/Decrement}), but it works for pointers too.
5397For instance, suppose we have a series of positive integers,
5398terminated by a zero, and we want to add them all up.
5399
5400@example
5401int
5402sum_array_till_0 (int *p)
5403@{
5404 int sum = 0;
5405
5406 for (;;)
5407 @{
5408 /* @r{Fetch the next integer.} */
5409 int next = *p++;
5410 /* @r{Exit the loop if it's 0.} */
5411 if (next == 0)
5412 break;
5413 /* @r{Add it into running total.} */
5414 sum += next;
5415 @}
5416
5417 return sum;
5418@}
5419@end example
5420
5421@noindent
5422The statement @samp{break;} will be explained further on (@pxref{break
5423Statement}). Used in this way, it immediately exits the surrounding
5424@code{for} statement.
5425
5426@code{*p++} parses as @code{*(p++)}, because a postfix operator always
5427takes precedence over a prefix operator. Therefore, it dereferences
5428@code{p}, and increments @code{p} afterwards. Incrementing a variable
5429means adding 1 to it, as in @code{p = p + 1}. Since @code{p} is a
5430pointer, adding 1 to it advances it by the width of the datum it
5431points to---in this case, one @code{int}. Therefore, each iteration
5432of the loop picks up the next integer from the series and puts it into
5433@code{next}.
5434
5435This @code{for}-loop has no initialization expression since @code{p}
5436and @code{sum} are already initialized, it has no end-test since the
5437@samp{break;} statement will exit it, and needs no expression to
5438advance it since that's done within the loop by incrementing @code{p}
5439and @code{sum}. Thus, those three expressions after @code{for} are
5440left empty.
5441
5442Another way to write this function is by keeping the parameter value unchanged
5443and using indexing to access the integers in the table.
5444
5445@example
5446int
5447sum_array_till_0_indexing (int *p)
5448@{
5449 int i;
5450 int sum = 0;
5451
5452 for (i = 0; ; i++)
5453 @{
5454 /* @r{Fetch the next integer.} */
5455 int next = p[i];
5456 /* @r{Exit the loop if it's 0.} */
5457 if (next == 0)
5458 break;
5459 /* @r{Add it into running total.} */
5460 sum += next;
5461 @}
5462
5463 return sum;
5464@}
5465@end example
5466
5467In this program, instead of advancing @code{p}, we advance @code{i}
5468and add it to @code{p}. (Recall that @code{p[i]} means @code{*(p +
5469i)}.) Either way, it uses the same address to get the next integer.
5470
5471It makes no difference in this program whether we write @code{i++} or
5472@code{++i}, because the value is not used. All that matters is the
5473effect, to increment @code{i}.
5474
5475The @samp{--} operator also works on pointers; it can be used
5476to scan backwards through an array, like this:
5477
5478@example
5479int
5480after_last_nonzero (int *p, int len)
5481@{
5482 /* @r{Set up @code{q} to point just after the last array element.} */
5483 int *q = p + len;
5484
5485 while (q != p)
5486 /* @r{Step @code{q} back until it reaches a nonzero element.} */
5487 if (*--q != 0)
5488 /* @r{Return the index of the element after that nonzero.} */
5489 return q - p + 1;
5490
5491 return 0;
5492@}
5493@end example
5494
5495That function returns the length of the nonzero part of the
5496array specified by its arguments; that is, the index of the
5497first zero of the run of zeros at the end.
5498
5499@node Pointer Arithmetic Drawbacks
5500@section Drawbacks of Pointer Arithmetic
5501@cindex drawbacks of pointer arithmetic
5502@cindex pointer arithmetic, drawbacks
5503
5504Pointer arithmetic is clean and elegant, but it is also the cause of a
5505major security flaw in the C language. Theoretically, it is only
5506valid to adjust a pointer within one object allocated as a unit in
5507memory. However, if you unintentionally adjust a pointer across the
5508bounds of the object and into some other object, the system has no way
5509to detect this error.
5510
5511A bug which does that can easily result in clobbering part of another
5512object. For example, with @code{array[-1]} you can read or write the
5513nonexistent element before the beginning of an array---probably part
5514of some other data.
5515
5516Combining pointer arithmetic with casts between pointer types, you can
5517create a pointer that fails to be properly aligned for its type. For
5518example,
5519
5520@example
5521int a[2];
5522char *pa = (char *)a;
5523int *p = (int *)(pa + 1);
5524@end example
5525
5526@noindent
5527gives @code{p} a value pointing to an ``integer'' that includes part
5528of @code{a[0]} and part of @code{a[1]}. Dereferencing that with
5529@code{*p} can cause a fatal @code{SIGSEGV} signal or it can return the
5530contents of that badly aligned @code{int} (@pxref{Signals}. If it
5531``works,'' it may be quite slow. It can also cause aliasing
5532confusions (@pxref{Aliasing}).
5533
5534@strong{Warning:} Using improperly aligned pointers is risky---don't do it
5535unless it is really necessary.
5536
5537@node Pointer-Integer Conversion
5538@section Pointer-Integer Conversion
5539@cindex pointer-integer conversion
5540@cindex conversion between pointers and integers
5541@cindex @code{uintptr_t}
5542
5543On modern computers, an address is simply a number. It occupies the
5544same space as some size of integer. In C, you can convert a pointer
5545to the appropriate integer types and vice versa, without losing
5546information. The appropriate integer types are @code{uintptr_t} (an
5547unsigned type) and @code{intptr_t} (a signed type). Both are defined
5548in @file{stdint.h}.
5549
5550For instance,
5551
5552@example
5553#include <stdint.h>
5554#include <stdio.h>
5555
5556void
5557print_pointer (void *ptr)
5558@{
5559 uintptr_t converted = (uintptr_t) ptr;
5560
5561 printf ("Pointer value is 0x%x\n",
5562 (unsigned int) converted);
5563@}
5564@end example
5565
5566@noindent
5567The specification @samp{%x} in the template (the first argument) for
5568@code{printf} means to represent this argument using hexadecimal
5569notation. It's cleaner to use @code{uintptr_t}, since hexadecimal
5570printing treats the number as unsigned, but it won't actually matter:
5571all @code{printf} gets to see is the series of bits in the number.
5572
5573@strong{Warning:} Converting pointers to integers is risky---don't do
5574it unless it is really necessary.
5575
5576@node Printing Pointers
5577@section Printing Pointers
5578
5579To print the numeric value of a pointer, use the @samp{%p} specifier.
5580For example:
5581
5582@example
5583void
5584print_pointer (void *ptr)
5585@{
5586 printf ("Pointer value is %p\n", ptr);
5587@}
5588@end example
5589
5590The specification @samp{%p} works with any pointer type. It prints
5591@samp{0x} followed by the address in hexadecimal, printed as the
5592appropriate unsigned integer type.
5593
5594@node Structures
5595@chapter Structures
5596@cindex structures
5597@findex struct
5598@cindex fields in structures
5599
5600A @dfn{structure} is a user-defined data type that holds various
5601@dfn{fields} of data. Each field has a name and a data type specified
5602in the structure's definition.
5603
5604Here we define a structure suitable for storing a linked list of
5605integers. Each list item will hold one integer, plus a pointer
5606to the next item.
5607
5608@example
5609struct intlistlink
5610 @{
5611 int datum;
5612 struct intlistlink *next;
5613 @};
5614@end example
5615
5616The structure definition has a @dfn{type tag} so that the code can
5617refer to this structure. The type tag here is @code{intlistlink}.
5618The definition refers recursively to the same structure through that
5619tag.
5620
5621You can define a structure without a type tag, but then you can't
5622refer to it again. That is useful only in some special contexts, such
5623as inside a @code{typedef} or a @code{union}.
5624
5625The contents of the structure are specified by the @dfn{field
5626declarations} inside the braces. Each field in the structure needs a
5627declaration there. The fields in one structure definition must have
5628distinct names, but these names do not conflict with any other names
5629in the program.
5630
5631A field declaration looks just like a variable declaration. You can
5632combine field declarations with the same beginning, just as you can
5633combine variable declarations.
5634
5635This structure has two fields. One, named @code{datum}, has type
5636@code{int} and will hold one integer in the list. The other, named
5637@code{next}, is a pointer to another @code{struct intlistlink}
5638which would be the rest of the list. In the last list item, it would
5639be @code{NULL}.
5640
5641This structure definition is recursive, since the type of the
5642@code{next} field refers to the structure type. Such recursion is not
5643a problem; in fact, you can use the type @code{struct intlistlink *}
5644before the definition of the type @code{struct intlistlink} itself.
5645That works because pointers to all kinds of structures really look the
5646same at the machine level.
5647
5648After defining the structure, you can declare a variable of type
5649@code{struct intlistlink} like this:
5650
5651@example
5652struct intlistlink foo;
5653@end example
5654
5655The structure definition itself can serve as the beginning of a
5656variable declaration, so you can declare variables immediately after,
5657like this:
5658
5659@example
5660struct intlistlink
5661 @{
5662 int datum;
5663 struct intlistlink *next;
5664 @} foo;
5665@end example
5666
5667@noindent
5668But that is ugly. It is almost always clearer to separate the
5669definition of the structure from its uses.
5670
5671Declaring a structure type inside a block (@pxref{Blocks}) limits
5672the scope of the structure type name to that block. That means the
5673structure type is recognized only within that block. Declaring it in
5674a function parameter list, as here,
5675
5676@example
5677int f (struct foo @{int a, b@} parm);
5678@end example
5679
5680@noindent
5681(assuming that @code{struct foo} is not already defined) limits the
5682scope of the structure type @code{struct foo} to that parameter list;
5683that is basically useless, so it triggers a warning.
5684
5685Standard C requires at least one field in a structure.
5686GNU C does not require this.
5687
5688@menu
5689* Referencing Fields:: Accessing field values in a structure object.
5690* Dynamic Memory Allocation:: Allocating space for objects
5691 while the program is running.
5692* Field Offset:: Memory layout of fields within a structure.
5693* Structure Layout:: Planning the memory layout of fields.
5694* Packed Structures:: Packing structure fields as close as possible.
5695* Bit Fields:: Dividing integer fields
5696 into fields with fewer bits.
5697* Bit Field Packing:: How bit fields pack together in integers.
5698* const Fields:: Making structure fields immutable.
5699* Zero Length:: Zero-length array as a variable-length object.
5700* Flexible Array Fields:: Another approach to variable-length objects.
5701* Overlaying Structures:: Casting one structure type
5702 over an object of another structure type.
5703* Structure Assignment:: Assigning values to structure objects.
5704* Unions:: Viewing the same object in different types.
5705* Packing With Unions:: Using a union type to pack various types into
5706 the same memory space.
5707* Cast to Union:: Casting a value one of the union's alternative
5708 types to the type of the union itself.
5709* Structure Constructors:: Building new structure objects.
5710* Unnamed Types as Fields:: Fields' types do not always need names.
5711* Incomplete Types:: Types which have not been fully defined.
5712* Intertwined Incomplete Types:: Defining mutually-recursive structue types.
5713* Type Tags:: Scope of structure and union type tags.
5714@end menu
5715
5716@node Referencing Fields
5717@section Referencing Structure Fields
5718@cindex referencing structure fields
5719@cindex structure fields, referencing
5720
5721To make a structure useful, there has to be a way to examine and store
5722its fields. The @samp{.} (period) operator does that; its use looks
5723like @code{@var{object}.@var{field}}.
5724
5725Given this structure and variable,
5726
5727@example
5728struct intlistlink
5729 @{
5730 int datum;
5731 struct intlistlink *next;
5732 @};
5733
5734struct intlistlink foo;
5735@end example
5736
5737@noindent
5738you can write @code{foo.datum} and @code{foo.next} to refer to the two
5739fields in the value of @code{foo}. These fields are lvalues, so you
5740can store values into them, and read the values out again.
5741
5742Most often, structures are dynamically allocated (see the next
5743section), and we refer to the objects via pointers.
5744@code{(*p).@var{field}} is somewhat cumbersome, so there is an
5745abbreviation: @code{p->@var{field}}. For instance, assume the program
5746contains this declaration:
5747
5748@example
5749struct intlistlink *ptr;
5750@end example
5751
5752@noindent
5753You can write @code{ptr->datum} and @code{ptr->next} to refer
5754to the two fields in the object that @code{ptr} points to.
5755
5756If a unary operator precedes an expression using @samp{->},
5757the @samp{->} nests inside:
5758
5759@example
5760 -ptr->datum @r{is equivalent to} -(ptr->datum)
5761@end example
5762
5763You can intermix @samp{->} and @samp{.} without parentheses,
5764as shown here:
5765
5766@example
5767struct @{ double d; struct intlistlink l; @} foo;
5768
5769@r{@dots{}}foo.l.next->next->datum@r{@dots{}}
5770@end example
5771
5772@node Dynamic Memory Allocation
5773@section Dynamic Memory Allocation
5774@cindex dynamic memory allocation
5775@cindex memory allocation, dynamic
5776@cindex allocating memory dynamically
5777
5778To allocate an object dynamically, call the library function
5779@code{malloc} (@pxref{Basic Allocation, The GNU C Library,, libc, The GNU C Library
5780Reference Manual}). Here is how to allocate an object of type
5781@code{struct intlistlink}. To make this code work, include the file
5782@file{stdlib.h}, like this:
5783
5784@example
5785#include <stddef.h> /* @r{Defines @code{NULL}.} */
5786#include <stdlib.h> /* @r{Declares @code{malloc}.} */
5787
5788@dots{}
5789
5790struct intlistlink *
5791alloc_intlistlink ()
5792@{
5793 struct intlistlink *p;
5794
5795 p = malloc (sizeof (struct intlistlink));
5796
5797 if (p == NULL)
5798 fatal ("Ran out of storage");
5799
5800 /* @r{Initialize the contents.} */
5801 p->datum = 0;
5802 p->next = NULL;
5803
5804 return p;
5805@}
5806@end example
5807
5808@noindent
5809@code{malloc} returns @code{void *}, so the assignment to @code{p}
5810will automatically convert it to type @code{struct intlistlink *}.
5811The return value of @code{malloc} is always sufficiently aligned
5812(@pxref{Type Alignment}) that it is valid for any data type.
5813
5814The test for @code{p == NULL} is necessary because @code{malloc}
5815returns a null pointer if it cannot get any storage. We assume that
5816the program defines the function @code{fatal} to report a fatal error
5817to the user.
5818
5819Here's how to add one more integer to the front of such a list:
5820
5821@example
5822struct intlistlink *my_list = NULL;
5823
5824void
5825add_to_mylist (int my_int)
5826@{
5827 struct intlistlink *p = alloc_intlistlink ();
5828
5829 p->datum = my_int;
5830 p->next = mylist;
5831 mylist = p;
5832@}
5833@end example
5834
5835The way to free the objects is by calling @code{free}. Here's
5836a function to free all the links in one of these lists:
5837
5838@example
5839void
5840free_intlist (struct intlistlink *p)
5841@{
5842 while (p)
5843 @{
5844 struct intlistlink *q = p;
5845 p = p->next;
5846 free (q);
5847 @}
5848@}
5849@end example
5850
5851We must extract the @code{next} pointer from the object before freeing
5852it, because @code{free} can clobber the data that was in the object.
5853For the same reason, the program must not use the list any more after
5854freeing its elements. To make sure it won't, it is best to clear out
5855the variable where the list was stored, like this:
5856
5857@example
5858free_intlist (mylist);
5859
5860mylist = NULL;
5861@end example
5862
5863@node Field Offset
5864@section Field Offset
5865@cindex field offset
5866@cindex structure field offset
5867@cindex offset of structure fields
5868
5869To determine the offset of a given field @var{field} in a structure
5870type @var{type}, use the macro @code{offsetof}, which is defined in
5871the file @file{stddef.h}. It is used like this:
5872
5873@example
5874offsetof (@var{type}, @var{field})
5875@end example
5876
5877Here is an example:
5878
5879@example
5880struct foo
5881@{
5882 int element;
5883 struct foo *next;
5884@};
5885
5886offsetof (struct foo, next)
5887/* @r{On most machines that is 4. It may be 8.} */
5888@end example
5889
5890@node Structure Layout
5891@section Structure Layout
5892@cindex structure layout
5893@cindex layout of structures
5894
5895The rest of this chapter covers advanced topics about structures. If
5896you are just learning C, you can skip it.
5897
5898The precise layout of a @code{struct} type is crucial when using it to
5899overlay hardware registers, to access data structures in shared
5900memory, or to assemble and disassemble packets for network
5901communication. It is also important for avoiding memory waste when
5902the program makes many objects of that type. However, the layout
5903depends on the target platform. Each platform has conventions for
5904structure layout, which compilers need to follow.
5905
5906Here are the conventions used on most platforms.
5907
5908The structure's fields appear in the structure layout in the order
5909they are declared. When possible, consecutive fields occupy
5910consecutive bytes within the structure. However, if a field's type
5911demands more alignment than it would get that way, C gives it the
5912alignment it requires by leaving a gap after the previous field.
5913
5914Once all the fields have been laid out, it is possible to determine
5915the structure's alignment and size. The structure's alignment is the
5916maximum alignment of any of the fields in it. Then the structure's
5917size is rounded up to a multiple of its alignment. That may require
5918leaving a gap at the end of the structure.
5919
5920Here are some examples, where we assume that @code{char} has size and
5921alignment 1 (always true), and @code{int} has size and alignment 4
5922(true on most kinds of computers):
5923
5924@example
5925struct foo
5926@{
5927 char a, b;
5928 int c;
5929@};
5930@end example
5931
5932@noindent
5933This structure occupies 8 bytes, with an alignment of 4. @code{a} is
5934at offset 0, @code{b} is at offset 1, and @code{c} is at offset 4.
5935There is a gap of 2 bytes before @code{c}.
5936
5937Contrast that with this structure:
5938
5939@example
5940struct foo
5941@{
5942 char a;
5943 int c;
5944 char b;
5945@};
5946@end example
5947
5948This structure has size 12 and alignment 4. @code{a} is at offset 0,
5949@code{c} is at offset 4, and @code{b} is at offset 8. There are two
5950gaps: three bytes before @code{c}, and three bytes at the end.
5951
5952These two structures have the same contents at the C level, but one
5953takes 8 bytes and the other takes 12 bytes due to the ordering of the
5954fields. A reliable way to avoid this sort of wastage is to order the
5955fields by size, biggest fields first.
5956
5957@node Packed Structures
5958@section Packed Structures
5959@cindex packed structures
5960@cindex @code{__attribute__((packed))}
5961
5962In GNU C you can force a structure to be laid out with no gaps by
5963adding @code{__attribute__((packed))} after @code{struct} (or at the
5964end of the structure type declaration). Here's an example:
5965
5966@example
5967struct __attribute__((packed)) foo
5968@{
5969 char a;
5970 int c;
5971 char b;
5972@};
5973@end example
5974
5975Without @code{__attribute__((packed))}, this structure occupies 12
5976bytes (as described in the previous section), assuming 4-byte
5977alignment for @code{int}. With @code{__attribute__((packed))}, it is
5978only 6 bytes long---the sum of the lengths of its fields.
5979
5980Use of @code{__attribute__((packed))} often results in fields that
5981don't have the normal alignment for their types. Taking the address
5982of such a field can result in an invalid pointer because of its
5983improper alignment. Dereferencing such a pointer can cause a
5984@code{SIGSEGV} signal on a machine that doesn't, in general, allow
5985unaligned pointers.
5986
5987@xref{Attributes}.
5988
5989@node Bit Fields
5990@section Bit Fields
5991@cindex bit fields
5992
5993A structure field declaration with an integer type can specify the
5994number of bits the field should occupy. We call that a @dfn{bit
5995field}. These are useful because consecutive bit fields are packed
5996into a larger storage unit. For instance,
5997
5998@example
5999unsigned char opcode: 4;
6000@end example
6001
6002@noindent
6003specifies that this field takes just 4 bits.
6004Since it is unsigned, its possible values range
6005from 0 to 15. A signed field with 4 bits, such as this,
6006
6007@example
6008signed char small: 4;
6009@end example
6010
6011@noindent
6012can hold values from -8 to 7.
6013
6014You can subdivide a single byte into those two parts by writing
6015
6016@example
6017unsigned char opcode: 4;
6018signed char small: 4;
6019@end example
6020
6021@noindent
6022in the structure. With bit fields, these two numbers fit into
6023a single @code{char}.
6024
6025Here's how to declare a one-bit field that can hold either 0 or 1:
6026
6027@example
6028unsigned char special_flag: 1;
6029@end example
6030
6031You can also use the @code{bool} type for bit fields:
6032
6033@example
6034bool special_flag: 1;
6035@end example
6036
6037Except when using @code{bool} (which is always unsigned,
6038@pxref{Boolean Type}), always specify @code{signed} or @code{unsigned}
6039for a bit field. There is a default, if that's not specified: the bit
6040field is signed if plain @code{char} is signed, except that the option
6041@option{-funsigned-bitfields} forces unsigned as the default. But it
6042is cleaner not to depend on this default.
6043
6044Bit fields are special in that you cannot take their address with
6045@samp{&}. They are not stored with the size and alignment appropriate
6046for the specified type, so they cannot be addressed through pointers
6047to that type.
6048
6049@node Bit Field Packing
6050@section Bit Field Packing
6051
6052Programs to communicate with low-level hardware interfaces need to
6053define bit fields laid out to match the hardware data. This section
6054explains how to do that.
6055
6056Consecutive bit fields are packed together, but each bit field must
6057fit within a single object of its specified type. In this example,
6058
6059@example
6060unsigned short a : 3, b : 3, c : 3, d : 3, e : 3;
6061@end example
6062
6063@noindent
6064all five fields fit consecutively into one two-byte @code{short}.
6065They need 15 bits, and one @code{short} provides 16. By contrast,
6066
6067@example
6068unsigned char a : 3, b : 3, c : 3, d : 3, e : 3;
6069@end example
6070
6071@noindent
6072needs three bytes. It fits @code{a} and @code{b} into one
6073@code{char}, but @code{c} won't fit in that @code{char} (they would
6074add up to 9 bits). So @code{c} and @code{d} go into a second
6075@code{char}, leaving a gap of two bits between @code{b} and @code{c}.
6076Then @code{e} needs a third @code{char}. By contrast,
6077
6078@example
6079unsigned char a : 3, b : 3;
6080unsigned int c : 3;
6081unsigned char d : 3, e : 3;
6082@end example
6083
6084@noindent
6085needs only two bytes: the type @code{unsigned int}
6086allows @code{c} to straddle bytes that are in the same word.
6087
6088You can leave a gap of a specified number of bits by defining a
6089nameless bit field. This looks like @code{@var{type} : @var{nbits};}.
6090It is allocated space in the structure just as a named bit field would
6091be allocated.
6092
6093You can force the following bit field to advance to the following
6094aligned memory object with @code{@var{type} : 0;}.
6095
6096Both of these constructs can syntactically share @var{type} with
6097ordinary bit fields. This example illustrates both:
6098
6099@example
6100unsigned int a : 5, : 3, b : 5, : 0, c : 5, : 3, d : 5;
6101@end example
6102
6103@noindent
6104It puts @code{a} and @code{b} into one @code{int}, with a 3-bit gap
6105between them. Then @code{: 0} advances to the next @code{int},
6106so @code{c} and @code{d} fit into that one.
6107
6108These rules for packing bit fields apply to most target platforms,
6109including all the usual real computers. A few embedded controllers
6110have special layout rules.
6111
6112@node const Fields
6113@section @code{const} Fields
6114@cindex const fields
6115@cindex structure fields, constant
6116
6117@c ??? Is this a C standard feature?
6118
6119A structure field declared @code{const} cannot be assigned to
6120(@pxref{const}). For instance, let's define this modified version of
6121@code{struct intlistlink}:
6122
6123@example
6124struct intlistlink_ro /* @r{``ro'' for read-only.} */
6125 @{
6126 const int datum;
6127 struct intlistlink *next;
6128 @};
6129@end example
6130
6131This structure can be used to prevent part of the code from modifying
6132the @code{datum} field:
6133
6134@example
6135/* @r{@code{p} has type @code{struct intlistlink *}.}
6136 @r{Convert it to @code{struct intlistlink_ro *}.} */
6137struct intlistlink_ro *q
6138 = (struct intlistlink_ro *) p;
6139
6140q->datum = 5; /* @r{Error!} */
6141p->datum = 5; /* @r{Valid since @code{*p} is}
6142 @r{not a @code{struct intlistlink_ro}.} */
6143@end example
6144
6145A @code{const} field can get a value in two ways: by initialization of
6146the whole structure, and by making a pointer-to-structure point to an object
6147in which that field already has a value.
6148
6149Any @code{const} field in a structure type makes assignment impossible
6150for structures of that type (@pxref{Structure Assignment}). That is
6151because structure assignment works by assigning the structure's
6152fields, one by one.
6153
6154@node Zero Length
6155@section Arrays of Length Zero
6156@cindex array of length zero
6157@cindex zero-length arrays
6158@cindex length-zero arrays
6159
6160GNU C allows zero-length arrays. They are useful as the last element
6161of a structure that is really a header for a variable-length object.
6162Here's an example, where we construct a variable-size structure
6163to hold a line which is @code{this_length} characters long:
6164
6165@example
6166struct line @{
6167 int length;
6168 char contents[0];
6169@};
6170
6171struct line *thisline
6172 = ((struct line *)
6173 malloc (sizeof (struct line)
6174 + this_length));
6175thisline->length = this_length;
6176@end example
6177
6178In ISO C90, we would have to give @code{contents} a length of 1, which
6179means either wasting space or complicating the argument to @code{malloc}.
6180
6181@node Flexible Array Fields
6182@section Flexible Array Fields
6183@cindex flexible array fields
6184@cindex array fields, flexible
6185
6186The C99 standard adopted a more complex equivalent of zero-length
6187array fields. It's called a @dfn{flexible array}, and it's indicated
6188by omitting the length, like this:
6189
6190@example
6191struct line
6192@{
6193 int length;
6194 char contents[];
6195@};
6196@end example
6197
6198The flexible array has to be the last field in the structure, and there
6199must be other fields before it.
6200
6201Under the C standard, a structure with a flexible array can't be part
6202of another structure, and can't be an element of an array.
6203
6204GNU C allows static initialization of flexible array fields. The effect
6205is to ``make the array long enough'' for the initializer.
6206
6207@example
6208struct f1 @{ int x; int y[]; @} f1
6209 = @{ 1, @{ 2, 3, 4 @} @};
6210@end example
6211
6212@noindent
6213This defines a structure variable named @code{f1}
6214whose type is @code{struct f1}. In C, a variable name or function name
6215never conflicts with a structure type tag.
6216
6217Omitting the flexible array field's size lets the initializer
6218determine it. This is allowed only when the flexible array is defined
6219in the outermost structure and you declare a variable of that
6220structure type. For example:
6221
6222@example
6223struct foo @{ int x; int y[]; @};
6224struct bar @{ struct foo z; @};
6225
6226struct foo a = @{ 1, @{ 2, 3, 4 @} @}; // @r{Valid.}
6227struct bar b = @{ @{ 1, @{ 2, 3, 4 @} @} @}; // @r{Invalid.}
6228struct bar c = @{ @{ 1, @{ @} @} @}; // @r{Valid.}
6229struct foo d[1] = @{ @{ 1 @{ 2, 3, 4 @} @} @}; // @r{Invalid.}
6230@end example
6231
6232@node Overlaying Structures
6233@section Overlaying Different Structures
6234@cindex overlaying structures
6235@cindex structures, overlaying
6236
6237Be careful about using different structure types to refer to the same
6238memory within one function, because GNU C can optimize code assuming
6239it never does that. @xref{Aliasing}. Here's an example of the kind of
6240aliasing that can cause the problem:
6241
6242@example
6243struct a @{ int size; char *data; @};
6244struct b @{ int size; char *data; @};
6245struct a foo;
6246struct b *q = (struct b *) &foo;
6247@end example
6248
6249Here @code{q} points to the same memory that the variable @code{foo}
6250occupies, but they have two different types. The two types
6251@code{struct a} and @code{struct b} are defined alike, but they are
6252not the same type. Interspersing references using the two types,
6253like this,
6254
6255@example
6256p->size = 0;
6257q->size = 1;
6258x = p->size;
6259@end example
6260
6261@noindent
6262allows GNU C to assume that @code{p->size} is still zero when it is
6263copied into @code{x}. The compiler ``knows'' that @code{q} points to
6264a @code{struct b} and this cannot overlap with a @code{struct a}.
6265
6266Other compilers might also do this optimization. The ISO C standard
6267considers such code erroneous, precisely so that this optimization
6268will be valid.
6269
6270@node Structure Assignment
6271@section Structure Assignment
6272@cindex structure assignment
6273@cindex assigning structures
6274
6275Assignment operating on a structure type copies the structure. The
6276left and right operands must have the same type. Here is an example:
6277
6278@example
6279#include <stddef.h> /* @r{Defines @code{NULL}.} */
6280#include <stdlib.h> /* @r{Declares @code{malloc}.} */
6281@r{@dots{}}
6282
6283struct point @{ double x, y; @};
6284
6285struct point *
6286copy_point (struct point point)
6287@{
6288 struct point *p
6289 = (struct point *) malloc (sizeof (struct point));
6290 if (p == NULL)
6291 fatal ("Out of memory");
6292 *p = point;
6293 return p;
6294@}
6295@end example
6296
6297Notionally, assignment on a structure type works by copying each of
6298the fields. Thus, if any of the fields has the @code{const}
6299qualifier, that structure type does not allow assignment:
6300
6301@example
6302struct point @{ const double x, y; @};
6303
6304struct point a, b;
6305
6306a = b; /* @r{Error!} */
6307@end example
6308
6309@xref{Assignment Expressions}.
6310
6311@node Unions
6312@section Unions
6313@cindex unions
6314@findex union
6315
6316A @dfn{union type} defines alternative ways of looking at the same
6317piece of memory. Each alternative view is defined with a data type,
6318and identified by a name. A union definition looks like this:
6319
6320@example
6321union @var{name}
6322@{
6323 @var{alternative declarations}@r{@dots{}}
6324@};
6325@end example
6326
6327Each alternative declaration looks like a structure field declaration,
6328except that it can't be a bit field. For instance,
6329
6330@example
6331union number
6332@{
6333 long int integer;
6334 double float;
6335@}
6336@end example
6337
6338@noindent
6339lets you store either an integer (type @code{long int}) or a floating
6340point number (type @code{double}) in the same place in memory. The
6341length and alignment of the union type are the maximum of all the
6342alternatives---they do not have to be the same. In this union
6343example, @code{double} probably takes more space than @code{long int},
6344but that doesn't cause a problem in programs that use the union in the
6345normal way.
6346
6347The members don't have to be different in data type. Sometimes
6348each member pertains to a way the data will be used. For instance,
6349
6350@example
6351union datum
6352@{
6353 double latitude;
6354 double longitude;
6355 double height;
6356 double weight;
6357 int continent;
6358@}
6359@end example
6360
6361This union holds one of several kinds of data; most kinds are floating
6362points, but the value can also be a code for a continent which is an
6363integer. You @emph{could} use one member of type @code{double} to
6364access all the values which have that type, but the different member
6365names will make the program clearer.
6366
6367The alignment of a union type is the maximum of the alignments of the
6368alternatives. The size of the union type is the maximum of the sizes
6369of the alternatives, rounded up to a multiple of the alignment
6370(because every type's size must be a multiple of its alignment).
6371
6372All the union alternatives start at the address of the union itself.
6373If an alternative is shorter than the union as a whole, it occupies
6374the first part of the union's storage, leaving the last part unused
6375@emph{for that alternative}.
6376
6377@strong{Warning:} if the code stores data using one union alternative
6378and accesses it with another, the results depend on the kind of
6379computer in use. Only wizards should try to do this. However, when
6380you need to do this, a union is a clean way to do it.
6381
6382Assignment works on any union type by copying the entire value.
6383
6384@node Packing With Unions
6385@section Packing With Unions
6386
6387Sometimes we design a union with the intention of packing various
6388kinds of objects into a certain amount of memory space. For example.
6389
6390@example
6391union bytes8
6392@{
6393 long long big_int_elt;
6394 double double_elt;
6395 struct @{ int first, second; @} two_ints;
6396 struct @{ void *first, *second; @} two_ptrs;
6397@};
6398
6399union bytes8 *p;
6400@end example
6401
6402This union makes it possible to look at 8 bytes of data that @code{p}
6403points to as a single 8-byte integer (@code{p->big_int_elt}), as a
6404single floating-point number (@code{p->double_elt}), as a pair of
6405integers (@code{p->two_ints.first} and @code{p->two_ints.second}), or
6406as a pair of pointers (@code{p->two_ptrs.first} and
6407@code{p->two_ptrs.second}).
6408
6409To pack storage with such a union makes assumptions about the sizes of
6410all the types involved. This particular union was written expecting a
6411pointer to have the same size as @code{int}. On a machine where one
6412pointer takes 8 bytes, the code using this union probably won't work
6413as expected. The union, as such, will function correctly---if you
6414store two values through @code{two_ints} and extract them through
6415@code{two_ints}, you will get the same integers back---but the part of
6416the program that expects the union to be 8 bytes long could
6417malfunction, or at least use too much space.
6418
6419The above example shows one case where a @code{struct} type with no
6420tag can be useful. Another way to get effectively the same result
6421is with arrays as members of the union:
6422
6423@example
6424union eight_bytes
6425@{
6426 long long big_int_elt;
6427 double double_elt;
6428 int two_ints[2];
6429 void *two_ptrs[2];
6430@};
6431@end example
6432
6433@node Cast to Union
6434@section Cast to a Union Type
6435@cindex cast to a union
6436@cindex union, casting to a
6437
6438In GNU C, you can explicitly cast any of the alternative types to the
6439union type; for instance,
6440
6441@example
6442(union eight_bytes) (long long) 5
6443@end example
6444
6445@noindent
6446makes a value of type @code{union eight_bytes} which gets its contents
6447through the alternative named @code{big_int_elt}.
6448
6449The value being cast must exactly match the type of the alternative,
6450so this is not valid:
6451
6452@example
6453(union eight_bytes) 5 /* @r{Error! 5 is @code{int}.} */
6454@end example
6455
6456A cast to union type looks like any other cast, except that the type
6457specified is a union type. You can specify the type either with
6458@code{union @var{tag}} or with a typedef name (@pxref{Defining
6459Typedef Names}).
6460
6461Using the cast as the right-hand side of an assignment to a variable of
6462union type is equivalent to storing in an alternative of the union:
6463
6464@example
6465union foo u;
6466
6467u = (union foo) x @r{means} u.i = x
6468
6469u = (union foo) y @r{means} u.d = y
6470@end example
6471
6472You can also use the union cast as a function argument:
6473
6474@example
6475void hack (union foo);
6476@r{@dots{}}
6477hack ((union foo) x);
6478@end example
6479
6480@node Structure Constructors
6481@section Structure Constructors
6482@cindex structure constructors
6483@cindex constructors, structure
6484
6485You can construct a structure value by writing its type in
6486parentheses, followed by an initializer that would be valid in a
6487declaration for that type. For instance, given this declaration,
6488
6489@example
6490struct foo @{int a; char b[2];@} structure;
6491@end example
6492
6493@noindent
6494you can create a @code{struct foo} value as follows:
6495
6496@example
6497((struct foo) @{x + y, 'a', 0@})
6498@end example
6499
6500@noindent
6501This specifies @code{x + y} for field @code{a},
6502the character @samp{a} for field @code{b}'s element 0,
6503and the null character for field @code{b}'s element 1.
6504
6505The parentheses around that constructor are to necessary, but we
6506recommend writing them to make the nesting of the containing
6507expression clearer.
6508
6509You can also show the nesting of the two by writing it like
6510this:
6511
6512@example
6513((struct foo) @{x + y, @{'a', 0@} @})
6514@end example
6515
6516Each of those is equivalent to writing the following statement
6517expression (@pxref{Statement Exprs}):
6518
6519@example
6520(@{
6521 struct foo temp = @{x + y, 'a', 0@};
6522 temp;
6523@})
6524@end example
6525
6526You can also create a union value this way, but it is not especially
6527useful since that is equivalent to doing a cast:
6528
6529@example
6530 ((union whosis) @{@var{value}@})
6531@r{is equivalent to}
6532 ((union whosis) (@var{value}))
6533@end example
6534
6535@node Unnamed Types as Fields
6536@section Unnamed Types as Fields
6537@cindex unnamed structures
6538@cindex unnamed unions
6539@cindex structures, unnamed
6540@cindex unions, unnamed
6541
6542A structure or a union can contain, as fields,
6543unnamed structures and unions. Here's an example:
6544
6545@example
6546struct
6547@{
6548 int a;
6549 union
6550 @{
6551 int b;
6552 float c;
6553 @};
6554 int d;
6555@} foo;
6556@end example
6557
6558@noindent
6559You can access the fields of the unnamed union within @code{foo} as if they
6560were individual fields at the same level as the union definition:
6561
6562@example
6563foo.a = 42;
6564foo.b = 47;
6565foo.c = 5.25; // @r{Overwrites the value in @code{foo.b}}.
6566foo.d = 314;
6567@end example
6568
6569Avoid using field names that could cause ambiguity. For example, with
6570this definition:
6571
6572@example
6573struct
6574@{
6575 int a;
6576 struct
6577 @{
6578 int a;
6579 float b;
6580 @};
6581@} foo;
6582@end example
6583
6584@noindent
6585it is impossible to tell what @code{foo.a} refers to. GNU C reports
6586an error when a definition is ambiguous in this way.
6587
6588@node Incomplete Types
6589@section Incomplete Types
6590@cindex incomplete types
6591@cindex types, incomplete
6592
6593A type that has not been fully defined is called an @dfn{incomplete
6594type}. Structure and union types are incomplete when the code makes a
6595forward reference, such as @code{struct foo}, before defining the
6596type. An array type is incomplete when its length is unspecified.
6597
6598You can't use an incomplete type to declare a variable or field, or
6599use it for a function parameter or return type. The operators
6600@code{sizeof} and @code{_Alignof} give errors when used on an
6601incomplete type.
6602
6603However, you can define a pointer to an incomplete type, and declare a
6604variable or field with such a pointer type. In general, you can do
6605everything with such pointers except dereference them. For example:
6606
6607@example
6608extern void bar (struct mysterious_value *);
6609
6610void
6611foo (struct mysterious_value *arg)
6612@{
6613 bar (arg);
6614@}
6615
6616@r{@dots{}}
6617
6618@{
6619 struct mysterious_value *p, **q;
6620
6621 p = *q;
6622 foo (p);
6623@}
6624@end example
6625
6626@noindent
6627These examples are valid because the code doesn't try to understand
6628what @code{p} points to; it just passes the pointer around.
6629(Presumably @code{bar} is defined in some other file that really does
6630have a definition for @code{struct mysterious_value}.) However,
6631dereferencing the pointer would get an error; that requires a
6632definition for the structure type.
6633
6634@node Intertwined Incomplete Types
6635@section Intertwined Incomplete Types
6636
6637When several structure types contain pointers to each other, you can
6638define the types in any order because pointers to types that come
6639later are incomplete types. Thus,
6640Here is an example.
6641
6642@example
6643/* @r{An employee record points to a group.} */
6644struct employee
6645@{
6646 char *name;
6647 @r{@dots{}}
6648 struct group *group; /* @r{incomplete type.} */
6649 @r{@dots{}}
6650@};
6651
6652/* @r{An employee list points to employees.} */
6653struct employee_list
6654@{
6655 struct employee *this_one;
6656 struct employee_list *next; /* @r{incomplete type.} */
6657 @r{@dots{}}
6658@};
6659
6660/* @r{A group points to one employee_list.} */
6661struct group
6662@{
6663 char *name;
6664 @r{@dots{}}
6665 struct employee_list *employees;
6666 @r{@dots{}}
6667@};
6668@end example
6669
6670@node Type Tags
6671@section Type Tags
6672@cindex type tags
6673
6674The name that follows @code{struct} (@pxref{Structures}), @code{union}
6675(@pxref{Unions}, or @code{enum} (@pxref{Enumeration Types}) is called
6676a @dfn{type tag}. In C, a type tag never conflicts with a variable
6677name or function name; the type tags have a separate @dfn{name space}.
6678Thus, there is no name conflict in this code:
6679
6680@example
6681struct pair @{ int a, b; @};
6682int pair = 1;
6683@end example
6684
6685@noindent
6686nor in this one:
6687
6688@example
6689struct pair @{ int a, b; @} pair;
6690@end example
6691
6692@noindent
6693where @code{pair} is both a structure type tag and a variable name.
6694
6695However, @code{struct}, @code{union}, and @code{enum} share the same
6696name space of tags, so this is a conflict:
6697
6698@example
6699struct pair @{ int a, b; @};
6700enum pair @{ c, d @};
6701@end example
6702
6703@noindent
6704and so is this:
6705
6706@example
6707struct pair @{ int a, b; @};
6708struct pair @{ int c, d; @};
6709@end example
6710
6711When the code defines a type tag inside a block, the tag's scope is
6712limited to that block (as for local variables). Two definitions for
6713one type tag do not conflict if they are in different scopes; rather,
6714each is valid in its scope. For example,
6715
6716@example
6717struct pair @{ int a, b; @};
6718
6719void
6720pair_up_doubles (int len, double array[])
6721@{
6722 struct pair @{ double a, b; @};
6723 @r{@dots{}}
6724@}
6725@end example
6726
6727@noindent
6728has two definitions for @code{struct pair} which do not conflict. The
6729one inside the function applies only within the definition of
6730@code{pair_up_doubles}. Within its scope, that definition
6731@dfn{shadows} the outer definition.
6732
6733If @code{struct pair} appears inside the function body, before the
6734inner definition, it refers to the outer definition---the only one
6735that has been seen at that point. Thus, in this code,
6736
6737@example
6738struct pair @{ int a, b; @};
6739
6740void
6741pair_up_doubles (int len, double array[])
6742@{
6743 struct two_pairs @{ struct pair *p, *q; @};
6744 struct pair @{ double a, b; @};
6745 @r{@dots{}}
6746@}
6747@end example
6748
6749@noindent
6750the structure @code{two_pairs} has pointers to the outer definition of
6751@code{struct pair}, which is probably not desirable.
6752
6753To prevent that, you can write @code{struct pair;} inside the function
6754body as a variable declaration with no variables. This is a
6755@dfn{forward declaration} of the type tag @code{pair}: it makes the
6756type tag local to the current block, with the details of the type to
6757come later. Here's an example:
6758
6759@example
6760void
6761pair_up_doubles (int len, double array[])
6762@{
6763 /* @r{Forward declaration for @code{pair}.} */
6764 struct pair;
6765 struct two_pairs @{ struct pair *p, *q; @};
6766 /* @r{Give the details.} */
6767 struct pair @{ double a, b; @};
6768 @r{@dots{}}
6769@}
6770@end example
6771
6772However, the cleanest practice is to avoid shadowing type tags.
6773
6774@node Arrays
6775@chapter Arrays
6776@cindex array
6777@cindex elements of arrays
6778
6779An @dfn{array} is a data object that holds a series of @dfn{elements},
6780all of the same data type. Each element is identified by its numeric
6781@var{index} within the array.
6782
6783We presented arrays of numbers in the sample programs early in this
6784manual (@pxref{Array Example}). However, arrays can have elements of
6785any data type, including pointers, structures, unions, and other
6786arrays.
6787
6788If you know another programming language, you may suppose that you know all
6789about arrays, but C arrays have special quirks, so in this chapter we
6790collect all the information about arrays in C@.
6791
6792The elements of a C array are allocated consecutively in memory,
6793with no gaps between them. Each element is aligned as required
6794for its data type (@pxref{Type Alignment}).
6795
6796@menu
6797* Accessing Array Elements:: How to access individual elements of an array.
6798* Declaring an Array:: How to name and reserve space for a new array.
6799* Strings:: A string in C is a special case of array.
6800* Array Type Designators:: Referring to a specific array type.
6801* Incomplete Array Types:: Naming, but not allocating, a new array.
6802* Limitations of C Arrays:: Arrays are not first-class objects.
6803* Multidimensional Arrays:: Arrays of arrays.
6804* Constructing Array Values:: Assigning values to an entire array at once.
6805* Arrays of Variable Length:: Declaring arrays of non-constant size.
6806@end menu
6807
6808@node Accessing Array Elements
6809@section Accessing Array Elements
6810@cindex accessing array elements
6811@cindex array elements, accessing
6812
6813If the variable @code{a} is an array, the @var{n}th element of
6814@code{a} is @code{a[@var{n}]}. You can use that expression to access
6815an element's value or to assign to it:
6816
6817@example
6818x = a[5];
6819a[6] = 1;
6820@end example
6821
6822@noindent
6823Since the variable @code{a} is an lvalue, @code{a[@var{n}]} is also an
6824lvalue.
6825
6826The lowest valid index in an array is 0, @emph{not} 1, and the highest
6827valid index is one less than the number of elements.
6828
6829The C language does not check whether array indices are in bounds, so
6830if the code uses an out-of-range index, it will access memory outside the
6831array.
6832
6833@strong{Warning:} Using only valid index values in C is the
6834programmer's responsibility.
6835
6836Array indexing in C is not a primitive operation: it is defined in
6837terms of pointer arithmetic and dereferencing. Now that we know
6838@emph{what} @code{a[i]} does, we can ask @emph{how} @code{a[i]} does
6839its job.
6840
6841In C, @code{@var{x}[@var{y}]} is an abbreviation for
6842@code{*(@var{x}+@var{y})}. Thus, @code{a[i]} really means
6843@code{*(a+i)}. @xref{Pointers and Arrays}.
6844
6845When an expression with array type (such as @code{a}) appears as part
6846of a larger C expression, it is converted automatically to a pointer
6847to element zero of that array. For instance, @code{a} in an
6848expression is equivalent to @code{&a[0]}. Thus, @code{*(a+i)} is
6849computed as @code{*(&a[0]+i)}.
6850
6851Now we can analyze how that expression gives us the desired element of
6852the array. It makes a pointer to element 0 of @code{a}, advances it
6853by the value of @code{i}, and dereferences that pointer.
6854
6855Another equivalent way to write the expression is @code{(&a[0])[i]}.
6856
6857@node Declaring an Array
6858@section Declaring an Array
6859@cindex declaring an array
6860@cindex array, declaring
6861
6862To make an array declaration, write @code{[@var{length}]} after the
6863name being declared. This construct is valid in the declaration of a
6864variable, a function parameter, a function value type (the value can't
6865be an array, but it can be a pointer to one), a structure field, or a
6866union alternative.
6867
6868The surrounding declaration specifies the element type of the array;
6869that can be any type of data, but not @code{void} or a function type.
6870For instance,
6871
6872@example
6873double a[5];
6874@end example
6875
6876@noindent
6877declares @code{a} as an array of 5 @code{double}s.
6878
6879@example
6880struct foo bstruct[length];
6881@end example
6882
6883@noindent
6884declares @code{bstruct} as an array of @code{length} objects of type
6885@code{struct foo}. A variable array size like this is allowed when
6886the array is not file-scope.
6887
6888Other declaration constructs can nest within the array declaration
6889construct. For instance:
6890
6891@example
6892struct foo *b[length];
6893@end example
6894
6895@noindent
6896declares @code{b} as an array of @code{length} pointers to
6897@code{struct foo}. This shows that the length need not be a constant
6898(@pxref{Arrays of Variable Length}).
6899
6900@example
6901double (*c)[5];
6902@end example
6903
6904@noindent
6905declares @code{c} as a pointer to an array of 5 @code{double}s, and
6906
6907@example
6908char *(*f (int))[5];
6909@end example
6910
6911@noindent
6912declares @code{f} as a function taking an @code{int} argument and
6913returning a pointer to an array of 5 strings (pointers to
6914@code{char}s).
6915
6916@example
6917double aa[5][10];
6918@end example
6919
6920@noindent
6921declares @code{aa} as an array of 5 elements, each of which is an
6922array of 10 @code{double}s. This shows how to declare a
6923multidimensional array in C (@pxref{Multidimensional Arrays}).
6924
6925All these declarations specify the array's length, which is needed in
6926these cases in order to allocate storage for the array.
6927
6928@node Strings
6929@section Strings
6930@cindex string
6931
6932A string in C is a sequence of elements of type @code{char},
6933terminated with the null character, the character with code zero.
6934
6935Programs often need to use strings with specific, fixed contents. To
6936write one in a C program, use a @dfn{string constant} such as
6937@code{"Take me to your leader!"}. The data type of a string constant
6938is @code{char *}. For the full syntactic details of writing string
6939constants, @ref{String Constants}.
6940
6941To declare a place to store a non-constant string, declare an array of
6942@code{char}. Keep in mind that it must include one extra @code{char}
6943for the terminating null. For instance,
6944
6945@example
6946char text = @{ 'H', 'e', 'l', 'l', 'o', 0 @};
6947@end example
6948
6949@noindent
6950declares an array named @samp{text} with six elements---five letters
6951and the terminating null character. An equivalent way to get the same
6952result is this,
6953
6954@example
6955char text = "Hello";
6956@end example
6957
6958@noindent
6959which copies the elements of the string constant, including @emph{its}
6960terminating null character.
6961
6962@example
6963char message[200];
6964@end example
6965
6966@noindent
6967declares an array long enough to hold a string of 199 ASCII characters
6968plus the terminating null character.
6969
6970When you store a string into @code{message} be sure to check or prove
6971that the length does not exceed its size. For example,
6972
6973@example
6974void
6975set_message (char *text)
6976@{
6977 int i;
6978 for (i = 0; i < sizeof (message); i++)
6979 @{
6980 message[i] = text[i];
6981 if (text[i] == 0)
6982 return;
6983 @}
6984 fatal_error ("Message is too long for `message');
6985@}
6986@end example
6987
6988It's easy to do this with the standard library function
6989@code{strncpy}, which fills out the whole destination array (up to a
6990specified length) with null characters. Thus, if the last character
6991of the destination is not null, the string did not fit. Many system
6992libraries, including the GNU C library, hand-optimize @code{strncpy}
6993to run faster than an explicit @code{for}-loop.
6994
6995Here's what the code looks like:
6996
6997@example
6998void
6999set_message (char *text)
7000@{
7001 strncpy (message, text, sizeof (message));
7002 if (message[sizeof (message) - 1] != 0)
7003 fatal_error ("Message is too long for `message');
7004@}
7005@end example
7006
7007@xref{String and Array Utilities, The GNU C Library, , libc, The GNU C
7008Library Reference Manual}, for more information about the standard
7009library functions for operating on strings.
7010
7011You can avoid putting a fixed length limit on strings you construct or
7012operate on by allocating the space for them dynamically.
7013@xref{Dynamic Memory Allocation}.
7014
7015@node Array Type Designators
7016@section Array Type Designators
7017
7018Every C type has a type designator, which you make by deleting the
7019variable name and the semicolon from a declaration (@pxref{Type
7020Designators}). The designators for array types follow this rule, but
7021they may appear surprising.
7022
7023@example
7024@r{type} int a[5]; @r{designator} int [5]
7025@r{type} double a[5][3]; @r{designator} double [5][3]
7026@r{type} struct foo *a[5]; @r{designator} struct foo *[5]
7027@end example
7028
7029@node Incomplete Array Types
7030@section Incomplete Array Types
7031@cindex incomplete array types
7032@cindex array types, incomplete
7033
7034An array is equivalent, for most purposes, to a pointer to its zeroth
7035element. When that is true, the length of the array is irrelevant.
7036The length needs to be known only for allocating space for the array, or
7037for @code{sizeof} and @code{typeof} (@pxref{Auto Type}). Thus, in some
7038contexts C allows
7039
7040@itemize @bullet
7041@item
7042An @code{extern} declaration says how to refer to a variable allocated
7043elsewhere. It does not need to allocate space for the variable,
7044so if it is an array, you can omit the length. For example,
7045
7046@example
7047extern int foo[];
7048@end example
7049
7050@item
7051When declaring a function parameter as an array, the argument value
7052passed to the function is really a pointer to the array's zeroth
7053element. This value does not say how long the array really is, there
7054is no need to declare it. For example,
7055
7056@example
7057int
7058func (int foo[])
7059@end example
7060@end itemize
7061
7062These declarations are examples of @dfn{incomplete} array types, types
7063that are not fully specified. The incompleteness makes no difference
7064for accessing elements of the array, but it matters for some other
7065things. For instance, @code{sizeof} is not allowed on an incomplete
7066type.
7067
7068With multidimensional arrays, only the first dimension can be omitted:
7069
7070@example
7071extern struct chesspiece *funnyboard foo[][8];
7072@end example
7073
7074In other words, the code doesn't have to say how many rows there are,
7075but it must state how big each row is.
7076
7077@node Limitations of C Arrays
7078@section Limitations of C Arrays
7079@cindex limitations of C arrays
7080@cindex first-class object
7081
7082Arrays have quirks in C because they are not ``first-class objects'':
7083there is no way in C to operate on an array as a unit.
7084
7085The other composite objects in C, structures and unions, are
7086first-class objects: a C program can copy a structure or union value
7087in an assignment, or pass one as an argument to a function, or make a
7088function return one. You can't do those things with an array in C@.
7089That is because a value you can operate on never has an array type.
7090
7091An expression in C can have an array type, but that doesn't produce
7092the array as a value. Instead it is converted automatically to a
7093pointer to the array's element at index zero. The code can operate
7094on the pointer, and through that on individual elements of the array,
7095but it can't get and operate on the array as a unit.
7096
7097There are three exceptions to this conversion rule, but none of them
7098offers a way to operate on the array as a whole.
7099
7100First, @samp{&} applied to an expression with array type gives you the
7101address of the array, as an array type. However, you can't operate on the
7102whole array that way---if you apply @samp{*} to get the array back,
7103that expression converts, as usual, to a pointer to its zeroth
7104element.
7105
7106Second, the operators @code{sizeof}, @code{_Alignof}, and
7107@code{typeof} do not convert the array to a pointer; they leave it as
7108an array. But they don't operate on the array's data---they only give
7109information about its type.
7110
7111Third, a string constant used as an initializer for an array is not
7112converted to a pointer---rather, the declaration copies the
7113@emph{contents} of that string in that one special case.
7114
7115You @emph{can} copy the contents of an array, just not with an
7116assignment operator. You can do it by calling the library function
7117@code{memcpy} or @code{memmove} (@pxref{Copying and Concatenation, The
7118GNU C Library, , libc, The GNU C Library Reference Manual}). Also,
7119when a structure contains just an array, you can copy that structure.
7120
7121An array itself is an lvalue if it is a declared variable, or part of
7122a structure or union that is an lvalue. When you construct an array
7123from elements (@pxref{Constructing Array Values}), that array is not
7124an lvalue.
7125
7126@node Multidimensional Arrays
7127@section Multidimensional Arrays
7128@cindex multidimensional arrays
7129@cindex array, multidimensional
7130
7131Strictly speaking, all arrays in C are unidimensional. However, you
7132can create an array of arrays, which is more or less equivalent to a
7133multidimensional array. For example,
7134
7135@example
7136struct chesspiece *board[8][8];
7137@end example
7138
7139@noindent
7140declares an array of 8 arrays of 8 pointers to @code{struct
7141chesspiece}. This data type could represent the state of a chess
7142game. To access one square's contents requires two array index
7143operations, one for each dimension. For instance, you can write
7144@code{board[row][column]}, assuming @code{row} and @code{column}
7145are variables with integer values in the proper range.
7146
7147How does C understand @code{board[row][column]}? First of all,
7148@code{board} is converted automatically to a pointer to the zeroth
7149element (at index zero) of @code{board}. Adding @code{row} to that
7150makes it point to the desired element. Thus, @code{board[row]}'s
7151value is an element of @code{board}---an array of 8 pointers.
7152
7153However, as an expression with array type, it is converted
7154automatically to a pointer to the array's zeroth element. The second
7155array index operation, @code{[column]}, accesses the chosen element
7156from that array.
7157
7158As this shows, pointer-to-array types are meaningful in C@.
7159You can declare a variable that points to a row in a chess board
7160like this:
7161
7162@example
7163struct chesspiece *(*rowptr)[8];
7164@end example
7165
7166@noindent
7167This points to an array of 8 pointers to @code{struct chesspiece}.
7168You can assign to it as follows:
7169
7170@example
7171rowptr = &board[5];
7172@end example
7173
7174The dimensions don't have to be equal in length. Here we declare
7175@code{statepop} as an array to hold the population of each state in
7176the United States for each year since 1900:
7177
7178@example
7179#define NSTATES 50
7180@{
7181 int nyears = current_year - 1900 + 1;
7182 int statepop[NSTATES][nyears];
7183 @r{@dots{}}
7184@}
7185@end example
7186
7187The variable @code{statepop} is an array of @code{NSTATES} subarrays,
7188each indexed by the year (counting from 1900). Thus, to get the
7189element for a particular state and year, we must subscript it first
7190by the number that indicates the state, and second by the index for
7191the year:
7192
7193@example
7194statepop[state][year - 1900]
7195@end example
7196
7197@cindex array, layout in memory
7198The subarrays within the multidimensional array are allocated
7199consecutively in memory, and within each subarray, its elements are
7200allocated consecutively in memory. The most efficient way to process
7201all the elements in the array is to scan the last subscript in the
7202innermost loop. This means consecutive accesses go to consecutive
7203memory locations, which optimizes use of the processor's memory cache.
7204For example:
7205
7206@example
7207int total = 0;
7208float average;
7209
7210for (int state = 0; state < NSTATES, ++state)
7211 @{
7212 for (int year = 0; year < nyears; ++year)
7213 @{
7214 total += statepop[state][year];
7215 @}
7216 @}
7217
7218average = total / nyears;
7219@end example
7220
7221C's layout for multidimensional arrays is different from Fortran's
7222layout. In Fortran, a multidimensional array is not an array of
7223arrays; rather, multidimensional arrays are a primitive feature, and
7224it is the first index that varies most rapidly between consecutive
7225memory locations. Thus, the memory layout of a 50x114 array in C
7226matches that of a 114x50 array in Fortran.
7227
7228@node Constructing Array Values
7229@section Constructing Array Values
7230@cindex constructing array values
7231@cindex array values, constructing
7232
7233You can construct an array from elements by writing them inside
7234braces, and preceding all that with the array type's designator in
7235parentheses. There is no need to specify the array length, since the
7236number of elements determines that. The constructor looks like this:
7237
7238@example
7239(@var{elttype}[]) @{ @var{elements} @};
7240@end example
7241
7242Here is an example, which constructs an array of string pointers:
7243
7244@example
7245(char *[]) @{ "x", "y", "z" @};
7246@end example
7247
7248That's equivalent in effect to declaring an array with the same
7249initializer, like this:
7250
7251@example
7252char *array[] = @{ "x", "y", "z" @};
7253@end example
7254
7255and then using the array.
7256
7257If all the elements are simple constant expressions, or made up of
7258such, then the compound literal can be coerced to a pointer to its
7259zeroth element and used to initialize a file-scope variable
7260(@pxref{File-Scope Variables}), as shown here:
7261
7262@example
7263char **foo = (char *[]) @{ "x", "y", "z" @};
7264@end example
7265
7266@noindent
7267The data type of @code{foo} is @code{char **}, which is a pointer
7268type, not an array type. The declaration is equivalent to defining
7269and then using an array-type variable:
7270
7271@example
7272char *nameless_array[] = @{ "x", "y", "z" @};
7273char **foo = &nameless_array[0];
7274@end example
7275
7276@node Arrays of Variable Length
7277@section Arrays of Variable Length
7278@cindex array of variable length
7279@cindex variable-length arrays
7280
7281In GNU C, you can declare variable-length arrays like any other
7282arrays, but with a length that is not a constant expression. The
7283storage is allocated at the point of declaration and deallocated when
7284the block scope containing the declaration exits. For example:
7285
7286@example
7287#include <stdio.h> /* @r{Defines @code{FILE}.} */
7288#include <string.h> /* @r{Declares @code{str}.} */
7289
7290FILE *
7291concat_fopen (char *s1, char *s2, char *mode)
7292@{
7293 char str[strlen (s1) + strlen (s2) + 1];
7294 strcpy (str, s1);
7295 strcat (str, s2);
7296 return fopen (str, mode);
7297@}
7298@end example
7299
7300@noindent
7301(This uses some standard library functions; see @ref{String and Array
7302Utilities, , , libc, The GNU C Library Reference Manual}.)
7303
7304The length of an array is computed once when the storage is allocated
7305and is remembered for the scope of the array in case it is used in
7306@code{sizeof}.
7307
7308@strong{Warning:} don't allocate a variable-length array if the size
7309might be very large (more than 100,000), or in a recursive function,
7310because that is likely to cause stack overflow. Allocate the array
7311dynamically instead (@pxref{Dynamic Memory Allocation}).
7312
7313Jumping or breaking out of the scope of the array name deallocates the
7314storage. Jumping into the scope is not allowed; that gives an error
7315message.
7316
7317You can also use variable-length arrays as arguments to functions:
7318
7319@example
7320struct entry
7321tester (int len, char data[len][len])
7322@{
7323 @r{@dots{}}
7324@}
7325@end example
7326
7327As usual, a function argument declared with an array type
7328is really a pointer to an array that already exists.
7329Calling the function does not allocate the array, so there's no
7330particular danger of stack overflow in using this construct.
7331
7332To pass the array first and the length afterward, use a forward
7333declaration in the function's parameter list (another GNU extension).
7334For example,
7335
7336@example
7337struct entry
7338tester (int len; char data[len][len], int len)
7339@{
7340 @r{@dots{}}
7341@}
7342@end example
7343
7344The @code{int len} before the semicolon is a @dfn{parameter forward
7345declaration}, and it serves the purpose of making the name @code{len}
7346known when the declaration of @code{data} is parsed.
7347
7348You can write any number of such parameter forward declarations in the
7349parameter list. They can be separated by commas or semicolons, but
7350the last one must end with a semicolon, which is followed by the
7351``real'' parameter declarations. Each forward declaration must match
7352a ``real'' declaration in parameter name and data type. ISO C11 does
7353not support parameter forward declarations.
7354
7355@node Enumeration Types
7356@chapter Enumeration Types
7357@cindex enumeration types
7358@cindex types, enumeration
7359@cindex enumerator
7360
7361An @dfn{enumeration type} represents a limited set of integer values,
7362each with a name. It is effectively equivalent to a primitive integer
7363type.
7364
7365Suppose we have a list of possible emotional states to store in an
7366integer variable. We can give names to these alternative values with
7367an enumeration:
7368
7369@example
7370enum emotion_state @{ neutral, happy, sad, worried,
7371 calm, nervous @};
7372@end example
7373
7374@noindent
7375(Never mind that this is a simplistic way to classify emotional states;
7376it's just a code example.)
7377
7378The names inside the enumeration are called @dfn{enumerators}. The
7379enumeration type defines them as constants, and their values are
7380consecutive integers; @code{neutral} is 0, @code{happy} is 1,
7381@code{sad} is 2, and so on. Alternatively, you can specify values for
7382the enumerators explicitly like this:
7383
7384@example
7385enum emotion_state @{ neutral = 2, happy = 5,
7386 sad = 20, worried = 10,
7387 calm = -5, nervous = -300 @};
7388@end example
7389
7390Each enumerator which does not specify a value gets value zero
7391(if it is at the beginning) or the next consecutive integer.
7392
7393@example
7394/* @r{@code{neutral} is 0 by default,}
7395 @r{and @code{worried} is 21 by default.} */
7396enum emotion_state @{ neutral,
7397 happy = 5, sad = 20, worried,
7398 calm = -5, nervous = -300 @};
7399@end example
7400
7401If an enumerator is obsolete, you can specify that using it should
7402cause a warning, by including an attribute in the enumerator's
7403declaration. Here is how @code{happy} would look with this
7404attribute:
7405
7406@example
7407happy __attribute__
7408 ((deprecated
7409 ("impossible under plutocratic rule")))
7410 = 5,
7411@end example
7412
7413@xref{Attributes}.
7414
7415You can declare variables with the enumeration type:
7416
7417@example
7418enum emotion_state feelings_now;
7419@end example
7420
7421In the C code itself, this is equivalent to declaring the variable
7422@code{int}. (If all the enumeration values are positive, it is
7423equivalent to @code{unsigned int}.) However, declaring it with the
7424enumeration type has an advantage in debugging, because GDB knows it
7425should display the current value of the variable using the
7426corresponding name. If the variable's type is @code{int}, GDB can
7427only show the value as a number.
7428
7429The identifier that follows @code{enum} is called a @dfn{type tag}
7430since it distinguishes different enumeration types. Type tags are in
7431a separate name space and belong to scopes like most other names in C@.
7432@xref{Type Tags}, for explanation.
7433
7434You can predeclare an @code{enum} type tag like a structure or union
7435type tag, like this:
7436
7437@example
7438enum foo;
7439@end example
7440
7441@noindent
7442The @code{enum} type is incomplete until you finish defining it.
7443
7444You can optionally include a trailing comma at the end of a list of
7445enumeration values:
7446
7447@example
7448enum emotion_state @{ neutral, happy, sad, worried,
7449 calm, nervous, @};
7450@end example
7451
7452@noindent
7453This is useful in some macro definitions, since it enables you to
7454assemble the list of enumerators without knowing which one is last.
7455The extra comma does not change the meaning of the enumeration in any
7456way.
7457
7458@node Defining Typedef Names
7459@chapter Defining Typedef Names
7460@cindex typedef names
7461@findex typedef
7462
7463You can define a data type keyword as an alias for any type, and then
7464use the alias syntactically like a built-in type keyword such as
7465@code{int}. You do this using @code{typedef}, so these aliases are
7466also called @dfn{typedef names}.
7467
7468@code{typedef} is followed by text that looks just like a variable
7469declaration, but instead of declaring variables it defines data type
7470keywords.
7471
7472Here's how to define @code{fooptr} as a typedef alias for the type
7473@code{struct foo *}, then declare @code{x} and @code{y} as variables
7474with that type:
7475
7476@example
7477typedef struct foo *fooptr;
7478
7479fooptr x, y;
7480@end example
7481
7482@noindent
7483That declaration is equivalent to the following one:
7484
7485@example
7486struct foo *x, *y;
7487@end example
7488
7489You can define a typedef alias for any type. For instance, this makes
7490@code{frobcount} an alias for type @code{int}:
7491
7492@example
7493typedef int frobcount;
7494@end example
7495
7496@noindent
7497This doesn't define a new type distinct from @code{int}. Rather,
7498@code{frobcount} is another name for the type @code{int}. Once the
7499variable is declared, it makes no difference which name the
7500declaration used.
7501
7502There is a syntactic difference, however, between @code{frobcount} and
7503@code{int}: A typedef name cannot be used with
7504@code{signed}, @code{unsigned}, @code{long} or @code{short}. It has
7505to specify the type all by itself. So you can't write this:
7506
7507@example
7508unsigned frobcount f1; /* @r{Error!} */
7509@end example
7510
7511But you can write this:
7512
7513@example
7514typedef unsigned int unsigned_frobcount;
7515
7516unsigned_frobcount f1;
7517@end example
7518
7519In other words, a typedef name is not an alias for @emph{a keyword}
7520such as @code{int}. It stands for a @emph{type}, and that could be
7521the type @code{int}.
7522
7523Typedef names are in the same namespace as functions and variables, so
7524you can't use the same name for a typedef and a function, or a typedef
7525and a variable. When a typedef is declared inside a code block, it is
7526in scope only in that block.
7527
7528@strong{Warning:} Avoid defining typedef names that end in @samp{_t},
7529because many of these have standard meanings.
7530
7531You can redefine a typedef name to the exact same type as its first
7532definition, but you cannot redefine a typedef name to a
7533different type, even if the two types are compatible. For example, this
7534is valid:
7535
7536@example
7537typedef int frobcount;
7538typedef int frotzcount;
7539typedef frotzcount frobcount;
7540typedef frobcount frotzcount;
7541@end example
7542
7543@noindent
7544because each typedef name is always defined with the same type
7545(@code{int}), but this is not valid:
7546
7547@example
7548enum foo @{f1, f2, f3@};
7549typedef enum foo frobcount;
7550typedef int frobcount;
7551@end example
7552
7553@noindent
7554Even though the type @code{enum foo} is compatible with @code{int},
7555they are not the @emph{same} type.
7556
7557@node Statements
7558@chapter Statements
7559@cindex statements
7560
7561A @dfn{statement} specifies computations to be done for effect; it
7562does not produce a value, as an expression would. In general a
7563statement ends with a semicolon (@samp{;}), but blocks (which are
7564statements, more or less) are an exception to that rule.
7565@ifnottex
7566@xref{Blocks}.
7567@end ifnottex
7568
7569The places to use statements are inside a block, and inside a
7570complex statement. A @dfn{complex statement} contains one or two
7571components that are nested statements. Each such component must
7572consist of one and only one statement. The way to put multiple
7573statements in such a component is to group them into a @dfn{block}
7574(@pxref{Blocks}), which counts as one statement.
7575
7576The following sections describe the various kinds of statement.
7577
7578@menu
7579* Expression Statement:: Evaluate an expression, as a statement,
7580 usually done for a side effect.
7581* if Statement:: Basic conditional execution.
7582* if-else Statement:: Multiple branches for conditional execution.
7583* Blocks:: Grouping multiple statements together.
7584* return Statement:: Return a value from a function.
7585* Loop Statements:: Repeatedly executing a statement or block.
7586* switch Statement:: Multi-way conditional choices.
7587* switch Example:: A plausible example of using @code{switch}.
7588* Duffs Device:: A special way to use @code{switch}.
7589* Case Ranges:: Ranges of values for @code{switch} cases.
7590* Null Statement:: A statement that does nothing.
7591* goto Statement:: Jump to another point in the source code,
7592 identified by a label.
7593* Local Labels:: Labels with limited scope.
7594* Labels as Values:: Getting the address of a label.
7595* Statement Exprs:: A series of statements used as an expression.
7596@end menu
7597
7598@node Expression Statement
7599@section Expression Statement
7600@cindex expression statement
7601@cindex statement, expression
7602
7603The most common kind of statement in C is an @dfn{expression statement}.
7604It consists of an expression followed by a
7605semicolon. The expression's value is discarded, so the expressions
7606that are useful are those that have side effects: assignment
7607expressions, increment and decrement expressions, and function calls.
7608Here are examples of expression statements:
7609
7610@smallexample
7611x = 5; /* @r{Assignment expression.} */
7612p++; /* @r{Increment expression.} */
7613printf ("Done\n"); /* @r{Function call expression.} */
7614*p; /* @r{Cause @code{SIGSEGV} signal if @code{p} is null.} */
7615x + y; /* @r{Useless statement without effect.} */
7616@end smallexample
7617
7618In very unusual circumstances we use an expression statement
7619whose purpose is to get a fault if an address is invalid:
7620
7621@smallexample
7622volatile char *p;
7623@r{@dots{}}
7624*p; /* @r{Cause signal if @code{p} is null.} */
7625@end smallexample
7626
7627If the target of @code{p} is not declared @code{volatile}, the
7628compiler might optimize away the memory access, since it knows that
7629the value isn't really used. @xref{volatile}.
7630
7631@node if Statement
7632@section @code{if} Statement
7633@cindex @code{if} statement
7634@cindex statement, @code{if}
7635@findex if
7636
7637An @code{if} statement computes an expression to decide
7638whether to execute the following statement or not.
7639It looks like this:
7640
7641@example
7642if (@var{condition})
7643 @var{execute-if-true}
7644@end example
7645
7646The first thing this does is compute the value of @var{condition}. If
7647that is true (nonzero), then it executes the statement
7648@var{execute-if-true}. If the value of @var{condition} is false
7649(zero), it doesn't execute @var{execute-if-true}; instead, it does
7650nothing.
7651
7652This is a @dfn{complex statement} because it contains a component
7653@var{if-true-substatement} that is a nested statement. It must be one
7654and only one statement. The way to put multiple statements there is
7655to group them into a @dfn{block} (@pxref{Blocks}).
7656
7657@node if-else Statement
7658@section @code{if-else} Statement
7659@cindex @code{if}@dots{}@code{else} statement
7660@cindex statement, @code{if}@dots{}@code{else}
7661@findex else
7662
7663An @code{if}-@code{else} statement computes an expression to decide
7664which of two nested statements to execute.
7665It looks like this:
7666
7667@example
7668if (@var{condition})
7669 @var{if-true-substatement}
7670else
7671 @var{if-false-substatement}
7672@end example
7673
7674The first thing this does is compute the value of @var{condition}. If
7675that is true (nonzero), then it executes the statement
7676@var{if-true-substatement}. If the value of @var{condition} is false
7677(zero), then it executes the statement @var{if-false-substatement} instead.
7678
7679This is a @dfn{complex statement} because it contains components
7680@var{if-true-substatement} and @var{if-else-substatement} that are
7681nested statements. Each must be one and only one statement. The way
7682to put multiple statements in such a component is to group them into a
7683@dfn{block} (@pxref{Blocks}).
7684
7685@node Blocks
7686@section Blocks
7687@cindex block
7688@cindex compound statement
7689
7690A @dfn{block} is a construct that contains multiple statements of any
7691kind. It begins with @samp{@{} and ends with @samp{@}}, and has a
7692series of statements and declarations in between. Another name for
7693blocks is @dfn{compound statements}.
7694
7695Is a block a statement? Yes and no. It doesn't @emph{look} like a
7696normal statement---it does not end with a semicolon. But you can
7697@emph{use} it like a statement; anywhere that a statement is required
7698or allowed, you can write a block and consider that block a statement.
7699
7700So far it seems that a block is a kind of statement with an unusual
7701syntax. But that is not entirely true: a function body is also a
7702block, and that block is definitely not a statement. The text after a
7703function header is not treated as a statement; only a function body is
7704allowed there, and nothing else would be meaningful there.
7705
7706In a formal grammar we would have to choose---either a block is a kind
7707of statement or it is not. But this manual is meant for humans, not
7708for parser generators. The clearest answer for humans is, ``a block
7709is a statement, in some ways.''
7710
7711@cindex nested block
7712@cindex internal block
7713A block that isn't a function body is called an @dfn{internal block}
7714or a @dfn{nested block}. You can put a nested block directly inside
7715another block, but more often the nested block is inside some complex
7716statement, such as a @code{for} statement or an @code{if} statement.
7717
7718There are two uses for nested blocks in C:
7719
7720@itemize @bullet
7721@item
7722To specify the scope for local declarations. For instance, a local
7723variable's scope is the rest of the innermost containing block.
7724
7725@item
7726To write a series of statements where, syntactically, one statement is
7727called for. For instance, the @var{execute-if-true} of an @code{if}
7728statement is one statement. To put multiple statements there, they
7729have to be wrapped in a block, like this:
7730
7731@example
7732if (x < 0)
7733 @{
7734 printf ("x was negative\n");
7735 x = -x;
7736 @}
7737@end example
7738@end itemize
7739
7740This example (repeated from above) shows a nested block which serves
7741both purposes: it includes two statements (plus a declaration) in the
7742body of a @code{while} statement, and it provides the scope for the
7743declaration of @code{q}.
7744
7745@example
7746void
7747free_intlist (struct intlistlink *p)
7748@{
7749 while (p)
7750 @{
7751 struct intlistlink *q = p;
7752 p = p->next;
7753 free (q);
7754 @}
7755@}
7756@end example
7757
7758@node return Statement
7759@section @code{return} Statement
7760@cindex @code{return} statement
7761@cindex statement, @code{return}
7762@findex return
7763
7764The @code{return} statement makes the containing function return
7765immediately. It has two forms. This one specifies no value to
7766return:
7767
7768@example
7769return;
7770@end example
7771
7772@noindent
7773That form is meant for functions whose return type is @code{void}
7774(@pxref{The Void Type}). You can also use it in a function that
7775returns nonvoid data, but that's a bad idea, since it makes the
7776function return garbage.
7777
7778The form that specifies a value looks like this:
7779
7780@example
7781return @var{value};
7782@end example
7783
7784@noindent
7785which computes the expression @var{value} and makes the function
7786return that. If necessary, the value undergoes type conversion to
7787the function's declared return value type, which works like
7788assigning the value to a variable of that type.
7789
7790@node Loop Statements
7791@section Loop Statements
7792@cindex loop statements
7793@cindex statements, loop
7794@cindex iteration
7795
7796You can use a loop statement when you need to execute a series of
7797statements repeatedly, making an @dfn{iteration}. C provides several
7798different kinds of loop statements, described in the following
7799subsections.
7800
7801Every kind of loop statement is a complex statement because contains a
7802component, here called @var{body}, which is a nested statement.
7803Most often the body is a block.
7804
7805@menu
7806* while Statement:: Loop as long as a test expression is true.
7807* do-while Statement:: Execute a loop once, with further looping
7808 as long as a test expression is true.
7809* break Statement:: End a loop immediately.
7810* for Statement:: Iterative looping.
7811* Example of for:: An example of iterative looping.
7812* Omitted for-Expressions:: for-loop expression options.
7813* for-Index Declarations:: for-loop declaration options.
7814* continue Statement:: Begin the next cycle of a loop.
7815@end menu
7816
7817@node while Statement
7818@subsection @code{while} Statement
7819@cindex @code{while} statement
7820@cindex statement, @code{while}
7821@findex while
7822
7823The @code{while} statement is the simplest loop construct.
7824It looks like this:
7825
7826@example
7827while (@var{test})
7828 @var{body}
7829@end example
7830
7831Here, @var{body} is a statement (often a nested block) to repeat, and
7832@var{test} is the test expression that controls whether to repeat it again.
7833Each iteration of the loop starts by computing @var{test} and, if it
7834is true (nonzero), that means the loop should execute @var{body} again
7835and then start over.
7836
7837Here's an example of advancing to the last structure in a chain of
7838structures chained through the @code{next} field:
7839
7840@example
7841#include <stddef.h> /* @r{Defines @code{NULL}.} */
7842@r{@dots{}}
7843while (chain->next != NULL)
7844 chain = chain->next;
7845@end example
7846
7847@noindent
7848This code assumes the chain isn't empty to start with; if the chain is
7849empty (that is, if @code{chain} is a null pointer), the code gets a
7850@code{SIGSEGV} signal trying to dereference that null pointer (@pxref{Signals}).
7851
7852@node do-while Statement
7853@subsection @code{do-while} Statement
7854@cindex @code{do}--@code{while} statement
7855@cindex statement, @code{do}--@code{while}
7856@findex do
7857
7858The @code{do}--@code{while} statement is a simple loop construct that
7859performs the test at the end of the iteration.
7860
7861@example
7862do
7863 @var{body}
7864while (@var{test});
7865@end example
7866
7867Here, @var{body} is a statement (possibly a block) to repeat, and
7868@var{test} is an expression that controls whether to repeat it again.
7869
7870Each iteration of the loop starts by executing @var{body}. Then it
7871computes @var{test} and, if it is true (nonzero), that means to go
7872back and start over with @var{body}. If @var{test} is false (zero),
7873then the loop stops repeating and execution moves on past it.
7874
7875@node break Statement
7876@subsection @code{break} Statement
7877@cindex @code{break} statement
7878@cindex statement, @code{break}
7879@findex break
7880
7881The @code{break} statement looks like @samp{break;}. Its effect is to
7882exit immediately from the innermost loop construct or @code{switch}
7883statement (@pxref{switch Statement}).
7884
7885For example, this loop advances @code{p} until the next null
7886character or newline.
7887
7888@example
7889while (*p)
7890 @{
7891 /* @r{End loop if we have reached a newline.} */
7892 if (*p == '\n')
7893 break;
7894 p++
7895 @}
7896@end example
7897
7898When there are nested loops, the @code{break} statement exits from the
7899innermost loop containing it.
7900
7901@example
7902struct list_if_tuples
7903@{
7904 struct list_if_tuples next;
7905 int length;
7906 data *contents;
7907@};
7908
7909void
7910process_all_elements (struct list_if_tuples *list)
7911@{
7912 while (list)
7913 @{
7914 /* @r{Process all the elements in this node's vector,}
7915 @r{stopping when we reach one that is null.} */
7916 for (i = 0; i < list->length; i++
7917 @{
7918 /* @r{Null element terminates this node's vector.} */
7919 if (list->contents[i] == NULL)
7920 /* @r{Exit the @code{for} loop.} */
7921 break;
7922 /* @r{Operate on the next element.} */
7923 process_element (list->contents[i]);
7924 @}
7925
7926 list = list->next;
7927 @}
7928@}
7929@end example
7930
7931The only way in C to exit from an outer loop is with
7932@code{goto} (@pxref{goto Statement}).
7933
7934@node for Statement
7935@subsection @code{for} Statement
7936@cindex @code{for} statement
7937@cindex statement, @code{for}
7938@findex for
7939
7940A @code{for} statement uses three expressions written inside a
7941parenthetical group to define the repetition of the loop. The first
7942expression says how to prepare to start the loop. The second says how
7943to test, before each iteration, whether to continue looping. The
7944third says how to advance, at the end of an iteration, for the next
7945iteration. All together, it looks like this:
7946
7947@example
7948for (@var{start}; @var{continue-test}; @var{advance})
7949 @var{body}
7950@end example
7951
7952The first thing the @code{for} statement does is compute @var{start}.
7953The next thing it does is compute the expression @var{continue-test}.
7954If that expression is false (zero), the @code{for} statement finishes
7955immediately, so @var{body} is executed zero times.
7956
7957However, if @var{continue-test} is true (nonzero), the @code{for}
7958statement executes @var{body}, then @var{advance}. Then it loops back
7959to the not-quite-top to test @var{continue-test} again. But it does
7960not compute @var{start} again.
7961
7962@node Example of for
7963@subsection Example of @code{for}
7964
7965Here is the @code{for} statement from the iterative Fibonacci
7966function:
7967
7968@example
7969int i;
7970for (i = 1; i < n; ++i)
7971 /* @r{If @code{n} is 1 or less, the loop runs zero times,} */
7972 /* @r{since @code{i < n} is false the first time.} */
7973 @{
7974 /* @r{Now @var{last} is @code{fib (@var{i})}}
7975 @r{and @var{prev} is @code{fib (@var{i} @minus{} 1)}.} */
7976 /* @r{Compute @code{fib (@var{i} + 1)}.} */
7977 int next = prev + last;
7978 /* @r{Shift the values down.} */
7979 prev = last;
7980 last = next;
7981 /* @r{Now @var{last} is @code{fib (@var{i} + 1)}}
7982 @r{and @var{prev} is @code{fib (@var{i})}.}
7983 @r{But that won't stay true for long,}
7984 @r{because we are about to increment @var{i}.} */
7985 @}
7986@end example
7987
7988In this example, @var{start} is @code{i = 1}, meaning set @code{i} to
79891. @var{continue-test} is @code{i < n}, meaning keep repeating the
7990loop as long as @code{i} is less than @code{n}. @var{advance} is
7991@code{i++}, meaning increment @code{i} by 1. The body is a block
7992that contains a declaration and two statements.
7993
7994@node Omitted for-Expressions
7995@subsection Omitted @code{for}-Expressions
7996
7997A fully-fleshed @code{for} statement contains all these parts,
7998
7999@example
8000for (@var{start}; @var{continue-test}; @var{advance})
8001 @var{body}
8002@end example
8003
8004@noindent
8005but you can omit any of the three expressions inside the parentheses.
8006The parentheses and the two semicolons are required syntactically, but
8007the expressions between them may be missing. A missing expression
8008means this loop doesn't use that particular feature of the @code{for}
8009statement.
8010
8011Instead of using @var{start}, you can do the loop preparation
8012before the @code{for} statement: the effect is the same. So we
8013could have written the beginning of the previous example this way:
8014
8015@example
8016int i = 0;
8017for (; i < n; ++i)
8018@end example
8019
8020@noindent
8021instead of this way:
8022
8023@example
8024int i;
8025for (i = 0; i < n; ++i)
8026@end example
8027
8028Omitting @var{continue-test} means the loop runs forever (or until
8029something else causes exit from it). Statements inside the loop can
8030test conditions for termination and use @samp{break;} to exit. This
8031is more flexible since you can put those tests anywhere in the loop,
8032not solely at the beginning.
8033
8034Putting an expression in @var{advance} is almost equivalent to writing
8035it at the end of the loop body; it does almost the same thing. The
8036only difference is for the @code{continue} statement (@pxref{continue
8037Statement}). So we could have written this:
8038
8039@example
8040for (i = 0; i < n;)
8041 @{
8042 @r{@dots{}}
8043 ++i;
8044 @}
8045@end example
8046
8047@noindent
8048instead of this:
8049
8050@example
8051for (i = 0; i < n; ++i)
8052 @{
8053 @r{@dots{}}
8054 @}
8055@end example
8056
8057The choice is mainly a matter of what is more readable for
8058programmers. However, there is also a syntactic difference:
8059@var{advance} is an expression, not a statement. It can't include
8060loops, blocks, declarations, etc.
8061
8062@node for-Index Declarations
8063@subsection @code{for}-Index Declarations
8064
8065You can declare loop-index variables directly in the @var{start}
8066portion of the @code{for}-loop, like this:
8067
8068@example
8069for (int i = 0; i < n; ++i)
8070 @{
8071 @r{@dots{}}
8072 @}
8073@end example
8074
8075This kind of @var{start} is limited to a single declaration; it can
8076declare one or more variables, separated by commas, all of which are
8077the same @var{basetype} (@code{int}, in this example):
8078
8079@example
8080for (int i = 0, j = 1, *p = NULL; i < n; ++i, ++j, ++p)
8081 @{
8082 @r{@dots{}}
8083 @}
8084@end example
8085
8086@noindent
8087The scope of these variables is the @code{for} statement as a whole.
8088See @ref{Variable Declarations} for a explanation of @var{basetype}.
8089
8090Variables declared in @code{for} statements should have initializers.
8091Omitting the initialization gives the variables unpredictable initial
8092values, so this code is erroneous.
8093
8094@example
8095for (int i; i < n; ++i)
8096 @{
8097 @r{@dots{}}
8098 @}
8099@end example
8100
8101@node continue Statement
8102@subsection @code{continue} Statement
8103@cindex @code{continue} statement
8104@cindex statement, @code{continue}
8105@findex continue
8106
8107The @code{continue} statement looks like @samp{continue;}, and its
8108effect is to jump immediately to the end of the innermost loop
8109construct. If it is a @code{for}-loop, the next thing that happens
8110is to execute the loop's @var{advance} expression.
8111
8112For example, this loop increments @code{p} until the next null character
8113or newline, and operates (in some way not shown) on all the characters
8114in the line except for spaces. All it does with spaces is skip them.
8115
8116@example
8117for (;*p; ++p)
8118 @{
8119 /* @r{End loop if we have reached a newline.} */
8120 if (*p == '\n')
8121 break;
8122 /* @r{Pay no attention to spaces.} */
8123 if (*p == ' ')
8124 continue;
8125 /* @r{Operate on the next character.} */
8126 @r{@dots{}}
8127 @}
8128@end example
8129
8130@noindent
8131Executing @samp{continue;} skips the loop body but it does not
8132skip the @var{advance} expression, @code{p++}.
8133
8134We could also write it like this:
8135
8136@example
8137for (;*p; ++p)
8138 @{
8139 /* @r{Exit if we have reached a newline.} */
8140 if (*p == '\n')
8141 break;
8142 /* @r{Pay no attention to spaces.} */
8143 if (*p != ' ')
8144 @{
8145 /* @r{Operate on the next character.} */
8146 @r{@dots{}}
8147 @}
8148 @}
8149@end example
8150
8151The advantage of using @code{continue} is that it reduces the
8152depth of nesting.
8153
8154Contrast @code{continue} with the @code{break} statement. @xref{break
8155Statement}.
8156
8157@node switch Statement
8158@section @code{switch} Statement
8159@cindex @code{switch} statement
8160@cindex statement, @code{switch}
8161@findex switch
8162@findex case
8163@findex default
8164
8165The @code{switch} statement selects code to run according to the value
8166of an expression. The expression, in parentheses, follows the keyword
8167@code{switch}. After that come all the cases to select among,
8168inside braces. It looks like this:
8169
8170@example
8171switch (@var{selector})
8172 @{
8173 @var{cases}@r{@dots{}}
8174 @}
8175@end example
8176
8177A case can look like this:
8178
8179@example
8180case @var{value}:
8181 @var{statements}
8182 break;
8183@end example
8184
8185@noindent
8186which means ``come here if @var{selector} happens to have the value
8187@var{value},'' or like this (a GNU C extension):
8188
8189@example
8190case @var{rangestart} ... @var{rangeend}:
8191 @var{statements}
8192 break;
8193@end example
8194
8195@noindent
8196which means ``come here if @var{selector} happens to have a value
8197between @var{rangestart} and @var{rangeend} (inclusive).'' @xref{Case
8198Ranges}.
8199
8200The values in @code{case} labels must reduce to integer constants.
8201They can use arithmetic, and @code{enum} constants, but they cannot
8202refer to data in memory, because they have to be computed at compile
8203time. It is an error if two @code{case} labels specify the same
8204value, or ranges that overlap, or if one is a range and the other is a
8205value in that range.
8206
8207You can also define a default case to handle ``any other value,'' like
8208this:
8209
8210@example
8211default:
8212 @var{statements}
8213 break;
8214@end example
8215
8216If the @code{switch} statement has no @code{default:} label, then it
8217does nothing when the value matches none of the cases.
8218
8219The brace-group inside the @code{switch} statement is a block, and you
8220can declare variables with that scope just as in any other block
8221(@pxref{Blocks}). However, initializers in these declarations won't
8222necessarily be executed every time the @code{switch} statement runs,
8223so it is best to avoid giving them initializers.
8224
8225@code{break;} inside a @code{switch} statement exits immediately from
8226the @code{switch} statement. @xref{break Statement}.
8227
8228If there is no @code{break;} at the end of the code for a case,
8229execution continues into the code for the following case. This
8230happens more often by mistake than intentionally, but since this
8231feature is used in real code, we cannot eliminate it.
8232
8233@strong{Warning:} When one case is intended to fall through to the
8234next, write a comment like @samp{falls through} to say it's
8235intentional. That way, other programmers won't assume it was an error
8236and ``fix'' it erroneously.
8237
8238Consecutive @code{case} statements could, pedantically, be considered
8239an instance of falling through, but we don't consider or treat them that
8240way because they won't confuse anyone.
8241
8242@node switch Example
8243@section Example of @code{switch}
8244
8245Here's an example of using the @code{switch} statement
8246to distinguish among characters:
8247
8248@cindex counting vowels and punctuation
8249@example
8250struct vp @{ int vowels, punct; @};
8251
8252struct vp
8253count_vowels_and_punct (char *string)
8254@{
8255 int c;
8256 int vowels = 0;
8257 int punct = 0;
8258 /* @r{Don't change the parameter itself.} */
8259 /* @r{That helps in debugging.} */
8260 char *p = string;
8261 struct vp value;
8262
8263 while (c = *p++)
8264 switch (c)
8265 @{
8266 case 'y':
8267 case 'Y':
8268 /* @r{We assume @code{y_is_consonant} will check surrounding
8269 letters to determine whether this y is a vowel.} */
8270 if (y_is_consonant (p - 1))
8271 break;
8272
8273 /* @r{Falls through} */
8274
8275 case 'a':
8276 case 'e':
8277 case 'i':
8278 case 'o':
8279 case 'u':
8280 case 'A':
8281 case 'E':
8282 case 'I':
8283 case 'O':
8284 case 'U':
8285 vowels++;
8286 break;
8287
8288 case '.':
8289 case ',':
8290 case ':':
8291 case ';':
8292 case '?':
8293 case '!':
8294 case '\"':
8295 case '\'':
8296 punct++;
8297 break;
8298 @}
8299
8300 value.vowels = vowels;
8301 value.punct = punct;
8302
8303 return value;
8304@}
8305@end example
8306
8307@node Duffs Device
8308@section Duff's Device
8309@cindex Duff's device
8310
8311The cases in a @code{switch} statement can be inside other control
8312constructs. For instance, we can use a technique known as @dfn{Duff's
8313device} to optimize this simple function,
8314
8315@example
8316void
8317copy (char *to, char *from, int count)
8318@{
8319 while (count > 0)
8320 *to++ = *from++, count--;
8321@}
8322@end example
8323
8324@noindent
8325which copies memory starting at @var{from} to memory starting at
8326@var{to}.
8327
8328Duff's device involves unrolling the loop so that it copies
8329several characters each time around, and using a @code{switch} statement
8330to enter the loop body at the proper point:
8331
8332@example
8333void
8334copy (char *to, char *from, int count)
8335@{
8336 if (count <= 0)
8337 return;
8338 int n = (count + 7) / 8;
8339 switch (count % 8)
8340 @{
8341 do @{
8342 case 0: *to++ = *from++;
8343 case 7: *to++ = *from++;
8344 case 6: *to++ = *from++;
8345 case 5: *to++ = *from++;
8346 case 4: *to++ = *from++;
8347 case 3: *to++ = *from++;
8348 case 2: *to++ = *from++;
8349 case 1: *to++ = *from++;
8350 @} while (--n > 0);
8351 @}
8352@}
8353@end example
8354
8355@node Case Ranges
8356@section Case Ranges
8357@cindex case ranges
8358@cindex ranges in case statements
8359
8360You can specify a range of consecutive values in a single @code{case} label,
8361like this:
8362
8363@example
8364case @var{low} ... @var{high}:
8365@end example
8366
8367@noindent
8368This has the same effect as the proper number of individual @code{case}
8369labels, one for each integer value from @var{low} to @var{high}, inclusive.
8370
8371This feature is especially useful for ranges of ASCII character codes:
8372
8373@example
8374case 'A' ... 'Z':
8375@end example
8376
8377@strong{Be careful:} with integers, write spaces around the @code{...}
8378to prevent it from being parsed wrong. For example, write this:
8379
8380@example
8381case 1 ... 5:
8382@end example
8383
8384@noindent
8385rather than this:
8386
8387@example
8388case 1...5:
8389@end example
8390
8391@node Null Statement
8392@section Null Statement
8393@cindex null statement
8394@cindex statement, null
8395
8396A @dfn{null statement} is just a semicolon. It does nothing.
8397
8398A null statement is a placeholder for use where a statement is
8399grammatically required, but there is nothing to be done. For
8400instance, sometimes all the work of a @code{for}-loop is done in the
8401@code{for}-header itself, leaving no work for the body. Here is an
8402example that searches for the first newline in @code{array}:
8403
8404@example
8405for (p = array; *p != '\n'; p++)
8406 ;
8407@end example
8408
8409@node goto Statement
8410@section @code{goto} Statement and Labels
8411@cindex @code{goto} statement
8412@cindex statement, @code{goto}
8413@cindex label
8414@findex goto
8415
8416The @code{goto} statement looks like this:
8417
8418@example
8419goto @var{label};
8420@end example
8421
8422@noindent
8423Its effect is to transfer control immediately to another part of the
8424current function---where the label named @var{label} is defined.
8425
8426An ordinary label definition looks like this:
8427
8428@example
8429@var{label}:
8430@end example
8431
8432@noindent
8433and it can appear before any statement. You can't use @code{default}
8434as a label, since that has a special meaning for @code{switch}
8435statements.
8436
8437An ordinary label doesn't need a separate declaration; defining it is
8438enough.
8439
8440Here's an example of using @code{goto} to implement a loop
8441equivalent to @code{do}--@code{while}:
8442
8443@example
8444@{
8445 loop_restart:
8446 @var{body}
8447 if (@var{condition})
8448 goto loop_restart;
8449@}
8450@end example
8451
8452The name space of labels is separate from that of variables and functions.
8453Thus, there is no error in using a single name in both ways:
8454
8455@example
8456@{
8457 int foo; // @r{Variable @code{foo}.}
8458 foo: // @r{Label @code{foo}.}
8459 @var{body}
8460 if (foo > 0) // @r{Variable @code{foo}.}
8461 goto foo; // @r{Label @code{foo}.}
8462@}
8463@end example
8464
8465Blocks have no effect on ordinary labels; each label name is defined
8466throughout the whole of the function it appears in. It looks strange to
8467jump into a block with @code{goto}, but it works. For example,
8468
8469@example
8470if (x < 0)
8471 goto negative;
8472if (y < 0)
8473 @{
8474 negative:
8475 printf ("Negative\n");
8476 return;
8477 @}
8478@end example
8479
8480If the goto jumps into the scope of a variable, it does not
8481initialize the variable. For example, if @code{x} is negative,
8482
8483@example
8484if (x < 0)
8485 goto negative;
8486if (y < 0)
8487 @{
8488 int i = 5;
8489 negative:
8490 printf ("Negative, and i is %d\n", i);
8491 return;
8492 @}
8493@end example
8494
8495@noindent
8496prints junk because @code{i} was not initialized.
8497
8498If the block declares a variable-length automatic array, jumping into
8499it gives a compilation error. However, jumping out of the scope of a
8500variable-length array works fine, and deallocates its storage.
8501
8502A label can't come directly before a declaration, so the code can't
8503jump directly to one. For example, this is not allowed:
8504
8505@example
8506@{
8507 goto foo;
8508foo:
8509 int x = 5;
8510 bar(&x);
8511@}
8512@end example
8513
8514@noindent
8515The workaround is to add a statement, even an empty statement,
8516directly after the label. For example:
8517
8518@example
8519@{
8520 goto foo;
8521foo:
8522 ;
8523 int x = 5;
8524 bar(&x);
8525@}
8526@end example
8527
8528Likewise, a label can't be the last thing in a block. The workaround
8529solution is the same: add a semicolon after the label.
8530
8531These unnecessary restrictions on labels make no sense, and ought in
8532principle to be removed; but they do only a little harm since labels
8533and @code{goto} are rarely the best way to write a program.
8534
8535These examples are all artificial; it would be more natural to
8536write them in other ways, without @code{goto}. For instance,
8537the clean way to write the example that prints @samp{Negative} is this:
8538
8539@example
8540if (x < 0 || y < 0)
8541 @{
8542 printf ("Negative\n");
8543 return;
8544 @}
8545@end example
8546
8547@noindent
8548It is hard to construct simple examples where @code{goto} is actually
8549the best way to write a program. Its rare good uses tend to be in
8550complex code, thus not apt for the purpose of explaining the meaning
8551of @code{goto}.
8552
8553The only good time to use @code{goto} is when it makes the code
8554simpler than any alternative. Jumping backward is rarely desirable,
8555because usually the other looping and control constructs give simpler
8556code. Using @code{goto} to jump forward is more often desirable, for
8557instance when a function needs to do some processing in an error case
8558and errors can occur at various different places within the function.
8559
8560@node Local Labels
8561@section Locally Declared Labels
8562@cindex local labels
8563@cindex macros, local labels
8564@findex __label__
8565
8566In GNU C you can declare @dfn{local labels} in any nested block
8567scope. A local label is used in a @code{goto} statement just like an
8568ordinary label, but you can only reference it within the block in
8569which it was declared.
8570
8571A local label declaration looks like this:
8572
8573@example
8574__label__ @var{label};
8575@end example
8576
8577@noindent
8578or
8579
8580@example
8581__label__ @var{label1}, @var{label2}, @r{@dots{}};
8582@end example
8583
8584Local label declarations must come at the beginning of the block,
8585before any ordinary declarations or statements.
8586
8587The label declaration declares the label @emph{name}, but does not define
8588the label itself. That's done in the usual way, with
8589@code{@var{label}:}, before one of the statements in the block.
8590
8591The local label feature is useful for complex macros. If a macro
8592contains nested loops, a @code{goto} can be useful for breaking out of
8593them. However, an ordinary label whose scope is the whole function
8594cannot be used: if the macro can be expanded several times in one
8595function, the label will be multiply defined in that function. A
8596local label avoids this problem. For example:
8597
8598@example
8599#define SEARCH(value, array, target) \
8600do @{ \
8601 __label__ found; \
8602 __auto_type _SEARCH_target = (target); \
8603 __auto_type _SEARCH_array = (array); \
8604 int i, j; \
8605 int value; \
8606 for (i = 0; i < max; i++) \
8607 for (j = 0; j < max; j++) \
8608 if (_SEARCH_array[i][j] == _SEARCH_target) \
8609 @{ (value) = i; goto found; @} \
8610 (value) = -1; \
8611 found:; \
8612@} while (0)
8613@end example
8614
8615This could also be written using a statement expression
8616(@pxref{Statement Exprs}):
8617
8618@example
8619#define SEARCH(array, target) \
8620(@{ \
8621 __label__ found; \
8622 __auto_type _SEARCH_target = (target); \
8623 __auto_type _SEARCH_array = (array); \
8624 int i, j; \
8625 int value; \
8626 for (i = 0; i < max; i++) \
8627 for (j = 0; j < max; j++) \
8628 if (_SEARCH_array[i][j] == _SEARCH_target) \
8629 @{ value = i; goto found; @} \
8630 value = -1; \
8631 found: \
8632 value; \
8633@})
8634@end example
8635
8636Ordinary labels are visible throughout the function where they are
8637defined, and only in that function. However, explicitly declared
8638local labels of a block are visible in nested functions declared
8639within that block. @xref{Nested Functions}, for details.
8640
8641@xref{goto Statement}.
8642
8643@node Labels as Values
8644@section Labels as Values
8645@cindex labels as values
8646@cindex computed gotos
8647@cindex goto with computed label
8648@cindex address of a label
8649
8650In GNU C, you can get the address of a label defined in the current
8651function (or a local label defined in the containing function) with
8652the unary operator @samp{&&}. The value has type @code{void *}. This
8653value is a constant and can be used wherever a constant of that type
8654is valid. For example:
8655
8656@example
8657void *ptr;
8658@r{@dots{}}
8659ptr = &&foo;
8660@end example
8661
8662To use these values requires a way to jump to one. This is done
8663with the computed goto statement@footnote{The analogous feature in
8664Fortran is called an assigned goto, but that name seems inappropriate in
8665C, since you can do more with label addresses than store them in special label
8666variables.}, @code{goto *@var{exp};}. For example,
8667
8668@example
8669goto *ptr;
8670@end example
8671
8672@noindent
8673Any expression of type @code{void *} is allowed.
8674
8675@xref{goto Statement}.
8676
8677@menu
8678* Label Value Uses:: Examples of using label values.
8679* Label Value Caveats:: Limitations of label values.
8680@end menu
8681
8682@node Label Value Uses
8683@subsection Label Value Uses
8684
8685One use for label-valued constants is to initialize a static array to
8686serve as a jump table:
8687
8688@example
8689static void *array[] = @{ &&foo, &&bar, &&hack @};
8690@end example
8691
8692Then you can select a label with indexing, like this:
8693
8694@example
8695goto *array[i];
8696@end example
8697
8698@noindent
8699Note that this does not check whether the subscript is in bounds---array
8700indexing in C never checks that.
8701
8702You can make the table entries offsets instead of addresses
8703by subtracting one label from the others. Here is an example:
8704
8705@example
8706static const int array[] = @{ &&foo - &&foo, &&bar - &&foo,
8707 &&hack - &&foo @};
8708goto *(&&foo + array[i]);
8709@end example
8710
8711@noindent
8712Using offsets is preferable in shared libraries, as it avoids the need
8713for dynamic relocation of the array elements; therefore, the array can
8714be read-only.
8715
8716An array of label values or offsets serves a purpose much like that of
8717the @code{switch} statement. The @code{switch} statement is cleaner,
8718so use @code{switch} by preference when feasible.
8719
8720Another use of label values is in an interpreter for threaded code.
8721The labels within the interpreter function can be stored in the
8722threaded code for super-fast dispatching.
8723
8724@node Label Value Caveats
8725@subsection Label Value Caveats
8726
8727Jumping to a label defined in another function does not work.
8728It can cause unpredictable results.
8729
8730The best way to avoid this is to store label values only in
8731automatic variables, or static variables whose names are declared
8732within the function. Never pass them as arguments.
8733
8734@cindex cloning
8735An optimization known as @dfn{cloning} generates multiple simplified
8736variants of a function's code, for use with specific fixed arguments.
8737Using label values in certain ways, such as saving the address in one
8738call to the function and using it again in another call, would make cloning
8739give incorrect results. These functions must disable cloning.
8740
8741Inlining calls to the function would also result in multiple copies of
8742the code, each with its own value of the same label. Using the label
8743in a computed goto is no problem, because the computed goto inhibits
8744inlining. However, using the label value in some other way, such as
8745an indication of where an error occurred, would be optimized wrong.
8746These functions must disable inlining.
8747
8748To prevent inlining or cloning of a function, specify
8749@code{__attribute__((__noinline__,__noclone__))} in its definition.
8750@xref{Attributes}.
8751
8752When a function uses a label value in a static variable initializer,
8753that automatically prevents inlining or cloning the function.
8754
8755@node Statement Exprs
8756@section Statements and Declarations in Expressions
8757@cindex statements inside expressions
8758@cindex declarations inside expressions
8759@cindex expressions containing statements
8760
8761@c the above section title wrapped and causes an underfull hbox.. i
8762@c changed it from "within" to "in". --mew 4feb93
8763A block enclosed in parentheses can be used as an expression in GNU
8764C@. This provides a way to use local variables, loops and switches within
8765an expression. We call it a @dfn{statement expression}.
8766
8767Recall that a block is a sequence of statements
8768surrounded by braces. In this construct, parentheses go around the
8769braces. For example:
8770
8771@example
8772(@{ int y = foo (); int z;
8773 if (y > 0) z = y;
8774 else z = - y;
8775 z; @})
8776@end example
8777
8778@noindent
8779is a valid (though slightly more complex than necessary) expression
8780for the absolute value of @code{foo ()}.
8781
8782The last statement in the block should be an expression statement; an
8783expression followed by a semicolon, that is. The value of this
8784expression serves as the value of statement expression. If the last
8785statement is anything else, the statement expression's value is
8786@code{void}.
8787
8788This feature is mainly useful in making macro definitions compute each
8789operand exactly once. @xref{Macros and Auto Type}.
8790
8791Statement expressions are not allowed in expressions that must be
8792constant, such as the value for an enumerator, the width of a
8793bit-field, or the initial value of a static variable.
8794
8795Jumping into a statement expression---with @code{goto}, or using a
8796@code{switch} statement outside the statement expression---is an
8797error. With a computed @code{goto} (@pxref{Labels as Values}), the
8798compiler can't detect the error, but it still won't work.
8799
8800Jumping out of a statement expression is permitted, but since
8801subexpressions in C are not computed in a strict order, it is
8802unpredictable which other subexpressions will have been computed by
8803then. For example,
8804
8805@example
8806 foo (), ((@{ bar1 (); goto a; 0; @}) + bar2 ()), baz();
8807@end example
8808
8809@noindent
8810calls @code{foo} and @code{bar1} before it jumps, and never
8811calls @code{baz}, but may or may not call @code{bar2}. If @code{bar2}
8812does get called, that occurs after @code{foo} and before @code{bar1}.
8813
8814@node Variables
8815@chapter Variables
8816@cindex variables
8817
8818Every variable used in a C program needs to be made known by a
8819@dfn{declaration}. It can be used only after it has been declared.
8820It is an error to declare a variable name more than once in the same
8821scope; an exception is that @code{extern} declarations and tentative
8822definitions can coexist with another declaration of the same
8823variable.
8824
8825Variables can be declared anywhere within a block or file. (Older
8826versions of C required that all variable declarations within a block
8827occur before any statements.)
8828
8829Variables declared within a function or block are @dfn{local} to
8830it. This means that the variable name is visible only until the end
8831of that function or block, and the memory space is allocated only
8832while control is within it.
8833
8834Variables declared at the top level in a file are called @dfn{file-scope}.
8835They are assigned fixed, distinct memory locations, so they retain
8836their values for the whole execution of the program.
8837
8838@menu
8839* Variable Declarations:: Name a variable and and reserve space for it.
8840* Initializers:: Assigning inital values to variables.
8841* Designated Inits:: Assigning initial values to array elements
8842 at particular array indices.
8843* Auto Type:: Obtaining the type of a variable.
8844* Local Variables:: Variables declared in function definitions.
8845* File-Scope Variables:: Variables declared outside of
8846 function definitions.
8847* Static Local Variables:: Variables declared within functions,
8848 but with permanent storage allocation.
8849* Extern Declarations:: Declaring a variable
8850 which is allocated somewhere else.
8851* Allocating File-Scope:: When is space allocated
8852 for file-scope variables?
8853* auto and register:: Historically used storage directions.
8854* Omitting Types:: The bad practice of declaring variables
8855 with implicit type.
8856@end menu
8857
8858@node Variable Declarations
8859@section Variable Declarations
8860@cindex variable declarations
8861@cindex declaration of variables
8862
8863Here's what a variable declaration looks like:
8864
8865@example
8866@var{keywords} @var{basetype} @var{decorated-variable} @r{[}= @var{init}@r{]};
8867@end example
8868
8869The @var{keywords} specify how to handle the scope of the variable
8870name and the allocation of its storage. Most declarations have
8871no keywords because the defaults are right for them.
8872
8873C allows these keywords to come before or after @var{basetype}, or
8874even in the middle of it as in @code{unsigned static int}, but don't
8875do that---it would surprise other programmers. Always write the
8876keywords first.
8877
8878The @var{basetype} can be any of the predefined types of C, or a type
8879keyword defined with @code{typedef}. It can also be @code{struct
8880@var{tag}}, @code{union @var{tag}}, or @code{enum @var{tag}}. In
8881addition, it can include type qualifiers such as @code{const} and
8882@code{volatile} (@pxref{Type Qualifiers}).
8883
8884In the simplest case, @var{decorated-variable} is just the variable
8885name. That declares the variable with the type specified by
8886@var{basetype}. For instance,
8887
8888@example
8889int foo;
8890@end example
8891
8892@noindent
8893uses @code{int} as the @var{basetype} and @code{foo} as the
8894@var{decorated-variable}. It declares @code{foo} with type
8895@code{int}.
8896
8897@example
8898struct tree_node foo;
8899@end example
8900
8901@noindent
8902declares @code{foo} with type @code{struct tree_node}.
8903
8904@menu
8905* Declaring Arrays and Pointers:: Declaration syntax for variables of
8906 array and pointer types.
8907* Combining Variable Declarations:: More than one variable declaration
8908 in a single statement.
8909@end menu
8910
8911@node Declaring Arrays and Pointers
8912@subsection Declaring Arrays and Pointers
8913@cindex declaring arrays and pointers
8914@cindex array, declaring
8915@cindex pointers, declaring
8916
8917To declare a variable that is an array, write
8918@code{@var{variable}[@var{length}]} for @var{decorated-variable}:
8919
8920@example
8921int foo[5];
8922@end example
8923
8924To declare a variable that has a pointer type, write
8925@code{*@var{variable}} for @var{decorated-variable}:
8926
8927@example
8928struct list_elt *foo;
8929@end example
8930
8931These constructs nest. For instance,
8932
8933@example
8934int foo[3][5];
8935@end example
8936
8937@noindent
8938declares @code{foo} as an array of 3 arrays of 5 integers each,
8939
8940@example
8941struct list_elt *foo[5];
8942@end example
8943
8944@noindent
8945declares @code{foo} as an array of 5 pointers to structures, and
8946
8947@example
8948struct list_elt **foo;
8949@end example
8950
8951@noindent
8952declares @code{foo} as a pointer to a pointer to a structure.
8953
8954@example
8955int **(*foo[30])(int, double);
8956@end example
8957
8958@noindent
8959declares @code{foo} as an array of 30 pointers to functions
8960(@pxref{Function Pointers}), each of which must accept two arguments
8961(one @code{int} and one @code{double}) and return type @code{int **}.
8962
8963@example
8964void
8965bar (int size)
8966@{
8967 int foo[size];
8968 @r{@dots{}}
8969@}
8970@end example
8971
8972@noindent
8973declares @code{foo} as an array of integers with a size specified at
8974run time when the function @code{bar} is called.
8975
8976@node Combining Variable Declarations
8977@subsection Combining Variable Declarations
8978@cindex combining variable declarations
8979@cindex variable declarations, combining
8980@cindex declarations, combining
8981
8982When multiple declarations have the same @var{keywords} and
8983@var{basetype}, you can combine them using commas. Thus,
8984
8985@example
8986@var{keywords} @var{basetype}
8987 @var{decorated-variable-1} @r{[}= @var{init1}@r{]},
8988 @var{decorated-variable-2} @r{[}= @var{init2}@r{]};
8989@end example
8990
8991@noindent
8992is equivalent to
8993
8994@example
8995@var{keywords} @var{basetype}
8996 @var{decorated-variable-1} @r{[}= @var{init1}@r{]};
8997@var{keywords} @var{basetype}
8998 @var{decorated-variable-2} @r{[}= @var{init2}@r{]};
8999@end example
9000
9001Here are some simple examples:
9002
9003@example
9004int a, b;
9005int a = 1, b = 2;
9006int a, *p, array[5];
9007int a = 0, *p = &a, array[5] = @{1, 2@};
9008@end example
9009
9010@noindent
9011In the last two examples, @code{a} is an @code{int}, @code{p} is a
9012pointer to @code{int}, and @code{array} is an array of 5 @code{int}s.
9013Since the initializer for @code{array} specifies only two elements,
9014the other three elements are initialized to zero.
9015
9016@node Initializers
9017@section Initializers
9018@cindex initializers
9019
9020A variable's declaration, unless it is @code{extern}, should also
9021specify its initial value. For numeric and pointer-type variables,
9022the initializer is an expression for the value. If necessary, it is
9023converted to the variable's type, just as in an assignment.
9024
9025You can also initialize a local structure-type (@pxref{Structures}) or
9026local union-type (@pxref{Unions}) variable this way, from an
9027expression whose value has the same type. But you can't initialize an
9028array this way (@pxref{Arrays}), since arrays are not first-class
9029objects in C (@pxref{Limitations of C Arrays}) and there is no array
9030assignment.
9031
9032You can initialize arrays and structures componentwise,
9033with a list of the elements or components. You can initialize
9034a union with any one of its alternatives.
9035
9036@itemize @bullet
9037@item
9038A component-wise initializer for an array consists of element values
9039surrounded by @samp{@{@r{@dots{}}@}}. If the values in the initializer
9040don't cover all the elements in the array, the remaining elements are
9041initialized to zero.
9042
9043You can omit the size of the array when you declare it, and let
9044the initializer specify the size:
9045
9046@example
9047int array[] = @{ 3, 9, 12 @};
9048@end example
9049
9050@item
9051A component-wise initializer for a structure consists of field values
9052surrounded by @samp{@{@r{@dots{}}@}}. Write the field values in the same
9053order as the fields are declared in the structure. If the values in
9054the initializer don't cover all the fields in the structure, the
9055remaining fields are initialized to zero.
9056
9057@item
9058The initializer for a union-type variable has the form @code{@{
9059@var{value} @}}, where @var{value} initializes the @emph{first alternative}
9060in the union definition.
9061@end itemize
9062
9063For an array of arrays, a structure containing arrays, an array of
9064structures, etc., you can nest these constructs. For example,
9065
9066@example
9067struct point @{ double x, y; @};
9068
9069struct point series[]
9070 = @{ @{0, 0@}, @{1.5, 2.8@}, @{99, 100.0004@} @};
9071@end example
9072
9073You can omit a pair of inner braces if they contain the right
9074number of elements for the sub-value they initialize, so that
9075no elements or fields need to be filled in with zeros.
9076But don't do that very much, as it gets confusing.
9077
9078An array of @code{char} can be initialized using a string constant.
9079Recall that the string constant includes an implicit null character at
9080the end (@pxref{String Constants}). Using a string constant as
9081initializer means to use its contents as the initial values of the
9082array elements. Here are examples:
9083
9084@example
9085char text[6] = "text!"; /* @r{Includes the null.} */
9086char text[5] = "text!"; /* @r{Excludes the null.} */
9087char text[] = "text!"; /* @r{Gets length 6.} */
9088char text[]
9089 = @{ 't', 'e', 'x', 't', '!', 0 @}; /* @r{same as above.} */
9090char text[] = @{ "text!" @}; /* @r{Braces are optional.} */
9091@end example
9092
9093@noindent
9094and this kind of initializer can be nested inside braces to initialize
9095structures or arrays that contain a @code{char}-array.
9096
9097In like manner, you can use a wide string constant to initialize
9098an array of @code{wchar_t}.
9099
9100@node Designated Inits
9101@section Designated Initializers
9102@cindex initializers with labeled elements
9103@cindex labeled elements in initializers
9104@cindex case labels in initializers
9105@cindex designated initializers
9106
9107In a complex structure or long array, it's useful to indicate
9108which field or element we are initializing.
9109
9110To designate specific array elements during initialization, include
9111the array index in brackets, and an assignment operator, for each
9112element:
9113
9114@example
9115int foo[10] = @{ [3] = 42, [7] = 58 @};
9116@end example
9117
9118@noindent
9119This does the same thing as:
9120
9121@example
9122int foo[10] = @{ 0, 0, 0, 42, 0, 0, 0, 58, 0, 0 @};
9123@end example
9124
9125The array initialization can include non-designated element values
9126alongside designated indices; these follow the expected ordering
9127of the array initialization, so that
9128
9129@example
9130int foo[10] = @{ [3] = 42, 43, 44, [7] = 58 @};
9131@end example
9132
9133@noindent
9134does the same thing as:
9135
9136@example
9137int foo[10] = @{ 0, 0, 0, 42, 43, 44, 0, 58, 0, 0 @};
9138@end example
9139
9140Note that you can only use constant expressions as array index values,
9141not variables.
9142
9143If you need to initialize a subsequence of sequential array elements to
9144the same value, you can specify a range:
9145
9146@example
9147int foo[100] = @{ [0 ... 19] = 42, [20 ... 99] = 43 @};
9148@end example
9149
9150@noindent
9151Using a range this way is a GNU C extension.
9152
9153When subsequence ranges overlap, each element is initialized by the
9154last specification that applies to it. Thus, this initialization is
9155equivalent to the previous one.
9156
9157@example
9158int foo[100] = @{ [0 ... 99] = 43, [0 ... 19] = 42 @};
9159@end example
9160
9161@noindent
9162as the second overrides the first for elements 0 through 19.
9163
9164The value used to initialize a range of elements is evaluated only
9165once, for the first element in the range. So for example, this code
9166
9167@example
9168int random_values[100]
9169 = @{ [0 ... 99] = get_random_number() @};
9170@end example
9171
9172@noindent
9173would initialize all 100 elements of the array @code{random_values} to
9174the same value---probably not what is intended.
9175
9176Similarly, you can initialize specific fields of a structure variable
9177by specifying the field name prefixed with a dot:
9178
9179@example
9180struct point @{ int x; int y; @};
9181
9182struct point foo = @{ .y = 42; @};
9183@end example
9184
9185@noindent
9186The same syntax works for union variables as well:
9187
9188@example
9189union int_double @{ int i; double d; @};
9190
9191union int_double foo = @{ .d = 34 @};
9192@end example
9193
9194@noindent
9195This casts the integer value 34 to a double and stores it
9196in the union variable @code{foo}.
9197
9198You can designate both array elements and structure elements in
9199the same initialization; for example, here's an array of point
9200structures:
9201
9202@example
9203struct point point_array[10] = @{ [4].y = 32, [6].y = 39 @};
9204@end example
9205
9206Along with the capability to specify particular array and structure
9207elements to initialize comes the possibility of initializing the same
9208element more than once:
9209
9210@example
9211int foo[10] = @{ [4] = 42, [4] = 98 @};
9212@end example
9213
9214@noindent
9215In such a case, the last initialization value is retained.
9216
9217@node Auto Type
9218@section Referring to a Type with @code{__auto_type}
9219@findex __auto_type
9220@findex typeof
9221@cindex macros, types of arguments
9222
9223You can declare a variable copying the type from
9224the initializer by using @code{__auto_type} instead of a particular type.
9225Here's an example:
9226
9227@example
9228#define max(a,b) \
9229 (@{ __auto_type _a = (a); \
9230 __auto_type _b = (b); \
9231 _a > _b ? _a : _b @})
9232@end example
9233
9234This defines @code{_a} to be of the same type as @code{a}, and
9235@code{_b} to be of the same type as @code{b}. This is a useful thing
9236to do in a macro that ought to be able to handle any type of data
9237(@pxref{Macros and Auto Type}).
9238
9239The original GNU C method for obtaining the type of a value is to use
9240@code{typeof}, which takes as an argument either a value or the name of
9241a type. The previous example could also be written as:
9242
9243@example
9244#define max(a,b) \
9245 (@{ typeof(a) _a = (a); \
9246 typeof(b) _b = (b); \
9247 _a > _b ? _a : _b @})
9248@end example
9249
9250@code{typeof} is more flexible than @code{__auto_type}; however, the
9251principal use case for @code{typeof} is in variable declarations with
9252initialization, which is exactly what @code{__auto_type} handles.
9253
9254@node Local Variables
9255@section Local Variables
9256@cindex local variables
9257@cindex variables, local
9258
9259Declaring a variable inside a function definition (@pxref{Function
9260Definitions}) makes the variable name @dfn{local} to the containing
9261block---that is, the containing pair of braces. More precisely, the
9262variable's name is visible starting just after where it appears in the
9263declaration, and its visibility continues until the end of the block.
9264
9265Local variables in C are generally @dfn{automatic} variables: each
9266variable's storage exists only from the declaration to the end of the
9267block. Execution of the declaration allocates the storage, computes
9268the initial value, and stores it in the variable. The end of the
9269block deallocates the storage.@footnote{Due to compiler optimizations,
9270allocation and deallocation don't necessarily really happen at
9271those times.}
9272
9273@strong{Warning:} Two declarations for the same local variable
9274in the same scope are an error.
9275
9276@strong{Warning:} Automatic variables are stored in the run-time stack.
9277The total space for the program's stack may be limited; therefore,
9278in using very large arrays, it may be necessary to allocate
9279them in some other way to stop the program from crashing.
9280
9281@strong{Warning:} If the declaration of an automatic variable does not
9282specify an initial value, the variable starts out containing garbage.
9283In this example, the value printed could be anything at all:
9284
9285@example
9286@{
9287 int i;
9288
9289 printf ("Print junk %d\n", i);
9290@}
9291@end example
9292
9293In a simple test program, that statement is likely to print 0, simply
9294because every process starts with memory zeroed. But don't rely on it
9295to be zero---that is erroneous.
9296
9297@strong{Note:} Make sure to store a value into each local variable (by
9298assignment, or by initialization) before referring to its value.
9299
9300@node File-Scope Variables
9301@section File-Scope Variables
9302@cindex file-scope variables
9303@cindex global variables
9304@cindex variables, file-scope
9305@cindex variables, global
9306
9307A variable declaration at the top level in a file (not inside a
9308function definition) declares a @dfn{file-scope variable}. Loading a
9309program allocates the storage for all the file-scope variables in it,
9310and initializes them too.
9311
9312Each file-scope variable is either @dfn{static} (limited to one
9313compilation module) or @dfn{global} (shared with all compilation
9314modules in the program). To make the variable static, write the
9315keyword @code{static} at the start of the declaration. Omitting
9316@code{static} makes the variable global.
9317
9318The initial value for a file-scope variable can't depend on the
9319contents of storage, and can't call any functions.
9320
9321@example
9322int foo = 5; /* @r{Valid.} */
9323int bar = foo; /* @r{Invalid!} */
9324int bar = sin (1.0); /* @r{Invalid!} */
9325@end example
9326
9327But it can use the address of another file-scope variable:
9328
9329@example
9330int foo;
9331int *bar = &foo; /* @r{Valid.} */
9332int arr[5];
9333int *bar3 = &arr[3]; /* @r{Valid.} */
9334int *bar4 = arr + 4; /* @r{Valid.} */
9335@end example
9336
9337It is valid for a module to have multiple declarations for a
9338file-scope variable, as long as they are all global or all static, but
9339at most one declaration can specify an initial value for it.
9340
9341@node Static Local Variables
9342@section Static Local Variables
9343@cindex static local variables
9344@cindex variables, static local
9345@findex static
9346
9347The keyword @code{static} in a local variable declaration says to
9348allocate the storage for the variable permanently, just like a
9349file-scope variable, even if the declaration is within a function.
9350
9351Here's an example:
9352
9353@example
9354int
9355increment_counter ()
9356@{
9357 static int counter = 0;
9358 return ++counter;
9359@}
9360@end example
9361
9362The scope of the name @code{counter} runs from the declaration to the
9363end of the containing block, just like an automatic local variable,
9364but its storage is permanent, so the value persists from one call to
9365the next. As a result, each call to @code{increment_counter}
9366returns a different, unique value.
9367
9368The initial value of a static local variable has the same limitations
9369as for file-scope variables: it can't depend on the contents of
9370storage or call any functions. It can use the address of a file-scope
9371variable or a static local variable, because those addresses are
9372determined before the program runs.
9373
9374@node Extern Declarations
9375@section @code{extern} Declarations
9376@cindex @code{extern} declarations
9377@cindex declarations, @code{extern}
9378@findex extern
9379
9380An @code{extern} declaration is used to refer to a global variable
9381whose principal declaration comes elsewhere---in the same module, or in
9382another compilation module. It looks like this:
9383
9384@example
9385extern @var{basetype} @var{decorated-variable};
9386@end example
9387
9388Its meaning is that, in the current scope, the variable name refers to
9389the file-scope variable of that name---which needs to be declared in a
9390non-@code{extern}, non-@code{static} way somewhere else.
9391
9392For instance, if one compilation module has this global variable
9393declaration
9394
9395@example
9396int error_count = 0;
9397@end example
9398
9399@noindent
9400then other compilation modules can specify this
9401
9402@example
9403extern int error_count;
9404@end example
9405
9406@noindent
9407to allow reference to the same variable.
9408
9409The usual place to write an @code{extern} declaration is at top level
9410in a source file, but you can write an @code{extern} declaration
9411inside a block to make a global or static file-scope variable
9412accessible in that block.
9413
9414Since an @code{extern} declaration does not allocate space for the
9415variable, it can omit the size of an array:
9416
9417@example
9418extern int array[];
9419@end example
9420
9421You can use @code{array} normally in all contexts where it is
9422converted automatically to a pointer. However, to use it as the
9423operand of @code{sizeof} is an error, since the size is unknown.
9424
9425It is valid to have multiple @code{extern} declarations for the same
9426variable, even in the same scope, if they give the same type. They do
9427not conflict---they agree. For an array, it is legitimate for some
9428@code{extern} declarations can specify the size while others omit it.
9429However, if two declarations give different sizes, that is an error.
9430
9431Likewise, you can use @code{extern} declarations at file scope
9432(@pxref{File-Scope Variables}) followed by an ordinary global
9433(non-static) declaration of the same variable. They do not conflict,
9434because they say compatible things about the same meaning of the variable.
9435
9436@node Allocating File-Scope
9437@section Allocating File-Scope Variables
9438@cindex allocation file-scope variables
9439@cindex file-scope variables, allocating
9440
9441Some file-scope declarations allocate space for the variable, and some
9442don't.
9443
9444A file-scope declaration with an initial value @emph{must} allocate
9445space for the variable; if there are two of such declarations for the
9446same variable, even in different compilation modules, they conflict.
9447
9448An @code{extern} declaration @emph{never} allocates space for the variable.
9449If all the top-level declarations of a certain variable are
9450@code{extern}, the variable never gets memory space. If that variable
9451is used anywhere in the program, the use will be reported as an error,
9452saying that the variable is not defined.
9453
9454@cindex tentative definition
9455A file-scope declaration without an initial value is called a
9456@dfn{tentative definition}. This is a strange hybrid: it @emph{can}
9457allocate space for the variable, but does not insist. So it causes no
9458conflict, no error, if the variable has another declaration that
9459allocates space for it, perhaps in another compilation module. But if
9460nothing else allocates space for the variable, the tentative
9461definition will do it. Any number of compilation modules can declare
9462the same variable in this way, and that is sufficient for all of them
9463to use the variable.
9464
9465@c @opindex -fno-common
9466@c @opindex --warn_common
9467In programs that are very large or have many contributors, it may be
9468wise to adopt the convention of never using tentative definitions.
9469You can use the compilation option @option{-fno-common} to make them
9470an error, or @option{--warn-common} to warn about them.
9471
9472If a file-scope variable gets its space through a tentative
9473definition, it starts out containing all zeros.
9474
9475@node auto and register
9476@section @code{auto} and @code{register}
9477@cindex @code{auto} declarations
9478@cindex @code{register} declarations
9479@findex auto
9480@findex register
9481
9482For historical reasons, you can write @code{auto} or @code{register}
9483before a local variable declaration. @code{auto} merely emphasizes
9484that the variable isn't static; it changes nothing.
9485
9486@code{register} suggests to the compiler storing this variable in a
9487register. However, GNU C ignores this suggestion, since it can
9488choose the best variables to store in registers without any hints.
9489
9490It is an error to take the address of a variable declared
9491@code{register}, so you cannot use the unary @samp{&} operator on it.
9492If the variable is an array, you can't use it at all (other than as
9493the operand of @code{sizeof}), which makes it rather useless.
9494
9495@node Omitting Types
9496@section Omitting Types in Declarations
9497@cindex omitting types in declarations
9498
9499The syntax of C traditionally allows omitting the data type in a
9500declaration if it specifies a storage class, a type qualifier (see the
9501next chapter), or @code{auto} or @code{register}. Then the type
9502defaults to @code{int}. For example:
9503
9504@example
9505auto foo = 42;
9506@end example
9507
9508This is bad practice; if you see it, fix it.
9509
9510@node Type Qualifiers
9511@chapter Type Qualifiers
9512
9513A declaration can include type qualifiers to advise the compiler
9514about how the variable will be used. There are three different
9515qualifiers, @code{const}, @code{volatile} and @code{restrict}. They
9516pertain to different issues, so you can use more than one together.
9517For instance, @code{const volatile} describes a value that the
9518program is not allowed to change, but might have a different value
9519each time the program examines it. (This might perhaps be a special
9520hardware register, or part of shared memory.)
9521
9522If you are just learning C, you can skip this chapter.
9523
9524@menu
9525* const:: Variables whose values don't change.
9526* volatile:: Variables whose values may be accessed
9527 or changed outside of the control of
9528 this program.
9529* restrict Pointers:: Restricted pointers for code optimization.
9530* restrict Pointer Example:: Example of how that works.
9531@end menu
9532
9533@node const
9534@section @code{const} Variables and Fields
9535@cindex @code{const} variables and fields
9536@cindex variables, @code{const}
9537@findex const
9538
9539You can mark a variable as ``constant'' by writing @code{const} in
9540front of the declaration. This says to treat any assignment to that
9541variable as an error. It may also permit some compiler
9542optimizations---for instance, to fetch the value only once to satisfy
9543multiple references to it. The construct looks like this:
9544
9545@example
9546const double pi = 3.14159;
9547@end example
9548
9549After this definition, the code can use the variable @code{pi}
9550but cannot assign a different value to it.
9551
9552@example
9553pi = 3.0; /* @r{Error!} */
9554@end example
9555
9556Simple variables that are constant can be used for the same purposes
9557as enumeration constants, and they are not limited to integers. The
9558constantness of the variable propagates into pointers, too.
9559
9560A pointer type can specify that the @emph{target} is constant. For
9561example, the pointer type @code{const double *} stands for a pointer
9562to a constant @code{double}. That's the typethat results from taking
9563the address of @code{pi}. Such a pointer can't be dereferenced in the
9564left side of an assignment.
9565
9566@example
9567*(&pi) = 3.0; /* @r{Error!} */
9568@end example
9569
9570Nonconstant pointers can be converted automatically to constant
9571pointers, but not vice versa. For instance,
9572
9573@example
9574const double *cptr;
9575double *ptr;
9576
9577cptr = π /* @r{Valid.} */
9578cptr = ptr; /* @r{Valid.} */
9579ptr = cptr; /* @r{Error!} */
9580ptr = π /* @r{Error!} */
9581@end example
9582
9583This is not an ironclad protection against modifying the value. You
9584can always cast the constant pointer to a nonconstant pointer type:
9585
9586@example
9587ptr = (double *)cptr; /* @r{Valid.} */
9588ptr = (double *)π /* @r{Valid.} */
9589@end example
9590
9591However, @code{const} provides a way to show that a certain function
9592won't modify the data structure whose address is passed to it. Here's
9593an example:
9594
9595@example
9596int
9597string_length (const char *string)
9598@{
9599 int count = 0;
9600 while (*string++)
9601 count++;
9602 return count;
9603@}
9604@end example
9605
9606@noindent
9607Using @code{const char *} for the parameter is a way of saying this
9608function never modifies the memory of the string itself.
9609
9610In calling @code{string_length}, you can specify an ordinary
9611@code{char *} since that can be converted automatically to @code{const
9612char *}.
9613
9614@node volatile
9615@section @code{volatile} Variables and Fields
9616@cindex @code{volatile} variables and fields
9617@cindex variables, @code{volatile}
9618@findex volatile
9619
9620The GNU C compiler often performs optimizations that eliminate the
9621need to write or read a variable. For instance,
9622
9623@example
9624int foo;
9625foo = 1;
9626foo++;
9627@end example
9628
9629@noindent
9630might simply store the value 2 into @code{foo}, without ever storing 1.
9631These optimizations can also apply to structure fields in some cases.
9632
9633If the memory containing @code{foo} is shared with another program,
9634or if it is examined asynchronously by hardware, such optimizations
9635could confuse the communication. Using @code{volatile} is one way
9636to prevent them.
9637
9638Writing @code{volatile} with the type in a variable or field declaration
9639says that the value may be examined or changed for reasons outside the
9640control of the program at any moment. Therefore, the program must
9641execute in a careful way to assure correct interaction with those
9642accesses, whenever they may occur.
9643
9644The simplest use looks like this:
9645
9646@example
9647volatile int lock;
9648@end example
9649
9650This directs the compiler not to do certain common optimizations on
9651use of the variable @code{lock}. All the reads and writes for a volatile
9652variable or field are really done, and done in the order specified
9653by the source code. Thus, this code:
9654
9655@example
9656lock = 1;
9657list = list->next;
9658if (lock)
9659 lock_broken (&lock);
9660lock = 0;
9661@end example
9662
9663@noindent
9664really stores the value 1 in @code{lock}, even though there is no
9665sign it is really used, and the @code{if} statement reads and
9666checks the value of @code{lock}, rather than assuming it is still 1.
9667
9668A limited amount of optimization can be done, in principle, on
9669@code{volatile} variables and fields: multiple references between two
9670sequence points (@pxref{Sequence Points}) can be simplified together.
9671
9672Use of @code{volatile} does not eliminate the flexibility in ordering
9673the computation of the operands of most operators. For instance, in
9674@code{lock + foo ()}, the order of accessing @code{lock} and calling
9675@code{foo} is not specified, so they may be done in either order; the
9676fact that @code{lock} is @code{volatile} has no effect on that.
9677
9678@node restrict Pointers
9679@section @code{restrict}-Qualified Pointers
9680@cindex @code{restrict} pointers
9681@cindex pointers, @code{restrict}-qualified
9682@findex restrict
9683
9684You can declare a pointer as ``restricted'' using the @code{restrict}
9685type qualifier, like this:
9686
9687@example
9688int *restrict p = x;
9689@end example
9690
9691@noindent
9692This enables better optimization of code that uses the pointer.
9693
9694If @code{p} is declared with @code{restrict}, and then the code
9695references the object that @code{p} points to (using @code{*p} or
9696@code{p[@var{i}]}), the @code{restrict} declaration promises that the
9697code will not access that object in any other way---only through
9698@code{p}.
9699
9700For instance, it means the code must not use another pointer
9701to access the same space, as shown here:
9702
9703@example
9704int *restrict p = @var{whatever};
9705int *q = p;
9706foo (*p, *q);
9707@end example
9708
9709@noindent
9710That contradicts the @code{restrict} promise by accessing the object
9711that @code{p} points to using @code{q}, which bypasses @code{p}.
9712Likewise, it must not do this:
9713
9714@example
9715int *restrict p = @var{whatever};
9716struct @{ int *a, *b; @} s;
9717s.a = p;
9718foo (*p, *s.a);
9719@end example
9720
9721@noindent
9722This example uses a structure field instead of the variable @code{q}
9723to hold the other pointer, and that contradicts the promise just the
9724same.
9725
9726The keyword @code{restrict} also promises that @code{p} won't point to
9727the allocated space of any automatic or static variable. So the code
9728must not do this:
9729
9730@example
9731int a;
9732int *restrict p = &a;
9733foo (*p, a);
9734@end example
9735
9736@noindent
9737because that does direct access to the object (@code{a}) that @code{p}
9738points to, which bypasses @code{p}.
9739
9740If the code makes such promises with @code{restrict} then breaks them,
9741execution is unpredictable.
9742
9743@node restrict Pointer Example
9744@section @code{restrict} Pointer Example
9745
9746Here are examples where @code{restrict} enables real optimization.
9747
9748In this example, @code{restrict} assures GCC that the array @code{out}
9749points to does not overlap with the array @code{in} points to.
9750
9751@example
9752void
9753process_data (const char *in,
9754 char * restrict out,
9755 size_t size)
9756@{
9757 for (i = 0; i < size; i++)
9758 out[i] = in[i] + in[i + 1];
9759@}
9760@end example
9761
9762Here's a simple tree structure, where each tree node holds data of
9763type @code{PAYLOAD} plus two subtrees.
9764
9765@example
9766struct foo
9767 @{
9768 PAYLOAD payload;
9769 struct foo *left;
9770 struct foo *right;
9771 @};
9772@end example
9773
9774Now here's a function to null out both pointers in the @code{left}
9775subtree.
9776
9777@example
9778void
9779null_left (struct foo *a)
9780@{
9781 a->left->left = NULL;
9782 a->left->right = NULL;
9783@}
9784@end example
9785
9786Since @code{*a} and @code{*a->left} have the same data type,
9787they could legitimately alias (@pxref{Aliasing}). Therefore,
9788the compiled code for @code{null_left} must read @code{a->left}
9789again from memory when executing the second assignment statement.
9790
9791We can enable optimization, so that it does not need to read
9792@code{a->left} again, by writing @code{null_left} this in a less
9793obvious way.
9794
9795@example
9796void
9797null_left (struct foo *a)
9798@{
9799 struct foo *b = a->left;
9800 b->left = NULL;
9801 b->right = NULL;
9802@}
9803@end example
9804
9805A more elegant way to fix this is with @code{restrict}.
9806
9807@example
9808void
9809null_left (struct foo *restrict a)
9810@{
9811 a->left->left = NULL;
9812 a->left->right = NULL;
9813@}
9814@end example
9815
9816Declaring @code{a} as @code{restrict} asserts that other pointers such
9817as @code{a->left} will not point to the same memory space as @code{a}.
9818Therefore, the memory location @code{a->left->left} cannot be the same
9819memory as @code{a->left}. Knowing this, the compiled code may avoid
9820reloading @code{a->left} for the second statement.
9821
9822@node Functions
9823@chapter Functions
9824@cindex functions
9825
9826We have already presented many examples of functions, so if you've
9827read this far, you basically understand the concept of a function. It
9828is vital, nonetheless, to have a chapter in the manual that collects
9829all the information about functions.
9830
9831@menu
9832* Function Definitions:: Writing the body of a function.
9833* Function Declarations:: Declaring the interface of a function.
9834* Function Calls:: Using functions.
9835* Function Call Semantics:: Call-by-value argument passing.
9836* Function Pointers:: Using references to functions.
9837* The main Function:: Where execution of a GNU C program begins.
9838* Advanced Definitions:: Advanced features of function definitions.
9839* Obsolete Definitions:: Obsolete features still used
9840 in function definitions in old code.
9841@end menu
9842
9843@node Function Definitions
9844@section Function Definitions
9845@cindex function definitions
9846@cindex defining functions
9847
9848We have already presented many examples of function definitions. To
9849summarize the rules, a function definition looks like this:
9850
9851@example
9852@var{returntype}
9853@var{functionname} (@var{parm_declarations}@r{@dots{}})
9854@{
9855 @var{body}
9856@}
9857@end example
9858
9859The part before the open-brace is called the @dfn{function header}.
9860
9861Write @code{void} as the @var{returntype} if the function does
9862not return a value.
9863
9864@menu
9865* Function Parameter Variables:: Syntax and semantics
9866 of function parameters.
9867* Forward Function Declarations:: Functions can only be called after
9868 they have been defined or declared.
9869* Static Functions:: Limiting visibility of a function.
9870* Arrays as Parameters:: Functions that accept array arguments.
9871* Structs as Parameters:: Functions that accept structure arguments.
9872@end menu
9873
9874@node Function Parameter Variables
9875@subsection Function Parameter Variables
9876@cindex function parameter variables
9877@cindex parameter variables in functions
9878@cindex parameter list
9879
9880A function parameter variable is a local variable (@pxref{Local
9881Variables}) used within the function to store the value passed as an
9882argument in a call to the function. Usually we say ``function
9883parameter'' or ``parameter'' for short, not mentioning the fact that
9884it's a variable.
9885
9886We declare these variables in the beginning of the function
9887definition, in the @dfn{parameter list}. For example,
9888
9889@example
9890fib (int n)
9891@end example
9892
9893@noindent
9894has a parameter list with one function parameter @code{n}, which has
9895type @code{int}.
9896
9897Function parameter declarations differ from ordinary variable
9898declarations in several ways:
9899
9900@itemize @bullet
9901@item
9902Inside the function definition header, commas separate parameter
9903declarations, and each parameter needs a complete declaration
9904including the type. For instance, if a function @code{foo} has two
9905@code{int} parameters, write this:
9906
9907@example
9908foo (int a, int b)
9909@end example
9910
9911You can't share the common @code{int} between the two declarations:
9912
9913@example
9914foo (int a, b) /* @r{Invalid!} */
9915@end example
9916
9917@item
9918A function parameter variable is initialized to whatever value is
9919passed in the function call, so its declaration cannot specify an
9920initial value.
9921
9922@item
9923Writing an array type in a function parameter declaration has the
9924effect of declaring it as a pointer. The size specified for the array
9925has no effect at all, and we normally omit the size. Thus,
9926
9927@example
9928foo (int a[5])
9929foo (int a[])
9930foo (int *a)
9931@end example
9932
9933@noindent
9934are equivalent.
9935
9936@item
9937The scope of the parameter variables is the entire function body,
9938notwithstanding the fact that they are written in the function header,
9939which is just outside the function body.
9940@end itemize
9941
9942If a function has no parameters, it would be most natural for the
9943list of parameters in its definition to be empty. But that, in C, has
9944a special meaning for historical reasons: ``Do not check that calls to
9945this function have the right number of arguments.'' Thus,
9946
9947@example
9948int
9949foo ()
9950@{
9951 return 5;
9952@}
9953
9954int
9955bar (int x)
9956@{
9957 return foo (x);
9958@}
9959@end example
9960
9961@noindent
9962would not report a compilation error in passing @code{x} as an
9963argument to @code{foo}. By contrast,
9964
9965@example
9966int
9967foo (void)
9968@{
9969 return 5;
9970@}
9971
9972int
9973bar (int x)
9974@{
9975 return foo (x);
9976@}
9977@end example
9978
9979@noindent
9980would report an error because @code{foo} is supposed to receive
9981no arguments.
9982
9983@node Forward Function Declarations
9984@subsection Forward Function Declarations
9985@cindex forward function declarations
9986@cindex function declarations, forward
9987
9988The order of the function definitions in the source code makes no
9989difference, except that each function needs to be defined or declared
9990before code uses it.
9991
9992The definition of a function also declares its name for the rest of
9993the containing scope. But what if you want to call the function
9994before its definition? To permit that, write a compatible declaration
9995of the same function, before the first call. A declaration that
9996prefigures a subsequent definition in this way is called a
9997@dfn{forward declaration}. The function declaration can be at top
9998@c ??? file scope
9999level or within a block, and it applies until the end of the containing
10000scope.
10001
10002@xref{Function Declarations}, for more information about these
10003declarations.
10004
10005@node Static Functions
10006@subsection Static Functions
10007@cindex static functions
10008@cindex functions, static
10009@findex static
10010
10011The keyword @code{static} in a function definition limits the
10012visibility of the name to the current compilation module. (That's the
10013same thing @code{static} does in variable declarations;
10014@pxref{File-Scope Variables}.) For instance, if one compilation module
10015contains this code:
10016
10017@example
10018static int
10019foo (void)
10020@{
10021 @r{@dots{}}
10022@}
10023@end example
10024
10025@noindent
10026then the code of that compilation module can call @code{foo} anywhere
10027after the definition, but other compilation modules cannot refer to it
10028at all.
10029
10030@cindex forward declaration
10031@cindex static function, declaration
10032To call @code{foo} before its definition, it needs a forward
10033declaration, which should use @code{static} since the function
10034definition does. For this function, it looks like this:
10035
10036@example
10037static int foo (void);
10038@end example
10039
10040It is generally wise to use @code{static} on the definitions of
10041functions that won't be called from outside the same compilation
10042module. This makes sure that calls are not added in other modules.
10043If programmers decide to change the function's calling convention, or
10044understand all the consequences of its use, they will only have to
10045check for calls in the same compilation module.
10046
10047@node Arrays as Parameters
10048@subsection Arrays as Parameters
10049@cindex array as parameters
10050@cindex functions with array parameters
10051
10052Arrays in C are not first-class objects: it is impossible to copy
10053them. So they cannot be passed as arguments like other values.
10054@xref{Limitations of C Arrays}. Rather, array parameters work in
10055a special way.
10056
10057@menu
10058* Array Parm Pointer::
10059* Passing Array Args::
10060* Array Parm Qualifiers::
10061@end menu
10062
10063@node Array Parm Pointer
10064@subsubsection Array parameters are pointers
10065
10066Declaring a function parameter variable as an array really gives it a
10067pointer type. C does this because an expression with array type, if
10068used as an argument in a function call, is converted automatically to
10069a pointer (to the zeroth element of the array). If you declare the
10070corresponding parameter as an ``array'', it will work correctly with
10071the pointer value that really gets passed.
10072
10073This relates to the fact that C does not check array bounds in access
10074to elements of the array (@pxref{Accessing Array Elements}).
10075
10076For example, in this function,
10077
10078@example
10079void
10080clobber4 (int array[20])
10081@{
10082 array[4] = 0;
10083@}
10084@end example
10085
10086@noindent
10087the parameter @code{array}'s real type is @code{int *}; the specified
10088length, 20, has no effect on the program. You can leave out the length
10089and write this:
10090
10091@example
10092void
10093clobber4 (int array[])
10094@{
10095 array[4] = 0;
10096@}
10097@end example
10098
10099@noindent
10100or write the parameter declaration explicitly as a pointer:
10101
10102@example
10103void
10104clobber4 (int *array)
10105@{
10106 array[4] = 0;
10107@}
10108@end example
10109
10110They are all equivalent.
10111
10112@node Passing Array Args
10113@subsubsection Passing array arguments
10114
10115 The function call passes this pointer by
10116value, like all argument values in C@. However, the result is
10117paradoxical in that the array itself is passed by reference: its
10118contents are treated as shared memory---shared between the caller and
10119the called function, that is. When @code{clobber4} assigns to element
101204 of @code{array}, the effect is to alter element 4 of the array
10121specified in the call.
10122
10123@example
10124#include <stddef.h> /* @r{Defines @code{NULL}.} */
10125#include <stdlib.h> /* @r{Declares @code{malloc},} */
10126 /* @r{Defines @code{EXIT_SUCCESS}.} */
10127
10128int
10129main (void)
10130@{
10131 int data[] = @{1, 2, 3, 4, 5, 6@};
10132 int i;
10133
10134 /* @r{Show the initial value of element 4.} */
10135 for (i = 0; i < 6; i++)
10136 printf ("data[%d] = %d\n", i, data[i]);
10137
10138 printf ("\n");
10139
10140 clobber4 (data);
10141
10142 /* @r{Show that element 4 has been changed.} */
10143 for (i = 0; i < 6; i++)
10144 printf ("data[%d] = %d\n", i, data[i]);
10145
10146 printf ("\n");
10147
10148 return EXIT_SUCCESS;
10149@}
10150@end example
10151
10152@noindent
10153shows that @code{data[4]} has become zero after the call to
10154@code{clobber4}.
10155
10156The array @code{data} has 6 elements, but passing it to a function
10157whose argument type is written as @code{int [20]} is not an error,
10158because that really stands for @code{int *}. The pointer that is the
10159real argument carries no indication of the length of the array it
10160points into. It is not required to point to the beginning of the
10161array, either. For instance,
10162
10163@example
10164clobber4 (data+1);
10165@end example
10166
10167@noindent
10168passes an ``array'' that starts at element 1 of @code{data}, and the
10169effect is to zero @code{data[5]} instead of @code{data[4]}.
10170
10171If all calls to the function will provide an array of a particular
10172size, you can specify the size of the array to be @code{static}:
10173
10174@example
10175void
10176clobber4 (int array[static 20])
10177@r{@dots{}}
10178@end example
10179
10180@noindent
10181This is a promise to the compiler that the function will always be
10182called with an array of 20 elements, so that the compiler can optimize
10183code accordingly. If the code breaks this promise and calls the
10184function with, for example, a shorter array, unpredictable things may
10185happen.
10186
10187@node Array Parm Qualifiers
10188@subsubsection Type qualifiers on array parameters
10189
10190You can use the type qualifiers @code{const}, @code{restrict}, and
10191@code{volatile} with array parameters; for example:
10192
10193@example
10194void
10195clobber4 (volatile int array[20])
10196@r{@dots{}}
10197@end example
10198
10199@noindent
10200denotes that @code{array} is equivalent to a pointer to a volatile
10201@code{int}. Alternatively:
10202
10203@example
10204void
10205clobber4 (int array[const 20])
10206@r{@dots{}}
10207@end example
10208
10209@noindent
10210makes the array parameter equivalent to a constant pointer to an
10211@code{int}. If we want the @code{clobber4} function to succeed, it
10212would not make sense to write
10213
10214@example
10215void
10216clobber4 (const int array[20])
10217@r{@dots{}}
10218@end example
10219
10220@noindent
10221as this would tell the compiler that the parameter should point to an
10222array of constant @code{int} values, and then we would not be able to
10223store zeros in them.
10224
10225In a function with multiple array parameters, you can use @code{restrict}
10226to tell the compiler that each array parameter passed in will be distinct:
10227
10228@example
10229void
10230foo (int array1[restrict 10], int array2[restrict 10])
10231@r{@dots{}}
10232@end example
10233
10234@noindent
10235Using @code{restrict} promises the compiler that callers will
10236not pass in the same array for more than one @code{restrict} array
10237parameter. Knowing this enables the compiler to perform better code
10238optimization. This is the same effect as using @code{restrict}
10239pointers (@pxref{restrict Pointers}), but makes it clear when reading
10240the code that an array of a specific size is expected.
10241
10242@node Structs as Parameters
10243@subsection Functions That Accept Structure Arguments
10244
10245Structures in GNU C are first-class objects, so using them as function
10246parameters and arguments works in the natural way. This function
10247@code{swapfoo} takes a @code{struct foo} with two fields as argument,
10248and returns a structure of the same type but with the fields
10249exchanged.
10250
10251@example
10252struct foo @{ int a, b; @};
10253
10254struct foo x;
10255
10256struct foo
10257swapfoo (struct foo inval)
10258@{
10259 struct foo outval;
10260 outval.a = inval.b;
10261 outval.b = inval.a;
10262 return outval;
10263@}
10264@end example
10265
10266This simpler definition of @code{swapfoo} avoids using a local
10267variable to hold the result about to be return, by using a structure
10268constructor (@pxref{Structure Constructors}), like this:
10269
10270@example
10271struct foo
10272swapfoo (struct foo inval)
10273@{
10274 return (struct foo) @{ inval.b, inval.a @};
10275@}
10276@end example
10277
10278It is valid to define a structure type in a function's parameter list,
10279as in
10280
10281@example
10282int
10283frob_bar (struct bar @{ int a, b; @} inval)
10284@{
10285 @var{body}
10286@}
10287@end example
10288
10289@noindent
10290and @var{body} can access the fields of @var{inval} since the
10291structure type @code{struct bar} is defined for the whole function
10292body. However, there is no way to create a @code{struct bar} argument
10293to pass to @code{frob_bar}, except with kludges. As a result,
10294defining a structure type in a parameter list is useless in practice.
10295
10296@node Function Declarations
10297@section Function Declarations
10298@cindex function declarations
10299@cindex declararing functions
10300
10301To call a function, or use its name as a pointer, a @dfn{function
10302declaration} for the function name must be in effect at that point in
10303the code. The function's definition serves as a declaration of that
10304function for the rest of the containing scope, but to use the function
10305in code before the definition, or from another compilation module, a
10306separate function declaration must precede the use.
10307
10308A function declaration looks like the start of a function definition.
10309It begins with the return value type (@code{void} if none) and the
10310function name, followed by argument declarations in parentheses
10311(though these can sometimes be omitted). But that's as far as the
10312similarity goes: instead of the function body, the declaration uses a
10313semicolon.
10314
10315@cindex function prototype
10316@cindex prototype of a function
10317A declaration that specifies argument types is called a @dfn{function
10318prototype}. You can include the argument names or omit them. The
10319names, if included in the declaration, have no effect, but they may
10320serve as documentation.
10321
10322This form of prototype specifies fixed argument types:
10323
10324@example
10325@var{rettype} @var{function} (@var{argtypes}@r{@dots{}});
10326@end example
10327
10328@noindent
10329This form says the function takes no arguments:
10330
10331@example
10332@var{rettype} @var{function} (void);
10333@end example
10334
10335@noindent
10336This form declares types for some arguments, and allows additional
10337arguments whose types are not specified:
10338
10339@example
10340@var{rettype} @var{function} (@var{argtypes}@r{@dots{}}, ...);
10341@end example
10342
10343For a parameter that's an array of variable length, you can write
10344its declaration with @samp{*} where the ``length'' of the array would
10345normally go; for example, these are all equivalent.
10346
10347@example
10348double maximum (int n, int m, double a[n][m]);
10349double maximum (int n, int m, double a[*][*]);
10350double maximum (int n, int m, double a[ ][*]);
10351double maximum (int n, int m, double a[ ][m]);
10352@end example
10353
10354@noindent
10355The old-fashioned form of declaration, which is not a prototype, says
10356nothing about the types of arguments or how many they should be:
10357
10358@example
10359@var{rettype} @var{function} ();
10360@end example
10361
10362@strong{Warning:} Arguments passed to a function declared without a
10363prototype are converted with the default argument promotions
10364(@pxref{Argument Promotions}. Likewise for additional arguments whose
10365types are unspecified.
10366
10367Function declarations are usually written at the top level in a source file,
10368but you can also put them inside code blocks. Then the function name
10369is visible for the rest of the containing scope. For example:
10370
10371@example
10372void
10373foo (char *file_name)
10374@{
10375 void save_file (char *);
10376 save_file (file_name);
10377@}
10378@end example
10379
10380If another part of the code tries to call the function
10381@code{save_file}, this declaration won't be in effect there. So the
10382function will get an implicit declaration of the form @code{extern int
10383save_file ();}. That conflicts with the explicit declaration
10384here, and the discrepancy generates a warning.
10385
10386The syntax of C traditionally allows omitting the data type in a
10387function declaration if it specifies a storage class or a qualifier.
10388Then the type defaults to @code{int}. For example:
10389
10390@example
10391static foo (double x);
10392@end example
10393
10394@noindent
10395defaults the return type to @code{int}.
10396This is bad practice; if you see it, fix it.
10397
10398Calling a function that is undeclared has the effect of an creating
10399@dfn{implicit} declaration in the innermost containing scope,
10400equivalent to this:
10401
10402@example
10403extern int @dfn{function} ();
10404@end example
10405
10406@noindent
10407This declaration says that the function returns @code{int} but leaves
10408its argument types unspecified. If that does not accurately fit the
10409function, then the program @strong{needs} an explicit declaration of
10410the function with argument types in order to call it correctly.
10411
10412Implicit declarations are deprecated, and a function call that creates one
10413causes a warning.
10414
10415@node Function Calls
10416@section Function Calls
10417@cindex function calls
10418@cindex calling functions
10419
10420Starting a program automatically calls the function named @code{main}
10421(@pxref{The main Function}). Aside from that, a function does nothing
10422except when it is @dfn{called}. That occurs during the execution of a
10423function-call expression specifying that function.
10424
10425A function-call expression looks like this:
10426
10427@example
10428@var{function} (@var{arguments}@r{@dots{}})
10429@end example
10430
10431Most of the time, @var{function} is a function name. However, it can
10432also be an expression with a function pointer value; that way, the
10433program can determine at run time which function to call.
10434
10435The @var{arguments} are a series of expressions separated by commas.
10436Each expression specifies one argument to pass to the function.
10437
10438The list of arguments in a function call looks just like use of the
10439comma operator (@pxref{Comma Operator}), but the fact that it fills
10440the parentheses of a function call gives it a different meaning.
10441
10442Here's an example of a function call, taken from an example near the
10443beginning (@pxref{Complete Program}).
10444
10445@example
10446printf ("Fibonacci series item %d is %d\n",
10447 19, fib (19));
10448@end example
10449
10450The three arguments given to @code{printf} are a constant string, the
10451integer 19, and the integer returned by @code{fib (19)}.
10452
10453@node Function Call Semantics
10454@section Function Call Semantics
10455@cindex function call semantics
10456@cindex semantics of function calls
10457@cindex call-by-value
10458
10459The meaning of a function call is to compute the specified argument
10460expressions, convert their values according to the function's
10461declaration, then run the function giving it copies of the converted
10462values. (This method of argument passing is known as
10463@dfn{call-by-value}.) When the function finishes, the value it
10464returns becomes the value of the function-call expression.
10465
10466Call-by-value implies that an assignment to the function argument
10467variable has no direct effect on the caller. For instance,
10468
10469@example
10470#include <stdlib.h> /* @r{Defines @code{EXIT_SUCCESS}.} */
10471#include <stdio.h> /* @r{Declares @code{printf}.} */
10472
10473void
10474subroutine (int x)
10475@{
10476 x = 5;
10477@}
10478
10479void
10480main (void)
10481@{
10482 int y = 20;
10483 subroutine (y);
10484 printf ("y is %d\n", y);
10485 return EXIT_SUCCESS;
10486@}
10487@end example
10488
10489@noindent
10490prints @samp{y is 20}. Calling @code{subroutine} initializes @code{x}
10491from the value of @code{y}, but this does not establish any other
10492relationship between the two variables. Thus, the assignment to
10493@code{x}, inside @code{subroutine}, changes only @emph{that} @code{x}.
10494
10495If an argument's type is specified by the function's declaration, the
10496function call converts the argument expression to that type if
10497possible. If the conversion is impossible, that is an error.
10498
10499If the function's declaration doesn't specify the type of that
10500argument, then the @emph{default argument promotions} apply.
10501@xref{Argument Promotions}.
10502
10503@node Function Pointers
10504@section Function Pointers
10505@cindex function pointers
10506@cindex pointers to functions
10507
10508A function name refers to a fixed function. Sometimes it is useful to
10509call a function to be determined at run time; to do this, you can use
10510a @dfn{function pointer value} that points to the chosen function
10511(@pxref{Pointers}).
10512
10513Pointer-to-function types can be used to declare variables and other
10514data, including array elements, structure fields, and union
10515alternatives. They can also be used for function arguments and return
10516values. These types have the peculiarity that they are never
10517converted automatically to @code{void *} or vice versa. However, you
10518can do that conversion with a cast.
10519
10520@menu
10521* Declaring Function Pointers:: How to declare a pointer to a function.
10522* Assigning Function Pointers:: How to assign values to function pointers.
10523* Calling Function Pointers:: How to call functions through pointers.
10524@end menu
10525
10526@node Declaring Function Pointers
10527@subsection Declaring Function Pointers
10528@cindex declaring function pointers
10529@cindex function pointers, declaring
10530
10531The declaration of a function pointer variable (or structure field)
10532looks almost like a function declaration, except it has an additional
10533@samp{*} just before the variable name. Proper nesting requires a
10534pair of parentheses around the two of them. For instance, @code{int
10535(*a) ();} says, ``Declare @code{a} as a pointer such that @code{*a} is
10536an @code{int}-returning function.''
10537
10538Contrast these three declarations:
10539
10540@example
10541/* @r{Declare a function returning @code{char *}.} */
10542char *a (char *);
10543/* @r{Declare a pointer to a function returning @code{char}.} */
10544char (*a) (char *);
10545/* @r{Declare a pointer to a function returning @code{char *}.} */
10546char *(*a) (char *);
10547@end example
10548
10549The possible argument types of the function pointed to are the same
10550as in a function declaration. You can write a prototype
10551that specifies all the argument types:
10552
10553@example
10554@var{rettype} (*@var{function}) (@var{arguments}@r{@dots{}});
10555@end example
10556
10557@noindent
10558or one that specifies some and leaves the rest unspecified:
10559
10560@example
10561@var{rettype} (*@var{function}) (@var{arguments}@r{@dots{}}, ...);
10562@end example
10563
10564@noindent
10565or one that says there are no arguments:
10566
10567@example
10568@var{rettype} (*@var{function}) (void);
10569@end example
10570
10571You can also write a non-prototype declaration that says
10572nothing about the argument types:
10573
10574@example
10575@var{rettype} (*@var{function}) ();
10576@end example
10577
10578For example, here's a declaration for a variable that should
10579point to some arithmetic function that operates on two @code{double}s:
10580
10581@example
10582double (*binary_op) (double, double);
10583@end example
10584
10585Structure fields, union alternatives, and array elements can be
10586function pointers; so can parameter variables. The function pointer
10587declaration construct can also be combined with other operators
10588allowed in declarations. For instance,
10589
10590@example
10591int **(*foo)();
10592@end example
10593
10594@noindent
10595declares @code{foo} as a pointer to a function that returns
10596type @code{int **}, and
10597
10598@example
10599int **(*foo[30])();
10600@end example
10601
10602@noindent
10603declares @code{foo} as an array of 30 pointers to functions that
10604return type @code{int **}.
10605
10606@example
10607int **(**foo)();
10608@end example
10609
10610@noindent
10611declares @code{foo} as a pointer to a pointer to a function that
10612returns type @code{int **}.
10613
10614@node Assigning Function Pointers
10615@subsection Assigning Function Pointers
10616@cindex assigning function pointers
10617@cindex function pointers, assigning
10618
10619Assuming we have declared the variable @code{binary_op} as in the
10620previous section, giving it a value requires a suitable function to
10621use. So let's define a function suitable for the variable to point
10622to. Here's one:
10623
10624@example
10625double
10626double_add (double a, double b)
10627@{
10628 return a+b;
10629@}
10630@end example
10631
10632Now we can give it a value:
10633
10634@example
10635binary_op = double_add;
10636@end example
10637
10638The target type of the function pointer must be upward compatible with
10639the type of the function (@pxref{Compatible Types}).
10640
10641There is no need for @samp{&} in front of @code{double_add}.
10642Using a function name such as @code{double_add} as an expression
10643automatically converts it to the function's address, with the
10644appropriate function pointer type. However, it is ok to use
10645@samp{&} if you feel that is clearer:
10646
10647@example
10648binary_op = &double_add;
10649@end example
10650
10651@node Calling Function Pointers
10652@subsection Calling Function Pointers
10653@cindex calling function pointers
10654@cindex function pointers, calling
10655
10656To call the function specified by a function pointer, just write the
10657function pointer value in a function call. For instance, here's a
10658call to the function @code{binary_op} points to:
10659
10660@example
10661binary_op (x, 5)
10662@end example
10663
10664Since the data type of @code{binary_op} explicitly specifies type
10665@code{double} for the arguments, the call converts @code{x} and 5 to
10666@code{double}.
10667
10668The call conceptually dereferences the pointer @code{binary_op} to
10669``get'' the function it points to, and calls that function. If you
10670wish, you can explicitly represent the derefence by writing the
10671@code{*} operator:
10672
10673@example
10674(*binary_op) (x, 5)
10675@end example
10676
10677The @samp{*} reminds people reading the code that @code{binary_op} is
10678a function pointer rather than the name of a specific function.
10679
10680@node The main Function
10681@section The @code{main} Function
10682@cindex @code{main} function
10683@findex main
10684
10685Every complete executable program requires at least one function,
10686called @code{main}, which is where execution begins. You do not have
10687to explicitly declare @code{main}, though GNU C permits you to do so.
10688Conventionally, @code{main} should be defined to follow one of these
10689calling conventions:
10690
10691@example
10692int main (void) @{@r{@dots{}}@}
10693int main (int argc, char *argv[]) @{@r{@dots{}}@}
10694int main (int argc, char *argv[], char *envp[]) @{@r{@dots{}}@}
10695@end example
10696
10697@noindent
10698Using @code{void} as the parameter list means that @code{main} does
10699not use the arguments. You can write @code{char **argv} instead of
10700@code{char *argv[]}, and likewise for @code{envp}, as the two
10701constructs are equivalent.
10702
10703@ignore @c Not so at present
10704Defining @code{main} in any other way generates a warning. Your
10705program will still compile, but you may get unexpected results when
10706executing it.
10707@end ignore
10708
10709You can call @code{main} from C code, as you can call any other
10710function, though that is an unusual thing to do. When you do that,
10711you must write the call to pass arguments that match the parameters in
10712the definition of @code{main}.
10713
10714The @code{main} function is not actually the first code that runs when
10715a program starts. In fact, the first code that runs is system code
10716from the file @file{crt0.o}. In Unix, this was hand-written assembler
10717code, but in GNU we replaced it with C code. Its job is to find
10718the arguments for @code{main} and call that.
10719
10720@menu
10721* Values from main:: Returning values from the main function.
10722* Command-line Parameters:: Accessing command-line parameters
10723 provided to the program.
10724* Environment Variables:: Accessing system environment variables.
10725@end menu
10726
10727@node Values from main
10728@subsection Returning Values from @code{main}
10729@cindex returning values from @code{main}
10730@cindex success
10731@cindex failure
10732@cindex exit status
10733
10734When @code{main} returns, the process terminates. Whatever value
10735@code{main} returns becomes the exit status which is reported to the
10736parent process. While nominally the return value is of type
10737@code{int}, in fact the exit status gets truncated to eight bits; if
10738@code{main} returns the value 256, the exit status is 0.
10739
10740Normally, programs return only one of two values: 0 for success,
10741and 1 for failure. For maximum portability, use the macro
10742values @code{EXIT_SUCCESS} and @code{EXIT_FAILURE} defined in
10743@code{stdlib.h}. Here's an example:
10744
10745@cindex @code{EXIT_FAILURE}
10746@cindex @code{EXIT_SUCCESS}
10747@example
10748#include <stdlib.h> /* @r{Defines @code{EXIT_SUCCESS}} */
10749 /* @r{and @code{EXIT_FAILURE}.} */
10750
10751int
10752main (void)
10753@{
10754 @r{@dots{}}
10755 if (foo)
10756 return EXIT_SUCCESS;
10757 else
10758 return EXIT_FAILURE;
10759@}
10760@end example
10761
10762Some types of programs maintain special conventions for various return
10763values; for example, comparison programs including @code{cmp} and
10764@code{diff} return 1 to indicate a mismatch, and 2 to indicate that
10765the comparison couldn't be performed.
10766
10767@node Command-line Parameters
10768@subsection Accessing Command-line Parameters
10769@cindex command-line parameters
10770@cindex parameters, command-line
10771
10772If the program was invoked with any command-line arguments, it can
10773access them through the arguments of @code{main}, @code{argc} and
10774@code{argv}. (You can give these arguments any names, but the names
10775@code{argc} and @code{argv} are customary.)
10776
10777The value of @code{argv} is an array containing all of the
10778command-line arguments as strings, with the name of the command
10779invoked as the first string. @code{argc} is an integer that says how
10780many strings @code{argv} contains. Here is an example of accessing
10781the command-line parameters, retrieving the program's name and
10782checking for the standard @option{--version} and @option{--help} options:
10783
10784@example
10785#include <string.h> /* @r{Declare @code{strcmp}.} */
10786
10787int
10788main (int argc, char *argv[])
10789@{
10790 char *program_name = argv[0];
10791
10792 for (int i = 1; i < argc; i++)
10793 @{
10794 if (!strcmp (argv[i], "--version"))
10795 @{
10796 /* @r{Print version information and exit.} */
10797 @r{@dots{}}
10798 @}
10799 else if (!strcmp (argv[i], "--help"))
10800 @{
10801 /* @r{Print help information and exit.} */
10802 @r{@dots{}}
10803 @}
10804 @}
10805 @r{@dots{}}
10806@}
10807@end example
10808
10809@node Environment Variables
10810@subsection Accessing Environment Variables
10811@cindex environment variables
10812
10813You can optionally include a third parameter to @code{main}, another
10814array of strings, to capture the environment variables available to
10815the program. Unlike what happens with @code{argv}, there is no
10816additional parameter for the count of environment variables; rather,
10817the array of environment variables concludes with a null pointer.
10818
10819@example
10820#include <stdio.h> /* @r{Declares @code{printf}.} */
10821
10822int
10823main (int argc, char *argv[], char *envp[])
10824@{
10825 /* @r{Print out all environment variables.} */
10826 int i = 0;
10827 while (envp[i])
10828 @{
10829 printf ("%s\n", envp[i]);
10830 i++;
10831 @}
10832@}
10833@end example
10834
10835Another method of retrieving environment variables is to use the
10836library function @code{getenv}, which is defined in @code{stdlib.h}.
10837Using @code{getenv} does not require defining @code{main} to accept the
10838@code{envp} pointer. For example, here is a program that fetches and prints
10839the user's home directory (if defined):
10840
10841@example
10842#include <stdlib.h> /* @r{Declares @code{getenv}.} */
10843#include <stdio.h> /* @r{Declares @code{printf}.} */
10844
10845int
10846main (void)
10847@{
10848 char *home_directory = getenv ("HOME");
10849 if (home_directory)
10850 printf ("My home directory is: %s\n", home_directory);
10851 else
10852 printf ("My home directory is not defined!\n");
10853@}
10854@end example
10855
10856@node Advanced Definitions
10857@section Advanced Function Features
10858
10859This section describes some advanced or obscure features for GNU C
10860function definitions. If you are just learning C, you can skip the
10861rest of this chapter.
10862
10863@menu
10864* Variable-Length Array Parameters:: Functions that accept arrays
10865 of variable length.
10866* Variable Number of Arguments:: Variadic functions.
10867* Nested Functions:: Defining functions within functions.
10868* Inline Function Definitions:: A function call optimization technique.
10869@end menu
10870
10871@node Variable-Length Array Parameters
10872@subsection Variable-Length Array Parameters
10873@cindex variable-length array parameters
10874@cindex array parameters, variable-length
10875@cindex functions that accept variable-length arrays
10876
10877An array parameter can have variable length: simply declare the array
10878type with a size that isn't constant. In a nested function, the
10879length can refer to a variable defined in a containing scope. In any
10880function, it can refer to a previous parameter, like this:
10881
10882@example
10883struct entry
10884tester (int len, char data[len][len])
10885@{
10886 @r{@dots{}}
10887@}
10888@end example
10889
10890Alternatively, in function declarations (but not in function
10891definitions), you can use @code{[*]} to denote that the array
10892parameter is of a variable length, such that these two declarations
10893mean the same thing:
10894
10895@example
10896struct entry
10897tester (int len, char data[len][len]);
10898@end example
10899
10900@example
10901struct entry
10902tester (int len, char data[*][*]);
10903@end example
10904
10905@noindent
10906The two forms of input are equivalent in GNU C, but emphasizing that
10907the array parameter is variable-length may be helpful to those
10908studying the code.
10909
10910You can also omit the length parameter, and instead use some other
10911in-scope variable for the length in the function definition:
10912
10913@example
10914struct entry
10915tester (char data[*][*]);
10916@r{@dots{}}
10917int dataLength = 20;
10918@r{@dots{}}
10919struct entry
10920tester (char data[dataLength][dataLength])
10921@{
10922 @r{@dots{}}
10923@}
10924@end example
10925
10926@c ??? check text above
10927
10928@cindex parameter forward declaration
10929In GNU C, to pass the array first and the length afterward, you can
10930use a @dfn{parameter forward declaration}, like this:
10931
10932@example
10933struct entry
10934tester (int len; char data[len][len], int len)
10935@{
10936 @r{@dots{}}
10937@}
10938@end example
10939
10940The @samp{int len} before the semicolon is the parameter forward
10941declaration; it serves the purpose of making the name @code{len} known
10942when the declaration of @code{data} is parsed.
10943
10944You can write any number of such parameter forward declarations in the
10945parameter list. They can be separated by commas or semicolons, but
10946the last one must end with a semicolon, which is followed by the
10947``real'' parameter declarations. Each forward declaration must match
10948a subsequent ``real'' declaration in parameter name and data type.
10949
10950Standard C does not support parameter forward declarations.
10951
10952@node Variable Number of Arguments
10953@subsection Variable-Length Parameter Lists
10954@cindex variable-length parameter lists
10955@cindex parameters lists, variable length
10956@cindex function parameter lists, variable length
10957
10958@cindex variadic function
10959A function that takes a variable number of arguments is called a
10960@dfn{variadic function}. In C, a variadic function must specify at
10961least one fixed argument with an explicitly declared data type.
10962Additional arguments can follow, and can vary in both quantity and
10963data type.
10964
10965In the function header, declare the fixed parameters in the normal
10966way, then write a comma and an ellipsis: @samp{, ...}. Here is an
10967example of a variadic function header:
10968
10969@example
10970int add_multiple_values (int number, ...)
10971@end example
10972
10973@cindex @code{va_list}
10974@cindex @code{va_start}
10975@cindex @code{va_end}
10976The function body can refer to fixed arguments by their parameter
10977names, but the additional arguments have no names. Accessing them in
10978the function body uses certain standard macros. They are defined in
10979the library header file @file{stdarg.h}, so the code must
10980@code{#include} that file.
10981
10982In the body, write
10983
10984@example
10985va_list ap;
10986va_start (ap, @var{last_fixed_parameter});
10987@end example
10988
10989@noindent
10990This declares the variable @code{ap} (you can use any name for it)
10991and then sets it up to point before the first additional argument.
10992
10993Then, to fetch the next consecutive additional argument, write this:
10994
10995@example
10996va_arg (ap, @var{type})
10997@end example
10998
10999After fetching all the additional arguments (or as many as need to be
11000used), write this:
11001
11002@example
11003va_end (ap);
11004@end example
11005
11006Here's an example of a variadic function definition that adds any
11007number of @code{int} arguments. The first (fixed) argument says how
11008many more arguments follow.
11009
11010@example
11011#include <stdarg.h> /* @r{Defines @code{va}@r{@dots{}} macros.} */
11012@r{@dots{}}
11013
11014int
11015add_multiple_values (int argcount, ...)
11016@{
11017 int counter, total = 0;
11018
11019 /* @r{Declare a variable of type @code{va_list}.} */
11020 va_list argptr;
11021
11022 /* @r{Initialize that variable..} */
11023 va_start (argptr, argcount);
11024
11025 for (counter = 0; counter < argcount; counter++)
11026 @{
11027 /* @r{Get the next additional argument.} */
11028 total += va_arg (argptr, int);
11029 @}
11030
11031 /* @r{End use of the @code{argptr} variable.} */
11032 va_end (argptr);
11033
11034 return total;
11035@}
11036@end example
11037
11038With GNU C, @code{va_end} is superfluous, but some other compilers
11039might make @code{va_start} allocate memory so that calling
11040@code{va_end} is necessary to avoid a memory leak. Before doing
11041@code{va_start} again with the same variable, do @code{va_end}
11042first.
11043
11044@cindex @code{va_copy}
11045Because of this possible memory allocation, it is risky (in principle)
11046to copy one @code{va_list} variable to another with assignment.
11047Instead, use @code{va_copy}, which copies the substance but allocates
11048separate memory in the variable you copy to. The call looks like
11049@code{va_copy (@var{to}, @var{from})}, where both @var{to} and
11050@var{from} should be variables of type @code{va_list}. In principle,
11051do @code{va_end} on each of these variables before its scope ends.
11052
11053Since the additional arguments' types are not specified in the
11054function's definition, the default argument promotions
11055(@pxref{Argument Promotions}) apply to them in function calls. The
11056function definition must take account of this; thus, if an argument
11057was passed as @code{short}, the function should get it as @code{int}.
11058If an argument was passed as @code{float}, the function should get it
11059as @code{double}.
11060
11061C has no mechanism to tell the variadic function how many arguments
11062were passed to it, so its calling convention must give it a way to
11063determine this. That's why @code{add_multiple_values} takes a fixed
11064argument that says how many more arguments follow. Thus, you can
11065call the function like this:
11066
11067@example
11068sum = add_multiple_values (3, 12, 34, 190);
11069/* @r{Value is 12+34+190.} */
11070@end example
11071
11072In GNU C, there is no actual need to use the @code{va_end} function.
11073In fact, it does nothing. It's used for compatibility with other
11074compilers, when that matters.
11075
11076It is a mistake to access variables declared as @code{va_list} except
11077in the specific ways described here. Just what that type consists of
11078is an implementation detail, which could vary from one platform to
11079another.
11080
11081@node Nested Functions
11082@subsection Nested Functions
11083@cindex nested functions
11084@cindex functions, nested
11085@cindex downward funargs
11086@cindex thunks
11087
11088A @dfn{nested function} is a function defined inside another function.
11089The nested function's name is local to the block where it is defined.
11090For example, here we define a nested function named @code{square}, and
11091call it twice:
11092
11093@example
11094@group
11095foo (double a, double b)
11096@{
11097 double square (double z) @{ return z * z; @}
11098
11099 return square (a) + square (b);
11100@}
11101@end group
11102@end example
11103
11104The nested function can access all the variables of the containing
11105function that are visible at the point of its definition. This is
11106called @dfn{lexical scoping}. For example, here we show a nested
11107function that uses an inherited variable named @code{offset}:
11108
11109@example
11110@group
11111bar (int *array, int offset, int size)
11112@{
11113 int access (int *array, int index)
11114 @{ return array[index + offset]; @}
11115 int i;
11116 @r{@dots{}}
11117 for (i = 0; i < size; i++)
11118 @r{@dots{}} access (array, i) @r{@dots{}}
11119@}
11120@end group
11121@end example
11122
11123Nested function definitions can appear wherever automatic variable
11124declarations are allowed; that is, in any block, interspersed with the
11125other declarations and statements in the block.
11126
11127The nested function's name is visible only within the parent block;
11128the name's scope starts from its definition and continues to the end
11129of the containing block. If the nested function's name
11130is the same as the parent function's name, there wil be
11131no way to refer to the parent function inside the scope of the
11132name of the nested function.
11133
11134Using @code{extern} or @code{static} on a nested function definition
11135is an error.
11136
11137It is possible to call the nested function from outside the scope of its
11138name by storing its address or passing the address to another function.
11139You can do this safely, but you must be careful:
11140
11141@example
11142@group
11143hack (int *array, int size, int addition)
11144@{
11145 void store (int index, int value)
11146 @{ array[index] = value + addition; @}
11147
11148 intermediate (store, size);
11149@}
11150@end group
11151@end example
11152
11153Here, the function @code{intermediate} receives the address of
11154@code{store} as an argument. If @code{intermediate} calls @code{store},
11155the arguments given to @code{store} are used to store into @code{array}.
11156@code{store} also accesses @code{hack}'s local variable @code{addition}.
11157
11158It is safe for @code{intermediate} to call @code{store} because
11159@code{hack}'s stack frame, with its arguments and local variables,
11160continues to exist during the call to @code{intermediate}.
11161
11162Calling the nested function through its address after the containing
11163function has exited is asking for trouble. If it is called after a
11164containing scope level has exited, and if it refers to some of the
11165variables that are no longer in scope, it will refer to memory
11166containing junk or other data. It's not wise to take the risk.
11167
11168The GNU C Compiler implements taking the address of a nested function
11169using a technique called @dfn{trampolines}. This technique was
11170described in @cite{Lexical Closures for C@t{++}} (Thomas M. Breuel,
11171USENIX C@t{++} Conference Proceedings, October 17--21, 1988).
11172
11173A nested function can jump to a label inherited from a containing
11174function, provided the label was explicitly declared in the containing
11175function (@pxref{Local Labels}). Such a jump returns instantly to the
11176containing function, exiting the nested function that did the
11177@code{goto} and any intermediate function invocations as well. Here
11178is an example:
11179
11180@example
11181@group
11182bar (int *array, int offset, int size)
11183@{
11184 /* @r{Explicitly declare the label @code{failure}.} */
11185 __label__ failure;
11186 int access (int *array, int index)
11187 @{
11188 if (index > size)
11189 /* @r{Exit this function,}
11190 @r{and return to @code{bar}.} */
11191 goto failure;
11192 return array[index + offset];
11193 @}
11194@end group
11195
11196@group
11197 int i;
11198 @r{@dots{}}
11199 for (i = 0; i < size; i++)
11200 @r{@dots{}} access (array, i) @r{@dots{}}
11201 @r{@dots{}}
11202 return 0;
11203
11204 /* @r{Control comes here from @code{access}
11205 if it does the @code{goto}.} */
11206 failure:
11207 return -1;
11208@}
11209@end group
11210@end example
11211
11212To declare the nested function before its definition, use
11213@code{auto} (which is otherwise meaningless for function declarations;
11214@pxref{auto and register}). For example,
11215
11216@example
11217bar (int *array, int offset, int size)
11218@{
11219 auto int access (int *, int);
11220 @r{@dots{}}
11221 @r{@dots{}} access (array, i) @r{@dots{}}
11222 @r{@dots{}}
11223 int access (int *array, int index)
11224 @{
11225 @r{@dots{}}
11226 @}
11227 @r{@dots{}}
11228@}
11229@end example
11230
11231@node Inline Function Definitions
11232@subsection Inline Function Definitions
11233@cindex inline function definitions
11234@cindex function definitions, inline
11235@findex inline
11236
11237To declare a function inline, use the @code{inline} keyword in its
11238definition. Here's a simple function that takes a pointer-to-@code{int}
11239and increments the integer stored there---declared inline.
11240
11241@example
11242struct list
11243@{
11244 struct list *first, *second;
11245@};
11246
11247inline struct list *
11248list_first (struct list *p)
11249@{
11250 return p->first;
11251@}
11252
11253inline struct list *
11254list_second (struct list *p)
11255@{
11256 return p->second;
11257@}
11258@end example
11259
11260optimized compilation can substitute the inline function's body for
11261any call to it. This is called @emph{inlining} the function. It
11262makes the code that contains the call run faster, significantly so if
11263the inline function is small.
11264
11265Here's a function that uses @code{pair_second}:
11266
11267@example
11268int
11269pairlist_length (struct list *l)
11270@{
11271 int length = 0;
11272 while (l)
11273 @{
11274 length++;
11275 l = pair_second (l);
11276 @}
11277 return length;
11278@}
11279@end example
11280
11281Substituting the code of @code{pair_second} into the definition of
11282@code{pairlist_length} results in this code, in effect:
11283
11284@example
11285int
11286pairlist_length (struct list *l)
11287@{
11288 int length = 0;
11289 while (l)
11290 @{
11291 length++;
11292 l = l->second;
11293 @}
11294 return length;
11295@}
11296@end example
11297
11298Since the definition of @code{pair_second} does not say @code{extern}
11299or @code{static}, that definition is used only for inlining. It
11300doesn't generate code that can be called at run time. If not all the
11301calls to the function are inlined, there must be a definition of the
11302same function name in another module for them to call.
11303
11304@cindex inline functions, omission of
11305@c @opindex fkeep-inline-functions
11306Adding @code{static} to an inline function definition means the
11307function definition is limited to this compilation module. Also, it
11308generates run-time code if necessary for the sake of any calls that
11309were not inlined. If all calls are inlined then the function
11310definition does not generate run-time code, but you can force
11311generation of run-time code with the option
11312@option{-fkeep-inline-functions}.
11313
11314@cindex extern inline function
11315Specifying @code{extern} along with @code{inline} means the function is
11316external and generates run-time code to be called from other
11317separately compiled modules, as well as inlined. You can define the
11318function as @code{inline} without @code{extern} in other modules so as
11319to inline calls to the same function in those modules.
11320
11321Why are some calls not inlined? First of all, inlining is an
11322optimization, so non-optimized compilation does not inline.
11323
11324Some calls cannot be inlined for technical reasons. Also, certain
11325usages in a function definition can make it unsuitable for inline
11326substitution. Among these usages are: variadic functions, use of
11327@code{alloca}, use of computed goto (@pxref{Labels as Values}), and
11328use of nonlocal goto. The option @option{-Winline} requests a warning
11329when a function marked @code{inline} is unsuitable to be inlined. The
11330warning explains what obstacle makes it unsuitable.
11331
11332Just because a call @emph{can} be inlined does not mean it
11333@emph{should} be inlined. The GNU C compiler weighs costs and
11334benefits to decide whether inlining a particular call is advantageous.
11335
11336You can force inlining of all calls to a given function that can be
11337inlined, even in a non-optimized compilation. by specifying the
11338@samp{always_inline} attribute for the function, like this:
11339
11340@example
11341/* @r{Prototype.} */
11342inline void foo (const char) __attribute__((always_inline));
11343@end example
11344
11345@noindent
11346This is a GNU C extension. @xref{Attributes}.
11347
11348A function call may be inlined even if not declared @code{inline} in
11349special cases where the compiler can determine this is correct and
11350desirable. For instance, when a static function is called only once,
11351it will very likely be inlined. With @option{-flto}, link-time
11352optimization, any function might be inlined. To absolutely prevent
11353inlining of a specific function, specify
11354@code{__attribute__((__noinline__))} in the function's definition.
11355
11356@node Obsolete Definitions
11357@section Obsolete Function Features
11358
11359These features of function definitions are still used in old
11360programs, but you shouldn't write code this way today.
11361If you are just learning C, you can skip this section.
11362
11363@menu
11364* Old GNU Inlining:: An older inlining technique.
11365* Old-Style Function Definitions:: Original K&R style functions.
11366@end menu
11367
11368@node Old GNU Inlining
11369@subsection Older GNU C Inlining
11370
11371The GNU C spec for inline functions, before GCC version 5, defined
11372@code{extern inline} on a function definition to mean to inline calls
11373to it but @emph{not} generate code for the function that could be
11374called at run time. By contrast, @code{inline} without @code{extern}
11375specified to generate run-time code for the function. In effect, ISO
11376incompatibly flipped the meanings of these two cases. We changed GCC
11377in version 5 to adopt the ISO specification.
11378
11379Many programs still use these cases with the previous GNU C meanings.
11380You can specify use of those meanings with the option
11381@option{-fgnu89-inline}. You can also specify this for a single
11382function with @code{__attribute__ ((gnu_inline))}. Here's an example:
11383
11384@example
11385inline __attribute__ ((gnu_inline))
11386int
11387inc (int *a)
11388@{
11389 (*a)++;
11390@}
11391@end example
11392
11393@node Old-Style Function Definitions
11394@subsection Old-Style Function Definitions
11395@cindex old-style function definitions
11396@cindex function definitions, old-style
11397@cindex K&R-style function definitions
11398
11399The syntax of C traditionally allows omitting the data type in a
11400function declaration if it specifies a storage class or a qualifier.
11401Then the type defaults to @code{int}. For example:
11402
11403@example
11404static foo (double x);
11405@end example
11406
11407@noindent
11408defaults the return type to @code{int}. This is bad practice; if you
11409see it, fix it.
11410
11411An @dfn{old-style} (or ``K&R'') function definition is the way
11412function definitions were written in the 1980s. It looks like this:
11413
11414@example
11415@var{rettype}
11416@var{function} (@var{parmnames})
11417 @var{parm_declarations}
11418@{
11419 @var{body}
11420@}
11421@end example
11422
11423In @var{parmnames}, only the parameter names are listed, separated by
11424commas. Then @var{parm_declarations} declares their data types; these
11425declarations look just like variable declarations. If a parameter is
11426listed in @var{parmnames} but has no declaration, it is implicitly
11427declared @code{int}.
11428
11429There is no reason to write a definition this way nowadays, but they
11430can still be seen in older GNU programs.
11431
11432An old-style variadic function definition looks like this:
11433
11434@example
11435#include <varargs.h>
11436
11437int
11438add_multiple_values (va_alist)
11439 va_dcl
11440@{
11441 int argcount;
11442 int counter, total = 0;
11443
11444 /* @r{Declare a variable of type @code{va_list}.} */
11445 va_list argptr;
11446
11447 /* @r{Initialize that variable.} */
11448 va_start (argptr);
11449
11450 /* @r{Get the first argument (fixed).} */
11451 argcount = va_arg (int);
11452
11453 for (counter = 0; counter < argcount; counter++)
11454 @{
11455 /* @r{Get the next additional argument.} */
11456 total += va_arg (argptr, int);
11457 @}
11458
11459 /* @r{End use of the @code{argptr} variable.} */
11460 va_end (argptr);
11461
11462 return total;
11463@}
11464@end example
11465
11466Note that the old-style variadic function definition has no fixed
11467parameter variables; all arguments must be obtained with
11468@code{va_arg}.
11469
11470@node Compatible Types
11471@chapter Compatible Types
11472@cindex compatible types
11473@cindex types, compatible
11474
11475Declaring a function or variable twice is valid in C only if the two
11476declarations specify @dfn{compatible} types. In addition, some
11477operations on pointers require operands to have compatible target
11478types.
11479
11480In C, two different primitive types are never compatible. Likewise for
11481the defined types @code{struct}, @code{union} and @code{enum}: two
11482separately defined types are incompatible unless they are defined
11483exactly the same way.
11484
11485However, there are a few cases where different types can be
11486compatible:
11487
11488@itemize @bullet
11489@item
11490Every enumeration type is compatible with some integer type. In GNU
11491C, the choice of integer type depends on the largest enumeration
11492value.
11493
11494@c ??? Which one, in GCC?
11495@c ??? ... it varies, depending on the enum values. Testing on
11496@c ??? fencepost, it appears to use a 4-byte signed integer first,
11497@c ??? then moves on to an 8-byte signed integer. These details
11498@c ??? might be platform-dependent, as the C standard says that even
11499@c ??? char could be used as an enum type, but it's at least true
11500@c ??? that GCC chooses a type that is at least large enough to
11501@c ??? hold the largest enum value.
11502
11503@item
11504Array types are compatible if the element types are compatible
11505and the sizes (when specified) match.
11506
11507@item
11508Pointer types are compatible if the pointer target types are
11509compatible.
11510
11511@item
11512Function types that specify argument types are compatible if the
11513return types are compatible and the argument types are compatible,
11514argument by argument. In addition, they must all agree in whether
11515they use @code{...} to allow additional arguments.
11516
11517@item
11518Function types that don't specify argument types are compatible if the
11519return types are.
11520
11521@item
11522Function types that specify the argument types are compatible with
11523function types that omit them, if the return types are compatible and
11524the specified argument types are unaltered by the argument promotions
11525(@pxref{Argument Promotions}).
11526@end itemize
11527
11528In order for types to be compatible, they must agree in their type
11529qualifiers. Thus, @code{const int} and @code{int} are incompatible.
11530It follows that @code{const int *} and @code{int *} are incompatible
11531too (they are pointers to types that are not compatible).
11532
11533If two types are compatible ignoring the qualifiers, we call them
11534@dfn{nearly compatible}. (If they are array types, we ignore
11535qualifiers on the element types.@footnote{This is a GNU C extension.})
11536Comparison of pointers is valid if the pointers' target types are
11537nearly compatible. Likewise, the two branches of a conditional
11538expression may be pointers to nearly compatible target types.
11539
11540If two types are compatible ignoring the qualifiers, and the first
11541type has all the qualifiers of the second type, we say the first is
11542@dfn{upward compatible} with the second. Assignment of pointers
11543requires the assigned pointer's target type to be upward compatible
11544with the right operand (the new value)'s target type.
11545
11546@node Type Conversions
11547@chapter Type Conversions
11548@cindex type conversions
11549@cindex conversions, type
11550
11551C converts between data types automatically when that seems clearly
11552necessary. In addition, you can convert explicitly with a @dfn{cast}.
11553
11554@menu
11555* Explicit Type Conversion:: Casting a value from one type to another.
11556* Assignment Type Conversions:: Automatic conversion by assignment operation.
11557* Argument Promotions:: Automatic conversion of function parameters.
11558* Operand Promotions:: Automatic conversion of arithmetic operands.
11559* Common Type:: When operand types differ, which one is used?
11560@end menu
11561
11562@node Explicit Type Conversion
11563@section Explicit Type Conversion
11564@cindex cast
11565@cindex explicit type conversion
11566
11567You can do explicit conversions using the unary @dfn{cast} operator,
11568which is written as a type designator (@pxref{Type Designators}) in
11569parentheses. For example, @code{(int)} is the operator to cast to
11570type @code{int}. Here's an example of using it:
11571
11572@example
11573@{
11574 double d = 5.5;
11575
11576 printf ("Floating point value: %f\n", d);
11577 printf ("Rounded to integer: %d\n", (int) d);
11578@}
11579@end example
11580
11581Using @code{(int) d} passes an @code{int} value as argument to
11582@code{printf}, so you can print it with @samp{%d}. Using just
11583@code{d} without the cast would pass the value as @code{double}.
11584That won't work at all with @samp{%d}; the results would be gibberish.
11585
11586To divide one integer by another without rounding,
11587cast either of the integers to @code{double} first:
11588
11589@example
11590(double) @var{dividend} / @var{divisor}
11591@var{dividend} / (double) @var{divisor}
11592@end example
11593
11594It is enough to cast one of them, because that forces the common type
11595to @code{double} so the other will be converted automatically.
11596
11597The valid cast conversions are:
11598
11599@itemize @bullet
11600@item
11601One numerical type to another.
11602
11603@item
11604One pointer type to another.
11605(Converting between pointers that point to functions
11606and pointers that point to data is not standard C.)
11607
11608@item
11609A pointer type to an integer type.
11610
11611@item
11612An integer type to a pointer type.
11613
11614@item
11615To a union type, from the type of any alternative in the union
11616(@pxref{Unions}). (This is a GNU extension.)
11617
11618@item
11619Anything, to @code{void}.
11620@end itemize
11621
11622@node Assignment Type Conversions
11623@section Assignment Type Conversions
11624@cindex assignment type conversions
11625
11626Certain type conversions occur automatically in assignments
11627and certain other contexts. These are the conversions
11628assignments can do:
11629
11630@itemize @bullet
11631@item
11632Converting any numeric type to any other numeric type.
11633
11634@item
11635Converting @code{void *} to any other pointer type
11636(except pointer-to-function types).
11637
11638@item
11639Converting any other pointer type to @code{void *}.
11640(except pointer-to-function types).
11641
11642@item
11643Converting 0 (a null pointer constant) to any pointer type.
11644
11645@item
11646Converting any pointer type to @code{bool}. (The result is
116471 if the pointer is not null.)
11648
11649@item
11650Converting between pointer types when the left-hand target type is
11651upward compatible with the right-hand target type. @xref{Compatible
11652Types}.
11653@end itemize
11654
11655These type conversions occur automatically in certain contexts,
11656which are:
11657
11658@itemize @bullet
11659@item
11660An assignment converts the type of the right-hand expression
11661to the type wanted by the left-hand expression. For example,
11662
11663@example
11664double i;
11665i = 5;
11666@end example
11667
11668@noindent
11669converts 5 to @code{double}.
11670
11671@item
11672A function call, when the function specifies the type for that
11673argument, converts the argument value to that type. For example,
11674
11675@example
11676void foo (double);
11677foo (5);
11678@end example
11679
11680@noindent
11681converts 5 to @code{double}.
11682
11683@item
11684A @code{return} statement converts the specified value to the type
11685that the function is declared to return. For example,
11686
11687@example
11688double
11689foo ()
11690@{
11691 return 5;
11692@}
11693@end example
11694
11695@noindent
11696also converts 5 to @code{double}.
11697@end itemize
11698
11699In all three contexts, if the conversion is impossible, that
11700constitutes an error.
11701
11702@node Argument Promotions
11703@section Argument Promotions
11704@cindex argument promotions
11705@cindex promotion of arguments
11706
11707When a function's definition or declaration does not specify the type
11708of an argument, that argument is passed without conversion in whatever
11709type it has, with these exceptions:
11710
11711@itemize @bullet
11712@item
11713Some narrow numeric values are @dfn{promoted} to a wider type. If the
11714expression is a narrow integer, such as @code{char} or @code{short},
11715the call converts it automatically to @code{int} (@pxref{Integer
11716Types}).@footnote{On an embedded controller where @code{char}
11717or @code{short} is the same width as @code{int}, @code{unsigned char}
11718or @code{unsigned short} promotes to @code{unsigned int}, but that
11719never occurs in GNU C on real computers.}
11720
11721In this example, the expression @code{c} is passed as an @code{int}:
11722
11723@example
11724char c = '$';
11725
11726printf ("Character c is '%c'\n", c);
11727@end example
11728
11729@item
11730If the expression
11731has type @code{float}, the call converts it automatically to
11732@code{double}.
11733
11734@item
11735An array as argument is converted to a pointer to its zeroth element.
11736
11737@item
11738A function name as argument is converted to a pointer to that function.
11739@end itemize
11740
11741@node Operand Promotions
11742@section Operand Promotions
11743@cindex operand promotions
11744
11745The operands in arithmetic operations undergo type conversion automatically.
11746These @dfn{operand promotions} are the same as the argument promotions
11747except without converting @code{float} to @code{double}. In other words,
11748the operand promotions convert
11749
11750@itemize @bullet
11751@item
11752@code{char} or @code{short} (whether signed or not) to @code{int}.
11753
11754@item
11755an array to a pointer to its zeroth element, and
11756
11757@item
11758a function name to a pointer to that function.
11759@end itemize
11760
11761@node Common Type
11762@section Common Type
11763@cindex common type
11764
11765Arithmetic binary operators (except the shift operators) convert their
11766operands to the @dfn{common type} before operating on them.
11767Conditional expressions also convert the two possible results to their
11768common type. Here are the rules for determining the common type.
11769
11770If one of the numbers has a floating-point type and the other is an
11771integer, the common type is that floating-point type. For instance,
11772
11773@example
117745.6 * 2 @result{} 11.2 /* @r{a @code{double} value} */
11775@end example
11776
11777If both are floating point, the type with the larger range is the
11778common type.
11779
11780If both are integers but of different widths, the common type
11781is the wider of the two.
11782
11783If they are integer types of the same width, the common type is
11784unsigned if either operand is unsigned, and it's @code{long} if either
11785operand is @code{long}. It's @code{long long} if either operand is
11786@code{long long}.
11787
11788These rules apply to addition, subtraction, multiplication, division,
11789remainder, comparisons, and bitwise operations. They also apply to
11790the two branches of a conditional expression, and to the arithmetic
11791done in a modifying assignment operation.
11792
11793@node Scope
11794@chapter Scope
11795@cindex scope
11796@cindex block scope
11797@cindex function scope
11798@cindex function prototype scope
11799
11800Each definition or declaration of an identifier is visible
11801in certain parts of the program, which is typically less than the whole
11802of the program. The parts where it is visible are called its @dfn{scope}.
11803
11804Normally, declarations made at the top-level in the source -- that is,
11805not within any blocks and function definitions -- are visible for the
11806entire contents of the source file after that point. This is called
11807@dfn{file scope} (@pxref{File-Scope Variables}).
11808
11809Declarations made within blocks of code, including within function
11810definitions, are visible only within those blocks. This is called
11811@dfn{block scope}. Here is an example:
11812
11813@example
11814@group
11815void
11816foo (void)
11817@{
11818 int x = 42;
11819@}
11820@end group
11821@end example
11822
11823@noindent
11824In this example, the variable @code{x} has block scope; it is visible
11825only within the @code{foo} function definition block. Thus, other
11826blocks could have their own variables, also named @code{x}, without
11827any conflict between those variables.
11828
11829A variable declared inside a subblock has a scope limited to
11830that subblock,
11831
11832@example
11833@group
11834void
11835foo (void)
11836@{
11837 @{
11838 int x = 42;
11839 @}
11840 // @r{@code{x} is out of scope here.}
11841@}
11842@end group
11843@end example
11844
11845If a variable declared within a block has the same name as a variable
11846declared outside of that block, the definition within the block
11847takes precedence during its scope:
11848
11849@example
11850@group
11851int x = 42;
11852
11853void
11854foo (void)
11855@{
11856 int x = 17;
11857 printf ("%d\n", x);
11858@}
11859@end group
11860@end example
11861
11862@noindent
11863This prints 17, the value of the variable @code{x} declared in the
11864function body block, rather than the value of the variable @code{x} at
11865file scope. We say that the inner declaration of @code{x}
11866@dfn{shadows} the outer declaration, for the extent of the inner
11867declaration's scope.
11868
11869A declaration with block scope can be shadowed by another declaration
11870with the same name in a subblock.
11871
11872@example
11873@group
11874void
11875foo (void)
11876@{
11877 char *x = "foo";
11878 @{
11879 int x = 42;
11880 @r{@dots{}}
11881 exit (x / 6);
11882 @}
11883@}
11884@end group
11885@end example
11886
11887A function parameter's scope is the entire function body, but it can
11888be shadowed. For example:
11889
11890@example
11891@group
11892int x = 42;
11893
11894void
11895foo (int x)
11896@{
11897 printf ("%d\n", x);
11898@}
11899@end group
11900@end example
11901
11902@noindent
11903This prints the value of @code{x} the function parameter, rather than
11904the value of the file-scope variable @code{x}. However,
11905
11906Labels (@pxref{goto Statement}) have @dfn{function} scope: each label
11907is visible for the whole of the containing function body, both before
11908and after the label declaration:
11909
11910@example
11911@group
11912void
11913foo (void)
11914@{
11915 @r{@dots{}}
11916 goto bar;
11917 @r{@dots{}}
11918 @{ // @r{Subblock does not affect labels.}
11919 bar:
11920 @r{@dots{}}
11921 @}
11922 goto bar;
11923@}
11924@end group
11925@end example
11926
11927Except for labels, a declared identifier is not
11928visible to code before its declaration. For example:
11929
11930@example
11931@group
11932int x = 5;
11933int y = x + 10;
11934@end group
11935@end example
11936
11937@noindent
11938will work, but:
11939
11940@example
11941@group
11942int x = y + 10;
11943int y = 5;
11944@end group
11945@end example
11946
11947@noindent
11948cannot refer to the variable @code{y} before its declaration.
11949
11950@include cpp.texi
11951
11952@node Integers in Depth
11953@chapter Integers in Depth
11954
11955This chapter explains the machine-level details of integer types: how
11956they are represented as bits in memory, and the range of possible
11957values for each integer type.
11958
11959@menu
11960* Integer Representations:: How integer values appear in memory.
11961* Maximum and Minimum Values:: Value ranges of integer types.
11962@end menu
11963
11964@node Integer Representations
11965@section Integer Representations
11966
11967@cindex integer representations
11968@cindex representation of integers
11969
11970Modern computers store integer values as binary (base-2) numbers that
11971occupy a single unit of storage, typically either as an 8-bit
11972@code{char}, a 16-bit @code{short int}, a 32-bit @code{int}, or
11973possibly, a 64-bit @code{long long int}. Whether a @code{long int} is
11974a 32-bit or a 64-bit value is system dependent.@footnote{In theory,
11975any of these types could have some other size, bit it's not worth even
11976a minute to cater to that possibility. It never happens on
11977GNU/Linux.}
11978
11979@cindex @code{CHAR_BIT}
11980The macro @code{CHAR_BIT}, defined in @file{limits.h}, gives the number
11981of bits in type @code{char}. On any real operating system, the value
11982is 8.
11983
11984The fixed sizes of numeric types necessarily limits their @dfn{range
11985of values}, and the particular encoding of integers decides what that
11986range is.
11987
11988@cindex two's-complement representation
11989For unsigned integers, the entire space is used to represent a
11990nonnegative value. Signed integers are stored using
11991@dfn{two's-complement representation}: a signed integer with @var{n}
11992bits has a range from @math{-2@sup{(@var{n} - 1)}} to @minus{}1 to 0
11993to 1 to @math{+2@sup{(@var{n} - 1)} - 1}, inclusive. The leftmost, or
11994high-order, bit is called the @dfn{sign bit}.
11995
11996@c ??? Needs correcting
11997
11998There is only one value that means zero, and the most negative number
11999lacks a positive counterpart. As a result, negating that number
12000causes overflow; in practice, its result is that number back again.
12001For example, a two's-complement signed 8-bit integer can represent all
12002decimal numbers from @minus{}128 to +127. We will revisit that
12003peculiarity shortly.
12004
12005Decades ago, there were computers that didn't use two's-complement
12006representation for integers (@pxref{Integers in Depth}), but they are
12007long gone and not worth any effort to support.
12008
12009@c ??? Is this duplicate?
12010
12011When an arithmetic operation produces a value that is too big to
12012represent, the operation is said to @dfn{overflow}. In C, integer
12013overflow does not interrupt the control flow or signal an error.
12014What it does depends on signedness.
12015
12016For unsigned arithmetic, the result of an operation that overflows is
12017the @var{n} low-order bits of the correct value. If the correct value
12018is representable in @var{n} bits, that is always the result;
12019thus we often say that ``integer arithmetic is exact,'' omitting the
12020crucial qualifying phrase ``as long as the exact result is
12021representable.''
12022
12023In principle, a C program should be written so that overflow never
12024occurs for signed integers, but in GNU C you can specify various ways
12025of handling such overflow (@pxref{Integer Overflow}).
12026
12027Integer representations are best understood by looking at a table for
12028a tiny integer size; here are the possible values for an integer with
12029three bits:
12030
12031@multitable @columnfractions .25 .25 .25 .25
12032@headitem Unsigned @tab Signed @tab Bits @tab 2s Complement
12033@item 0 @tab 0 @tab 000 @tab 000 (0)
12034@item 1 @tab 1 @tab 001 @tab 111 (-1)
12035@item 2 @tab 2 @tab 010 @tab 110 (-2)
12036@item 3 @tab 3 @tab 011 @tab 101 (-3)
12037@item 4 @tab -4 @tab 100 @tab 100 (-4)
12038@item 5 @tab -3 @tab 101 @tab 011 (3)
12039@item 6 @tab -2 @tab 110 @tab 010 (2)
12040@item 7 @tab -1 @tab 111 @tab 001 (1)
12041@end multitable
12042
12043The parenthesized decimal numbers in the last column represent the
12044signed meanings of the two's-complement of the line's value. Recall
12045that, in two's-complement encoding, the high-order bit is 0 when
12046the number is nonnegative.
12047
12048We can now understand the peculiar behavior of negation of the
12049most negative two's-complement integer: start with 0b100,
12050invert the bits to get 0b011, and add 1: we get
120510b100, the value we started with.
12052
12053We can also see overflow behavior in two's-complement:
12054
12055@example
120563 + 1 = 0b011 + 0b001 = 0b100 = (-4)
120573 + 2 = 0b011 + 0b010 = 0b101 = (-3)
120583 + 3 = 0b011 + 0b011 = 0b110 = (-2)
12059@end example
12060
12061@noindent
12062A sum of two nonnegative signed values that overflows has a 1 in the
12063sign bit, so the exact positive result is truncated to a negative
12064value.
12065
12066@c =====================================================================
12067
12068@node Maximum and Minimum Values
12069@section Maximum and Minimum Values
12070@cindex maximum integer values
12071@cindex minimum integer values
12072@cindex integer ranges
12073@cindex ranges of integer types
12074@findex INT_MAX
12075@findex UINT_MAX
12076@findex SHRT_MAX
12077@findex LONG_MAX
12078@findex LLONG_MAX
12079@findex USHRT_MAX
12080@findex ULONG_MAX
12081@findex ULLONG_MAX
12082@findex CHAR_MAX
12083@findex SCHAR_MAX
12084@findex UCHAR_MAX
12085
12086For each primitive integer type, there is a standard macro defined in
12087@file{limits.h} that gives the largest value that type can hold. For
12088instance, for type @code{int}, the maximum value is @code{INT_MAX}.
12089On a 32-bit computer, that is equal to 2,147,483,647. The
12090maximum value for @code{unsigned int} is @code{UINT_MAX}, which on a
1209132-bit computer is equal to 4,294,967,295. Likewise, there are
12092@code{SHRT_MAX}, @code{LONG_MAX}, and @code{LLONG_MAX}, and
12093corresponding unsigned limits @code{USHRT_MAX}, @code{ULONG_MAX}, and
12094@code{ULLONG_MAX}.
12095
12096Since there are three ways to specify a @code{char} type, there are
12097also three limits: @code{CHAR_MAX}, @code{SCHAR_MAX}, and
12098@code{UCHAR_MAX}.
12099
12100For each type that is or might be signed, there is another symbol that
12101gives the minimum value it can hold. (Just replace @code{MAX} with
12102@code{MIN} in the names listed above.) There is no minimum limit
12103symbol for types specified with @code{unsigned} because the
12104minimum for them is universally zero.
12105
12106@code{INT_MIN} is not the negative of @code{INT_MAX}. In
12107two's-complement representation, the most negative number is 1 less
12108than the negative of the most positive number. Thus, @code{INT_MIN}
12109on a 32-bit computer has the value @minus{}2,147,483,648. You can't
12110actually write the value that way in C, since it would overflow.
12111That's a good reason to use @code{INT_MIN} to specify
12112that value. Its definition is written to avoid overflow.
12113
12114@include fp.texi
12115
12116@node Compilation
12117@chapter Compilation
12118@cindex object file
12119@cindex compilation module
12120@cindex make rules
12121
12122Early in the manual we explained how to compile a simple C program
12123that consists of a single source file (@pxref{Compile Example}).
12124However, we handle only short programs that way. A typical C program
12125consists of many source files, each of which is a separate
12126@dfn{compilation module}---meaning that it has to be compiled
12127separately.
12128
12129The full details of how to compile with GCC are documented in xxxx.
12130@c ??? ref
12131Here we give only a simple introduction.
12132
12133These are the commands to compile two compilation modules,
12134@file{foo.c} and @file{bar.c}, with a command for each module:
12135
12136@example
12137gcc -c -O -g foo.c
12138gcc -c -O -g bar.c
12139@end example
12140
12141@noindent
12142In these commands, @option{-g} says to generate debugging information,
12143@option{-O} says to do some optimization, and @option{-c} says to put
12144the compiled code for that module into a corresponding @dfn{object
12145file} and go no further. The object file for @file{foo.c} is called
12146@file{foo.o}, and so on.
12147
12148If you wish, you can specify the additional options @option{-Wformat
12149-Wparenthesis -Wstrict-prototypes}, which request additional warnings.
12150
12151One reason to divide a large program into multiple compilation modules
12152is to control how each module can access the internals of the others.
12153When a module declares a function or variable @code{extern}, other
12154modules can access it. The other functions and variables in
12155a module can't be accessed from outside that module.
12156
12157The other reason for using multiple modules is so that changing
12158one source file does not require recompiling all of them in order
12159to try the modified program. Dividing a large program into many
12160substantial modules in this way typically makes recompilation much faster.
12161
12162@cindex linking object files
12163After you compile all the program's modules, in order to run the
12164program you must @dfn{link} the object files into a combined
12165executable, like this:
12166
12167@example
12168gcc -o foo foo.o bar.o
12169@end example
12170
12171@noindent
12172In this command, @option{-o foo} species the file name for the
12173executable file, and the other arguments are the object files to link.
12174Always specify the executable file name in a command that generates
12175one.
12176
12177Normally we don't run any of these commands directly. Instead we
12178write a set of @dfn{make rules} for the program, then use the
12179@command{make} program to recompile only the source files that need to
12180be recompiled.
12181
12182@c ??? ref to make manual
12183
12184@node Directing Compilation
12185@chapter Directing Compilation
12186
12187This chapter describes C constructs that don't alter the program's
12188meaning @emph{as such}, but rather direct the compiler how to treat
12189some aspects of the program.
12190
12191@menu
12192* Pragmas:: Controling compilation of some constructs.
12193* Static Assertions:: Compile-time tests for conditions.
12194@end menu
12195
12196@node Pragmas
12197@section Pragmas
12198
12199A @dfn{pragma} is an annotation in a program that gives direction to
12200the compiler.
12201
12202@menu
12203* Pragma Basics:: Pragma syntax and usage.
12204* Severity Pragmas:: Settings for compile-time pragma output.
12205* Optimization Pragmas:: Controlling optimizations.
12206@end menu
12207
12208@c See also @ref{Macro Pragmas}, which save and restore macro definitions.
12209
12210@node Pragma Basics
12211@subsection Pragma Basics
12212
12213C defines two syntactical forms for pragmas, the line form and the
12214token form. You can write any pragma in either form, with the same
12215meaning.
12216
12217The line form is a line in the source code, like this:
12218
12219@example
12220#pragma @var{line}
12221@end example
12222
12223@noindent
12224The line pragma has no effect on the parsing of the lines around it.
12225This form has the drawback that it can't be generated by a macro expansion.
12226
12227The token form is a series of tokens; it can appear anywhere in the
12228program between the other tokens.
12229
12230@example
12231_Pragma (@var{stringconstant})
12232@end example
12233
12234@noindent
12235The pragma has no effect on the syntax of the tokens that surround it;
12236thus, here's a pragma in the middle of an @code{if} statement:
12237
12238@example
12239if _Pragma ("hello") (x > 1)
12240@end example
12241
12242@noindent
12243However, that's an unclear thing to do; for the sake of
12244understandability, it is better to put a pragma on a line by itself
12245and not embedded in the middle of another construct.
12246
12247Both forms of pragma have a textual argument. In a line pragma, the
12248text is the rest of the line. The textual argument to @code{_Pragma}
12249uses the same syntax as a C string constant: surround the text with
12250two @samp{"} characters, and add a backslash before each @samp{"} or
12251@samp{\} character in it.
12252
12253With either syntax, the textual argument specifies what to do.
12254It begins with one or several words that specify the operation.
12255If the compiler does not recognize them, it ignores the pragma.
12256
12257Here are the pragma operations supported in GNU C@.
12258
12259@c ??? Verify font for []
12260@table @code
12261@item #pragma GCC dependency "@var{file}" [@var{message}]
12262@itemx _Pragma ("GCC dependency \"@var{file}\" [@var{message}]")
12263Declares that the current source file depends on @var{file}, so GNU C
12264compares the file times and gives a warning if @var{file} is newer
12265than the current source file.
12266
12267This directive searches for @var{file} the way @code{#include}
12268searches for a non-system header file.
12269
12270If @var{message} is given, the warning message includes that text.
12271
12272Examples:
12273
12274@example
12275#pragma GCC dependency "parse.y"
12276_pragma ("GCC dependency \"/usr/include/time.h\" \
12277rerun fixincludes")
12278@end example
12279
12280@item #pragma GCC poison @var{identifiers}
12281@itemx _Pragma ("GCC poison @var{identifiers}")
12282Poisons the identifiers listed in @var{identifiers}.
12283
12284This is useful to make sure all mention of @var{identifiers} has been
12285deleted from the program and that no reference to them creeps back in.
12286If any of those identifiers appears anywhere in the source after the
12287directive, it causes a compilation error. For example,
12288
12289@example
12290#pragma GCC poison printf sprintf fprintf
12291sprintf(some_string, "hello");
12292@end example
12293
12294@noindent
12295generates an error.
12296
12297If a poisoned identifier appears as part of the expansion of a macro
12298that was defined before the identifier was poisoned, it will @emph{not}
12299cause an error. Thus, system headers that define macros that use
12300the identifier will not cause errors.
12301
12302For example,
12303
12304@example
12305#define strrchr rindex
12306_Pragma ("GCC poison rindex")
12307strrchr(some_string, 'h');
12308@end example
12309
12310@noindent
12311does not cause a compilation error.
12312
12313@item #pragma GCC system_header
12314@itemx _Pragma ("GCC system_header")
12315Specify treating the rest of the current source file as if it came
12316from a system header file. @xref{System Headers, System Headers,
12317System Headers, gcc, Using the GNU Compiler Collection}.
12318
12319@item #pragma GCC warning @var{message}
12320@itemx _Pragma ("GCC warning @var{message}")
12321Equivalent to @code{#warning}. Its advantage is that the
12322@code{_Pragma} form can be included in a macro definition.
12323
12324@item #pragma GCC error @var{message}
12325@itemx _Pragma ("GCC error @var{message}")
12326Equivalent to @code{#error}. Its advantage is that the
12327@code{_Pragma} form can be included in a macro definition.
12328
12329@item #pragma GCC message @var{message}
12330@itemx _Pragma ("GCC message @var{message}")
12331Similar to @samp{GCC warning} and @samp{GCC error}, this simply prints an
12332informational message, and could be used to include additional warning
12333or error text without triggering more warnings or errors. (Note that
12334unlike @samp{warning} and @samp{error}, @samp{message} does not include
12335@samp{GCC} as part of the pragma.)
12336@end table
12337
12338@node Severity Pragmas
12339@subsection Severity Pragmas
12340
12341These pragmas control the severity of classes of diagnostics.
12342You can specify the class of diagnostic with the GCC option that causes
12343those diagnostics to be generated.
12344
12345@table @code
12346@item #pragma GCC diagnostic error @var{option}
12347@itemx _Pragma ("GCC diagnostic error @var{option}")
12348For code following this pragma, treat diagnostics of the variety
12349specified by @var{option} as errors. For example:
12350
12351@example
12352_Pragma ("GCC diagnostic error -Wformat")
12353@end example
12354
12355@noindent
12356specifies to treat diagnostics enabled by the @var{-Wformat} option
12357as errors rather than warnings.
12358
12359@item #pragma GCC diagnostic warning @var{option}
12360@itemx _Pragma ("GCC diagnostic warning @var{option}")
12361For code following this pragma, treat diagnostics of the variety
12362specified by @var{option} as warnings. This overrides the
12363@var{-Werror} option which says to treat warnings as errors.
12364
12365@item #pragma GCC diagnostic ignore @var{option}
12366@itemx _Pragma ("GCC diagnostic ignore @var{option}")
12367For code following this pragma, refrain from reporting any diagnostics
12368of the variety specified by @var{option}.
12369
12370@item #pragma GCC diagnostic push
12371@itemx _Pragma ("GCC diagnostic push")
12372@itemx #pragma GCC diagnostic pop
12373@itemx _Pragma ("GCC diagnostic pop")
12374These pragmas maintain a stack of states for severity settings.
12375@samp{GCC diagnostic push} saves the current settings on the stack,
12376and @samp{GCC diagnostic pop} pops the last stack item and restores
12377the current settings from that.
12378
12379@samp{GCC diagnostic pop} when the severity setting stack is empty
12380restores the settings to what they were at the start of compilation.
12381
12382Here is an example:
12383
12384@example
12385_Pragma ("GCC diagnostic error -Wformat")
12386
12387/* @r{@option{-Wformat} messages treated as errors. } */
12388
12389_Pragma ("GCC diagnostic push")
12390_Pragma ("GCC diagnostic warning -Wformat")
12391
12392/* @r{@option{-Wformat} messages treated as warnings. } */
12393
12394_Pragma ("GCC diagnostic push")
12395_Pragma ("GCC diagnostic ignored -Wformat")
12396
12397/* @r{@option{-Wformat} messages suppressed. } */
12398
12399_Pragma ("GCC diagnostic pop")
12400
12401/* @r{@option{-Wformat} messages treated as warnings again. } */
12402
12403_Pragma ("GCC diagnostic pop")
12404
12405/* @r{@option{-Wformat} messages treated as errors again. } */
12406
12407/* @r{This is an excess @samp{pop} that matches no @samp{push}. } */
12408_Pragma ("GCC diagnostic pop")
12409
12410/* @r{@option{-Wformat} messages treated once again}
12411 @r{as specified by the GCC command-line options.} */
12412@end example
12413@end table
12414
12415@node Optimization Pragmas
12416@subsection Optimization Pragmas
12417
12418These pragmas enable a particular optimization for specific function
12419definitions. The settings take effect at the end of a function
12420definition, so the clean place to use these pragmas is between
12421function definitions.
12422
12423@table @code
12424@item #pragma GCC optimize @var{optimization}
12425@itemx _Pragma ("GCC optimize @var{optimization}")
12426These pragmas enable the optimization @var{optimization} for the
12427following functions. For example,
12428
12429@example
12430_Pragma ("GCC optimize -fforward-propagate")
12431@end example
12432
12433@noindent
12434says to apply the @samp{forward-propagate} optimization to all
12435following function definitions. Specifying optimizations for
12436individual functions, rather than for the entire program, is rare but
12437can be useful for getting around a bug in the compiler.
12438
12439If @var{optimization} does not correspond to a defined optimization
12440option, the pragma is erroneous. To turn off an optimization, use the
12441corresponding @samp{-fno-} option, such as
12442@samp{-fno-forward-propagate}.
12443
12444@item #pragma GCC target @var{optimizations}
12445@itemx _Pragma ("GCC target @var{optimizations}")
12446The pragma @samp{GCC target} is similar to @samp{GCC optimize} but is
12447used for platform-specific optimizations. Thus,
12448
12449@example
12450_Pragma ("GCC target popcnt")
12451@end example
12452
12453@noindent
12454activates the optimization @samp{popcnt} for all
12455following function definitions. This optimization is supported
12456on a few common targets but not on others.
12457
12458@item #pragma GCC push_options
12459@itemx _Pragma ("GCC push_options")
12460The @samp{push_options} pragma saves on a stack the current settings
12461specified with the @samp{target} and @samp{optimize} pragmas.
12462
12463@item #pragma GCC pop_options
12464@itemx _Pragma ("GCC pop_options")
12465The @samp{pop_options} pragma pops saved settings from that stack.
12466
12467Here's an example of using this stack.
12468
12469@example
12470_Pragma ("GCC push_options")
12471_Pragma ("GCC optimize forward-propagate")
12472
12473/* @r{Functions to compile}
12474 @r{with the @code{forward-propagate} optimization.} */
12475
12476_Pragma ("GCC pop_options")
12477/* @r{Ends enablement of @code{forward-propagate}.} */
12478@end example
12479
12480@item #pragma GCC reset_options
12481@itemx _Pragma ("GCC reset_options")
12482Clears all pragma-defined @samp{target} and @samp{optimize}
12483optimization settings.
12484@end table
12485
12486@node Static Assertions
12487@section Static Assertions
12488@cindex static assertions
12489@findex _Static_assert
12490
12491You can add compiler-time tests for necessary conditions into your
12492code using @code{_Static_assert}. This can be useful, for example, to
12493check that the compilation target platform supports the type sizes
12494that the code expects. For example,
12495
12496@example
12497_Static_assert ((sizeof (long int) >= 8),
12498 "long int needs to be at least 8 bytes");
12499@end example
12500
12501@noindent
12502reports a compile-time error if compiled on a system with long
12503integers smaller than 8 bytes, with @samp{long int needs to be at
12504least 8 bytes} as the error message.
12505
12506Since calls @code{_Static_assert} are processed at compile time, the
12507expression must be computable at compile time and the error message
12508must be a literal string. The expression can refer to the sizes of
12509variables, but can't refer to their values. For example, the
12510following static assertion is invalid for two reasons:
12511
12512@example
12513char *error_message
12514 = "long int needs to be at least 8 bytes";
12515int size_of_long_int = sizeof (long int);
12516
12517_Static_assert (size_of_long_int == 8, error_message);
12518@end example
12519
12520@noindent
12521The expression @code{size_of_long_int == 8} isn't computable at
12522compile time, and the error message isn't a literal string.
12523
12524You can, though, use preprocessor definition values with
12525@code{_Static_assert}:
12526
12527@example
12528#define LONG_INT_ERROR_MESSAGE "long int needs to be \
12529at least 8 bytes"
12530
12531_Static_assert ((sizeof (long int) == 8),
12532 LONG_INT_ERROR_MESSAGE);
12533@end example
12534
12535Static assertions are permitted wherever a statement or declaration is
12536permitted, including at top level in the file, and also inside the
12537definition of a type.
12538
12539@example
12540union y
12541@{
12542 int i;
12543 int *ptr;
12544 _Static_assert (sizeof (int *) == sizeof (int),
12545 "Pointer and int not same size");
12546@};
12547@end example
12548
12549@node Type Alignment
12550@appendix Type Alignment
12551@cindex type alignment
12552@cindex alignment of type
12553@findex _Alignof
12554@findex __alignof__
12555
12556Code for device drivers and other communication with low-level
12557hardware sometimes needs to be concerned with the alignment of
12558data objects in memory.
12559
12560Each data type has a required @dfn{alignment}, always a power of 2,
12561that says at which memory addresses an object of that type can validly
12562start. A valid address for the type must be a multiple of its
12563alignment. If a type's alignment is 1, that means it can validly
12564start at any address. If a type's alignment is 2, that means it can
12565only start at an even address. If a type's alignment is 4, that means
12566it can only start at an address that is a multiple of 4.
12567
12568The alignment of a type (except @code{char}) can vary depending on the
12569kind of computer in use. To refer to the alignment of a type in a C
12570program, use @code{_Alignof}, whose syntax parallels that of
12571@code{sizeof}. Like @code{sizeof}, @code{_Alignof} is a compile-time
12572operation, and it doesn't compute the value of the expression used
12573as its argument.
12574
12575Nominally, each integer and floating-point type has an alignment equal to
12576the largest power of 2 that divides its size. Thus, @code{int} with
12577size 4 has a nominal alignment of 4, and @code{long long int} with
12578size 8 has a nominal alignment of 8.
12579
12580However, each kind of computer generally has a maximum alignment, and
12581no type needs more alignment than that. If the computer's maximum
12582alignment is 4 (which is common), then no type's alignment is more
12583than 4.
12584
12585The size of any type is always a multiple of its alignment; that way,
12586in an array whose elements have that type, all the elements are
12587properly aligned if the first one is.
12588
12589These rules apply to all real computers today, but some embedded
12590controllers have odd exceptions. We don't have references to cite for
12591them.
12592@c We can't cite a nonfree manual as documentation.
12593
12594Ordinary C code guarantees that every object of a given type is in
12595fact aligned as that type requires.
12596
12597If the operand of @code{_Alignof} is a structure field, the value
12598is the alignment it requires. It may have a greater alignment by
12599coincidence, due to the other fields, but @code{_Alignof} is not
12600concerned about that. @xref{Structures}.
12601
12602Older versions of GNU C used the keyword @code{__alignof__} for this,
12603but now that the feature has been standardized, it is better
12604to use the standard keyword @code{_Alignof}.
12605
12606@findex _Alignas
12607@findex __aligned__
12608You can explicitly specify an alignment requirement for a particular
12609variable or structure field by adding @code{_Alignas
12610(@var{alignment})} to the declaration, where @var{alignment} is a
12611power of 2 or a type name. For instance:
12612
12613@example
12614char _Alignas (8) x;
12615@end example
12616
12617@noindent
12618or
12619
12620@example
12621char _Alignas (double) x;
12622@end example
12623
12624@noindent
12625specifies that @code{x} must start on an address that is a multiple of
126268. However, if @var{alignment} exceeds the maximum alignment for the
12627machine, that maximum is how much alignment @code{x} will get.
12628
12629The older GNU C syntax for this feature looked like
12630@code{__attribute__ ((__aligned__ (@var{alignment})))} to the
12631declaration, and was added after the variable. For instance:
12632
12633@example
12634char x __attribute__ ((__aligned__ 8));
12635@end example
12636
12637@xref{Attributes}.
12638
12639@node Aliasing
12640@appendix Aliasing
12641@cindex aliasing (of storage)
12642@cindex pointer type conversion
12643@cindex type conversion, pointer
12644
12645We have already presented examples of casting a @code{void *} pointer
12646to another pointer type, and casting another pointer type to
12647@code{void *}.
12648
12649One common kind of pointer cast is guaranteed safe: casting the value
12650returned by @code{malloc} and related functions (@pxref{Dynamic Memory
12651Allocation}). It is safe because these functions do not save the
12652pointer anywhere else; the only way the program will access the newly
12653allocated memory is via the pointer just returned.
12654
12655In fact, C allows casting any pointer type to any other pointer type.
12656Using this to access the same place in memory using two
12657different data types is called @dfn{aliasing}.
12658
12659Aliasing is necessary in some programs that do sophisticated memory
12660management, such as GNU Emacs, but most C programs don't need to do
12661aliasing. When it isn't needed, @strong{stay away from it!} To do
12662aliasing correctly requires following the rules stated below.
12663Otherwise, the aliasing may result in malfunctions when the program
12664runs.
12665
12666The rest of this appendix explains the pitfalls and rules of aliasing.
12667
12668@menu
12669* Aliasing Alignment:: Memory alignment considerations for
12670 casting between pointer types.
12671* Aliasing Length:: Type size considerations for
12672 casting between pointer types.
12673* Aliasing Type Rules:: Even when type alignment and size matches,
12674 aliasing can still have surprising results.
12675
12676@end menu
12677
12678@node Aliasing Alignment
12679@appendixsection Aliasing and Alignment
12680
12681In order for a type-converted pointer to be valid, it must have the
12682alignment that the new pointer type requires. For instance, on most
12683computers, @code{int} has alignment 4; the address of an @code{int}
12684must be a multiple of 4. However, @code{char} has alignment 1, so the
12685address of a @code{char} is usually not a multiple of 4. Taking the
12686address of such a @code{char} and casting it to @code{int *} probably
12687results in an invalid pointer. Trying to dereference it may cause a
12688@code{SIGBUS} signal, depending on the platform in use (@pxref{Signals}).
12689
12690@example
12691foo ()
12692@{
12693 char i[4];
12694 int *p = (int *) &i[1]; /* @r{Misaligned pointer!} */
12695 return *p; /* @r{Crash!} */
12696@}
12697@end example
12698
12699This requirement is never a problem when casting the return value
12700of @code{malloc} because that function always returns a pointer
12701with as much alignment as any type can require.
12702
12703@node Aliasing Length
12704@appendixsection Aliasing and Length
12705
12706When converting a pointer to a different pointer type, make sure the
12707object it really points to is at least as long as the target of the
12708converted pointer. For instance, suppose @code{p} has type @code{int
12709*} and it's cast as follows:
12710
12711@example
12712int *p;
12713
12714struct
12715 @{
12716 double d, e, f;
12717 @} foo;
12718
12719struct foo *q = (struct foo *)p;
12720
12721q->f = 5.14159;
12722@end example
12723
12724@noindent
12725the value @code{q->f} will run past the end of the @code{int} that
12726@code{p} points to. If @code{p} was initialized to the start of an
12727array of type @code{int[6]}, the object is long enough for three
12728@code{double}s. But if @code{p} points to something shorter,
12729@code{q->f} will run on beyond the end of that, overlaying some other
12730data. Storing that will garble that other data. Or it could extend
12731past the end of memory space and cause a @code{SIGSEGV} signal
12732(@pxref{Signals}).
12733
12734@node Aliasing Type Rules
12735@appendixsection Type Rules for Aliasing
12736
12737C code that converts a pointer to a different pointer type can use the
12738pointers to access the same memory locations with two different data
12739types. If the same address is accessed with different types in a
12740single control thread, optimization can make the code do surprising
12741things (in effect, make it malfunction).
12742
12743Here's a concrete example where aliasing that can change the code's
12744behavior when it is optimized. We assume that @code{float} is 4 bytes
12745long, like @code{int}, and so is every pointer. Thus, the structures
12746@code{struct a} and @code{struct b} are both 8 bytes.
12747
12748@example
12749#include <stdio.h>
12750struct a @{ int size; char *data; @};
12751struct b @{ float size; char *data; @};
12752
12753void sub (struct a *p, struct b *q)
12754@{
12755 int x;
12756 p->size = 0;
12757 q->size = 1;
12758 x = p->size;
12759 printf("x =%d\n", x);
12760 printf("p->size =%d\n", (int)p->size);
12761 printf("q->size =%d\n", (int)q->size);
12762@}
12763
12764int main(void)
12765@{
12766 struct a foo;
12767 struct a *p = &foo;
12768 struct b *q = (struct b *) &foo;
12769
12770 sub (p, q);
12771@}
12772@end example
12773
12774This code works as intended when compiled without optimization. All
12775the operations are carried out sequentially as written. The code
12776sets @code{x} to @code{p->size}, but what it actually gets is the
12777bits of the floating point number 1, as type @code{int}.
12778
12779However, when optimizing, the compiler is allowed to assume
12780(mistakenly, here) that @code{q} does not point to the same storage as
12781@code{p}, because their data types are not allowed to alias.
12782
12783From this assumption, the compiler can deduce (falsely, here) that the
12784assignment into @code{q->size} has no effect on the value of
12785@code{p->size}, which must therefore still be 0. Thus, @code{x} will
12786be set to 0.
12787
12788GNU C, following the C standard, @emph{defines} this optimization as
12789legitimate. Code that misbehaves when optimized following these rules
12790is, by definition, incorrect C code.
12791
12792The rules for storage aliasing in C are based on the two data types:
12793the type of the object, and the type it is accessed through. The
12794rules permit accessing part of a storage object of type @var{t} using
12795only these types:
12796
12797@itemize @bullet
12798@item
12799@var{t}.
12800
12801@item
12802A type compatible with @var{t}. @xref{Compatible Types}.
12803
12804@item
12805A signed or unsigned version of one of the above.
12806
12807@item
12808A qualifed version of one of the above.
12809@xref{Type Qualifiers}.
12810
12811@item
12812An array, structure (@pxref{Structures}), or union type
12813(@code{Unions}) that contains one of the above, either directly as a
12814field or through multiple levels of fields. If @var{t} is
12815@code{double}, this would include @code{struct s @{ union @{ double
12816d[2]; int i[4]; @} u; int i; @};} because there's a @code{double}
12817inside it somewhere.
12818
12819@item
12820A character type.
12821@end itemize
12822
12823What do these rules say about the example in this subsection?
12824
12825For @code{foo.size} (equivalently, @code{a->size}), @var{t} is
12826@code{int}. The type @code{float} is not allowed as an aliasing type
12827by those rules, so @code{b->size} is not supposed to alias with
12828elements of @code{j}. Based on that assumption, GNU C makes a
12829permitted optimization that was not, in this case, consistent with
12830what the programmer intended the program to do.
12831
12832Whether GCC actually performs type-based aliasing analysis depends on
12833the details of the code. GCC has other ways to determine (in some cases)
12834whether objects alias, and if it gets a reliable answer that way, it won't
12835fall back on type-based heuristics.
12836
12837@c @opindex -fno-strict-aliasing
12838The importance of knowing the type-based aliasing rules is not so as
12839to ensure that the optimization is done where it would be safe, but so
12840as to ensure it is @emph{not} done in a way that would break the
12841program. You can turn off type-based aliasing analysis by giving GCC
12842the option @option{-fno-strict-aliasing}.
12843
12844@node Digraphs
12845@appendix Digraphs
12846@cindex digraphs
12847
12848C accepts aliases for certain characters. Apparently in the 1990s
12849some computer systems had trouble inputting these characters, or
12850trouble displaying them. These digraphs almost never appear in C
12851programs nowadays, but we mention them for completeness.
12852
12853@table @samp
12854@item <:
12855An alias for @samp{[}.
12856@item :>
12857An alias for @samp{]}.
12858@item <%
12859An alias for @samp{@{}.
12860@item %>
12861An alias for @samp{@}}.
12862@item %:
12863An alias for @samp{#},
12864used for preprocessing directives (@pxref{Directives}) and
12865macros (@pxref{Macros}).
12866@end table
12867
12868@node Attributes
12869@appendix Attributes in Declarations
12870@cindex attributes
12871@findex __attribute__
12872
12873You can specify certain additional requirements in a declaration, to
12874get fine-grained control over code generation, and helpful
12875informational messages during compilation. We use a few attributes in
12876code examples throughout this manual, including
12877
12878@table @code
12879@item aligned
12880The @code{aligned} attribute specifies a minimum alignment for a
12881variable or structure field, measured in bytes:
12882
12883@example
12884int foo __attribute__ ((aligned (8))) = 0;
12885@end example
12886
12887@noindent
12888This directs GNU C to allocate @code{foo} at an address that is a
12889multiple of 8 bytes. However, you can't force an alignment bigger
12890than the computer's maximum meaningful alignment.
12891
12892@item packed
12893The @code{packed} attribute specifies to compact the fields of a
12894structure by not leaving gaps between fields. For example,
12895
12896@example
12897struct __attribute__ ((packed)) bar
12898@{
12899 char a;
12900 int b;
12901@};
12902@end example
12903
12904@noindent
12905allocates the integer field @code{b} at byte 1 in the structure,
12906immediately after the character field @code{a}. The packed structure
12907is just 5 bytes long (assuming @code{int} is 4 bytes) and its
12908alignment is 1, that of @code{char}.
12909
12910@item deprecated
12911Applicable to both variables and functions, the @code{deprecated}
12912attribute tells the compiler to issue a warning if the variable or
12913function is ever used in the source file.
12914
12915@example
12916int old_foo __attribute__ ((deprecated));
12917
12918int old_quux () __attribute__ ((deprecated));
12919@end example
12920
12921@item __noinline__
12922The @code{__noinline__} attribute, in a function's declaration or
12923definition, specifies never to inline calls to that function. All
12924calls to that function, in a compilation unit where it has this
12925attribute, will be compiled to invoke the separately compiled
12926function. @xref{Inline Function Definitions}.
12927
12928@item __noclone__
12929The @code{__noclone__} attribute, in a function's declaration or
12930definition, specifies never to clone that function. Thus, there will
12931be only one compiled version of the function. @xref{Label Value
12932Caveats}, for more information about cloning.
12933
12934@item always_inline
12935The @code{always_inline} attribute, in a function's declaration or
12936definition, specifies to inline all calls to that function (unless
12937something about the function makes inlining impossible). This applies
12938to all calls to that function in a compilation unit where it has this
12939attribute. @xref{Inline Function Definitions}.
12940
12941@item gnu_inline
12942The @code{gnu_inline} attribute, in a function's declaration or
12943definition, specifies to handle the @code{inline} keywprd the way GNU
12944C originally implemented it, many years before ISO C said anything
12945about inlining. @xref{Inline Function Definitions}.
12946@end table
12947
12948For full documentation of attributes, see the GCC manual.
12949@xref{Attribute Syntax, Attribute Syntax, System Headers, gcc, Using
12950the GNU Compiler Collection}.
12951
12952@node Signals
12953@appendix Signals
12954@cindex signal
12955@cindex handler (for signal)
12956@cindex @code{SIGSEGV}
12957@cindex @code{SIGFPE}
12958@cindex @code{SIGBUS}
12959
12960Some program operations bring about an error condition called a
12961@dfn{signal}. These signals terminate the program, by default.
12962
12963There are various different kinds of signals, each with a name. We
12964have seen several such error conditions through this manual:
12965
12966@table @code
12967@item SIGSEGV
12968This signal is generated when a program tries to read or write outside
12969the memory that is allocated for it, or to write memory that can only
12970be read. The name is an abbreviation for ``segmentation violation''.
12971
12972@item SIGFPE
12973This signal indicates a fatal arithmetic error. The name is an
12974abbreviation for ``floating-point exception'', but covers all types of
12975arithmetic errors, including division by zero and overflow.
12976
12977@item SIGBUS
12978This signal is generated when an invalid pointer is dereferenced,
12979typically the result of dereferencing an uninintalized pointer. It is
12980similar to @code{SIGSEGV}, except that @code{SIGSEGV} indicates
12981invalid access to valid memory, while @code{SIGBUS} indicates an
12982attempt to access an invalid address.
12983@end table
12984
12985These kinds of signal allow the program to specify a function as a
12986@dfn{signal handler}. When a signal has a handler, it doesn't
12987terminate the program; instead it calls the handler.
12988
12989There are many other kinds of signal; here we list only those that
12990come from run-time errors in C operations. The rest have to do with
12991the functioning of the operating system. The GNU C Library Reference
12992Manual gives more explanation about signals (@pxref{Program Signal
12993Handling, The GNU C Library, , libc, The GNU C Library Reference
12994Manual}).
12995
12996@node GNU Free Documentation License
12997@appendix GNU Free Documentation License
12998
12999@include fdl.texi
13000
13001@node Symbol Index
13002@unnumbered Index of Symbols and Keywords
13003
13004@printindex fn
13005
13006@node Concept Index
13007@unnumbered Concept Index
13008
13009@printindex cp
13010
13011@bye
13012