- a source file, with headers and sources included via
#include
, is called preprocessing translation unit. After preprocessing, it is called translation unit - a source file that is not empty shall end in a newline character which shall not be immediately preceded by a backslash character
- in freestanding environment, C program executes without any benefit of an operating system (except minimal set in clause 4) and the name/type of a function is called at startup is implementation-defined
- in a hosted environment
- startup function is
main
with return typeint
and with no parameters or two namedargc
/argv
int main() {} int main(int argc, char *argv[]) {} // argc is non-negative, argv[0] is program name if exist <- both are modifiable
- return of
main
is equivalent to callingexit
with the same value if its return type is int-compatible and reach}
is equivalent toreturn 0
- startup function is
- integer/double-precision promotion promote each operand to
int/double
then do the operator and truncate afterward, but the actual execution need only produce the same result, possibly omitting the promotionchar c1, c2; c1 = c1 + c2; ---- float f1, f2; f1 = f2 + 0.2;
- rearrangement for floating-point expression is often restrict because of precision
double x, y, z; x = (x * y) * z; // != x *= y * z z = (x - y) + y; // != z = x z = x + x * y; // != z = x * (1 + y) y = x / 5.0; // != y = x * 0.2
- on a machine which overflows produce an explicit trap, cannot rewrite expression even with associativity
int a, b; a = a + 32760 + b + 5; // != a = a + b + 32760 + 5
- the order of execution of each operand is undefined
int sum; int *p; sum = sum + 10 + (*p++ = getchar()) // increment of p or getchar can occur at any point between previous and next sequence point
- trigraph sequences are replaced before any processing (usually need to enable via
-trigraphs
)??=define arraycheck(a, b) a??(b??) ??!??! b??(a??) ↓ #define arraycheck(a, b) a[b] || b[a]
- digraphs contain
<: :> <% %> %: %:%:
behave, respectively, the same as[ ] { } # ##
except for their spelling (when "stringized")
- identifiers can belong to one of four scopes (function, file, block, and function prototype)
- label name scope is only function scope
- only enumeration constant has scope right after its defining enumerator, others have scope after the completion of its declarator
enum state { PENDING = 0, FAILED = PENDING, // can use `pending` right after };
- if an identifier with
extern
in the scope which already exist that identifier, identifier keep the prior linkagestatic int i = 10; extern int i; // i linkage is internal ----- int i = 10; extern int i; // i linkage is external
- an automatic storage duration object (not VLA)'s lifetime from entry into the object (
goto
into the middle of the block) until execution of that block ends in anyway (goto
). For VLA, its lifetime is available only if passing through the declarationint main() { goto test; { int x = 1; test: printf("%d", x); // work } }
- a non-lvalue expression with struct or union type, its member with array-type refer to an object which has temporary lifetime
struct X { char a[8]; }; struct Y { char a; }; struct X getX(void) { struct X result = { "world" }; return result; } struct Y getY() { struct Y result = { 1 }; return result; } int main() { char *i = getX().a; // work char *j = &getY().a; // not work }
- scalar types = arithmetic types + pointer types
- arithemetic types = integer types + floating types
- integer types = char types + signed/unsigned integer types + enumerated types
- floating types = real floating types + complex types
- trap representation don't present values of the object type and not applied for struct or union (even any of their members can be trap representation)
for example, a C implementation take non {0, 1} values to be trap representation for
_Bool
- most of the recent C compilers only support two's complement
- alignment of a struct object is the strictest (largest) alignment from its members and valid alignment values are a non-negative integral power of two
- bitwise operators yield values depend on the internal representation of integers and have implementation-defined for signed bytes
- compound literal storage duration depends on where it occurs, static when outside of a function, automatic within the enclosing block
"/tmp/fileXXXXXX" // static no mater what and non-modifiable (char []){"/tmp/fileXXXXXX"} // like compound literal storage above and modifiable (const char []){"/tmp/fileXXXXXX"} // like above (except it can be placed in read-only memory and even be shared) and non-modifiable (const char []){"abc"} == "abc" // might yield 1
- each compound literal creates only single object in a given scope (in iteration statement, it is initialized each time since each iteration has each own scope)
struct s { int i; }; int main() { struct s q; int j = 0; again: q = (struct s){j++}; if (j < 2) goto again; return q.i == 1; // true }
- the result of unary operators
+
,-
and~
is the value of its (promoted) operandchar x; printf("%lu %lu", sizeof(x), sizeof(+x)); // 1 4
- the operands of
sizeof/alignof
operators are expressions that are not evaluated (except VLA forsizeof
)size_t n = sizeof(printf("%d", 4)); // doesn't output to terminal, n = 4
- in assignment expression, the evalution of operands (both left and right) are unsequenced, but updating stored value in left operand is sequenced after both
&
and*
operators are canceled each other when apply to the same operand (no dereference is applied)int a[10]; int *p = &*NULL; // p = NULL, work int *q = &a[2]; // q = a + 2
- enumerated type is implemented-defined (char, signed or unsigned integer), is capable of representing values (an implementation may delay the choice of which integer type untils the type is completed)
- qualifiers between
*
and the identifer qualify the type of the pointerint n = 10; const int *p = &n; // pointer to a const int *p = 10; // invalid p = NULL; // valid ---- int * const p = &n; // const pointer to a non-const int *p = 10; // valid p = NULL; // invalid ---- int p[const] = &n; // const array of a non-cost int
- objects of any variably-modified type may only be declared at block or function scope and must have automatic/allocated storage duration (except pointer to VLA)
int n = 10; int foo[n]; // invalid void func() { int baz[n]; // valid static int x[n]; // invalid extern int y[n]; // invalid static int (*z)[n]; // valid }
- l-value expressions designate objects and any members of those objects of const-qualified type are not modifiable lvalues (even re-assign with the same value)
struct { int a; const int b; } s1 = { .b = 1 }, s2 = { .b = 2 }; s1 = s2; // valid
- if a object (array, struct/union) is declared with any combination of type qualifiers (
const
,volatile
andrestrict
), its member types acquire those qualifiers - struct/union doesn't create a new scope -> nested types, enumeration, which is introduced inside are visible to the surrounding scope
- zero with bit field is only allowed for nameless and specify the next bit field will begin at a next unit type
struct foo { int x: 1; int y: 2; int z: 3; } // sizeof(struct foo) == 4 struct baz { int x: 1; int : 0; int z: 3; } // sizeof(struct baz) == 8
register/restrict
andinline
(only for function) are the hint, a compiler is free to ignore it- function declaration can appear at block scope as well as function scope (not like function definition <- usually compiler support block scope function definition but it is non-standard)
- for
static inline
, if there is no reference, the compiler can ignore that function in the resulting object file (each translation unit will have its definition of that function -> waste space). Whileextern inline
, the compiler has to generate a global definition in the resulting object file (only one global definition).inline
alone can lead to confusing thing - an empty list in a function declaration means a function has an unspecified number of parameters, while an empty list in a function definition means a function has no parameters
int foo(); int main() { foo(1); // valid, without any warnings } int foo() { return 1; }
typedef
can be interpreted different based on contextstypedef signed int t; typedef int plain; struct tag { unsigned t:4; // is unsigned identifer t with 4-bit field const t:5; // is unnamed const signed int 5-bit field plain r:5; // is signed int identifer r with 5-bit field }; ---- t f(t (t)); // f is the function, returning signed int, with one unnamed parameter pointer (to function returning signed int with one unnamed parameter signed int) long t; // is long identifer t
typedef
denotes a variable length array type, the length of an array is fixed at the timetypedef
is definedvoid copy(int n) { typedef int B[n]; // B is n ints, n evaluated now n += 1; B a; // a is n ints, n without += 1 int b[n]; // a and b are different sizes for (int i = 1; i < n; i++) a[i-1] = b[i]; }
- in the initializer, if number of initialized elements exceed size array, compiler freely discard those
int p[2] = {1, 2, 3, 4}; // two last elements 3 and 4 are discard
- if the initializer of the subobject doesn't begin with a left brace, the subobject will take enough initializers from the list and left remaining initializers for the next element
struct foo { int x, y, z; }; int main() { struct foo baz[] = {1, 2, 3, 4}; // baz[0] = { .x = 1, .y = 2, z = 3}, baz[1] = { .x = 4}; struct foo bar[] = {1, 2, {3, 4, 5}, 6}; // baz[0] = { .x = 1, .y = 2, z = 3}, baz[1] = { .x = 6}; struct foo qux[] = {{1}, 2, 3}; // baz[0] = { .x = 1 }, baz[1] = { .x = 2, .y = 3}; }
struct fred { int a[3], b; } w[] = {[0].a = {1}, [1].a[0] = 2};
- you re-run the array index in initializer
int a[MAX] = { 1, 3, 5, 7, 9, [MAX-5] = 8, 6, 4, 2, 0 // if MAX > 10, some zero-values element in the middle // if MAX < 10, some of values provided by the first 5 elments will be overrided };
- statements are not/don't include declarations
- if the first substatement is reached via a label, the second substatement is not executed
- jump into loop body cause the controlling expression of a
for
orwhile
is not evaluated, nor clause-1 of afor
statementgoto bump; ---- for(int i = 0; i < 10 ; ++i) { // i is declared but `i = 0` and `i < 10` are not executed after jumping bump: printf("%d", i); } ---- while(i < 0) { // `i < 0` is not executed after jumping bump: printf("%d", i); }
goto
statement is not allowed to jump past any declarations of VLA (jump within scope is permitted)goto lab3; // invalid: going INTO scope of VLA. { double a[n]; a[j] = 4.4; lab3: a[j] = 3.3; goto lab4; // valid: going WITHIN scope of VLA. a[j] = 5.5; lab4: a[j] = 6.6; } goto lab4; // invalid: going INTO scope of VLA.
- "An identifier declared as a typedef name shall not be redeclared as a parameter"
typedef int foo; int baz(foo) int foo; // invalid { return 1; }
- in the same scope, an identifier can be declared more than once (except identifier with no linkage) while the definition is only once
- source file inclusion has 3 forms
#include <h-char-sequence> new-line #include "h-char-sequence" new-line // if not supported for failed, continue with form-1 #include pp-tokens new-line // macro replacement if having
#
follows by an identifier, the identifier is not subject to macro replacement- if any nested replacements encounter for the name of the macro being replaced, it is not replaced (or unspecified behavior)
- line control has 2 forms
#line digit-sequence "s-char-sequence"(opt) new-line #line pp-tokens new-line // macro replacement if having