Study Notes of CS:APP
Part I Program Structure and Execution
How application programs are represented and executed.
Chapter 2 Representing and Manipulating Information
[22-01]
Two-valued Signals versus Decimal Notation
• Computers store and process information represented as two-valued signals.
• Lowly binary digits, or bits
• Decimal, or base-10, representation: natural for humans.
• Two-valued signals can readily be represented, stored, and transmitted
Encodings
• When we group bits together and apply some interpretation that gives meaning to the different possible bit patterns, we can represent the elements of any finite set.
• 3 most important representations of numbers:
• Unsigned encodings
• Based on traditional binary notation
• Represent numbers ≥ 0
• Two's-complement encodings
• Represent signed integers
• Either positive or negative
• Floating-point encodings
• Base-2 version of scientific notation for representing real numbers
• Arithmetic operations implemented
• Properties of arithmetic operations
• Properties of integer computer (vs. true integer) arithmetic
• Difference: Use a limited number of bits, hence some operations can overflow when the results are too large to be represented.
• Similarity: Multiplication is associative and commutative
• Properties of floating-point (vs. integer) arithmetic
• Different:
• Product of a set of positive numbers will always be positive
• Not associative due to the finite precision
• Stem from the difference in how they handle the finiteness
• Integer representations encode a comparatively small range of values precisely
• Floating-point representations encode a wide range of values approximately
• A number of computer security vulnerabilities have arisen due to some of the subtleties of computer arithmetic.
2.1 Information Storage [22-01]
Virtual Memory
• Computers use bytes (blocks of 8 bits) as the smallest addressable unit of memory.
• A machine-level program views memory as virtual memory (a very large array of bytes).
• Every byte of memory is identified by its address (a unique number).
• Virtual address space: set of all possible addresses
Storing Program Objects
• Program objects: program data, instructions, and control information
• Value of a pointer in C is the virtual address of the first byte of some block of storage.
• C compiler also associate type information with each pointer.
• Actual machine-level program simply treats each program object as a block of bytes and the program itself as a sequence of bytes.
2.1.1 Hexadecimal Notation
A Byte in Different Notations
• Byte = 8 bits
• Binary
• 000000002 to 111111112
• Decimal
• 010 to 25510
• Hexadecimal (or "hex", base-16):
• Convenient for describing bit patterns
• Use '0' through '9' along with 'A' through 'F'
• 0016 to FF16
•
• In C, starts with 0x or 0X
• Example: write FA1D37B16 as 0xFA1D37B, as 0xfa1d37b, or even 0xFa1D37b
Converting Between Decimal, Binary, and Hexadecimal
• Converting between binary and hexadecimal
• Straightforward, by referring to chart
• Simple trick: Memorize digital equivalents of A, C, F
• Convert hexadecimal to binary
• Example:
• Hexadecimal: 0x173A4C
•
• Convert binary to hexadecimal
• Split binary into 4-bit groups and translate each
• Note: Make leftmost group with ≤4 bits
• Example:
• Binary: 1111001010110110110011
•
• When x = 2n, binary of x: 10…02 (n 0s).
• For n = i + 4j, where 0 ≤ i ≤ 3, hex of x: 0x2i0…0 (j 0s).
• Example: x = 2,048 = 211
• n = 11= 3 + 4 · 2
• Hexadecimal: 0x800
• Converting between decimal and hexadecimal: multiplication or division
• Convert decimal to hexadecimal
• Convert decimal x to hexadecimal:
• Let x = q · 16 + r
• Use r as the least significant digit and generate the remaining digits by repeating the process on q
• Example
• Decimal: 314,156
• Hexadecimal: 0x4CB2C
• Convert hexadecimal to decimal
• Multiply each of the hex digits by the appropriate power of 16
• Example
• Hexadecimal: 0x7AF
• Decimal: 7 · 162 + 10 · 16 + 15 = 7 · 256 + 10 · 16 + 15 = 1,967
2.1.2 Data Sizes
The Word Size
• Every computer has a word size (nominal size of pointer data)
• For w-bit machine, virtual addresses range from 0 to 2w – 1
• There's been a widespread shift from 32-bit machines to 64-bit ones
• 32-bit: virtual address space of 4 GB
• 64-bit: virtual address space of 16 EB
• 1 exabyte = 260
• Distinction between 32-bit programs and 64-bit programs: compilation, rather than machine type
Data Formats
• Computers and compilers support multiple data formats using different ways to encode data.
• C supports multiple data formats for both integer and floating-point data.
• Integer data:
• Signed: negative, zero, and positive values
• Unsigned: nonnegative values
• char: single byte
• Stores character / integer values
• short, int, and long: provide a range of sizes.
• Pointer (e.g., char *): full word size
• Two different floating-point formats:
• float: single precision, 4 bytes
• double: double precision, 8 bytes
• Fixed-size integer types: int32_t and int64_t
• Used to have close control over data representations
• Signed and unsigned values
• Mostly signed
• Unsigned prefixed by unsigned or uint-.
• char: mostly signed but not guaranteed
• signed char: 1-byte signed value
• In many contexts, insensitive
• C allows a variety of ways:
• To order the keywords
• To include or omit optional keywords
• One aspect of portability: to make the program insensitive to the exact sizes of the different data types.
2.1.3 Addressing and Byte Ordering
• For multi-byte objects, we must establish two conventions:
• Addressing
• Byte ordering
Addressing
• Multi-byte object is stored as a contiguous sequence of bytes
• Address: smallest address of the bytes used
• Example: int x at 0x100
• (Assuming 32-bit) 4 bytes would be stored in 0x100, 0x101, 0x102, and 0x103
Byte Ordering
• Two common conventions.
• Consider w-bit integer [xw −1, xw −2, . . . , x1, x0], where xw−1 is the most significant bit, x0 the least.
• Assuming w is multiple of 8
• Most significant byte: [xw−1, xw−2, …, xw−8]
• Least significant byte: [x7, x6, …, x0]
• Other bytes: bits from the middle
• Little endian: least significant byte comes first
• Big endian: most significant byte comes first
• Example: int x at 0x100 has value of 0x01234567
• Ordering depends on the type of machine:
• Note: high-order byte is 0x01, while low-order byte is 0x67.
• Machines:
• Little-endian: most Intel-compatible machines, Android, iOS (OSs)
• Big-endian: most machines from IBM and Oracle
• Bi-endian: ARM microprocessors (hardware)
• Cases when byte-ordering becomes an issue:
1. When binary data are communicated over a network between different machines.
2. When looking at the byte sequences representing integer data.
• Occurs often when inspecting machine-level programs
• Example: line of a disassembly an Intel x86-64 processor
4004d3: 01 05 43 0b 20 00 add %eax,0x200b43(%rip)
• Generated by a disassembler
• Disassembler: tool that determines the instruction sequence represented by an executable program file
• Bytes are in reverse order for disassembly generated for little-endian machines
3. When programs are written that circumvent the normal type system.
• In C, using a cast or a union
• C code that uses casting to print the byte representations of program objects.
• size_t: preferred data type for expressing the sizes of data structures
• sizeof(T) returns # bytes required to store an object of type T.
• Writing portable code
• Running code on different machines
• Different machine/OS configurations use different conventions for storage allocation
2.1.4 Representing Strings
• String in C
• Encoded by an array of characters
• Each character is most commonly represented by the ASCII character code.
• ASCII code for decimal digit x happens to be 0x3x.
• Terminated by terminating byte (0x00)
2.1.5 Representing Code
• Binary code is seldom portable across different combinations of machine and OS.
• Fundamental concept: A program, from the perspective of the machine, is simply a sequence of bytes.
2.1.6 Introduction to Boolean Algebra
• Boolean algebra.
• Work of George Boole
• true, false encoded as 1, 0
Boolean Operations
• Simplest Boolean algebra is defined over {0, 1}.
• 4 Boolean operations
Boolean operation |
Corresponding logical operation |
|
~ |
not |
¬ |
& |
and |
∧ |
| |
or |
∨ |
^ |
exclusive-or |
⊕ |
Boolean Operations over Bit Vectors
• Extending the 4 Boolean operations to bit vectors (strings of 0's and 1's of fixed length w).
• Defined according to applications to the matching elements
• Examples
• Application: Represent finite sets
• Encode A ⊆ {0, 1, …, w− 1} with [aw−1, …, a1, a0], ai = 1 if and only if i ∈ A.
• Example: a = [01101001] encodes A = {0, 3, 5, 6}, b = [01010101] encodes B = {0, 2, 4, 6}.
• Corresponding set operations of Boolean operations
Boolean operation |
Corresponding set operation |
| |
Set union |
& |
Set intersection |
~ |
Set complement |
• Continuing the example: a & b yields [01000001], A ∩ B = {0, 6}.
• Practical applications example:
• Signal can interrupt the execution of a program.
• Applications can explicitly block and unblock selected signals by specifying the blocked bit-vector mask
2.1.7 Bit-Level Operations in C
• C supports bitwise Boolean operations.
• Symbols
C symbol |
Corresponding logical operation |
|
~ |
not |
|
& |
and |
|
| |
or |
|
^ |
exclusive-or |
• Can be applied to any "integral" data type
• Examples: expression evaluation for char
• Evaluating bit-level expression:
(1) Expand hexadecimal to binary
(2) Perform the operations
(3) Convert back to hexadecimal
• Use: Implement masking operations
• Mask: bit pattern that indicates a selected set of bits within a word.
• Example:
• x & 0xFF: least significant byte of x and 0's (all others).
• Examples:
• x = 0x89ABCDEF: 0x000000EF
• ~0: all 1's
2.1.8 Logical Operations in C
• Logical operators provided by C
Logical operator |
Corresponding logic operation |
|| |
or |
&& |
and |
! |
not |
• Argument: nonzero is true, 0 is false
• Return either 1 (true) or 0 (false)<span style="font-variant: small-caps">
• </span>Examples of expression evaluation
•
Bitwise Operations vs. their Logical Counterparts
• Matches only when arguments are restricted to 0 or 1.
• '&&' and '||' do not evaluate 2nd argument if result of can be determined by evaluating 1st
• Examples: a && 5/a and p && *p++ will never cause an exception
2.1.9 Shift Operations in C
• Shift operations shift bit patterns to the left and to the right.
• x: [xw−1, xw−2, …, x0]
• Left shift operation: x << k
• [xw−k−1, xw−k−2, …, x0, 0, …, 0]
• x is shifted k bits to the left, dropping off k xw−1s and filling the right end with k 0's.
• 0 ≤ k ≤ w − 1
• Associate from left to right:
• x << j << k ⇔ (x << j) << k
• Right shift operation: x >> k (2 forms)
• Logical
• Left end is filled with k 0's
• [0, …, 0, xw−1, xw−2, …, xk]
• Arithmetic
• Left end is filled with k xw−1s
• [xw−1, …, xw−1, xw−1, xw−2, …, xk]
• Examples (italicized fill):
• In C, no precise definition, either
• In practice
• For signed data, arithmetic
• For unsigned data, logical (must)
• Java definition
• x >> k: arithmetic
• x >>> k: logical
2.2 Integer Representations [earlier]
2.2.1 Integral Data Types [earlier]
Integral data types represent finite ranges of integers.
• Size: char, short, long
• All nonnegative or possibly negative: unsigned or the default
• The only machine-dependent: long (8-byte with 64-bit, 4-byte with 32-bit)
• Asymmetric
• ≤ the typical
• Symmetric except the fixed-size data types
• int could be 2-byte (mostly for 16-bit), long can be 4-byte (typically for 32-bit)
• The fixed-size data types
• Ranges = those of typical numbers
• Asymmetric
2.2.2 Unsigned Encodings [earlier]
Consider an integer data type of w bits.
• A bit vector is written as:
• , to denote the entire vector
• [xw−1, xw−2, ..., x0], to denote the individual bits within the vector
• The unsigned interpretation of : treated as binary.
• xi = 0 or 1 (2i is part of the value).
principle: Definition of unsigned encoding
For vector = [xw−1, xw−2, ..., x0]:
B2Uw() ≐ xi2i
• ≐: the left-hand side is defined to be equal to the right-hand side.
• Examples:
• UMaxw ≐ 2i =2w − 1
• A mapping: B2Uw: {0, 1}w→{0, … , UMaxw}.
principle: Uniqueness of unsigned encoding
Function B2Uw is a bijection.
• Bijection: a function f that goes two ways: it maps a value x to a value y where y = f (x), but it can also operate in reverse, since for every y, there is a unique value x such that f (x) = y, which is given by the inverse function x = f−1(y).
• U2Bw: the inverse of B2Uw
2.2.3 Two's-Complement Encodings [earlier]
Two's-complement form: the most common representation of signed numbers
principle: Definition of two's-complement encoding
For vector = [xw−1, xw−2, ..., x0]:
B2Tw() ≐ −xw−12w−1 +xi2i
The sign bit: xw−1
• "Weight": −2w−1
• 1: negative; 0: nonnegative.
• TMinw ≐ −2w−1, TMaxw ≐ 2i =2w−1 − 1.
• A mapping: B2Tw: {0, 1}w →{TMinw, …, TMaxw}.
principle: Uniqueness of two's-complement encoding
Function B2Tw is a bijection.
• T2Bw: the inverse of B2Tw
Drop w: UMax, TMin, and TMax
Points worth highlighting:
• Asymmetric Range: |TMin| = |TMax| + 1
• UMax = 2TMax + 1
• −1: same representation as UMax
• 0: a string of all 0's in both
Two's-complement in languages:
• Not required by C. <limits.h> defines a set of constants delimiting the ranges of the different integer data types for the particular machine.
• Example: for a two's-complement machine, INT_MAX = TMaxw, INT_MIN = TMinw and UINT_MAX = UMaxw.
• Required by Java.
Example to get a better understanding:
2.2.4 Conversions between Signed and Unsigned [earlier]
Casting example:
• int x, unsigned u
• (unsigned) x converts x to unsigned, (int) u converts u to int.
A general rule of handling conversions between signed and unsigned numbers with the same word size—the numeric values might change, but the bit patterns do not.
U2Bw and T2Bw [earlier]
• U2Bw
• 0 ≤ x ≤ UMaxw, U2Bw(x)
• Unique unsigned
• T2Bw
• TMinw ≤ x ≤ TMaxw, T2Bw(x)
• Unique two's-complement
T2Uw and U2Tw [earlier]
• T2Uw
• TMinw ≤ x ≤ TMaxw, T2Uw(x) ≐ B2Uw(T2Bw(x))
• 0 ≤ T2Uw(x) ≤ UMaxw, same representation
principle: Conversion from two's complement to unsigned
For x such that TMinw ≤ x ≤ TMaxw:
T2Uw(x) =
derivation: Conversion from two's complement to unsigned
B2Uw(T2Bw(x)) = T2Uw(x) = x + xw−12w
In a two's-complement representation of x, bit xw−1 determines whether or not x is negative.
• Examples:
• The behavior of T2U:
• < 0: converted to large positive
• ≥ 0: unchanged
• U2Tw(u)
• U2Tw 0 ≤ x ≤ UMaxw, U2Tw(x) ≐ B2Tw (U2B w(x))
• 0 ≤ U2T w(x) ≤ UMaxw, same representation
principle: Unsigned to two's-complement conversion
For u such that 0 ≤ u ≤ UMaxw:
U2Tw(u) =
derivation: Unsigned to two's-complement conversion
U2Tw(u) = −uw−12w + u
In the unsigned representation of u, bit uw−1 determines whether or not u is greater than TMaxw
= 2w−1 − 1.
• The behavior of U2T:
• ≤ TMaxw, unchanged
• > TMaxw, converted to negative
Summary [earlier]
The effects of converting in both directions:
• 0 ≤ x ≤ TMaxw
• T2Uw(x) = x, U2Tw(x) = x, identical
• Outside
• +/− 2w
• 2 extremes:
• T2Uw (−1) = UMaxw
• T2Uw(TMinw) = TMaxw + 1
2.2.5 Signed versus Unsigned in C [earlier]
Numbers
• Signed: by default
• Example: 12345 or 0x1A2B
• Unsigned: adding 'U' or 'u' as a suffix
• Example: 12345U or 0x1A2Bu
Conversion between:
• Allowed but not specified. Mostly U2Tw and T2Uw.
• Explicit casting and implicit casting (when an expression of one type is assigned to a variable of another)
• printf does not use type information.
• Possibly nonintuitive behavior:
2.2.6 Expanding the Bit Representation of a Number [earlier]
One common operation: to convert between integers having different word sizes while retaining the same numeric value. Converting from a smaller to a larger data type should always be possible.
Converting Unsigned to Larger [earlier]
Zero extension: adding leading zeros to the representation.
principle: Expansion of an unsigned number by zero extension
Define bit vectors = [uw−1, uw−2, …, u0] of width w and ′ = [0, …, 0, uw−1, uw−2, …, u0] of width w, where w′>w. Then B2Uw′() = B2Uw(′).
Converting Two's-complement to Larger [earlier]
Sign extension: adding copies of the most significant bit to the representation.
principle: Expansion of a two's-complement number by sign extension
Define bit vectors = [xw−1, xw−2, …, x0] of width w and ′ = [xw−1, …, xw−1, xw−1, xw−2, …, x0] of width w, where w′>w. Then B2Tw (x) = B2Tw′(x).
• The value is preserved:
derivation: Expansion of a two's-complement number by sign extension
• The relative order of conversion: The program first changes the size and then the type.
2.2.7 Truncating Numbers [earlier]
Truncating = [xw−1, xw−2, …, x0] to k-bit: drop the high-order w − k bits, ′ = [xk−1, xk−2, …, x0]. Truncating a number can alter its value—a form of overflow.
Truncating Unsigned [earlier]
principle: Truncation of an unsigned number
Let be the bit vector [xw−1, xw−2, …, x0], and let ′ be the result of truncating it to k bits: ′ = [xk−1, xk−2, …, x0]. Let x = B2Uw() and x′ = B2Uk(′). Then x = x′ mod 2k.
Truncating Two's-complement [earlier]
principle: Truncation of a two's-complement number
Let be the bit vector [xw−1, xw−2, …, x0], and let ′ be the result of truncating it to k bits: ′ = [xk−1, xk−2, …, x0]. Let x = B2Uw() and x′ = B2Tk(′). Then x = U2Tk(x mod 2k).
derivation: Truncation of a two's-complement number
B2Tw ([xw−1, xw−2, …, x0]) mod 2k = B2Uk ([xk−1, xk−2, …, x0])
Summary [earlier]
The effect of truncation
• For unsigned:
B2Uk([xk−1, xk−2, …, x0]) = B2Uw([xw−1, xw−2, …, x0]) mod 2k
• For two's-complement:
B2Tk([xk−1, xk−2, …, x0]) = U2Tk(B2Uw ([xw−1, xw−2, …, x0]) mod 2k)
2.2.8 Advice on Signed versus Unsigned [earlier]
The implicit conversion can lead to errors or vulnerabilities.
• One way to avoid: to never use unsigned.
• Example: Java.
Unsigned values are useful
• When words = collections of bits. Example:
• Packing a word with flags describing various Boolean conditions
• Addresses (naturally unsigned)
• When implementing mathematical packages for modular / multiprecision arithmetic.
2.3 Integer Arithmetic [earlier]
2.3.1 Unsigned Addition [earlier]
"Word size inflation":
• Some programming languages support arbitrary size arithmetic;
More commonly, programming languages support fixed-size arithmetic.
x y for arguments x and y, where 0 ≤ x, y < 2w: the result of truncating the integer sum x + y to be w bits long, viewed as an unsigned number.
• Characterized as a form of modular arithmetic: discarding any bits with weight > 2w−1
principle: Unsigned addition
For x and y such that 0 ≤ x, y < 2w:
x y =
• Illustration:
Overflow: an arithmetic operation whose integer result cannot fit within the word size limits of the data type.
• Occurs when the two operands sum to 2w or more
• Not signaled as errors
principle: Detecting overflow of unsigned addition
For x and y in the range 0 ≤ x, y ≤ UMaxw, let s ≐ x y. Then the computation of s overflowed if and only if s < x (or equivalently, s < y).
Modular addition forms an abelian group (a mathematical structure).
• Commutative and associative
• The identity element: 0; every element has an additive inverse
• Value x for every value x: x x = 0
principle: Unsigned negation
For any number x such that 0 ≤ x < 2w, its w-bit unsigned negation x is given by the following:
x =
2.3.2 Two's-Complement Addition [earlier]
x y, given integer values x and y where −2w−1 ≤ x, y ≤ 2w−1 − 1: the result of truncating the integer sum x + y to w bits, viewed as a two's-complement number.
principle: Two's-complement addition
For integer values x and y in the range −2w−1 ≤ x, y ≤ 2w−1 − 1:
• Illustration:
• Positive overflow: x + y exceeds TMaxw (case 4)
• Negative overflow: x + y is less than TMinw (case 1)
• Has the same bit-level representation as the unsigned sum
• Examples:
principle: Detecting overflow in two's-complement addition
For x and y in the range TMinw ≤ x, y ≤ TMaxw, let s ≐ x y. Then the computation of s has had positive overflow if and only if x > 0 and y > 0 but s ≤ 0. The computation has had negative overflow if and only if x < 0 and y <0 but s ≥ 0.
• Illustrations:
2.3.3 Two's-Complement Negation [earlier]
x: the additive inverse under
principle: Two's-complement negation
For x in the range TMinw ≤ x ≤ TMaxw, its two's-complement negation x is given by the formula
x =
2.3.4 Unsigned Multiplication [earlier]
x y for integers x and y where 0 ≤ x, y ≤ 2w − 1: the result of truncating the 2w-bit product x · y to w bits, viewed as an unsigned number.
principle: Unsigned multiplication
For x and y such that 0 ≤ x, y ≤ UMaxw:
x y = (x · y) mod 2w
2.3.5 Two's-Complement Multiplication [earlier]
x y for integers x and y where −2w-1 ≤ x, y ≤ 2w-1 − 1: the result of truncating the 2w-bit product x · y to w bits, viewed as a two's-complement number.
principle: Two's-complement multiplication
For x and y such that TMinw ≤ x, y ≤ TMaxw:
x y = U2Tw((x · y) mod 2w)
principle: Bit-level equivalence of unsigned and two's-complement multiplication
Let and be bit vectors of length w. Define integers x and y as the values represented by these bits in two's-complement form: x = B2Tw(x) and y = B2Tw(y). Define nonnegative integers x′ and y′ as the values represented by these bits in unsigned form: x = B2Uw(x) and y = B2Uw(y). Then
T2Bw(x y) = U2Bw(x′ y′)
• Illustrations:
2.3.6 Multiplying by Constants [earlier]
Integer multiply is slow. Optimization: to replace multiplications by constants with combinations of shift and addition operations.
principle: Multiplication by a power of 2
Let x be the unsigned integer represented by bit pattern [xw−1, xw−2, …, x0]. Then for any k ≥ 0, the w + k-bit unsigned representation of x2k is given by [xw−1, xw−2, …, x0, 0, …, 0], where k zeros have been added to the right.
principle: Unsigned multiplication by a power of 2
For C variables x and k with unsigned values x and k, such that 0 ≤ k < w, the C expression x << k yields the value x 2k.
principle: Two's-complement multiplication by a power of 2
For C variables x and k with two's-complement value x and unsigned value k, such that 0 ≤ k < w, the C expression x << k yields the value x 2k.
The task of generating code for the expression x * K, for some constant K.
• K: [(0…0) (1…1) (0…0) . . . (1…1)].
• Compute a run of 1's from bit position n down to bit position m (n ≥ m) using either form:
• Form A: (x<<n) + (x<<(n − 1)) + . . . + (x<<m)
• Form B: (x<<(n + 1)) - (x<<m)
• Compute x * K by adding together the results for each run.
2.3.7 Dividing by Powers of 2 [earlier]
Performed using a right shift:
• Logical: unsigned
• Arithmetic: two's-complement
Integer division always rounds toward 0. Some notation for any real number a:
• ⌊a⌋: the unique integer a′ such that a′ ≤ a < a′ + 1.
• ⌈a⌉: the unique integer a′ such that a′ − 1< a′ ≤ a′.
Dividing by a Power of 2 with Unsigned Arithmetic [earlier]
principle: Unsigned division by a power of 2
For C variables x and k with unsigned values x and k, such that 0 ≤ k < w, the C expression x >> k yields the value ⌊x/2k⌋.
• Examples:
Dividing by a Power of 2 with Two's-complement Arithmetic [earlier]
Using an arithmetic right shift:
principle: Two's-complement division by a power of 2, rounding down
Let C variables x and k have two's-complement value x and unsigned value k, respectively, such that 0 ≤ k < w. The C expression x >> k, when the shift is performed arithmetically, yields the value ⌊x/2k⌋.
• Examples:
Correcting for the improper rounding that occurs when a negative number is shifted right by "biasing" the value before shifting.
principle: Two's-complement division by a power of 2, rounding up
Let C variables x and k have two's-complement value x and unsigned value k, respectively, such that 0 ≤ k < w. The C expression (x + (1 << k) - 1) >> k, when the shift is performed arithmetically, yields the value ⌈x/2k⌉.
• Demonstration:
(x<0 ? x+(1<<k)-1 : x) >> k will compute x/2k.
2.3.8 Final Thoughts on Integer Arithmetic [earlier]
2.4 Floating Point [earlier]
2.4.1 Fractional Binary Numbers [earlier]
2.4.2 IEEE Floating-Point Representation [earlier]
2.4.3 Example Numbers [earlier]
2.4.4 Rounding
2.4.5 Floating-Point Operations [earlier]
2.4.6 Floating Point in C [earlier]
2.5 Summary [earlier]
Computers encode information as bits, generally organized as sequences of bytes. Different encodings are used for representing integers, real numbers, and character strings. Different models of computers use different conventions for encoding numbers and for ordering the bytes within multi-byte data.
The C language is designed to accommodate a wide range of different implementations in terms of word sizes and numeric encodings. Machines with 64-bit word sizes have become increasingly common, replacing the 32-bit machines that dominated the market for around 30 years. Because 64-bit machines can also run programs compiled for 32-bit machines, we have focused on the distinction between 32- and 64-bit programs, rather than machines. The advantage of 64-bit programs is that they can go beyond the 4 GB address limitation of 32-bit programs.
Most machines encode signed numbers using a two's-complement representation and encode floating-point numbers using IEEE Standard 754. Understanding these encodings at the bit level, as well as understanding the mathematical characteristics of the arithmetic operations, is important for writing programs that operate correctly over the full range of numeric values.
When casting between signed and unsigned integers of the same size, most C implementations follow the convention that the underlying bit pattern does not change. On a two's-complement machine, this behavior is characterized by functions T2Uw and U2Tw, for a w-bit value. The implicit casting of C gives results that many programmers do not anticipate, often leading to program bugs.
Due to the finite lengths of the encodings, computer arithmetic has properties quite different from conventional integer and real arithmetic. The finite length can cause numbers to overflow, when they exceed the range of the representation. Floating-point values can also underflow, when they are so close to 0.0 that they are changed to zero.
The finite integer arithmetic implemented by C, as well as most other programming languages, has some peculiar properties compared to true integer arithmetic. For example, the expression x*x can evaluate to a negative number due to overflow. Nonetheless, both unsigned and two's-complement arithmetic satisfy many of the other properties of integer arithmetic, including associativity, commutativity, and distributivity. This allows compilers to do many optimizations. For example, in replacing the expression 7*x by (x<<3)-x, we make use of the associative, commutative, and distributive properties, along with the relationship between shifting and multiplying by powers of 2.
We have seen several clever ways to exploit combinations of bit-level operations and arithmetic operations. For example, we saw that with two's-complement arithmetic, ~x+1 is equivalent to -x. As another example, suppose we want a bit pattern of the form [0, …, 0, 1, …, 1], consisting of w − k zeros followed by k ones. Such bit patterns are useful for masking operations. This pattern can be generated by the C expression (1<<k)-1, exploiting the property that the desired bit pattern has numeric value 2k − 1. For example, the expression (1<<8)-1 will generate the bit pattern 0xFF.
Floating-point representations approximate real numbers by encoding numbers of the form x × 2y. IEEE Standard 754 provides for several different precisions, with the most common being single (32 bits) and double (64 bits). IEEE floating point also has representations for special values representing plus and minus infinity, as well as not-a-number.
Floating-point arithmetic must be used very carefully, because it has only limited range and precision, and because it does not obey common mathematical properties such as associativity.