Study Notes of CS:APP
Resources [21-12]
Official Material [22-01]
• Textbooks
• Randal E. Bryant and David R. O'Halloron, Computer Systems: A Programmer's Perspective, Third Edition, Pearson, 2016
• Courses
• 15-213 Instances
• 15-213/18-213 Fall 2019 (Randy's latest)
• http://www.cs.cmu.edu/~213 (original)
• Five system courses
Readers' Notes [22-01-11]
• 嵌入式与Linux那些事's cnblogs posts
Adoptions [22-01]
• Other course variations
• Comparisons of some course variations
Chapter |
CSE 351: H/SI |
15-213: ICS |
SE 101: ICS |
1 |
1.0-1.10 |
1 |
1.1-1.2, 1.4.1, 1.7-1.8, 1.9.2 |
2 |
2.0-2.4.3 |
2.1-2.3 |
2.1-2.3 |
3 |
3.2-3.5.3, 3.6.0-3.6.5, 3.6.7-3.7.5, 3.8-3.10 |
3.1-3.10 |
3.1, 3.4, 3.6.1-3.6.8, 3.7-3.11 |
4 |
4.1-4.5.9 |
||
5 |
5 |
5.1-5.14 |
|
6 |
6.0, 6.2-6.7 |
6.1-6.7 |
6.1-6.6 |
7 |
7.1-7.12 |
||
8 |
8.0-8.4 |
8.1-8.8 |
8.1-8.6 |
9 |
9.0-9.7, 9.9-9.12 |
9.1-9.12 |
9.1-9.11 |
10 |
10 |
10.1-10.11 |
|
11 |
11.1-11.6 |
11.1-11.6 |
|
12 |
12.1-12.8 |
12.1-12.7 |
Software Tools [21-12]
My Foreword [21-12]
Goal [21-12]
• Easy review
• Sharing
Content Covered [22-01]
• Content suitable for Chinese national conditions and my personal needs
Chapter |
Phase |
|||
1 |
2 |
… |
||
N/R |
New |
Review |
New |
|
1 |
1.0-1.10 |
1.0-1.10 |
||
2 |
2.0-2.3, |
2.0-2.3, 2.5 |
2.4-2.4.3, 2.4.4.1, |
|
3 |
3.0-3.10, |
3.0-3.10, 3.12 |
||
4 |
4.0 |
4.0 |
||
5 |
5.0 |
5.0 |
||
6 |
6.0-6.4, |
|||
7 |
7.0 |
|||
8 |
8.0-8.6, |
|||
9 |
9.0-9.7, |
|||
10 |
10.0-10.12 |
|||
11 |
11.0-11.7 |
|||
12 |
12.0-12.8 |
How I Learn [22-01]
• Every time I learn something:
• First, read the book on MoonReader with the assistance of eudic and mark the key points.
• Second, watch the corresponding 15-213 video(s) if available and refine the notes.
• Any instance of a bug listed in the book errata will be corrected.
On the Design [22-01]
• WARNING: Earlier versions are not readable enough!
• Environment
• Microsoft Word 2019
• Organization
• Lies between that of typical articles and of typical presentations
• Levels
• Almost identical to those of the global version
• What's the biggest different
• + Heading 6
• Formatting
• Use the style scheme as much close as that of the global edition as possible, to the degree in which formatting would not become a burden on me.
• New/noteworthy conventions
• Heading 6: Times New Roman, bold
• Colors: #000000, #00AEEF, #767676
• Date updated: [yy-mm]
• Code: Courier New
• Terminology: bold
• Highlight (primary): italic
• Highlight (secondary): underline (deprecated)
• Screenshots
• 200% taken
• 55% of 200% displayed
Exporting to HTML Blog Post [22-01]
• NOTE: Not fully applied yet for the reasons explained below
• On Word
• File – Share – Post to Blog
• Run some macros that add marks for additional styles
• But Normal.dotm permanently deleted by accident :(
• Publish
• On VS Code
• Paste the HTML source from cnblogs
• On a simple Java program
• Written by me
• A little bit unstable but pretty efficient
• Perform some replacements, to
• Recover the additional styles
• Adjust text size and font for Web reading
• Remove useless attributes
To-do List [22-01]
• Read more.
• Refine my notes.
• Goal of # words: ~100 words for each noted page
• Properly decide the length of list items
• Remove all derivations
• Rewrite chapter summaries
• …
• Rewrite Word macros
Chapter 1 A Tour of Computer Systems
Computer Systems [22-01]
• Computer system consists of hardware and systems software that work together to run application programs.
1.1 Information Is Bits + Context [21-12]
Source Programs
• A program begins life as a source program (or source file).
• Source program is a sequence of bits, each with a value of 0 or 1
• Organized in bytes (8-bit chunks), each byte represents some text character in the program
• Most computer systems represent text characters using ASCII
Files [21-12]
• Text files: files that consist exclusively of ASCII characters
• Binary files: all other files
Information = Bits + Context [22-01]
• All information in a system is represented as a bunch of bits.
• The only thing that distinguishes different data objects is the context we view them.
1.2 Programs Are Translated by Other Programs into Different Forms [21-12]
The Compilation System
• Compilation system: collectively programs that perform the four phases (preprocessor, compiler, assembler, and linker).
• Preprocessing phase
• Preprocessor (cpp) modifies hello.c according to directives that begin with '#'
• Result: hello.i C program
• Compilation phase
• Compiler (cc1) translates hello.i into hello.s assembly-language program
• Assembly language provides a common output language for different compilers for different high-level languages.
• Assembly phase
• Assembler (as) translates hello.s into machine language instructions, packages them in a relocatable object program (binary)
• Result: hello.o object file
• Linking phase
• printf
• Part of the standard C library provided by every C compiler
• Resides in a separate precompiled object file printf.o
• Linker (ld) merges printf.o with hello.o
• Result: hello executable object file (an executable object file, or simply executable, binary) ready to be loaded into memory and executed by the system
1.3 It Pays to Understand How Compilation Systems Work [21-12]
• Some important reasons:
• Optimizing program performance
• Understanding link-time errors
• Avoiding security holes
• Buffer overflow vulnerabilities
1.4 Processors Read and Interpret Instructions Stored in Memory [21-12]
The Shell
• Shell: command-line interpreter that prints a prompt, waits for you to type a command line, and then performs the command.
• If the first word of the command line does not correspond to a built-in shell command, then the shell assumes that it is the name of an executable file.
1.4.1 Hardware Organization of a System
Buses
• Buses: collection of electrical conduits running throughout the system that carry bytes of information back and forth between the components.
• Typically designed to transfer words
• Word: fixed-size chunks of bytes
• Word size: # bytes in a word
• Fundamental system parameter that varies across systems.
• Most machines: either 4 bytes (32 bits) or 8 bytes (64 bits).
I/O Devices
• Input/output (I/O) devices: system's connection to the external world
• Example system has 4:
• Keyboard and mouse
• Display
• Disk drive (or simply disk)
• Each is connected to the I/O bus by either a controller or an adapter.
• The distinction: packaging
• Motherboard: system's main printed circuit board
• Controllers: chip sets in the device itself or on the motherboard
• Adapter: card that plugs into a slot on the motherboard
• Purpose of each: to transfer information back and forth between the I/O bus and an I/O device.
Main Memory
• Main memory: temporary storage device that holds both a program and the data it manipulates while the processor is executing the program.
• Physically, consists of a collection of dynamic random-access memory (DRAM) chips.
• Logically, organized as a linear array of bytes, each with its own unique address (array index) starting at 0.
Processor
• Central processing unit (CPU), or simply processor: engine that executes (interprets) instructions stored in main memory.
• Program counter (PC): register at its core
• Register: word-size storage device
• At any point in time, points at (contains the address of) some machine-language instruction in main memory.
Operations
• Processor repeatedly
• Executes the instruction pointed at by the PC
• Updates the PC to point to the next instruction
• Processor appears to operate according to a very simple instruction execution model, defined by its instruction set architecture.
• Instructions execute in strict sequence, and executing a single instruction involves performing a series of steps.
• Operations revolve around:
• Main memory
• Register file: small storage device that consists of a collection of word-size registers, each with its own unique name
• Arithmetic/logic unit (ALU): Computes new data and address values
• Examples of operations:
• Load: Copy a byte or a word from main memory into a register, overwriting the previous contents of the register.
• Store: Copy a byte or a word from a register to a location in main memory, overwriting the previous contents of that location.
• Operate: Copy the contents of two registers to the ALU, perform an arithmetic operation on the two words, and store the result in a register, overwriting the previous contents of that register.
• Jump: Extract a word from the instruction itself and copy that word into the PC, overwriting the previous value of the PC.
The Processor's ISA versus its Microarchitecture
• ISA describes the effect of each machine-code instruction
• Microarchitecture describes how the processor is actually implemented
1.4.2 Running the hello Program
Reading the hello Command from the Keyboard
Loading the Executable from Disk into Main Memory
• Using direct memory access (DMA), the data travel directly from disk to main memory, without passing through the processor.
Writing the Output String from Memory to the Display
1.5 Caches Matter [21-12]
The Need to Make Copy Faster
• Lesson: A system spends a lot of time moving information from one place to another.
• Much of copying is overhead: slows down the "real work" of the program.
• Major goal for system designers: to make these copy operations run as fast as possible.
The Processor–Memory Gap
• Processor–memory gap: [22-01]
• Register file stores much less bytes of information than main memory.
• Processor can read data from register file much faster than from memory.
Dealing with the Processor–Memory Gap: Cache Memories
• Cache memories (or simply caches): smaller, faster storage devices that serve as temporary staging areas for information that the processor is likely to need in the near future.
• L1 caches
• On the processor chip
• Holds many bytes
• Can be accessed nearly as fast as the register file
• L2 caches
• Larger
• With more bytes
• Connected to the processor by a special bus
• Still much faster than accessing the main memory
• L3 caches
• Newer and more powerful systems
• Implemented with static random-access memory (SRAM, a hardware technology)
• Idea behind: Exploit locality to get a very large and fast memory
• Locality: tendency for programs to access data and code in localized regions
• Set up caches to hold data that are likely to be accessed often
1.6 Storage Devices Form a Hierarchy [21-12]
Memory Hierarchy
• Storage devices in a system are organized as a memory hierarchy
•
• Main idea: Storage at one level serves as a cache for storage at the next lower level
1.7 The Operating System Manages the Hardware [22-01]
• Operating system: layer of software interposed between the application program and the hardware.
• All attempts by an application program to manipulate the hardware must go through the operating system.
• Two primary purposes:
(1) to protect the hardware from misuse by runaway applications
(2) to provide applications with simple and uniform mechanisms for manipulating complicated and often wildly different low-level hardware devices
• Both goals are achieved via the fundamental abstractions: processes, virtual memory, and files
1.7.1 Processes
Overview
• Process: OS's abstraction for a running program
• Multiple processes can run concurrently on the same system, and each process appears to have exclusive use of the hardware.
• Concurrently: The instructions of one process are interleaved with the instructions of another process.
• Single CPU can appear to execute multiple processes concurrently by having the processor switch among them
• OS performs this interleaving with context switching (a mechanism).
• Consider only a uniprocessor system
• Uniprocessor system contains a single CPU
Context
• Context: all the state information that the process needs in order to run
• OS keeps track of the context.
• Including information such as:
• Current values of the PC
• Register file
• Contents of main memory
Context Switching
• At any point in time, uniprocessor system can only execute the code for a single process.
• When OS decides to transfer control from current process to some new process, it performs a context switch by:
• 1. saving context of current process
• 2. restoring context of new process
• 3. passing control to new process
• New process picks up exactly where it left off.
• Basic idea for hello scenario:
•
The Operating System Kernel
• Transition from one process to another is managed by the OS kernel.
• Kernel: portion of the OS code always resident in memory.
• When application program requires some action by OS, it executes a special system call instruction, transferring control to the kernel.
• Kernel then performs the requested operation and returns back to the application program.
• Note: Kernel is:
• Not a separate process
• Instead, a collection of code and data structures that the system uses to manage all the processes.
1.7.2 Threads
• A process can actually consist of multiple threads (execution units)
• Each running in the context of the process and sharing the same code and global data
• Increasingly important programming model:
• Requirement for concurrency in network servers
• Sharing data between
• Efficiency
• Multi-threading
1.7.3 Virtual Memory
• Virtual memory: abstraction that provides each process with the illusion that it has exclusive use of main memory.
Virtual Address Space
• Virtual address space: same uniform view of memory of ach process
• Topmost region is reserved for code and data
• Lower region holds the code and data defined by the user's process.
•
• Areas
• Program code and data
• Code begins at the same fixed address for all processes, followed by data locations that correspond to global C variables.
• Initialized directly from the contents of an executable object file.
• Linking and loading
• Heap (run-time heap)
• Expands and contracts dynamically at run time as a result of calls to C standard library routines.
• Managing virtual memory
• Shared libraries
• Code and data such as the C standard library and the math library.
• Dynamic linking
• Stack (user stack)
• Used by the compiler to implement function calls.
• Expands and contracts dynamically during the execution of the program.
• In particular, each time we call a function, it grows; each time we return from a function, it contracts
• Kernel virtual memory
• Top region reserved
• Application must invoke the kernel to read or write the contents of this area or to directly call functions defined in the kernel code
Summary
• For virtual memory to work, basic idea: to store the contents of a process's virtual memory on disk and then use the main memory as a cache for the disk.
1.7.4 Files
• File: sequence of bytes
• Every I/O device is modeled as a file
• All input and output in the system is performed by reading and writing files, using Unix I/O (a small set of system calls).
• Very powerful
• Provides applications with a uniform view of all the varied I/O devices.
1.8 Systems Communicate with Other Systems Using Networks [22-01]
• From the point of view of an individual system, network can be viewed as just another I/O device.
• Network adapter
• Example: using telnet
• Client and server
• Network programming
1.9 Important Themes [22-01]
• A system is a collection of intertwined hardware and systems software that must cooperate in order to achieve the ultimate goal of running application programs.
1.9.1 Amdahl's Law
• Amdahl's law: observation about the effectiveness of improving the performance of one part of a system.
• Main idea: when we speed up one part of a system, the effect on the overall system performance depends on both how significant this part was and how much it sped up.
• Speedup
• Consider system:
• Executing some application requires time Told
• Some part requires time αTold, performance improved by k
• Overall execution time Tnew = Told[(1− α) + α/k]
• Speedup S = Told/Tnew = 1/[(1 − α) + α/k]
• Major insight—to significantly speed up the entire system, we must improve the speed of a very large fraction of the overall system.
• Special case: S∞ = 1/(1 − α)
• Describes general principle for improving any process.
• Most meaningful in the world of computers.
1.9.2 Concurrency and Parallelism
• Concurrency: general concept of a system with multiple, simultaneous activities.
• Parallelism: use of concurrency to make a system run faster.
• Can be exploited at multiple levels of abstraction in a system, from highest to lowest:
Thread-Level Concurrency
• Concurrency: Multiple control flows execute concurrently.
Concurrent Execution on Uniprocessors
• Uniprocessor system: system consisting of a single processor
• Since the advent of time-sharing
• Only simulated, by having a single computer rapidly switch among its executing processes
Concurrent Execution on Multiprocessors
• Multiprocessor system: system consisting of multiple processors all under the control of a single kernel
• Have become commonplace with the advent of multi-core processors and hyperthreading
• Multi-core processors have several "cores" (CPUs) integrated onto a single integrated-circuit chip.
• Each core with its own L1 and L2 caches
• Each L1 cache split into an i-cache (instruction cache) and a d-cache (data cache)
• Cores share higher levels of cache as well as interface to main memory
• Hyperthreading (simultaneous multi-threading): technique that allows a single CPU to execute multiple flows of control.
• It involves:
• Having multiple copies of some of the CPU hardware,
• E.g., PCs and register files
• Having only single copies of other parts of the hardware,
• E.g., FPUs.
• Hyperthreaded processor decides which of its threads to execute on a cycle-by-cycle basis.
• Use of multiprocessing improve system performance—
• Reduces the need to simulate concurrency when multitasking
• Can run a single multi-threaded program faster
Instruction-Level Parallelism
• Instruction-level parallelism: property that at much lower level of abstraction, processors can execute multiple instructions at one time.
• Pipelining: actions required to execute an instruction are partitioned into different steps and the processor hardware is organized as a series of stages, each performing one of these steps.
• The stages can operate in parallel, working on different parts of different instructions.
• Superscalar processors: Processors that can sustain execution rates faster than 1 instruction per cycle.
• Most modern processors support superscalar operation.
Single-Instruction, Multiple-Data (SIMD) Parallelism
• Single-instruction, multiple-data (SIMD) parallelism: mode that processors with special hardware allows a single instruction to cause multiple operations to be performed in parallel.
• Write programs using special vector data types
1.9.3 The Importance of Abstractions in Computer Systems
• Use of abstractions is one of the most important concepts in computer science.
• On the processor side
• Instruction set architecture: abstraction of the actual processor hardware
• A machine-code program behaves as if it were executed on a processor that performs just one instruction at a time.
• Underlying hardware executes multiple instructions in parallel
• On the OS side
• Files: abstraction of I/O devices
• Virtual memory: abstraction of program memory
• Processes: abstraction of a running program
• Virtual machine: abstraction of the entire computer
1.10 Summary [21-12]
• Computer system = hardware + systems software + application programs
• Information = bits + context
• Compilation system
• It pays to understand how it works
• Processors read and interpret instructions stored in main memory.
• Hardware organization
• Buses
• I/O devices
• Networks
• Main memory
• Processor
• Running the hello program
• Memory hierarchy: storage devices
• Regs
• Cache memories
• Cache matters
• DRAM
• Disk storage
• Abstractions provided by the OS:
• Processes
• Kernel
• Virtual memory
• Files
• Important themes
• Amdahl's Law
• Concurrency and Parallelism
• Abstractions