Computer Systems: A Programmer's Perspective (Global Edition)
A comprehensive deep-dive into how computer systems execute programs and store information. This course bridges the gap between high-level programming and the underlying hardware, covering machine-level representation, processor architecture, memory hierarchy, and concurrent programming.
Panoramica del corso
📚 Content Summary
A comprehensive deep-dive into how computer systems execute programs and store information. This course bridges the gap between high-level programming and the underlying hardware, covering machine-level representation, processor architecture, memory hierarchy, and concurrent programming.
Master the art of systems programming by understanding the hardware-software interface.
Author: Randal E. Bryant, David R. O'Hallaron
Acknowledgments: Supported by the students and instructors of the 15-213 course at Carnegie Mellon University. Acknowledgments include contributions from Manasa S. and Mohit Tahiliani.
🎯 Learning Objectives
- Identify how information is represented using bits and context within a system.
- Trace the four stages of the compilation system from source code to executable.
- Describe the organizational structure of hardware and the hierarchical nature of storage devices.
- Convert between decimal, binary, and hexadecimal notations and explain machine-level addressing (Endianness).
- Perform bit-level and logical operations in C and predict the results of arithmetic shifts.
- Analyze integer encodings to identify potential overflow vulnerabilities and casting errors.
- Analyze the mapping between C constructs (loops, branches, procedures) and x86-64 assembly instructions.
- Deconstruct the run-time stack to explain how parameters are passed, local variables are stored, and recursive calls are managed.
- Evaluate memory layouts for heterogeneous data structures and apply alignment rules to calculate total storage requirements.
- Define the Y86-64 Programmer-Visible State and encode/decode instructions into byte sequences.
🔹 Lesson 1: A Tour of Computer Systems
Overview: This lesson provides a comprehensive overview of how computer systems represent information, translate programs, and execute instructions through complex hardware and software interactions. It explores the journey of a program from source code to execution, the critical role of the memory hierarchy in overcoming the processor-memory gap, the abstractions provided by the operating system, and the mathematical laws governing system performance and parallelism.
Learning Outcomes:
- Identify how information is represented using bits and context within a system.
- Trace the four stages of the compilation system from source code to executable.
- Describe the organizational structure of hardware and the hierarchical nature of storage devices.
🔹 Lesson 2: Representing and Manipulating Information
Overview: This lesson explores how digital computers represent and manipulate information at the bit level. It covers the transition from hexadecimal notation and machine-level word sizes to the complex encodings of integers (Unsigned and Two's-Complement) and floating-point numbers (IEEE 754). Students will analyze the mathematical properties of computer arithmetic, including the security implications of overflow and the nuances of rounding in finite-precision systems.
Learning Outcomes:
- Convert between decimal, binary, and hexadecimal notations and explain machine-level addressing (Endianness).
- Perform bit-level and logical operations in C and predict the results of arithmetic shifts.
- Analyze integer encodings to identify potential overflow vulnerabilities and casting errors.
🔹 Lesson 3: Machine-Level Representation of Programs
Overview: This lesson provides a comprehensive deep dive into how high-level C programs are transformed into x86-64 machine code. It covers the fundamental architecture of the processor, including registers and the stack, the implementation of control flow (conditionals, loops, and switches), the mechanics of procedure calls and recursion, and the machine-level representation of complex data structures like arrays, structs, and unions. Finally, it addresses system security via buffer overflow analysis and the specialized instructions used for floating-point arithmetic.
Learning Outcomes:
- Analyze the mapping between C constructs (loops, branches, procedures) and x86-64 assembly instructions.
- Deconstruct the run-time stack to explain how parameters are passed, local variables are stored, and recursive calls are managed.
- Evaluate memory layouts for heterogeneous data structures and apply alignment rules to calculate total storage requirements.
🔹 Lesson 4: Processor Architecture
Overview: This lesson explores the fundamental architecture of a processor, focusing on the transition from a sequential implementation (SEQ) to a high-performance pipelined implementation (PIPE) using the Y86-64 Instruction Set Architecture (ISA). Students will analyze how instructions are encoded, processed through discrete stages (Fetch, Decode, Execute, Memory, Write-Back), and how hardware hazards are managed using control logic, stalling, and forwarding to maximize throughput.
Learning Outcomes:
- Define the Y86-64 Programmer-Visible State and encode/decode instructions into byte sequences.
- Implement hardware control logic using HCL (Hardware Control Language) for combinational and sequential circuits.
- Trace instruction flow through the six stages of a sequential processor and identify the impact of clocking.
🔹 Lesson 5: Optimizing Program Performance
Overview: This lesson explores the systematic approach to improving program performance by understanding the interaction between high-level code, optimizing compilers, and modern microprocessor architectures. Students will learn to identify "optimization blockers" such as memory aliasing, apply low-level transformations like loop unrolling and reassociation, and utilize profiling tools like GPROF to target performance bottlenecks effectively.
Learning Outcomes:
- Identify and mitigate optimization blockers including memory aliasing and procedure call overhead.
- Quantify program performance using the Cycles Per Element (CPE) metric.
- Apply loop unrolling, multiple accumulators, and reassociation transformations to exploit instruction-level parallelism.
🔹 Lesson 6: The Memory Hierarchy
Overview: This lesson explores the structural and functional design of the memory hierarchy, focusing on the trade-offs between storage speed, cost, and capacity. It details the technologies powering modern systems—from SRAM and DRAM to Disks and SSDs—and explains how the Principle of Locality (Temporal and Spatial) allows small, fast cache memories to significantly enhance program performance. Students will learn to analyze cache mapping (Direct-Mapped, Set Associative, Fully Associative) and apply optimization techniques like loop reordering and blocking to write cache-friendly code.
Learning Outcomes:
- Distinguish between SRAM, DRAM, ROM, and Flash memory technologies and their roles in the hierarchy.
- Calculate disk storage capacity and total access time based on geometry and operational components.
- Analyze memory addresses to determine cache set indices, tags, and block offsets across different mapping strategies.
🔹 Lesson 7: Linking
Overview: This lesson explores the critical system-level process of linking, which aggregates code and data into a single file that can be loaded into memory and executed. Students will transition from source code to executable binaries, understanding how linkers resolve symbol references, merge sections through relocation, and manage both static and dynamic libraries. The lesson concludes with advanced techniques like library interpositioning and position-independent code (PIC) used in modern shared libraries.
Learning Outcomes:
- Trace the transformation of source files through the compiler driver to a final executable.
- Analyze ELF object files to identify symbol types and section organization.
- Apply symbol resolution rules to manage duplicate names and link-time dependencies.
🔹 Lesson 8: Exceptional Control Flow
Overview: This lesson explores Exceptional Control Flow (ECF), the mechanism by which a computer system reacts to changes in system state. We examine how ECF is implemented at all levels of the system, from hardware-triggered Exceptions and operating system-level Context Switching and Process Control (fork, wait, execve) to software-level Signals and Nonlocal Jumps. Students will learn to manage concurrency, handle system errors, and write robust, signal-safe code.
Learning Outcomes:
- Distinguish between the four classes of hardware-level exceptions (Interrupts, Traps, Faults, Aborts) and their handling mechanisms.
- Manage process lifecycles using system calls for creation (
fork), reaping (waitpid), and execution (execve). - Implement safe signal handlers that account for concurrency, non-queuing signals, and async-signal-safety.
🔹 Lesson 9: Virtual Memory
Overview: This lesson explores Virtual Memory (VM) as a fundamental abstraction that provides each process with a large, contiguous, and private address space. We cover its three primary roles: a tool for efficient caching in DRAM, a mechanism for memory management and protection, and a foundation for memory mapping. Furthermore, the lesson delves into the mechanics of address translation (TLB), dynamic memory allocation (heap management), and the principles of automated garbage collection, concluding with critical memory-related pitfalls in C programming.
Learning Outcomes:
- Distinguish between physical and virtual addressing and describe the role of the Memory Management Unit (MMU).
- Perform virtual-to-physical address translation using Page Tables and the Translation Lookaside Buffer (TLB).
- Analyze and implement dynamic memory allocation strategies, including implicit/explicit lists and coalescing.
🔹 Lesson 10: System-Level I/O
Overview: This lesson explores the fundamental interface between the Linux operating system and application programs for performing input and output. It covers the basic Unix I/O system calls, the various types of files encountered in the Linux filesystem, and the kernel-level data structures used to manage them. Furthermore, it introduces the Robust I/O (RIO) package to handle "short counts" and provides guidelines for choosing between Standard I/O and System-Level I/O in different programming contexts, such as network programming.
Learning Outcomes:
- Implement basic file operations using the Unix I/O interface (
open,close,read,write). - Differentiate between regular files, directories, and links while querying file metadata using
stat. - Utilize the RIO package to perform robust, buffered, and unbuffered I/O operations.
🔹 Lesson 11: Network Programming
Overview: This lesson explores the fundamental architecture of network-based applications, centered on the Client-Server programming model and the Global IP Internet. Students will learn to navigate the Sockets interface—the primary API for system-level network communication—and progress toward implementing a functional web server (TINY) capable of delivering both static files and dynamic content through the Common Gateway Interface (CGI).
Learning Outcomes:
- Understand the request-response cycle of the client-server model and the hardware/software hierarchy of the Global IP Internet.
- Manipulate and convert IP addresses, domain names, and socket structures using protocol-independent functions like
getaddrinfo. - Implement a robust iterative web server and CGI programs that utilize process control and I/O redirection to serve dynamic content.
🔹 Lesson 12: Concurrent Programming
Overview: This lesson explores the fundamental models of concurrency: processes, I/O multiplexing, and threads. It provides a deep dive into synchronization using semaphores to resolve race conditions, common architectural patterns like producer-consumer and prethreaded servers, and the metrics used to evaluate parallel performance. Finally, it addresses critical reliability issues including thread safety, reentrancy, and deadlock prevention.
Learning Outcomes:
- Distinguish between process-based, I/O multiplexed, and thread-based concurrency models.
- Apply semaphore operations (P and V) to ensure mutual exclusion and solve synchronization patterns.
- Calculate parallel performance metrics such as speedup and efficiency under different scaling laws.