Python Data Structures and Algorithm Analysis (2nd Edition)
This book is a classic textbook on data structures and algorithms using Python. It covers Python basics review, algorithm analysis (Big O notation), fundamental data structures (stacks, queues, lists), recursion, searching and sorting, tree and graph algorithms. Through practical code listings, it helps readers understand how to efficiently implement various abstract data types using Python.
Lessons
Course Overview
📚 Content Summary
This book is a classic textbook on data structures and algorithms using Python. It covers Python fundamentals review, algorithm analysis (Big O notation), basic data structures (stacks, queues, lists), recursion, searching and sorting, tree and graph algorithms. Through practical code listings, it helps readers understand how to efficiently implement various abstract data types using Python.
Only by deeply understanding data structures and algorithms can one truly master Python.
Author: Bradley N. Miller (Professor Emeritus of Computer Science, Franklin & Marshall College, USA), David L. Ranum (Cognitive Software Engineer at IBM Watson)
Acknowledgments: Thanks to colleague Steve Hubbard for extensive feedback on the first edition and new material for this edition, as well as to peers worldwide who emailed us with corrections and suggestions. Special thanks to Mary, Bob, and others at Java John’s Café in Decatur for allowing us to become "resident authors" during our sabbaticals. Also, heartfelt gratitude to the staff at Franklin, Beedle & Associates (especially Jim Leisy and Tom Sumner) for their enjoyable collaboration. Finally, special thanks to our wives, Jane Miller and Brenda Ranum, whose love and support made this book possible.
🎯 Learning Objectives
- Understand the relationship between computer science, algorithms, and programming, and grasp the concepts of Abstract Data Types (ADT) and information hiding.
- Become proficient in Python's built-in collection data types (lists, tuples, sets, dictionaries) and control structures (loops, conditionals, exception handling).
- Master core aspects of Python object-oriented programming: including class definition, constructors, operator overloading (e.g., fraction addition), application of Euclidean algorithm, and distinction between deep and shallow equality.
- Compare multiple algorithmic approaches: explain four solutions for anagram detection (counting, sorting, brute force, counting) and their corresponding time complexities.
- Quantify Python container performance: understand the Big O efficiency of core operations in Python lists (List) and dictionaries (Dict), and distinguish performance differences between
pop()andpop(i). - Develop performance validation skills: use the
timeitmodule to design experiments verifying consistency between theoretical complexity and actual runtime. - Understand and differentiate the logical characteristics of linear data structures (stacks, queues, deques, lists).
- Implement custom stacks, queues, and deques using Python's basic collections (e.g., lists).
- Apply stacks in expression processing (infix to postfix conversion, postfix evaluation) and queues in system simulation (printer simulation).
- Master the three fundamental principles of recursion: accurately identify and write recursive functions containing base cases, state evolution, and self-calls.
🔹 Lesson 1: Python Programming Fundamentals and Object-Oriented Abstraction
Overview: This lesson module guides learners from foundational computer science theory into practical Python programming. It begins by defining key concepts in computer science, algorithms, and Abstract Data Types (ADT), then delves into Python's built-in collections, control structures, and exception handling. Finally, it demonstrates the power of Object-Oriented Programming (OOP) in procedural and data abstraction through building a Fraction class and a "logic gate" inheritance hierarchy.
Learning Outcomes:
- Understand the relationship between computer science, algorithms, and programming, and master the concepts of Abstract Data Types (ADT) and information hiding.
- Be proficient in using Python's built-in collection data types (lists, tuples, sets, dictionaries) and control structures (loops, conditionals, exception handling).
- Master core aspects of Python object-oriented programming: including class definition, constructors, operator overloading (e.g., fraction addition), application of the Euclidean algorithm, and the difference between deep and shallow equality.
🔹 Lesson 2: Algorithm Analysis and Python Container Performance
Overview: This lesson explores how to evaluate algorithm efficiency using Big O notation, illustrated through the example of anagram detection evolving from O(n!) down to O(n). It also provides quantitative analysis of common operations in Python's core data structures (lists and dictionaries), helping developers make optimal choices in container selection and algorithm design in real-world programming.
Learning Outcomes:
- Compare multiple algorithmic approaches: explain the four solutions for anagram detection (counting, sorting, brute force, counting) and their respective time complexities.
- Quantify Python container performance: understand the Big O efficiency of core operations in Python lists (List) and dictionaries (Dict), and distinguish performance differences between
pop()andpop(i). - Develop performance validation skills: use the
timeitmodule to design experiments that verify consistency between theoretical complexity and actual runtime.
🔹 Lesson 3: Fundamental Linear Structures: Stacks, Queues, and Linked Lists
Overview: This lesson dives into four fundamental linear data structures: stacks (Stack), queues (Queue), deques (Deque), and lists (List). The essence of linear structures lies in maintaining relative order among elements, with differences primarily arising from how items are added and removed. By implementing these abstract data types (ADTs) in Python, learners will master how to leverage LIFO (Last In, First Out) and FIFO (First In, First Out) principles to solve practical algorithmic problems such as expression transformation, system simulation, and linked list memory management.
Learning Outcomes:
- Understand and differentiate the logical characteristics of linear data structures (stacks, queues, deques, lists).
- Implement custom stacks, queues, and deques using Python's basic collections (e.g., lists).
- Apply stacks in expression processing (infix to postfix conversion, postfix evaluation) and queues in system simulation (printer simulation).
🔹 Lesson 4: Principles of Recursive Algorithms and Introduction to Dynamic Programming
Overview: This lesson explores the core principles of recursive algorithms, starting from the foundational three rules of recursion. Through examples like numerical conversion, fractal geometry, and classic puzzles (Tower of Hanoi, maze solving), it reveals the underlying mechanism of recursive calls—stack frames. Finally, the lesson introduces the concept of dynamic programming (Dynamic Programming), demonstrating how memoization techniques can eliminate redundant computation in recursion, leading to a qualitative leap in algorithmic efficiency.
Learning Outcomes:
- Master the three fundamental principles of recursion: accurately identify and write recursive functions containing base cases, state transitions, and self-calls.
- Understand the underlying execution mechanism: comprehend how Python's call stack (Stack Frame) manages local variables and return paths during recursion.
- Develop modeling ability for complex problems: apply recursion and backtracking to solve non-linear problems such as Tower of Hanoi and maze navigation.
🔹 Lesson 5: Search Techniques, Hashing, and Sorting Algorithms
Overview: This lesson module focuses on core data processing techniques in computer science: search, hashing (Hashing), and sorting. It covers everything from basic sequential search to efficient binary search, explores the construction principles of hash tables, collision resolution methods, and implementations of the Map abstract data type. It also provides detailed analysis of bubble sort, shell sort, and merge sort algorithms, including their logic and performance characteristics.
Learning Outcomes:
- Understand and compare the applicability and time complexity of sequential search versus binary search (O(n) vs O(\log n)).
- Master hash function design (folding method, mid-square method) and collision handling mechanisms (linear probing, quadratic probing, chaining).
- Manually simulate and implement bubble sort, shell sort, and merge sort in code, and understand how divide-and-conquer strategies are applied in sorting.
🔹 Lesson 6: Tree Structures: Binary Heaps, Search Trees, and Balanced Trees (AVL)
Overview: This lesson explores one of the most fundamental data structures in computer science—trees. We begin with basic terminology and implementation of binary trees, then progress to advanced applications such as expression parsing trees and traversal methods. Next, we study two efficient variants: binary heaps for priority queues and binary search trees (BST) for efficient searching. Finally, we examine AVL trees, which maintain balance through rotation operations to address performance degradation in BSTs under worst-case scenarios.
Learning Outcomes:
- Accurately define core tree terminology (root, edge, path, height, etc.) and describe the recursive nature of binary trees.
- Master core algorithms for implementing binary heaps, binary search trees, and AVL trees in Python (e.g., insertion, deletion, rotation, rebalancing).
- Execute and interpret three tree traversal algorithms, and use parse trees to process mathematical expressions.
🔹 Lesson 7: Graph Algorithms: Search, Shortest Path, and Minimum Spanning Tree
Overview: This lesson delves into the foundational theory of graph theory and its efficient implementation in Python. Topics include the abstract data type representation of graphs (adjacency matrix and adjacency list), fundamental search algorithms (BFS and DFS) and their classical applications. The lesson culminates with advanced algorithms for weighted graphs: shortest path (Dijkstra) and minimum spanning tree (Prim).
Learning Outcomes:
- Understand and compare the performance and use cases of the two main graph representations (adjacency matrix and adjacency list).
- Master the principles of breadth-first search (BFS) and depth-first search (DFS), and apply them to solve pathfinding and traversal problems.
- Apply graph algorithms to solve complex dependency issues (topological sorting) and network connectivity problems (strongly connected components, minimum spanning tree).
🔹 Lesson 8: Special Topic Review: RSA Encryption, Skip Lists, and Octree Quantization
Overview: This lesson explores the transition from fundamental data structures to complex algorithms in computer science. It covers the underlying implementation and amortized analysis of Python lists (ArrayList), number-theoretic recursive algorithms supporting RSA encryption, construction of skip lists to enhance dictionary search efficiency, and the principle of using octrees for color quantization in digital images.
Learning Outcomes:
- Understand the memory layout of ArrayLists and the mathematical derivation behind amortized analysis (Amortized Analysis).
- Master the core mathematical principles of RSA encryption, including modular arithmetic, power residues, and the extended Euclidean algorithm.
- Describe the hierarchical structure of skip lists, their probabilistic construction process, and their O(\log n) search efficiency.