图书简介:
Contents
Part I Preliminaries 预备知识
Chapter 1 Data Structures and Algorithms 数据结构和算法 3
1.1 A Philosophy of Data Structures 数据结构的原则 4
1.1.1 The Need for Data Structures 学习数据结构的必要性 4
1.1.2 Costs and Benefits 代价与效益 6
1.2 Abstract Data Types and Data Structures 抽象数据类型和数据结构 8
1.3 Design Patterns 设计模式 12
1.3.1 Flyweight 享元模式 13
1.3.2 Visitor 访问者模式 13
1.3.3 Composite 组合模式 14
1.3.4 Strategy 策略模式 15
1.4 Problems, Algorithms, and Programs 问题、算法和程序 16
1.5 Further Reading 深入学习导读 18
1.6 Exercises 习题 20
Chapter 2 Mathematical Preliminaries 数学预备知识 25
2.1 Sets and Relations 集合和关系 25
2.2 Miscellaneous Notation 常用数学术语 29
2.3 Logarithms 对数 31
2.4 Summations and Recurrences 级数求和与递归 32
2.5 Recursion 递归 36
2.6 Mathematical Proof Techniques 数学证明方法 38
2.6.1 Direct Proof 直接证明法 39
2.6.2 Proof by Contradiction 反证法 39
2.6.3 Proof by Mathematical Induction 数学归纳法 40
2.7 Estimation 估计 46
2.8 Further Reading 深入学习导读 47
2.9 Exercises 习题 48
Chapter 3 Algorithm Analysis 算法分析 55
3.1 Introduction 概述 55
3.2 Best, Worst, and Average Cases 最佳、最差和平均情况 61
3.3 A Faster Computer, or a Faster Algorithm 换一台更快的计算机,还是换一种更快的算法 62
3.4 Asymptotic Analysis 渐近分析 65
3.4.1 Upper Bounds 上限 65
3.4.2 Lower Bounds 下限 67
3.4.3 Θ Notation Θ表示法 68
3.4.4 Simplifying Rules 化简法则 69
3.4.5 Classifying Functions 函数分类 70
3.5 Calculating the Running Time for a Program 程序运行时间的计算 71
3.6 Analyzing Problems 问题的分析 76
3.7 Common Misunderstandings 容易混淆的概念 77
3.8 Multiple Parameters 多参数问题 79
3.9 Space Bounds 空间代价 80
3.10 Speeding Up Your Programs 加速你的程序 82
3.11 Empirical Analysis 实证分析 85
3.12 Further Reading 深入学习导读 86
3.13 Exercises 习题 86
3.14 Projects 项目设计 90
Part II Fundamental Data Structures 基本数据结构
Chapter 4 Lists, Stacks, and Queues 线性表、栈和队列 95
4.1 Lists 线性表 96
4.1.1 Array-Based List Implementation 顺序表的实现 100
4.1.2 Linked Lists 链表 103
4.1.3 Comparison of List Implementations 线性表实现方法的比较 112
4.1.4 Element Implementations 元素的表示 114
4.1.5 Doubly Linked Lists 双链表 115
4.2 Stacks 栈 120
4.2.1 Array-Based Stacks 顺序栈 121
4.2.2 Linked Stacks 链式栈 123
4.2.3 Comparison of Array-Based and Linked Stacks 顺序栈与链式栈的比较 123
4.2.4 Implementing Recursion 递归的实现 125
4.3 Queues 队列 127
4.3.1 Array-Based Queues 顺序队列 128
4.3.2 Linked Queues 链式队列 133
4.3.3 Comparison of Array-Based and Linked Queues 顺序队列与链式队列的比较 133
4.4 Dictionaries 字典 133
4.5 Further Reading 深入学习导读 145
4.6 Exercises 习题 145
4.7 Projects 项目设计 148
Chapter 5 Binary Trees 二叉树 151
5.1 Definitions and Properties 定义及主要特性 151
5.1.1 The Full Binary Tree Theorem 满二叉树定理 153
5.1.2 A Binary Tree Node ADT 二叉树的抽象数据类型 155
5.2 Binary Tree Traversals 遍历二叉树 155
5.3 Binary Tree Node Implementations 二叉树的实现 160
5.3.1 Pointer-Based Node Implementations 使用指针实现二叉树 160
5.3.2 Space Requirements 空间代价 166
5.3.3 Array Implementation for Complete Binary Trees 使用数组实现完全二叉树 168
5.4 Binary Search Trees 二叉检索树 168
5.5 Heaps and Priority Queues 堆与优先队列 178
5.6 Huffman Coding Trees Huffman编码树 185
5.6.1 Building Huffman Coding Trees 建立Huffman编码树 186
5.6.2 Assigning and Using Huffman Codes Huffman编码及其用法 192
5.6.3 Search in Huffman Trees 在Huffman树中搜索 195
5.7 Further Reading 深入学习导读 196
5.8 Exercises 习题 196
5.9 Projects 项目设计 200
Chapter 6 Non-Binary Trees 树 203
6.1 General Tree Definitions and Terminology 树的定义与术语 203
6.1.1 An ADT for General Tree Nodes 树结点的ADT 204
6.1.2 General Tree Traversals 树的遍历 205
6.2 The Parent Pointer Implementation 父指针表示法 207
6.3 General Tree Implementations 树的实现 213
6.3.1 List of Children 子结点表表示法 214
6.3.2 The Left-Child/Right-Sibling Implementation 左子结点/右兄弟结点表示法 215
6.3.3 Dynamic Node Implementations 动态结点表示法 215
6.3.4 Dynamic “Left-Child/Right-Sibling” Implementation 动态左子结点/右兄弟结点表示法 218
6.4 K-ary Trees K叉树 218
6.5 Sequential Tree Implementations 树的顺序表示法 219
6.6 Further Reading 深入学习导读 223
6.7 Exercises 习题 223
6.8 Projects 项目设计 226
Part III Sorting and Searching 排序与检索
Chapter 7 Internal Sorting 内排序 231
7.1 Sorting Terminology and Notation 排序术语 232
7.2 Three Θ(n2) Sorting Algorithms 三种代价为Θ(n2)的排序算法 233
7.2.1 Insertion Sort 插入排序 233
7.2.2 Bubble Sort 冒泡排序 235
7.2.3 Selection Sort 选择排序 237
7.2.4 The Cost of Exchange Sorting 交换排序算法的时间代价 238
7.3 Shellsort Shell排序 239
7.4 Mergesort 归并排序 241
7.5 Quicksort 快速排序 244
7.6 Heapsort 堆排序 251
7.7 Binsort and Radix Sort 分配排序和基数排序 252
7.8 An Empirical Comparison of Sorting Algorithms 对各种排序算法的实验比较 259
7.9 Lower Bounds for Sorting 排序问题的下限 261
7.10 Further Reading 深入学习导读 265
7.11 Exercises 习题 265
7.12 Projects 项目设计 269
Chapter 8 File Processing and External Sorting 文件管理和外排序 273
8.1 Primary versus Secondary Storage 主存储器和辅助存储器 273
8.2 Disk Drives 磁盘 276
8.2.1 Disk Drive Architecture 磁盘结构 276
8.2.2 Disk Access Costs 磁盘访问代价 280
8.3 Buffers and Buffer Pools 缓冲区和缓冲池 282
8.4 The Programmer’s View of Files 程序员的文件视图 290
8.5 External Sorting 外排序 291
8.5.1 Simple Approaches to External Sorting 外排序的简单方法 294
8.5.2 Replacement Selection 置换选择排序 296
8.5.3 Multiway Merging 多路归并 300
8.6 Further Reading 深入学习导读 303
8.7 Exercises 习题 304
8.8 Projects 项目设计 307
Chapter 9 Searching 检索 311
9.1 Searching Unsorted and Sorted Arrays 检索未排序和已排序的数组 312
9.2 Self-Organizing Lists 自组织线性表 317
9.3 Bit Vectors for Representing Sets 集合检索 323
9.4 Hashing 散列方法 324
9.4.1 Hash Functions 散列函数 325
9.4.2 Open Hashing 开散列方法 330
9.4.3 Closed Hashing 闭散列方法 331
9.4.4 Analysis of Closed Hashing 闭散列方法分析 339
9.4.5 Deletion 删除 344
9.5 Further Reading 深入学习导读 345
9.6 Exercises 习题 345
9.7 Projects 项目设计 348
Chapter 10 Indexing 索引技术 351
10.1 Linear Indexing 线性索引 353
10.2 ISAM 索引顺序访问方法 356
10.3 Tree-based Indexing 基于树的索引 358
10.4 2-3 Trees 2-3树 360
10.5 B-Trees B树 364
10.5.1 B+-Trees B+树 368
10.5.2 B-Tree Analysis B树分析 374
10.6 Further Reading 深入学习导读 375
10.7 Exercises 习题 375
10.8 Projects 项目设计 377
Part IV Advanced Data Structures 高级数据结构
Chapter 11 Graphs 图 381
11.1 Terminology and Representations 术语和表示法 382
11.2 Graph Implementations 图的实现 386
11.3 Graph Traversals 图的遍历 390
11.3.1 Depth-First Search 深度优先搜索 393
11.3.2 Breadth-First Search 广度优先搜索 394
11.3.3 Topological Sort 拓扑排序 394
11.4 Shortest-Paths Problems 最短路径问题 399
11.4.1 Single-Source Shortest Paths 单源最短路径 400
11.5 Minimum-Cost Spanning Trees 最小支撑树 402
11.5.1 Prim’s Algorithm Prim算法 404
11.5.2 Kruskal’s Algorithm Kruskal算法 407
11.6 Further Reading 深入学习导读 408
11.7 Exercises 习题 408
11.8 Projects 项目设计 412
Chapter 12 Lists and Arrays Revisited 线性表和数组高级技术 415
12.1 Multilists 广义表 415
12.2 Matrix Representations 矩阵的表示方法 418
12.3 Memory Management 存储管理 422
12.3.1 Dynamic Storage Allocation 动态存储分配 424
12.3.2 Failure Policies and Garbage Collection 失败处理策略和无用单元收集 431
12.4 Further Reading 深入学习导读 435
12.5 Exercises 习题 436
12.6 Projects 项目设计 437
Chapter 13 Advanced Tree Structures 高级树结构 439
13.1 Tries Tries结构 439
13.2 Balanced Trees 平衡树 444
13.2.1 The AVL Tree AVL树 445
13.2.2 The Splay Tree 伸展树 447
13.3 Spatial Data Structures 空间数据结构 450
13.3.1 The k-d Tree k-d树 452
13.3.2 The PR quadtree PR四分树 457
13.3.3 Other Point Data Structures 其他点状数据结构 461
13.3.4 Other Spatial Data Structures 其他空间数据结构 463
13.4 Further Reading 深入学习导读 463
13.5 Exercises 习题 464
13.6 Projects 项目设计 465
Part V Theory of Algorithms 算法理论
Chapter 14 Analysis Techniques 分析技术 471
14.1 Summation Techniques 求和技术 472
14.2 Recurrence Relations 递归关系 477
14.2.1 Estimating Upper and Lower Bounds 估算上下限 477
14.2.2 Expanding Recurrences 扩展递归 480
14.2.3 Divide and Conquer Recurrences 分治法递归 482
14.2.4 Average-Case Analysis of Quicksort 快速排序平均情况分析 484
14.3 Amortized Analysis 均摊分析 486
14.4 Further Reading 深入学习导读 489
14.5 Exercises 习题 489
14.6 Projects 项目设计 493
Chapter 15 Lower Bounds 下限 495
15.1 Introduction to Lower Bounds Proofs 下限证明简介 496
15.2 Lower Bounds on Searching Lists 线性表检索的下限 498
15.2.1 Searching in Unsorted Lists 无序线性表检索 498
15.2.2 Searching in Sorted Lists 有序线性表检索 500
15.3 Finding the Maximum Value 查找最大值 501
15.4 Adversarial Lower Bounds Proofs 对抗性下限证明 503
15.5 State Space Lower Bounds Proofs 状态空间下限证明 506
15.6 Finding the ith Best Element 查找第i大元素 509
15.7 Optimal Sorting 优化排序 511
15.8 Further Reading 深入学习导读 514
15.9 Exercises 习题 514
15.10 Projects 项目设计 517
Chapter 16 Patterns of Algorithms 算法模式 519
16.1 Dynamic Programming 动态规划 519
16.1.1 The Knapsack Problem 背包问题 521
16.1.2 All-Pairs Shortest Paths 全局最短路径 523
16.2 Randomized Algorithms 随机算法 525
16.2.1 Randomized algorithms for finding large values 查找最大值的随机算法 525
16.2.2 Skip Lists 跳跃表 526
16.3 Numerical Algorithms 数值算法 532
16.3.1 Exponentiation 幂运算 533
16.3.2 Largest Common Factor 最大公约数 533
16.3.3 Matrix Multiplication 矩阵相乘 534
16.3.4 Random Numbers 随机数 536
16.3.5 The Fast Fourier Transform 快速傅里叶变换 537
16.4 Further Reading 深入学习导读 542
16.5 Exercises 习题 542
16.6 Projects 项目设计 543
Chapter 17 Limits to Computation 计算的限制 545
17.1 Reductions 归约 546
17.2 Hard Problems 难解问题 551
17.2.1 The Theory of NP-Completeness NP完全性理论 553
17.2.2 NP-Completeness Proofs NP完全性证明 557
17.2.3 Coping with NP-Complete Problems 处理NP完全性问题 562
17.3 Impossible Problems 不可解问题 565
17.3.1 Uncountability 不可数性 566
17.3.2 The Halting Problem Is Unsolvable 停机问题的不可解性 569
17.4 Further Reading 深入学习导读 571
17.5 Exercises 习题 572
17.6 Projects 项目设计 574
Part VI Appendix 附录
Appendix A Utility Functions 实用函数 579
Bibliography 参考文献 581
展开
导读
本书是关于数据结构与算法的经典教材,主要论述了数据的逻辑结构、存储结构、算法设计以及算法评价,使读者能够深入了解线性表、栈、队列、树、图等数据结构以及排序和检索等基础算法,并掌握在不同的数据结构上进行有关算法设计的思想和技巧。书中包含大量代码实例和详尽的解释,非常适合作为计算机专业本科生的低年级入门教材。
全书由浅入深,分成了五部分。针对计算机专业的低年级本科生而言,可重点学习前面三部分的内容,学有余力的读者可以在此基础上继续研读后面的内容。下面为读者介绍本书的重点内容。
基础知识
本书第1章介绍数据结构的入门基础知识。其中1.1节主要介绍数据结构的定义、基本术语以及数据结构的基本类型。读者可通过该节的学习,重点理解数据结构对抽象表达、程序建模的重要作用,了解在实际应用过程中数据结构的优缺点分析方法。1.2节重点介绍基于抽象数据类型的数据结构描述方法,这是整本书中各个数据结构的基本描述方法。读者可穿插翻阅后面第4章至第6章中各种数据结构的抽象数据类型,加深对此概念的理解。此外,通过学习1.4节,读者可进一步理解数据结构、问题与算法的关系。
第2章介绍本课程涉及的一些基础数学知识。这里介绍的绝大部分数学知识,读者都在中学阶段学过。读者可简单快速地浏览一下,用时再来回顾翻查。
第3章主要介绍算法复杂度的定义以及分析方法。通过学习这一章,读者就能了解渐近算法复杂度分析方法,理解上界分析、下界分析以及确切界的具体定义和快速分析方法,并掌握针对任意给定程序片段的基本分析方法。
数据结构
本书第II部分是整本教材的重点,主要讨论了线性表(List)、二叉树(Binary Tree)、普通树(General Tree)和图(Graph)的定义与实现方法。为了能对各类数据结构融会贯通,建议读者将第4章至第6章以及第11章放在一起来进行对比学习。
第4章主要介绍线性表的定义、存储实现方法,以及插入、删除和定位等基本运算;介绍了栈和队列的基本定义、基本操作,以及灵活运用线性表解决现实工程问题的方法。本章的重点是分析线性表的数组和链表这两种实现方法的优缺点,实现线性表的插入、删除和定位等基本运算的算法,掌握栈的入栈和出栈的操作以及队列的入队和出队的操作。本章的难点是掌握单链表及循环链表的算法设计综合应用,掌握在循环队列上进行入队、出队以及求队列长度的操作。
第5章主要阐述二叉树的基本定义和存储实现方式,介绍二叉树的遍历和搜索等基本操作,并讨论堆和哈夫曼树的基本原理和实现方法。本章的重点是基于链接和数组两种存储结构的二叉树实现方法,基于递归的二叉树遍历算法,以及二叉检索树的搜索、删除结点和添加结点等操作的实现。本章的难点是堆的构建,插入删除结点的算法实现,以及哈夫曼树和哈夫曼编码的原理。
第6章主要介绍普通树的基本定义、遍历算法,以及树的基本存储结构的原理。本章的重点是左子结点/右兄弟结点的实现方法,以及普通树和森林与二叉树的相互转换方法。本章的难点是普通树的顺序存储方法。
第11章主要介绍图的定义和实现方法,图的遍历算法、拓扑排序算法、最短路径算法以及最小生成树算法。本章的重点是图的邻接表和邻接矩阵两种实现方法的原理以及优缺点对比,基于递归函数的图的遍历算法的实现,拓扑排序算法的实现方法以及应用。本章的难点是Dijkstra最短路径算法、Floyd最短路径算法,以及最小支撑树算法。
排序与检索算法
第III部分主要介绍排序与检索这两类比较常见的算法,读者在掌握了前面所学数据结构的基础上,可以较为容易地掌握这些算法。
第7章和第8章主要介绍排序算法的定义、排序算法的稳定性分析方法、排序算法的设计与实现、排序算法的复杂度对比分析,以及排序算法的应用。这两章的重点是快速排序算法、归并排序算法以及堆排序算法的实现。这两章的难点是希尔排序算法的原理和实现,以及快速排序算法的改进方法。
第9章主要介绍检索的基本原理和二分法检索,讨论了哈希表的原理、构建以及相关操作。本章的重点是哈希表的原理、哈希函数的设计方法,以及哈希表的插入和删除算法的实现。本章的难点是哈希表的冲突解决策略。
第10章主要介绍线性索引、2-3树索引、B/B+树索引的构建方法,以及算法复杂度分析方法。本章的重点是线性索引和2-3树索引的插入、删除和检索算法,以及B树和B+树的定义。本章的难点是B树和B+树的插入、删除和检索算法。
至此,通过前三部分的学习,读者已掌握了数据结构的基本知识、理论和分析方法,为从事相关计算机程序分析和设计工作以及相关专业的后续课程学习打下了扎实的基础。读者在本书各章节的学习过程中,可动手实践对应的程序代码,掌握各类数据结构与算法的实现方法。华南理工大学计算机科学与工程学院的“数据结构”课程采用了本书作为教材。我们授课团队结合本书设计了相应的在线教学视频(https://next.xuetangx.com/course/SCUT08091000960/),感兴趣的读者可参考学习,定能起到事半功倍的作用。采用本书作为教材的授课教师,也可登录华信教育资源网(www.hxedu.com.cn)注册后免费下载我们的教学用PPT等相关资料。
吕建明教授 博导
华南理工大学计算机科学与工程学院
Preface
We study data structures so that we can learn to write more efficient programs. But why must programs be efficient when new computers are faster every year? The reason is that our ambitions grow with our capabilities. Instead of rendering efficiency needs obsolete, the modern revolution in computing power and storage capability merely raises the efficiency stakes as we attempt more complex tasks.
The quest for program efficiency need not and should not conflict with sound design and clear coding. Creating efficient programs has little to do with “programming tricks” but rather is based on good organization of information and good algorithms. A programmer who has not mastered the basic principles of clear design is not likely to write efficient programs. Conversely, concerns related to development costs and maintainability should not be used as an excuse to justify inefficient performance. Generality in design can and should be achieved without sacrificing performance, but this can only be done if the designer understands how to measure performance and does so as an integral part of the design and implementation process. Most computer science curricula recognize that good programming skills begin with a strong emphasis on fundamental software engineering principles. Then, once a programmer has learned the principles of clear program design and implementation, the next step is to study the effects of data organization and algorithms on program efficiency.
Approach: This book describes many techniques for representing data. These techniques are presented within the context of the following principles:
1. Each data structure and each algorithm has costs and benefits. Practitioners need a thorough understanding of how to assess costs and benefits to be able to adapt to new design challenges. This requires an understanding of the principles of algorithm analysis, and also an appreciation for the significant effects of the physical medium employed (e.g., data stored on disk versus main memory).
2. Related to costs and benefits is the notion of tradeoffs. For example, it is quite common to reduce time requirements at the expense of an increase in space requirements, or vice versa. Programmers face tradeoff issues regularly in all phases of software design and implementation, so the concept must become deeply ingrained.
3. Programmers should know enough about common practice to avoid reinventing the wheel. Thus, programmers need to learn the commonly used data structures, their related algorithms, and the most frequently encountered design patterns found in programming.
4. Data structures follow needs. Programmers must learn to assess application needs first, then find a data structure with matching capabilities. To do this requires competence in Principles 1, 2, and 3.
As I have taught data structures through the years, I have found that design issues have played an ever greater role in my courses. This can be traced through the various editions of this textbook by the increasing coverage for design patterns and generic interfaces. The first edition had no mention of design patterns. The second edition had limited coverage of a few example patterns, and introduced the dictionary ADT and comparator classes. With the third edition, there is explicit coverage of some design patterns that are encountered when programming the basic data structures and algorithms covered in the book.
Using the Book in Class: Data structures and algorithms textbooks tend to fall into one of two categories: teaching texts or encyclopedias. Books that attempt to do both usually fail at both. This book is intended as a teaching text. I believe it is more important for a practitioner to understand the principles required to select or design the data structure that will best solve some problem than it is to memorize a lot of textbook implementations. Hence, I have designed this as a teaching text that covers most standard data structures, but not all. A few data structures that are not widely adopted are included to illustrate important principles. Some relatively new data structures that should become widely used in the future are included.
Within an undergraduate program, this textbook is designed for use in either an advanced lower division (sophomore or junior level) data structures course, or for a senior level algorithms course. New material has been added in the third edition to support its use in an algorithms course. Normally, this text would be used in a course beyond the standard freshman level “CS2” course that often serves as the initial introduction to data structures. Readers of this book should typically have two semesters of the equivalent of programming experience, including at least some exposure to C++. Readers who are already familiar with recursion will have an advantage. Students of data structures will also benefit from having first completed a good course in Discrete Mathematics. Nonetheless, Chapter 2 attempts to give a reasonably complete survey of the prerequisite mathematical topics at the level necessary to understand their use in this book. Readers may wish to refer back to the appropriate sections as needed when encountering unfamiliar mathematical material.
A sophomore-level class where students have only a little background in basic data structures or analysis (that is, background equivalent to what would be had from a traditional CS2 course) might cover Chapters 1~11 in detail, as well as selected topics from Chapter 13. That is how I use the book for my own sophomore-level class. Students with greater background might cover Chapter 1, skip most of Chapter 2 except for reference, briefly cover Chapters 3 and 4, and then cover chapters 5~12 in detail. Again, only certain topics from Chapter 13 might be covered, depending on the programming assignments selected by the instructor. A senior-level algorithms course would focus on Chapters 11 and 14~17.
Chapter 13 is intended in part as a source for larger programming exercises. I recommend that all students taking a data structures course be required to implement some advanced tree structure, or another dynamic structure of comparable difficulty such as the skip list or sparse matrix representations of Chapter 12. None of these data structures are significantly more difficult to implement than the binary search tree, and any of them should be within a student's ability after completing Chapter 5.
While I have attempted to arrange the presentation in an order that makes sense, instructors should feel free to rearrange the topics as they see fit. The book has been written so that once the reader has mastered Chapters 1-6, the remaining material has relatively few dependencies. Clearly, external sorting depends on understanding internal sorting and disk files. Section 6.2 on the UNION/FIND algorithm is used in Kruskal's Minimum-Cost Spanning Tree algorithm. Section 9.2 on self-organizing lists mentions the buffer replacement schemes covered in Section 8.3. Chapter 14 draws on examples from throughout the book. Section 17.2 relies on knowledge of graphs. Otherwise, most topics depend only on material presented earlier within the same chapter.
Most chapters end with a section entitled “Further Reading.” These sections are not comprehensive lists of references on the topics presented. Rather, I include books and articles that, in my opinion, may prove exceptionally informative or entertaining to the reader. In some cases I include references to works that should become familiar to any well-rounded computer scientist.
Use of C++: The programming examples are written in C++, but I do not wish to discourage those unfamiliar with C++ from reading this book. I have attempted to make the examples as clear as possible while maintaining the advantages of C++. C++ is used here strictly as a tool to illustrate data structures concepts. In particular, I make use of C++’s support for hiding implementation details, including features such as classes, private class members, constructors, and destructors. These features of the language support the crucial concept of separating logical design, as embodied in the abstract data type, from physical implementation as embodied in the data structure.
To keep the presentation as clear as possible, some important features of C++ are avoided here. I deliberately minimize use of certain features commonly used by experienced C++ programmers such as class hierarchy, inheritance, and virtual functions. Operator and function overloading is used sparingly. C-like initialization syntax is preferred to some of the alternatives offered by C++.
While the C++ features mentioned above have valid design rationale in real programs, they tend to obscure rather than enlighten the principles espoused in this book. For example, inheritance is an important tool that helps programmers avoid duplication, and thus minimize bugs. From a pedagogical standpoint, however, inheritance often makes code examples harder to understand since it tends to spread the description for one logical unit among several classes. Thus, my class definitions only use inheritance where inheritance is explicitly relevant to the point illustrated (e.g., Section 5.3.1). This does not mean that a programmer should do likewise. Avoiding code duplication and minimizing errors are important goals. Treat the programming examples as illustrations of data structure principles, but do not copy them directly into your own programs.
One painful decision I had to make was whether to use templates in the code examples. In the first edition of this book, the decision was to leave templates out as it was felt that their syntax obscures the meaning of the code for those not familiar with C++. In the years following, the use of C++ in computer science curricula has greatly expanded. I now assume that readers of the text will be familiar with template syntax. Thus, templates are now used extensively in the code examples.
My implementations are meant to provide concrete illustrations of data structure principles, as an aid to the textual exposition. Code examples should not be read or used in isolation from the associated text because the bulk of each example's documentation is contained in the text, not the code. The code complements the text, not the other way around. They are not meant to be a series of commercial quality class implementations. If you are looking for a complete implementation of a standard data structure for use in your own code, you would do well to do an Internet search.
For instance, the code examples provide less parameter checking than is sound programming practice, since including such checking would obscure rather than illuminate the text. Some parameter checking and testing for other constraints (e.g., whether a value is being removed from an empty container) is included in the form of a call to Assert. The inputs to Assert are a Boolean expression and a character string. If this expression evaluates to false, then a message is printed and the program terminates immediately. Terminating a program when a function receives a bad parameter is generally considered undesirable in real programs, but is quite adequate for understanding how a data structure is meant to operate. In real programming applications, C++'s exception handling features should be used to deal with input data errors. However, assertions provide a simpler mechanism for indicating required conditions in a way that is both adequate for clarifying how a data structure is meant to operate, and is easily modified into true exception handling. See the Appendix for the implementation of Assert.
I make a distinction in the text between “C++ implementations” and “pseudocode.” Code labeled as a C++ implementation has actually been compiled and tested on one or more C++ compilers. Pseudocode examples often conform closely to C++ syntax, but typically contain one or more lines of higher-level description. Pseudocode is used where I perceived a greater pedagogical advantage to a simpler, but less precise, description.
Exercises and Projects: Proper implementation and analysis of data structures cannot be learned simply by reading a book. You must practice by implementing real programs, constantly comparing different techniques to see what really works best in a given situation.
One of the most important aspects of a course in data structures is that it is where students really learn to program using pointers and dynamic memory allocation, by implementing data structures such as linked lists and trees. It is often where students truly learn recursion. In our curriculum, this is the first course where students do significant design, because it often requires real data structures to motivate significant design exercises. Finally, the fundamental differences between memory-based and disk-based data access cannot be appreciated without practical programming experience. For all of these reasons, a data structures course cannot succeed without a significant programming component. In our department, the data structures course is one of the most difficult programming courses in the curriculum.
Students should also work problems to develop their analytical abilities. I provide over 450 exercises and suggestions for programming projects. I urge readers to take advantage of them. Contacting the Author and Supplementary Materials: A book such as this is sure to contain errors and have room for improvement. I welcome bug reports and constructive criticism. I can be reached by electronic mail via the Internet at shaffer@vt.edu. Alternatively, comments can be mailed to
Cliff Shaffer
Department of Computer Science
Virginia Tech
Blacksburg, VA 24061
The electronic posting of this book, along with a set of lecture notes for use in class can be obtained at http://www.cs.vt.edu/~shaffer/book.html
The code examples used in the book are available at the same site. Online Web pages for Virginia Tech's sophomore-level data structures class can be found at http://courses.cs.vt.edu/~cs3114
Readers of this textbook will be interested in our open-source, online eTextbook project, OpenDSA (http://algoviz.org/OpenDSA). The OpenDSA project's goal is to create a complete collection of tutorials that combine textbook quality content with algorithm visualizations for every algorithm and data structure, and a rich collection of interactive exercises. When complete, OpenDSA will replace this book.
This book was typeset by the author using LATEX. The bibliography was prepared using BIBTEX. The index was prepared using makeindex. The figures were mostly drawn with Xfig. Figures 3.1 and 9.10 were partially created using Mathematica.
Acknowledgments: It takes a lot of help from a lot of people to make a book. I wish to acknowledge a few of those who helped to make this book possible. I apologize for the inevitable omissions.
Virginia Tech helped make this whole thing possible through sabbatical research leave during Fall 1994, enabling me to get the project off the ground. My department heads during the time I have written the various editions of this book, Dennis Kafura and Jack Carroll, provided unwavering moral support for this project. Mike Keenan, Lenny Heath, and Jeff Shaffer provided valuable input on early versions of the chapters. I also wish to thank Lenny Heath for many years of stimulating discussions about algorithms and analysis (and how to teach both to students). Steve Edwards deserves special thanks for spending so much time helping me on various redesigns of the C++ and Java code versions for the second and third editions, and many hours of discussion on the principles of program design. Thanks to Layne Watson for his help with Mathematica, and to Bo Begole, Philip Isenhour, Jeff Nielsen, and Craig Struble for much technical assistance. Thanks to Bill McQuain, Mark Abrams and Dennis Kafura for answering lots of silly questions about C++ and Java.
I am truly indebted to the many reviewers of the various editions of this manuscript. For the first edition these reviewers included J. David Bezek (University of Evansville), Douglas Campbell (Brigham Young University), Karen Davis (University of Cincinnati), Vijay Kumar Garg (University of Texas, Austin), Jim Miller (University of Kansas), Bruce Maxim (University of Michigan, Dearborn), Jeff Parker (Agile Networks/Harvard), Dana Richards (George Mason University), Jack Tan (University of Houston), and Lixin Tao (Concordia University). Without their help, this book would contain many more technical errors and many fewer insights.
For the second edition, I wish to thank these reviewers: Gurdip Singh (Kansas State University), Peter Allen (Columbia University), Robin Hill (University of Wyoming), Norman Jacobson (University of California, Irvine), Ben Keller (Eastern Michigan University), and Ken Bosworth (Idaho State University). In addition, I wish to thank Neil Stewart and Frank J. Thesen for their comments and ideas for improvement.
Third edition reviewers included Randall Lechlitner (University of Houstin, Clear Lake) and Brian C. Hipp (York Technical College). I thank them for their comments.
Prentice Hall was the original print publisher for the first and second editions. Without the hard work of many people there, none of this would be possible. Authors simply do not create printer-ready books on their own. Foremost thanks go to Kate Hargett, Petra Rector, Laura Steele, and Alan Apt, my editors over the years. My production editors, Irwin Zucker for the second edition, Kathleen Caren for the original C++ version, and Ed DeFelippis for the Java version, kept everything moving smoothly during that horrible rush at the end. Thanks to Bill Zobrist and Bruce Gregory (I think) for getting me into this in the first place. Others at Prentice Hall who helped me along the way include Truly Donovan, Linda Behrens, and Phyllis Bregman. Thanks to Tracy Dunkelberger for her help in returning the copyright to me, thus enabling the electronic future of this work. I am sure I owe thanks to many others at Prentice Hall for their help in ways that I am not even aware of.
I am thankful to Shelley Kronzek at Dover publications for her faith in taking on the print publication of this third edition. Much expanded, with both Java and C++ versions, and many inconsistencies corrected, I am confident that this is the best edition yet. But none of us really knows whether students will prefer a free online textbook or a low-cost, printed bound version. In the end, we believe that the two formats will be mutually supporting by offering more choices. Production editor James Miller and design manager Marie Zaczkiewicz have worked hard to ensure that the production is of the highest quality.
I wish to express my appreciation to Hanan Samet for teaching me about data structures. I learned much of the philosophy presented here from him as well, though he is not responsible for any problems with the result. Thanks to my wife Terry, for her love and support, and to my daughters Irena and Kate for pleasant diversions from working too hard. Finally, and most importantly, to all of the data structures students over the years who have taught me what is important and what should be skipped in a data structures course, and the many new insights they have provided. This book is dedicated to them.
Cliff Shaffer
Blacksburg, Virginia
展开