Using Box<T>
to Point to Data on the Heap
使用 Box<T>
指向堆上的数据
The most straightforward smart pointer is a box, whose type is written
Box<T>
. Boxes allow you to store data on the heap rather than the stack. What
remains on the stack is the pointer to the heap data. Refer to Chapter 4 to
review the difference between the stack and the heap.
最直接的智能指针是一个 box,其类型写作 Box<T>
。Box 允许您将数据存储在堆上,而不是栈上。栈上保留的是指向堆数据的指针。请参考第 4 章,了解栈和堆之间的区别。
Boxes don’t have performance overhead, other than storing their data on the
heap instead of on the stack. But they don’t have many extra capabilities
either. You’ll use them most often in these situations:
盒子没有性能开销,除了将它们的数据存储在堆上而不是栈上。但它们也没有太多额外的功能。您最常在以下情况下使用它们:
- When you have a type whose size can’t be known at compile time and you want
to use a value of that type in a context that requires an exact size
当您有一个在编译时无法确定大小的类型,并且希望在需要确切大小的上下文中使用该类型的值时 - When you have a large amount of data and you want to transfer ownership but
ensure the data won’t be copied when you do so
当您拥有大量数据并希望转移所有权,但又希望在此过程中确保数据不会被复制时 - When you want to own a value and you care only that it’s a type that
implements a particular trait rather than being of a specific type
当您想拥有一个值,并且只关心它是实现特定特质的类型,而不是特定类型时
We’ll demonstrate the first situation in the “Enabling Recursive Types with
Boxes” section. In the
second case, transferring ownership of a large amount of data can take a long
time because the data is copied around on the stack. To improve performance in
this situation, we can store the large amount of data on the heap in a box.
Then, only the small amount of pointer data is copied around on the stack,
while the data it references stays in one place on the heap. The third case is
known as a trait object, and Chapter 17 devotes an entire section, “Using
Trait Objects That Allow for Values of Different Types,” just to that topic. So what you learn here you’ll apply again in
Chapter 17!
我们将在“使用盒子启用递归类型”部分演示第一种情况。在第二种情况中,传输大量数据的所有权可能需要很长时间,因为数据在堆栈上进行复制。为了改善这种情况下的性能,我们可以将大量数据存储在堆中的盒子中。然后,在堆栈上只复制少量指针数据,而它引用的数据保留在堆中的一个位置。第三种情况被称为特质对象,第 17 章专门讨论了一个名为“使用允许不同类型值的特质对象”的部分。所以你在这里学到的知识将在第 17 章再次应用!
Using a Box<T>
to Store Data on the Heap
使用 Box<T>
在堆上存储数据
Before we discuss the heap storage use case for Box<T>
, we’ll cover the
syntax and how to interact with values stored within a Box<T>
.
在讨论 Box<T>
的堆存储使用情况之前,我们将介绍语法以及如何与存储在 Box<T>
中的值进行交互。
Listing 15-1 shows how to use a box to store an i32
value on the heap:
清单 15-1 显示了如何使用一个盒子在堆上存储一个 i32
值:
Filename: src/main.rs 文件名:src/main.rs
fn main() {
let b = Box::new(5);
println!("b = {b}");
}
We define the variable b
to have the value of a Box
that points to the
value 5
, which is allocated on the heap. This program will print b = 5
; in
this case, we can access the data in the box similar to how we would if this
data were on the stack. Just like any owned value, when a box goes out of
scope, as b
does at the end of main
, it will be deallocated. The
deallocation happens both for the box (stored on the stack) and the data it
points to (stored on the heap).
我们定义变量 b
的值为指向堆上分配的值 5
的 Box
。该程序将打印 b = 5
;在这种情况下,我们可以访问盒子中的数据,类似于在数据位于堆栈上时的访问方式。就像任何拥有的值一样,当盒子超出范围时,如 b
在 main
结束时一样,它将被释放。释放同时发生在盒子(存储在堆栈上)和它指向的数据(存储在堆上)。
Putting a single value on the heap isn’t very useful, so you won’t use boxes by
themselves in this way very often. Having values like a single i32
on the
stack, where they’re stored by default, is more appropriate in the majority of
situations. Let’s look at a case where boxes allow us to define types that we
wouldn’t be allowed to if we didn’t have boxes.
将单个值放在堆上并不是很有用,因此在这种情况下,您不会经常单独使用盒子。在大多数情况下,像单个 i32
这样存储在堆栈上的值更为合适,因为这是默认存储的方式。让我们看一个案例,盒子使我们能够定义一些如果没有盒子我们将无法定义的类型。
Enabling Recursive Types with Boxes
使用盒子实现递归类型
A value of recursive type can have another value of the same type as part of
itself. Recursive types pose an issue because at compile time Rust needs to
know how much space a type takes up. However, the nesting of values of
recursive types could theoretically continue infinitely, so Rust can’t know how
much space the value needs. Because boxes have a known size, we can enable
recursive types by inserting a box in the recursive type definition.
递归类型的值可以将相同类型的另一个值作为自身的一部分。递归类型会带来问题,因为在编译时,Rust 需要知道一个类型占用多少空间。然而,递归类型的值的嵌套理论上可以无限继续,因此 Rust 无法知道值需要多少空间。由于盒子有已知的大小,我们可以通过在递归类型定义中插入一个盒子来启用递归类型。
As an example of a recursive type, let’s explore the cons list. This is a data
type commonly found in functional programming languages. The cons list type
we’ll define is straightforward except for the recursion; therefore, the
concepts in the example we’ll work with will be useful any time you get into
more complex situations involving recursive types.
作为递归类型的一个示例,让我们来探讨一下 cons 列表。这是一种在函数式编程语言中常见的数据类型。我们将定义的 cons 列表类型很简单,除了递归之外;因此,我们将使用的示例中的概念在涉及递归类型的更复杂情况时都会很有用。
More Information About the Cons List
有关 Cons 列表的更多信息
A cons list is a data structure that comes from the Lisp programming language
and its dialects and is made up of nested pairs, and is the Lisp version of a
linked list. Its name comes from the cons
function (short for “construct
function”) in Lisp that constructs a new pair from its two arguments. By
calling cons
on a pair consisting of a value and another pair, we can
construct cons lists made up of recursive pairs.
cons 列表是一种数据结构,源自 Lisp 编程语言及其方言,由嵌套对组成,是链表的 Lisp 版本。其名称来自 Lisp 中的 cons
函数(缩写为“构造函数”),用于从两个参数构造新对。通过在由值和另一个对组成的对上调用 cons
,我们可以构造由递归对组成的 cons 列表。
For example, here’s a pseudocode representation of a cons list containing the
list 1, 2, 3 with each pair in parentheses:
例如,这是一个伪代码表示,包含列表 1、2、3 的 cons 列表,每对都在括号中:
(1, (2, (3, Nil)))
Each item in a cons list contains two elements: the value of the current item
and the next item. The last item in the list contains only a value called Nil
without a next item. A cons list is produced by recursively calling the cons
function. The canonical name to denote the base case of the recursion is Nil
.
Note that this is not the same as the “null” or “nil” concept in Chapter 6,
which is an invalid or absent value.
在 cons 列表中,每个项目包含两个元素:当前项目的值和下一个项目。列表中的最后一个项目仅包含一个名为 Nil
的值,没有下一个项目。通过递归调用 cons
函数生成 cons 列表。用于表示递归基本情况的规范名称是 Nil
。请注意,这与第 6 章中的“null”或“nil”概念不同,后者是无效或不存在的值。
The cons list isn’t a commonly used data structure in Rust. Most of the time
when you have a list of items in Rust, Vec<T>
is a better choice to use.
Other, more complex recursive data types are useful in various situations,
but by starting with the cons list in this chapter, we can explore how boxes
let us define a recursive data type without much distraction.
在 Rust 中,cons 列表不是常用的数据结构。大多数情况下,在 Rust 中有一系列项目时,使用 Vec<T>
更好。其他更复杂的递归数据类型在各种情况下都很有用,但通过从本章开始使用 cons 列表,我们可以探索如何使用盒子定义递归数据类型而不受太多干扰。
Listing 15-2 contains an enum definition for a cons list. Note that this code
won’t compile yet because the List
type doesn’t have a known size, which
we’ll demonstrate.
清单 15-2 包含了一个用于 cons 列表的枚举定义。请注意,这段代码目前无法编译,因为 List
类型的大小是未知的,我们将在后面进行演示。
Filename: src/main.rs 文件名:src/main.rs
Note: We’re implementing a cons list that holds only
i32
values for the purposes of this example. We could have implemented it using generics, as we discussed in Chapter 10, to define a cons list type that could store values of any type.
注意:为了本示例的目的,我们正在实现一个仅包含i32
值的 cons 列表。我们可以使用泛型来实现它,正如我们在第 10 章中讨论的那样,以定义一个可以存储任何类型值的 cons 列表类型。
Using the List
type to store the list 1, 2, 3
would look like the code in
Listing 15-3:
使用 List
类型来存储列表 1, 2, 3
的代码如 15-3 清单中所示:
Filename: src/main.rs 文件名:src/main.rs
The first Cons
value holds 1
and another List
value. This List
value is
another Cons
value that holds 2
and another List
value. This List
value
is one more Cons
value that holds 3
and a List
value, which is finally
Nil
, the non-recursive variant that signals the end of the list.
第一个 Cons
值包含 1
和另一个 List
值。这个 List
值是另一个 Cons
值,包含 2
和另一个 List
值。这个 List
值是另一个 Cons
值,包含 3
和一个 List
值,最后是 Nil
,表示列表的结束的非递归变体。
If we try to compile the code in Listing 15-3, we get the error shown in
Listing 15-4:
如果我们尝试编译清单 15-3 中的代码,我们会得到清单 15-4 中显示的错误:
$ cargo run
Compiling cons-list v0.1.0 (file:///projects/cons-list)
error[E0072]: recursive type `List` has infinite size
--> src/main.rs:1:1
|
1 | enum List {
| ^^^^^^^^^
2 | Cons(i32, List),
| ---- recursive without indirection
|
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
|
2 | Cons(i32, Box<List>),
| ++++ +
For more information about this error, try `rustc --explain E0072`.
error: could not compile `cons-list` (bin "cons-list") due to 1 previous error
The error shows this type “has infinite size.” The reason is that we’ve defined
List
with a variant that is recursive: it holds another value of itself
directly. As a result, Rust can’t figure out how much space it needs to store a
List
value. Let’s break down why we get this error. First, we’ll look at how
Rust decides how much space it needs to store a value of a non-recursive type.
错误显示此类型“具有无限大小”。原因是我们用一个递归变体定义了 List
:它直接保存了另一个相同类型的值。因此,Rust 无法确定存储 List
值所需的空间大小。让我们分析一下为什么会出现这个错误。首先,我们来看看 Rust 是如何决定存储非递归类型值所需空间的。
Computing the Size of a Non-Recursive Type
计算非递归类型的大小
Recall the Message
enum we defined in Listing 6-2 when we discussed enum
definitions in Chapter 6:
回想一下我们在第 6 章讨论枚举定义时在清单 6-2 中定义的 Message
枚举:
To determine how much space to allocate for a Message
value, Rust goes
through each of the variants to see which variant needs the most space. Rust
sees that Message::Quit
doesn’t need any space, Message::Move
needs enough
space to store two i32
values, and so forth. Because only one variant will be
used, the most space a Message
value will need is the space it would take to
store the largest of its variants.
确定为 Message
值分配多少空间,Rust 会遍历每个变体,看看哪个变体需要最多的空间。Rust 发现 Message::Quit
不需要任何空间, Message::Move
需要足够的空间来存储两个 i32
值,依此类推。因为只有一个变体会被使用,所以 Message
值需要的最大空间就是存储其变体中最大值所需的空间。
Contrast this with what happens when Rust tries to determine how much space a
recursive type like the List
enum in Listing 15-2 needs. The compiler starts
by looking at the Cons
variant, which holds a value of type i32
and a value
of type List
. Therefore, Cons
needs an amount of space equal to the size of
an i32
plus the size of a List
. To figure out how much memory the List
type needs, the compiler looks at the variants, starting with the Cons
variant. The Cons
variant holds a value of type i32
and a value of type
List
, and this process continues infinitely, as shown in Figure 15-1.
与 Rust 尝试确定类似于清单 15-2 中 List
枚举需要多少空间时发生的情况形成鲜明对比。编译器首先查看 Cons
变体,它包含类型为 i32
和类型为 List
的值。因此, Cons
需要的空间量等于一个 i32
的大小加上一个 List
的大小。为了确定 List
类型需要多少内存,编译器查看变体,从 Cons
变体开始。 Cons
变体包含类型为 i32
和类型为 List
的值,这个过程会无限继续,如图 15-1 所示。
Using Box<T>
to Get a Recursive Type with a Known Size
使用 Box<T>
来获取具有已知大小的递归类型
Because Rust can’t figure out how much space to allocate for recursively
defined types, the compiler gives an error with this helpful suggestion:
由于 Rust 无法确定递归定义类型需要分配多少空间,编译器会给出一个带有有用建议的错误提示:
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
|
2 | Cons(i32, Box<List>),
| ++++ +
In this suggestion, “indirection” means that instead of storing a value
directly, we should change the data structure to store the value indirectly by
storing a pointer to the value instead.
在这个建议中,“间接”意味着我们应该改变数据结构,通过存储指向值的指针来间接存储值,而不是直接存储值。
Because a Box<T>
is a pointer, Rust always knows how much space a Box<T>
needs: a pointer’s size doesn’t change based on the amount of data it’s
pointing to. This means we can put a Box<T>
inside the Cons
variant instead
of another List
value directly. The Box<T>
will point to the next List
value that will be on the heap rather than inside the Cons
variant.
Conceptually, we still have a list, created with lists holding other lists, but
this implementation is now more like placing the items next to one another
rather than inside one another.
因为 Box<T>
是一个指针,Rust 始终知道一个 Box<T>
需要多少空间:指针的大小不会根据其指向的数据量而改变。这意味着我们可以将一个 Box<T>
放在 Cons
变体中,而不是直接放入另一个 List
值。 Box<T>
将指向下一个将位于堆上而不是在 Cons
变体内部的 List
值。从概念上讲,我们仍然有一个列表,由持有其他列表的列表创建,但这种实现现在更像是将项目放在一起而不是放在一起。
We can change the definition of the List
enum in Listing 15-2 and the usage
of the List
in Listing 15-3 to the code in Listing 15-5, which will compile:
我们可以将清单 15-2 中 List
枚举的定义和清单 15-3 中 List
的使用更改为清单 15-5 中的代码,这样就可以编译通过:
Filename: src/main.rs 文件名:src/main.rs
enum List {
Cons(i32, Box<List>),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
}
The Cons
variant needs the size of an i32
plus the space to store the
box’s pointer data. The Nil
variant stores no values, so it needs less space
than the Cons
variant. We now know that any List
value will take up the
size of an i32
plus the size of a box’s pointer data. By using a box, we’ve
broken the infinite, recursive chain, so the compiler can figure out the size
it needs to store a List
value. Figure 15-2 shows what the Cons
variant
looks like now.
Cons
变体需要一个 i32
的大小,再加上用于存储盒子指针数据的空间。 Nil
变体不存储任何值,因此需要的空间比 Cons
变体少。现在我们知道,任何 List
值将占用一个 i32
的大小,再加上盒子指针数据的大小。通过使用盒子,我们打破了无限递归链,因此编译器可以确定存储 List
值所需的大小。图 15-2 展示了 Cons
变体现在的样子。
Boxes provide only the indirection and heap allocation; they don’t have any
other special capabilities, like those we’ll see with the other smart pointer
types. They also don’t have the performance overhead that these special
capabilities incur, so they can be useful in cases like the cons list where the
indirection is the only feature we need. We’ll look at more use cases for boxes
in Chapter 17, too.
盒子只提供间接和堆分配;它们没有任何其他特殊功能,就像我们将在其他智能指针类型中看到的那样。它们也没有这些特殊功能所带来的性能开销,因此它们在像 cons 列表这样的情况下是有用的,其中间接是我们唯一需要的特性。在第 17 章中,我们还将看到盒子的更多用例。
The Box<T>
type is a smart pointer because it implements the Deref
trait,
which allows Box<T>
values to be treated like references. When a Box<T>
value goes out of scope, the heap data that the box is pointing to is cleaned
up as well because of the Drop
trait implementation. These two traits will be
even more important to the functionality provided by the other smart pointer
types we’ll discuss in the rest of this chapter. Let’s explore these two traits
in more detail.
Box<T>
类型是智能指针,因为它实现了 Deref
特性,允许将 Box<T>
值视为引用。当 Box<T>
值超出作用域时,由于 Drop
特性的实现,指向堆数据的盒子也会被清理。这两个特性对于我们将在本章剩余部分讨论的其他智能指针类型提供的功能将变得更加重要。让我们更详细地探讨这两个特性。