Parsing Expressions 解析表达式

Grammar, which knows how to control even kings.
语法，它知道如何控制甚至国王。Molière 莫里哀

This chapter marks the first major milestone of the book. Many of us have cobbled together a mishmash of regular expressions and substring operations to extract some sense out of a pile of text. The code was probably riddled with bugs and a beast to maintain. Writing a real parser—one with decent error handling, a coherent internal structure, and the ability to robustly chew through a sophisticated syntax—is considered a rare, impressive skill. In this chapter, you will attain it.
本章标志着本书的第一个重要里程碑。我们中的许多人曾拼凑过一堆正则表达式和子字符串操作，试图从一堆文本中提取出有意义的信息。这些代码可能漏洞百出，维护起来如同驯服野兽。编写一个真正的解析器——具备良好的错误处理、一致的内部结构，以及能够稳健处理复杂语法的能力——被视为一项罕见且令人印象深刻的技能。在本章中，你将掌握这一技能。

It’s easier than you think, partially because we front-loaded a lot of the hard work in the last chapter. You already know your way around a formal grammar. You’re familiar with syntax trees, and we have some Java classes to represent them. The only remaining piece is parsing—transmogrifying a sequence of tokens into one of those syntax trees.
这比你想象的要容易，部分原因在于我们在上一章已经完成了许多繁重的工作。你已经掌握了形式语法的基本知识，熟悉了语法树，并且我们有一些 Java 类来表示它们。剩下的唯一部分就是解析——将一系列标记转换为这些语法树之一。

Some CS textbooks make a big deal out of parsers. In the ’60s, computer scientists—understandably tired of programming in assembly language—started designing more sophisticated, human-friendly languages like Fortran and ALGOL. Alas, they weren’t very machine-friendly for the primitive computers of the time.
一些计算机科学教材对解析器大书特书。在 60 年代，计算机科学家们——显然厌倦了用汇编语言编程——开始设计更复杂、更人性化的语言，如 Fortran 和 ALGOL。可惜的是，这些语言对当时原始的计算机并不十分友好。

These pioneers designed languages that they honestly weren’t even sure how to write compilers for, and then did groundbreaking work inventing parsing and compiling techniques that could handle these new, big languages on those old, tiny machines.
这些先驱者设计了他们甚至不确定如何编写编译器的语言，然后进行了开创性的工作，发明了能够在那些老旧的小型机器上处理这些新的大型语言的解析和编译技术。

Classic compiler books read like fawning hagiographies of these heroes and their tools. The cover of Compilers: Principles, Techniques, and Tools literally has a dragon labeled “complexity of compiler design” being slain by a knight bearing a sword and shield branded “LALR parser generator” and “syntax directed translation”. They laid it on thick.
经典的编译器书籍读起来就像是对这些英雄及其工具的阿谀奉承的圣徒传记。《编译器：原理、技术与工具》的封面上，确实有一条被标记为“编译器设计的复杂性”的龙，正被一位手持标有“LALR 解析器生成器”和“语法导向翻译”的剑与盾的骑士所斩杀。他们描绘得相当夸张。

A little self-congratulation is well-deserved, but the truth is you don’t need to know most of that stuff to bang out a high quality parser for a modern machine. As always, I encourage you to broaden your education and take it in later, but this book omits the trophy case.
适度的自我祝贺是应得的，但事实上，你并不需要了解大部分内容就能为现代机器编写出高质量的解析器。一如既往，我鼓励你拓宽知识面并在之后深入学习，但本书略去了那些荣誉展示。

6 . 1Ambiguity and the Parsing Game
歧义与解析游戏

In the last chapter, I said you can “play” a context-free grammar like a game in order to generate strings. Parsers play that game in reverse. Given a string—a series of tokens—we map those tokens to terminals in the grammar to figure out which rules could have generated that string.
在上一章中，我说过你可以像玩游戏一样“玩”一个上下文无关文法来生成字符串。解析器则逆向进行这个游戏。给定一个字符串——一系列标记——我们将这些标记映射到文法中的终结符，以找出哪些规则可能生成了该字符串。

The “could have” part is interesting. It’s entirely possible to create a grammar that is ambiguous, where different choices of productions can lead to the same string. When you’re using the grammar to generate strings, that doesn’t matter much. Once you have the string, who cares how you got to it?
“可能”这部分很有趣。完全有可能创建一个歧义语法，其中不同的产生式选择可能导致相同的字符串。当你使用语法生成字符串时，这并不重要。一旦你有了字符串，谁在乎你是怎么得到的呢？

When parsing, ambiguity means the parser may misunderstand the user’s code. As we parse, we aren’t just determining if the string is valid Lox code, we’re also tracking which rules match which parts of it so that we know what part of the language each token belongs to. Here’s the Lox expression grammar we put together in the last chapter:
在解析时，歧义意味着解析器可能会误解用户的代码。当我们进行解析时，我们不仅仅是在确定字符串是否为有效的 Lox 代码，我们还在跟踪哪些规则匹配了它的哪些部分，以便我们知道每个标记属于语言的哪一部分。这是我们上一章整理的 Lox 表达式语法：

expression     → literal
               | unary
               | binary
               | grouping ;

literal        → NUMBER | STRING | "true" | "false" | "nil" ;
grouping       → "(" expression ")" ;
unary          → ( "-" | "!" ) expression ;
binary         → expression operator expression ;
operator       → "==" | "!=" | "<" | "<=" | ">" | ">="
               | "+"  | "-"  | "*" | "/" ;

This is a valid string in that grammar:
这是一个符合该语法的有效字符串：

But there are two ways we could have generated it. One way is:
但我们有两种生成方式。一种是：

Starting at expression, pick binary.
从 expression 开始，选择 binary 。
For the left-hand expression, pick NUMBER, and use 6.
对于左侧的 expression ，选择 NUMBER ，并使用 6 。
For the operator, pick "/".
对于操作员，选择 "/" 。
For the right-hand expression, pick binary again.
对于右侧的 expression ，再次选择 binary 。
In that nested binary expression, pick 3 - 1.
在该嵌套的 binary 表达式中，选择 3 - 1 。

Another is: 另一个是：

Starting at expression, pick binary.
从 expression 开始，选择 binary 。
For the left-hand expression, pick binary again.
对于左侧的 expression ，再次选择 binary 。
In that nested binary expression, pick 6 / 3.
在该嵌套的 binary 表达式中，选择 6 / 3 。
Back at the outer binary, for the operator, pick "-".
回到外部的 binary ，对于操作员，选择 "-" 。
For the right-hand expression, pick NUMBER, and use 1.
对于右侧的 expression ，选择 NUMBER ，并使用 1 。

Those produce the same strings, but not the same syntax trees:
那些产生相同的字符串，但不产生相同的语法树：

Two valid syntax trees: (6 / 3) - 1 and 6 / (3 - 1)

In other words, the grammar allows seeing the expression as (6 / 3) - 1 or 6 / (3 - 1). The binary rule lets operands nest any which way you want. That in turn affects the result of evaluating the parsed tree. The way mathematicians have addressed this ambiguity since blackboards were first invented is by defining rules for precedence and associativity.
换句话说，语法允许将表达式视为 (6 / 3) - 1 或 6 / (3 - 1) 。 binary 规则允许操作数以任何方式嵌套。这反过来又影响了解析树评估的结果。自黑板发明以来，数学家们解决这种歧义的方法是通过定义优先级和结合性的规则。

Precedence determines which operator is evaluated first in an expression containing a mixture of different operators. Precedence rules tell us that we evaluate the / before the - in the above example. Operators with higher precedence are evaluated before operators with lower precedence. Equivalently, higher precedence operators are said to “bind tighter”.
优先级决定了在包含不同运算符的表达式中，哪个运算符首先被求值。优先级规则告诉我们，在上面的例子中，我们先对 / 求值，再对 - 求值。具有较高优先级的运算符在较低优先级的运算符之前被求值。同样地，较高优先级的运算符被称为“绑定更紧密”。
Associativity determines which operator is evaluated first in a series of the same operator. When an operator is left-associative (think “left-to-right”), operators on the left evaluate before those on the right. Since - is left-associative, this expression:
结合性决定了在一系列相同运算符中哪个运算符首先被求值。当运算符是左结合（即“从左到右”）时，左边的运算符先于右边的运算符求值。由于 - 是左结合的，因此这个表达式：
```
5 - 3 - 1
```
is equivalent to: 相当于：
```
(5 - 3) - 1
```
Assignment, on the other hand, is right-associative. This:
另一方面，赋值是右结合的。如下所示：
```
a = b = c
```
is equivalent to: 相当于：
```
a = (b = c)
```

While not common these days, some languages specify that certain pairs of operators have no relative precedence. That makes it a syntax error to mix those operators in an expression without using explicit grouping.
虽然如今并不常见，但某些语言规定某些运算符对之间没有相对优先级。这使得在表达式中混合使用这些运算符而不使用显式分组时，会产生语法错误。

Likewise, some operators are non-associative. That means it’s an error to use that operator more than once in a sequence. For example, Perl’s range operator isn’t associative, so a .. b is OK, but a .. b .. c is an error.
同样，有些运算符是非结合性的。这意味着在一个序列中多次使用该运算符是错误的。例如，Perl 的范围运算符不是结合性的，因此 a .. b 是正确的，但 a .. b .. c 是错误的。

Without well-defined precedence and associativity, an expression that uses multiple operators is ambiguous—it can be parsed into different syntax trees, which could in turn evaluate to different results. We’ll fix that in Lox by applying the same precedence rules as C, going from lowest to highest.
如果没有明确的优先级和结合性，使用多个运算符的表达式就会变得模糊不清——它可能被解析成不同的语法树，进而可能计算出不同的结果。我们将在 Lox 中通过应用与 C 语言相同的优先级规则来解决这个问题，从最低到最高。

Name 名称	Operators	Associates
Equality 平等	`==` `!=`	Left 左
Comparison 比较	`>` `>=` `<` `<=`	Left 左
Term	`-` `+`	Left 左
Factor 因子	`/` `*`	Left 左
Unary 一元	`!` `-`	Right

Right now, the grammar stuffs all expression types into a single expression rule. That same rule is used as the non-terminal for operands, which lets the grammar accept any kind of expression as a subexpression, regardless of whether the precedence rules allow it.
目前，语法将所有表达式类型都塞进了一个单一的 expression 规则中。该规则同样被用作操作数的非终结符，这使得语法能够接受任何类型的表达式作为子表达式，而不管优先级规则是否允许。

We fix that by stratifying the grammar. We define a separate rule for each precedence level.
我们通过分层语法来解决这个问题。我们为每个优先级级别定义了一个单独的规则。

expression     → ...
equality       → ...
comparison     → ...
term           → ...
factor         → ...
unary          → ...
primary        → ...

Each rule here only matches expressions at its precedence level or higher. For example, unary matches a unary expression like !negated or a primary expression like 1234. And term can match 1 + 2 but also 3 * 4 / 5. The final primary rule covers the highest-precedence forms—literals and parenthesized expressions.
这里的每条规则仅匹配其优先级或更高优先级的表达式。例如， unary 可以匹配像 !negated 这样的一元表达式，或者像 1234 这样的基本表达式。而 term 可以匹配 1 + 2 ，也可以匹配 3 * 4 / 5 。最后的 primary 规则涵盖了最高优先级的表达式形式——字面量和括号内的表达式。

We just need to fill in the productions for each of those rules. We’ll do the easy ones first. The top expression rule matches any expression at any precedence level. Since equality has the lowest precedence, if we match that, then it covers everything.
我们只需要为每个规则填写产生式。我们先从简单的开始。顶部的 expression 规则匹配任何优先级下的任何表达式。由于 equality 具有最低的优先级，如果我们匹配它，那么它就覆盖了所有情况。

expression     → equality

Over at the other end of the precedence table, a primary expression contains all the literals and grouping expressions.
在优先级表的另一端，主表达式包含所有字面量和分组表达式。

primary        → NUMBER | STRING | "true" | "false" | "nil"
               | "(" expression ")" ;

A unary expression starts with a unary operator followed by the operand. Since unary operators can nest—!!true is a valid if weird expression—the operand can itself be a unary operator. A recursive rule handles that nicely.
一元表达式以一元运算符开头，后跟操作数。由于一元运算符可以嵌套—— !!true 是一个有效但奇怪的表达式——操作数本身也可以是一元运算符。递归规则很好地处理了这种情况。

unary          → ( "!" | "-" ) unary ;

But this rule has a problem. It never terminates.
但这条规则有一个问题。它永远不会终止。

Remember, each rule needs to match expressions at that precedence level or higher, so we also need to let this match a primary expression.
记住，每条规则都需要匹配该优先级或更高优先级的表达式，因此我们还需要让它匹配一个基本表达式。

unary          → ( "!" | "-" ) unary
               | primary ;

That works. 那行得通。

The remaining rules are all binary operators. We’ll start with the rule for multiplication and division. Here’s a first try:
剩下的规则都是二元运算符。我们将从乘法和除法的规则开始。这是第一次尝试：

factor         → factor ( "/" | "*" ) unary
               | unary ;

The rule recurses to match the left operand. That enables the rule to match a series of multiplication and division expressions like 1 * 2 / 3. Putting the recursive production on the left side and unary on the right makes the rule left-associative and unambiguous.
该规则递归地匹配左操作数。这使得规则能够匹配一系列乘法和除法表达式，如 1 * 2 / 3 。将递归产生式放在左侧， unary 放在右侧，使规则具有左结合性且无歧义。

In principle, it doesn’t matter whether you treat multiplication as left- or right-associative—you get the same result either way. Alas, in the real world with limited precision, roundoff and overflow mean that associativity can affect the result of a sequence of multiplications. Consider:
原则上，乘法无论是左结合还是右结合都无关紧要——两种方式得到的结果相同。然而，在现实世界中，由于精度有限、舍入误差和溢出问题，结合性可能会影响一系列乘法的结果。考虑以下情况：

print 0.1 * (0.2 * 0.3);
print (0.1 * 0.2) * 0.3;

In languages like Lox that use IEEE 754 double-precision floating-point numbers, the first evaluates to 0.006, while the second yields 0.006000000000000001. Sometimes that tiny difference matters. This is a good place to learn more.
在像 Lox 这样使用 IEEE 754 双精度浮点数的语言中，第一个表达式的结果为 0.006 ，而第二个表达式的结果为 0.006000000000000001 。有时，这种微小的差异很重要。这是一个了解更多信息的好地方。

All of this is correct, but the fact that the first symbol in the body of the rule is the same as the head of the rule means this production is left-recursive. Some parsing techniques, including the one we’re going to use, have trouble with left recursion. (Recursion elsewhere, like we have in unary and the indirect recursion for grouping in primary are not a problem.)
所有这些都是正确的，但规则体中的第一个符号与规则头相同，这意味着这个产生式是左递归的。一些解析技术，包括我们将要使用的技术，在处理左递归时会遇到困难。（其他地方的递归，如我们在 unary 中的递归以及 primary 中用于分组的间接递归，则没有问题。）

There are many grammars you can define that match the same language. The choice for how to model a particular language is partially a matter of taste and partially a pragmatic one. This rule is correct, but not optimal for how we intend to parse it. Instead of a left recursive rule, we’ll use a different one.
有许多语法可以定义来匹配同一种语言。如何为特定语言建模的选择部分取决于个人偏好，部分则是出于实用考虑。这条规则是正确的，但对于我们预期的解析方式来说并非最优。我们将使用另一条规则，而非左递归规则。

factor         → unary ( ( "/" | "*" ) unary )* ;

We define a factor expression as a flat sequence of multiplications and divisions. This matches the same syntax as the previous rule, but better mirrors the code we’ll write to parse Lox. We use the same structure for all of the other binary operator precedence levels, giving us this complete expression grammar:
我们将因子表达式定义为一系列连续的乘法和除法运算。这与前一条规则的语法相匹配，但更好地反映了我们将编写的用于解析 Lox 的代码。我们对所有其他二元运算符优先级级别采用相同的结构，从而得到以下完整的表达式语法：

expression     → equality ;
equality       → comparison ( ( "!=" | "==" ) comparison )* ;
comparison     → term ( ( ">" | ">=" | "<" | "<=" ) term )* ;
term           → factor ( ( "-" | "+" ) factor )* ;
factor         → unary ( ( "/" | "*" ) unary )* ;
unary          → ( "!" | "-" ) unary
               | primary ;
primary        → NUMBER | STRING | "true" | "false" | "nil"
               | "(" expression ")" ;

This grammar is more complex than the one we had before, but in return we have eliminated the previous one’s ambiguity. It’s just what we need to make a parser.
这个语法比我们之前的更复杂，但作为回报，我们消除了前一个语法的歧义。这正是我们构建解析器所需要的。

6 . 2Recursive Descent Parsing
递归下降解析

There is a whole pack of parsing techniques whose names are mostly combinations of “L” and “R”—LL(k), LR(1), LALR—along with more exotic beasts like parser combinators, Earley parsers, the shunting yard algorithm, and packrat parsing. For our first interpreter, one technique is more than sufficient: recursive descent.
有一整套解析技术，其名称大多是“L”和“R”的组合——LL(k)、LR(1)、LALR——以及更奇特的工具，如解析器组合子、Earley 解析器、调度场算法和 packrat 解析。对于我们的第一个解释器，一种技术就足够了：递归下降。

Recursive descent is the simplest way to build a parser, and doesn’t require using complex parser generator tools like Yacc, Bison or ANTLR. All you need is straightforward handwritten code. Don’t be fooled by its simplicity, though. Recursive descent parsers are fast, robust, and can support sophisticated error handling. In fact, GCC, V8 (the JavaScript VM in Chrome), Roslyn (the C# compiler written in C#) and many other heavyweight production language implementations use recursive descent. It rocks.
递归下降是构建解析器的最简单方法，且无需使用如 Yacc、Bison 或 ANTLR 等复杂的解析器生成工具。你所需要的只是直接手写的代码。不过，别被它的简单所迷惑。递归下降解析器速度快、健壮性强，并能支持复杂的错误处理。实际上，GCC、V8（Chrome 中的 JavaScript 虚拟机）、Roslyn（用 C#编写的 C#编译器）以及许多其他重量级生产语言实现都采用了递归下降。它真的很棒。

Recursive descent is considered a top-down parser because it starts from the top or outermost grammar rule (here expression) and works its way down into the nested subexpressions before finally reaching the leaves of the syntax tree. This is in contrast with bottom-up parsers like LR that start with primary expressions and compose them into larger and larger chunks of syntax.
递归下降被认为是一种自顶向下的解析器，因为它从顶部或最外层的语法规则（此处为 expression ）开始，逐步深入到嵌套的子表达式中，最终到达语法树的叶子节点。这与自底向上的解析器（如 LR 解析器）形成对比，后者从基本表达式开始，逐步组合成越来越大的语法块。

It’s called “recursive descent” because it walks down the grammar. Confusingly, we also use direction metaphorically when talking about “high” and “low” precedence, but the orientation is reversed. In a top-down parser, you reach the lowest-precedence expressions first because they may in turn contain subexpressions of higher precedence.
它被称为“递归下降”，因为它沿着语法结构向下解析。令人困惑的是，我们在谈论“高”和“低”优先级时也使用了方向性的比喻，但方向是相反的。在自上而下的解析器中，你会首先到达最低优先级的表达式，因为它们可能反过来包含更高优先级的子表达式。

CS people really need to get together and straighten out their metaphors. Don’t even get me started on which direction a stack grows or why trees have their roots on top.
CS 人士真的需要聚在一起，理清他们的隐喻。更别提栈是向哪个方向增长的，或者为什么树的根在顶部了。

A recursive descent parser is a literal translation of the grammar’s rules straight into imperative code. Each rule becomes a function. The body of the rule translates to code roughly like:
递归下降解析器是将语法规则直接转换为命令式代码的字面翻译。每个规则变成一个函数。规则的主体大致翻译为如下代码：

Grammar notation 语法符号	Code representation 代码表示
Terminal	Code to match and consume a token 匹配并消费令牌的代码
Nonterminal	Call to that rule’s function 调用该规则的函数
`\|`	`if` or `switch` statement `if` 或 `switch` 语句
`` or `+` `` 或 `+`	`while` or `for` loop `while` 或 `for` 循环
`?`	`if` statement

The descent is described as “recursive” because when a grammar rule refers to itself—directly or indirectly—that translates to a recursive function call.
下降被称为“递归”是因为当语法规则直接或间接引用自身时，这转化为递归函数调用。

6 . 2 . 1The parser class 解析器类

Each grammar rule becomes a method inside this new class:
每个语法规则都成为这个新类中的一个方法：

lox/Parser.java
create new file 创建新文件

package com.craftinginterpreters.lox;

import java.util.List;

import static com.craftinginterpreters.lox.TokenType.*;

class Parser {
  private final List<Token> tokens;
  private int current = 0;

  Parser(List<Token> tokens) {
    this.tokens = tokens;
  }
}

lox/Parser.java, create new file

Like the scanner, the parser consumes a flat input sequence, only now we’re reading tokens instead of characters. We store the list of tokens and use current to point to the next token eagerly waiting to be parsed.
与扫描器类似，解析器也消耗一个扁平的输入序列，只不过现在我们读取的是标记而非字符。我们存储标记列表，并使用 current 来指向急切等待被解析的下一个标记。

We’re going to run straight through the expression grammar now and translate each rule to Java code. The first rule, expression, simply expands to the equality rule, so that’s straightforward.
我们现在将直接运行表达式语法，并将每个规则翻译为 Java 代码。第一个规则 expression 简单地扩展为 equality 规则，因此这很简单。

lox/Parser.java
add after Parser() 在 Parser()后添加

  private Expr expression() {
    return equality();
  }

lox/Parser.java, add after Parser()

Each method for parsing a grammar rule produces a syntax tree for that rule and returns it to the caller. When the body of the rule contains a nonterminal—a reference to another rule—we call that other rule’s method.
每种解析语法规则的方法都会为该规则生成一个语法树，并将其返回给调用者。当规则的主体包含一个非终结符——即对另一个规则的引用时，我们会调用该规则的方法。

The rule for equality is a little more complex.
相等性的规则稍微复杂一些。

equality       → comparison ( ( "!=" | "==" ) comparison )* ;

In Java, that becomes:
在 Java 中，这变为：

lox/Parser.java
add after expression() 在 expression()后添加

  private Expr equality() {
    Expr expr = comparison();

    while (match(BANG_EQUAL, EQUAL_EQUAL)) {
      Token operator = previous();
      Expr right = comparison();
      expr = new Expr.Binary(expr, operator, right);
    }

    return expr;
  }

lox/Parser.java, add after expression()

Let’s step through it. The first comparison nonterminal in the body translates to the first call to comparison() in the method. We take that result and store it in a local variable.
让我们逐步分析。正文中的第一个 comparison 非终结符转换为方法中对 comparison() 的第一次调用。我们获取该结果并将其存储在局部变量中。

Then, the ( ... )* loop in the rule maps to a while loop. We need to know when to exit that loop. We can see that inside the rule, we must first find either a != or == token. So, if we don’t see one of those, we must be done with the sequence of equality operators. We express that check using a handy match() method.
然后，规则中的 ( ... )* 循环映射到 while 循环。我们需要知道何时退出该循环。我们可以看到，在规则内部，我们必须首先找到 != 或 == 标记。因此，如果我们没有看到其中一个，那么我们必须已经完成了相等运算符的序列。我们使用一个方便的 match() 方法来表示该检查。

lox/Parser.java
add after equality() 在 equality()后添加

  private boolean match(TokenType... types) {
    for (TokenType type : types) {
      if (check(type)) {
        advance();
        return true;
      }
    }

    return false;
  }

lox/Parser.java, add after equality()

This checks to see if the current token has any of the given types. If so, it consumes the token and returns true. Otherwise, it returns false and leaves the current token alone. The match() method is defined in terms of two more fundamental operations.
此操作检查当前令牌是否具有任何给定类型。如果是，则消耗该令牌并返回 true 。否则，返回 false 并保持当前令牌不变。 match() 方法基于两个更基本的操作定义。

The check() method returns true if the current token is of the given type. Unlike match(), it never consumes the token, it only looks at it.
check() 方法在当前标记为给定类型时返回 true 。与 match() 不同，它从不消耗标记，仅查看它。

lox/Parser.java
add after match() 在 match() 后添加

  private boolean check(TokenType type) {
    if (isAtEnd()) return false;
    return peek().type == type;
  }

lox/Parser.java, add after match()

The advance() method consumes the current token and returns it, similar to how our scanner’s corresponding method crawled through characters.
advance() 方法消耗当前标记并返回它，类似于我们扫描器对应方法遍历字符的方式。

lox/Parser.java
add after check() 在 check()后添加

  private Token advance() {
    if (!isAtEnd()) current++;
    return previous();
  }

lox/Parser.java, add after check()

These methods bottom out on the last handful of primitive operations.
这些方法最终归结为最后几个基本操作。

lox/Parser.java
add after advance() 在 advance()后添加

  private boolean isAtEnd() {
    return peek().type == EOF;
  }

  private Token peek() {
    return tokens.get(current);
  }

  private Token previous() {
    return tokens.get(current - 1);
  }

lox/Parser.java, add after advance()

isAtEnd() checks if we’ve run out of tokens to parse. peek() returns the current token we have yet to consume, and previous() returns the most recently consumed token. The latter makes it easier to use match() and then access the just-matched token.
isAtEnd() 检查我们是否已经用完要解析的标记。 peek() 返回我们尚未消耗的当前标记， previous() 返回最近消耗的标记。后者使得使用 match() 后访问刚刚匹配的标记更加方便。

That’s most of the parsing infrastructure we need. Where were we? Right, so if we are inside the while loop in equality(), then we know we have found a != or == operator and must be parsing an equality expression.
这就是我们所需的大部分解析基础设施。我们说到哪儿了？对了，如果我们在 equality() 中的 while 循环内，那么我们知道已经找到了 != 或 == 运算符，并且必须解析一个相等表达式。

We grab the matched operator token so we can track which kind of equality expression we have. Then we call comparison() again to parse the right-hand operand. We combine the operator and its two operands into a new Expr.Binary syntax tree node, and then loop around. For each iteration, we store the resulting expression back in the same expr local variable. As we zip through a sequence of equality expressions, that creates a left-associative nested tree of binary operator nodes.
我们获取匹配的运算符标记，以便跟踪所处理的等式表达式类型。接着，再次调用 comparison() 来解析右侧操作数。将运算符及其两个操作数组合成一个新的 Expr.Binary 语法树节点，然后循环处理。每次迭代时，我们将结果表达式存储回同一个 expr 局部变量中。当快速遍历一系列等式表达式时，这便构建了一个左结合嵌套的二元运算符节点树。

The syntax tree created by parsing 'a == b == c == d == e'

The parser falls out of the loop once it hits a token that’s not an equality operator. Finally, it returns the expression. Note that if the parser never encounters an equality operator, then it never enters the loop. In that case, the equality() method effectively calls and returns comparison(). In that way, this method matches an equality operator or anything of higher precedence.
解析器一旦遇到非等值运算符的标记，就会退出循环。最后，它返回表达式。请注意，如果解析器从未遇到等值运算符，则它永远不会进入循环。在这种情况下，` equality() ` 方法实际上会调用并返回 ` comparison() `。通过这种方式，该方法匹配等值运算符或任何更高优先级的表达式。

Moving on to the next rule . . .
继续下一个规则...

comparison     → term ( ( ">" | ">=" | "<" | "<=" ) term )* ;

Translated to Java: 翻译为 Java：

lox/Parser.java
add after equality() 在 equality()后添加

  private Expr comparison() {
    Expr expr = term();

    while (match(GREATER, GREATER_EQUAL, LESS, LESS_EQUAL)) {
      Token operator = previous();
      Expr right = term();
      expr = new Expr.Binary(expr, operator, right);
    }

    return expr;
  }

lox/Parser.java, add after equality()

The grammar rule is virtually identical to equality and so is the corresponding code. The only differences are the token types for the operators we match, and the method we call for the operands—now term() instead of comparison(). The remaining two binary operator rules follow the same pattern.
语法规则与 equality 几乎相同，对应的代码也是如此。唯一的区别在于我们匹配的操作符的标记类型，以及我们为操作数调用的方法——现在是 term() 而不是 comparison() 。剩下的两个二元操作符规则遵循相同的模式。

In order of precedence, first addition and subtraction:
按优先级顺序，先进行加减法：

lox/Parser.java
add after comparison() 在 comparison()后添加

  private Expr term() {
    Expr expr = factor();

    while (match(MINUS, PLUS)) {
      Token operator = previous();
      Expr right = factor();
      expr = new Expr.Binary(expr, operator, right);
    }

    return expr;
  }

lox/Parser.java, add after comparison()

And finally, multiplication and division:
最后，乘法和除法：

lox/Parser.java
add after term() 在 term()后添加

  private Expr factor() {
    Expr expr = unary();

    while (match(SLASH, STAR)) {
      Token operator = previous();
      Expr right = unary();
      expr = new Expr.Binary(expr, operator, right);
    }

    return expr;
  }

lox/Parser.java, add after term()

That’s all of the binary operators, parsed with the correct precedence and associativity. We’re crawling up the precedence hierarchy and now we’ve reached the unary operators.
这就是所有的二元运算符，按照正确的优先级和结合性进行解析。我们正在爬升优先级层次结构，现在我们已经到达了一元运算符。

unary          → ( "!" | "-" ) unary
               | primary ;

The code for this is a little different.
此代码略有不同。

lox/Parser.java
add after factor() 在 factor()后添加

  private Expr unary() {
    if (match(BANG, MINUS)) {
      Token operator = previous();
      Expr right = unary();
      return new Expr.Unary(operator, right);
    }

    return primary();
  }

lox/Parser.java, add after factor()

Again, we look at the current token to see how to parse. If it’s a ! or -, we must have a unary expression. In that case, we grab the token and then recursively call unary() again to parse the operand. Wrap that all up in a unary expression syntax tree and we’re done.
再次，我们查看当前标记以确定如何解析。如果它是 ! 或 - ，我们必须有一个一元表达式。在这种情况下，我们获取该标记，然后递归调用 unary() 来解析操作数。将所有内容包装在一元表达式语法树中，我们就完成了。

Otherwise, we must have reached the highest level of precedence, primary expressions.
否则，我们必须已经达到了最高优先级的表达式，即基本表达式。

primary        → NUMBER | STRING | "true" | "false" | "nil"
               | "(" expression ")" ;

Most of the cases for the rule are single terminals, so parsing is straightforward.
大多数规则的案例都是单一终端，因此解析是直接的。

lox/Parser.java
add after unary() 在 unary()后添加

  private Expr primary() {
    if (match(FALSE)) return new Expr.Literal(false);
    if (match(TRUE)) return new Expr.Literal(true);
    if (match(NIL)) return new Expr.Literal(null);

    if (match(NUMBER, STRING)) {
      return new Expr.Literal(previous().literal);
    }

    if (match(LEFT_PAREN)) {
      Expr expr = expression();
      consume(RIGHT_PAREN, "Expect ')' after expression.");
      return new Expr.Grouping(expr);
    }
  }

lox/Parser.java, add after unary()

The interesting branch is the one for handling parentheses. After we match an opening ( and parse the expression inside it, we must find a ) token. If we don’t, that’s an error.
有趣的分支是处理括号的部分。当我们匹配到一个开头的 ( 并解析其中的表达式后，必须找到一个 ) 标记。如果没有找到，那就是一个错误。

6 . 3Syntax Errors 语法错误

A parser really has two jobs:
解析器实际上有两个任务：

Given a valid sequence of tokens, produce a corresponding syntax tree.
给定一个有效的标记序列，生成相应的语法树。
Given an invalid sequence of tokens, detect any errors and tell the user about their mistakes.
给定一个无效的令牌序列，检测任何错误并告知用户其错误。

Don’t underestimate how important the second job is! In modern IDEs and editors, the parser is constantly reparsing code—often while the user is still editing it—in order to syntax highlight and support things like auto-complete. That means it will encounter code in incomplete, half-wrong states all the time.
不要低估第二项工作的重要性！在现代集成开发环境（IDE）和编辑器中，解析器会不断地重新解析代码——通常是在用户仍在编辑时——以便进行语法高亮和支持自动完成等功能。这意味着它将经常遇到不完整、半错误状态的代码。

When the user doesn’t realize the syntax is wrong, it is up to the parser to help guide them back onto the right path. The way it reports errors is a large part of your language’s user interface. Good syntax error handling is hard. By definition, the code isn’t in a well-defined state, so there’s no infallible way to know what the user meant to write. The parser can’t read your mind.
当用户未意识到语法错误时，解析器有责任引导他们回到正确的路径。错误报告的方式是您语言用户界面的重要组成部分。良好的语法错误处理是困难的。根据定义，代码未处于明确定义的状态，因此没有绝对可靠的方法来了解用户想要编写的内容。解析器无法读取您的思维。

There are a couple of hard requirements for when the parser runs into a syntax error. A parser must:
解析器在遇到语法错误时有一些硬性要求。解析器必须：

Detect and report the error. If it doesn’t detect the error and passes the resulting malformed syntax tree on to the interpreter, all manner of horrors may be summoned.
检测并报告错误。如果未能检测到错误并将生成的有缺陷的语法树传递给解释器，可能会引发各种可怕的问题。

Philosophically speaking, if an error isn’t detected and the interpreter runs the code, is it really an error?
从哲学角度讲，如果一个错误未被检测到且解释器执行了代码，那它真的算是一个错误吗？
Avoid crashing or hanging. Syntax errors are a fact of life, and language tools have to be robust in the face of them. Segfaulting or getting stuck in an infinite loop isn’t allowed. While the source may not be valid code, it’s still a valid input to the parser because users use the parser to learn what syntax is allowed.
避免崩溃或挂起。语法错误是不可避免的，语言工具在面对它们时必须具备鲁棒性。不允许出现段错误或陷入无限循环的情况。虽然源代码可能不是有效的代码，但它仍然是解析器的有效输入，因为用户使用解析器来了解允许的语法。

Those are the table stakes if you want to get in the parser game at all, but you really want to raise the ante beyond that. A decent parser should:
这些是进入解析器游戏的基本要求，但你确实需要在此基础上提高赌注。一个像样的解析器应该：

Be fast. Computers are thousands of times faster than they were when parser technology was first invented. The days of needing to optimize your parser so that it could get through an entire source file during a coffee break are over. But programmer expectations have risen as quickly, if not faster. They expect their editors to reparse files in milliseconds after every keystroke.
要快。计算机的速度比解析器技术刚发明时快了几千倍。那种需要优化解析器以便在咖啡休息时间内完成整个源文件解析的日子已经一去不复返了。但程序员的期望也以同样快的速度，甚至更快的速度上升。他们期望编辑器在每次按键后能在几毫秒内重新解析文件。
Report as many distinct errors as there are. Aborting after the first error is easy to implement, but it’s annoying for users if every time they fix what they think is the one error in a file, a new one appears. They want to see them all.
报告尽可能多的不同错误。在第一个错误后中止很容易实现，但如果用户每次修复他们认为文件中唯一的错误时，又出现新的错误，这会让他们感到烦恼。他们希望看到所有错误。
Minimize cascaded errors. Once a single error is found, the parser no longer really knows what’s going on. It tries to get itself back on track and keep going, but if it gets confused, it may report a slew of ghost errors that don’t indicate other real problems in the code. When the first error is fixed, those phantoms disappear, because they reflect only the parser’s own confusion. Cascaded errors are annoying because they can scare the user into thinking their code is in a worse state than it is.
最小化级联错误。一旦发现单个错误，解析器实际上就不再知道发生了什么。它试图让自己回到正轨并继续运行，但如果它感到困惑，可能会报告一连串的幽灵错误，这些错误并不表示代码中的其他实际问题。当第一个错误被修复时，这些幻影就会消失，因为它们仅反映了解析器自身的困惑。级联错误令人烦恼，因为它们可能会吓到用户，让他们认为自己的代码状态比实际情况更糟。

The last two points are in tension. We want to report as many separate errors as we can, but we don’t want to report ones that are merely side effects of an earlier one.
最后两点存在矛盾。我们希望尽可能多地报告独立的错误，但又不希望报告那些仅仅是早期错误副作用的错误。

The way a parser responds to an error and keeps going to look for later errors is called error recovery. This was a hot research topic in the ’60s. Back then, you’d hand a stack of punch cards to the secretary and come back the next day to see if the compiler succeeded. With an iteration loop that slow, you really wanted to find every single error in your code in one pass.
解析器对错误作出响应并继续寻找后续错误的方式称为错误恢复。这是 20 世纪 60 年代的一个热门研究课题。那时候，你会把一叠穿孔卡片交给秘书，第二天再回来查看编译器是否成功。由于迭代循环如此缓慢，你真的希望在一次编译中找出代码中的所有错误。

Today, when parsers complete before you’ve even finished typing, it’s less of an issue. Simple, fast error recovery is fine.
如今，解析器在你输入完成之前就已结束，这已不再是问题。简单快速的错误恢复就足够了。

6 . 3 . 1Panic mode error recovery
Panic 模式错误恢复

Of all the recovery techniques devised in yesteryear, the one that best stood the test of time is called—somewhat alarmingly—panic mode. As soon as the parser detects an error, it enters panic mode. It knows at least one token doesn’t make sense given its current state in the middle of some stack of grammar productions.
在昔日设计的所有恢复技术中，最能经受时间考验的是一种听起来有些令人不安的——恐慌模式。一旦解析器检测到错误，它就会进入恐慌模式。它知道至少有一个标记在当前语法产生式堆栈的中间状态下是没有意义的。

Before it can get back to parsing, it needs to get its state and the sequence of forthcoming tokens aligned such that the next token does match the rule being parsed. This process is called synchronization.
在恢复解析之前，它需要将其状态与即将到来的标记序列对齐，以确保下一个标记确实匹配正在解析的规则。这一过程称为同步。

To do that, we select some rule in the grammar that will mark the synchronization point. The parser fixes its parsing state by jumping out of any nested productions until it gets back to that rule. Then it synchronizes the token stream by discarding tokens until it reaches one that can appear at that point in the rule.
为此，我们在语法中选择一些规则来标记同步点。解析器通过跳出任何嵌套的产生式来固定其解析状态，直到返回到该规则。然后，它通过丢弃标记来同步标记流，直到到达可以在该规则点出现的标记。

Any additional real syntax errors hiding in those discarded tokens aren’t reported, but it also means that any mistaken cascaded errors that are side effects of the initial error aren’t falsely reported either, which is a decent trade-off.
任何隐藏在那些被丢弃的标记中的额外真实语法错误都不会被报告，但这也意味着任何由初始错误引发的级联错误也不会被错误地报告，这是一个不错的权衡。

The traditional place in the grammar to synchronize is between statements. We don’t have those yet, so we won’t actually synchronize in this chapter, but we’ll get the machinery in place for later.
语法中传统的同步位置是在语句之间。我们目前还没有这些内容，因此本章实际上不会进行同步操作，但我们会为后续章节搭建好机制。

6 . 3 . 2Entering panic mode 进入紧急模式

Back before we went on this side trip around error recovery, we were writing the code to parse a parenthesized expression. After parsing the expression, the parser looks for the closing ) by calling consume(). Here, finally, is that method:
在我们绕道讨论错误恢复之前，我们正在编写解析带括号表达式的代码。解析完表达式后，解析器通过调用 consume() 来查找结束的 ) 。最后，这就是那个方法：

lox/Parser.java
add after match() 在 match() 后添加

  private Token consume(TokenType type, String message) {
    if (check(type)) return advance();

    throw error(peek(), message);
  }

lox/Parser.java, add after match()

It’s similar to match() in that it checks to see if the next token is of the expected type. If so, it consumes the token and everything is groovy. If some other token is there, then we’ve hit an error. We report it by calling this:
它与 match() 类似，都会检查下一个标记是否为预期类型。如果是，则消耗该标记，一切顺利。如果出现其他标记，则意味着遇到了错误。我们通过调用以下内容来报告错误：

lox/Parser.java
add after previous() 在上一个之后添加

  private ParseError error(Token token, String message) {
    Lox.error(token, message);
    return new ParseError();
  }

lox/Parser.java, add after previous()

First, that shows the error to the user by calling:
首先，通过调用以下代码向用户显示错误：

lox/Lox.java
add after report() 在 report()后添加

  static void error(Token token, String message) {
    if (token.type == TokenType.EOF) {
      report(token.line, " at end", message);
    } else {
      report(token.line, " at '" + token.lexeme + "'", message);
    }
  }

lox/Lox.java, add after report()

This reports an error at a given token. It shows the token’s location and the token itself. This will come in handy later since we use tokens throughout the interpreter to track locations in code.
此函数报告给定令牌处的错误。它显示令牌的位置和令牌本身。由于我们在整个解释器中使用令牌来跟踪代码中的位置，这在以后会派上用场。

After we report the error, the user knows about their mistake, but what does the parser do next? Back in error(), we create and return a ParseError, an instance of this new class:
在我们报告错误后，用户知道了他们的错误，但解析器接下来会做什么呢？回到 error() ，我们创建并返回一个 ParseError，这是新类的一个实例：

class Parser {

lox/Parser.java
nest inside class Parser
嵌套在类 Parser 内部

  private static class ParseError extends RuntimeException {}

  private final List<Token> tokens;

lox/Parser.java, nest inside class Parser

This is a simple sentinel class we use to unwind the parser. The error() method returns the error instead of throwing it because we want to let the calling method inside the parser decide whether to unwind or not. Some parse errors occur in places where the parser isn’t likely to get into a weird state and we don’t need to synchronize. In those places, we simply report the error and keep on truckin’.
这是一个简单的哨兵类，我们用它来展开解析器。 error() 方法返回错误而不是抛出它，因为我们想让解析器内部的调用方法决定是否展开。有些解析错误发生在解析器不太可能进入奇怪状态的地方，我们不需要同步。在这些地方，我们只需报告错误并继续前进。

For example, Lox limits the number of arguments you can pass to a function. If you pass too many, the parser needs to report that error, but it can and should simply keep on parsing the extra arguments instead of freaking out and going into panic mode.
例如，Lox 限制了可以传递给函数的参数数量。如果传递了太多参数，解析器需要报告该错误，但它可以而且应该继续解析额外的参数，而不是惊慌失措并进入恐慌模式。

Another way to handle common syntax errors is with error productions. You augment the grammar with a rule that successfully matches the erroneous syntax. The parser safely parses it but then reports it as an error instead of producing a syntax tree.
处理常见语法错误的另一种方法是使用错误产生式。您可以通过添加一条规则来扩展语法，该规则能够成功匹配错误的语法。解析器会安全地解析它，但随后将其报告为错误，而不是生成语法树。

For example, some languages have a unary + operator, like +123, but Lox does not. Instead of getting confused when the parser stumbles onto a + at the beginning of an expression, we could extend the unary rule to allow it.
例如，某些语言具有一元 + 运算符，如 +123 ，但 Lox 没有。当解析器在表达式开头遇到 + 时，为了避免混淆，我们可以扩展一元规则以允许它。

unary → ( "!" | "-" | "+" ) unary
      | primary ;

This lets the parser consume + without going into panic mode or leaving the parser in a weird state.
这使得解析器能够消耗 + 而不会进入恐慌模式或使解析器处于奇怪的状态。

Error productions work well because you, the parser author, know how the code is wrong and what the user was likely trying to do. That means you can give a more helpful message to get the user back on track, like, “Unary ‘+’ expressions are not supported.” Mature parsers tend to accumulate error productions like barnacles since they help users fix common mistakes.
错误产生式之所以有效，是因为作为解析器的作者，您知道代码哪里出错以及用户可能试图做什么。这意味着您可以提供更有帮助的信息，引导用户回到正轨，例如：“不支持一元‘+’表达式。”成熟的解析器往往会积累像藤壶一样的错误产生式，因为它们帮助用户修复常见错误。

In our case, though, the syntax error is nasty enough that we want to panic and synchronize. Discarding tokens is pretty easy, but how do we synchronize the parser’s own state?
不过，在我们的例子中，语法错误严重到足以让我们想要报错并同步。丢弃标记相当容易，但我们如何同步解析器自身的状态呢？

6 . 3 . 3Synchronizing a recursive descent parser
同步递归下降解析器

With recursive descent, the parser’s state—which rules it is in the middle of recognizing—is not stored explicitly in fields. Instead, we use Java’s own call stack to track what the parser is doing. Each rule in the middle of being parsed is a call frame on the stack. In order to reset that state, we need to clear out those call frames.
使用递归下降法时，解析器的状态——即它正在识别哪些规则——并不显式地存储在字段中。相反，我们利用 Java 自身的调用栈来跟踪解析器的操作。每个正在解析的规则都是栈上的一个调用帧。为了重置该状态，我们需要清除这些调用帧。

The natural way to do that in Java is exceptions. When we want to synchronize, we throw that ParseError object. Higher up in the method for the grammar rule we are synchronizing to, we’ll catch it. Since we synchronize on statement boundaries, we’ll catch the exception there. After the exception is caught, the parser is in the right state. All that’s left is to synchronize the tokens.
在 Java 中实现这一点的自然方式是使用异常。当我们想要同步时，我们抛出那个 ParseError 对象。在我们同步到的语法规则的方法中更高层的位置，我们会捕获它。由于我们在语句边界上同步，我们会在那里捕获异常。捕获异常后，解析器处于正确的状态。剩下的就是同步标记了。

We want to discard tokens until we’re right at the beginning of the next statement. That boundary is pretty easy to spot—it’s one of the main reasons we picked it. After a semicolon, we’re probably finished with a statement. Most statements start with a keyword—for, if, return, var, etc. When the next token is any of those, we’re probably about to start a statement.
我们希望丢弃标记，直到我们正好处于下一条语句的开头。这个边界很容易识别——这是我们选择它的主要原因之一。在分号之后，我们可能已经完成了一条语句。大多数语句以关键字开头—— for 、 if 、 return 、 var 等。当下一个标记是这些关键字之一时，我们可能即将开始一条语句。

This method encapsulates that logic:
该方法封装了该逻辑：

lox/Parser.java
add after error() 在 error()后添加

  private void synchronize() {
    advance();

    while (!isAtEnd()) {
      if (previous().type == SEMICOLON) return;

      switch (peek().type) {
        case CLASS:
        case FUN:
        case VAR:
        case FOR:
        case IF:
        case WHILE:
        case PRINT:
        case RETURN:
          return;
      }

      advance();
    }
  }

lox/Parser.java, add after error()

It discards tokens until it thinks it has found a statement boundary. After catching a ParseError, we’ll call this and then we are hopefully back in sync. When it works well, we have discarded tokens that would have likely caused cascaded errors anyway, and now we can parse the rest of the file starting at the next statement.
它会丢弃标记，直到认为找到了语句边界。在捕获到 ParseError 后，我们将调用此方法，希望此时能重新同步。当它运作良好时，我们已经丢弃了那些可能导致级联错误的标记，现在可以从下一个语句开始解析文件的其余部分。

Alas, we don’t get to see this method in action, since we don’t have statements yet. We’ll get to that in a couple of chapters. For now, if an error occurs, we’ll panic and unwind all the way to the top and stop parsing. Since we can parse only a single expression anyway, that’s no big loss.
可惜的是，由于我们还没有语句，所以无法看到这个方法的具体应用。我们将在接下来的几章中讨论这个问题。目前，如果出现错误，我们会直接 panic 并回退到最上层，停止解析。反正我们只能解析单个表达式，所以这也没什么大不了的。

6 . 4Wiring up the Parser 连接解析器

We are mostly done parsing expressions now. There is one other place where we need to add a little error handling. As the parser descends through the parsing methods for each grammar rule, it eventually hits primary(). If none of the cases in there match, it means we are sitting on a token that can’t start an expression. We need to handle that error too.
我们现在基本完成了表达式的解析。还有一个地方需要添加一些错误处理。当解析器通过每个语法规则的解析方法下降时，它最终会到达 primary() 。如果其中的所有情况都不匹配，这意味着我们遇到了一个不能作为表达式开头的标记。我们也需要处理这个错误。

    if (match(LEFT_PAREN)) {
      Expr expr = expression();
      consume(RIGHT_PAREN, "Expect ')' after expression.");
      return new Expr.Grouping(expr);
    }

lox/Parser.java
in primary() 在 primary()中

    throw error(peek(), "Expect expression.");

lox/Parser.java, in primary()

With that, all that remains in the parser is to define an initial method to kick it off. That method is called, naturally enough, parse().
至此，解析器中剩下的就是定义一个初始方法来启动它。这个方法自然被称为 parse() 。

lox/Parser.java
add after Parser() 在 Parser()后添加

  Expr parse() {
    try {
      return expression();
    } catch (ParseError error) {
      return null;
    }
  }

lox/Parser.java, add after Parser()

We’ll revisit this method later when we add statements to the language. For now, it parses a single expression and returns it. We also have some temporary code to exit out of panic mode. Syntax error recovery is the parser’s job, so we don’t want the ParseError exception to escape into the rest of the interpreter.
我们稍后向语言中添加语句时会重新讨论这个方法。目前，它只解析单个表达式并返回它。我们还有一些临时代码用于退出恐慌模式。语法错误恢复是解析器的职责，因此我们不希望 ParseError 异常逃逸到解释器的其他部分。

When a syntax error does occur, this method returns null. That’s OK. The parser promises not to crash or hang on invalid syntax, but it doesn’t promise to return a usable syntax tree if an error is found. As soon as the parser reports an error, hadError gets set, and subsequent phases are skipped.
当发生语法错误时，此方法返回 null 。这是正常的。解析器承诺不会因无效语法而崩溃或挂起，但它不承诺在发现错误时返回可用的语法树。一旦解析器报告错误， hadError 就会被设置，后续阶段将被跳过。

Finally, we can hook up our brand new parser to the main Lox class and try it out. We still don’t have an interpreter, so for now, we’ll parse to a syntax tree and then use the AstPrinter class from the last chapter to display it.
最后，我们可以将全新的解析器连接到主 Lox 类并进行测试。由于我们还没有解释器，所以目前我们将解析为语法树，然后使用上一章的 AstPrinter 类来显示它。

Delete the old code to print the scanned tokens and replace it with this:
删除用于打印扫描标记的旧代码，并将其替换为以下内容：

    List<Token> tokens = scanner.scanTokens();

lox/Lox.java
in run() 在 run() 中
replace 5 lines 替换 5 行

    Parser parser = new Parser(tokens);
    Expr expression = parser.parse();

    // Stop if there was a syntax error.
    if (hadError) return;

    System.out.println(new AstPrinter().print(expression));

lox/Lox.java, in run(), replace 5 lines

Congratulations, you have crossed the threshold! That really is all there is to handwriting a parser. We’ll extend the grammar in later chapters with assignment, statements, and other stuff, but none of that is any more complex than the binary operators we tackled here.
恭喜你，已经跨过了门槛！手写解析器其实就这么简单。在后续章节中，我们将扩展语法，加入赋值、语句等内容，但这些都不会比我们在这里处理的二元运算符更复杂。

It is possible to define a more complex grammar than Lox’s that’s difficult to parse using recursive descent. Predictive parsing gets tricky when you may need to look ahead a large number of tokens to figure out what you’re sitting on.
可以定义一个比 Lox 更复杂的语法，这种语法使用递归下降法难以解析。当需要前瞻大量标记才能确定当前所处位置时，预测解析会变得棘手。

In practice, most languages are designed to avoid that. Even in cases where they aren’t, you can usually hack around it without too much pain. If you can parse C++ using recursive descent—which many C++ compilers do—you can parse anything.
在实践中，大多数语言的设计都旨在避免这种情况。即使在某些情况下它们没有这样做，通常你也可以不太费力地绕过它。如果你能用递归下降法解析 C++——许多 C++编译器就是这么做的——那么你就能解析任何东西。

Fire up the interpreter and type in some expressions. See how it handles precedence and associativity correctly? Not bad for less than 200 lines of code.
启动解释器并输入一些表达式。看看它是如何正确处理优先级和结合性的？对于不到 200 行代码来说，这还不错。

Challenges 挑战

In C, a block is a statement form that allows you to pack a series of statements where a single one is expected. The comma operator is an analogous syntax for expressions. A comma-separated series of expressions can be given where a single expression is expected (except inside a function call’s argument list). At runtime, the comma operator evaluates the left operand and discards the result. Then it evaluates and returns the right operand.
在 C 语言中，块是一种语句形式，允许你在预期单个语句的地方打包一系列语句。逗号运算符是表达式的类似语法。在预期单个表达式的地方（函数调用的参数列表内部除外），可以给出一个由逗号分隔的表达式系列。在运行时，逗号运算符会计算左操作数并丢弃结果，然后计算并返回右操作数。

Add support for comma expressions. Give them the same precedence and associativity as in C. Write the grammar, and then implement the necessary parsing code.
添加对逗号表达式的支持。赋予它们与 C 语言中相同的优先级和结合性。编写语法，然后实现必要的解析代码。
Likewise, add support for the C-style conditional or “ternary” operator ?:. What precedence level is allowed between the ? and :? Is the whole operator left-associative or right-associative?
同样地，添加对 C 风格的条件或“三元”运算符 ?: 的支持。在 ? 和 : 之间允许的优先级级别是什么？整个运算符是左结合还是右结合？
Add error productions to handle each binary operator appearing without a left-hand operand. In other words, detect a binary operator appearing at the beginning of an expression. Report that as an error, but also parse and discard a right-hand operand with the appropriate precedence.
添加错误产生式以处理每个没有左操作数出现的二元运算符。换句话说，检测出现在表达式开头的二元运算符。将其报告为错误，但也要解析并丢弃具有适当优先级的右操作数。

Design Note: Logic Versus History
设计说明：逻辑与历史

Let’s say we decide to add bitwise & and | operators to Lox. Where should we put them in the precedence hierarchy? C—and most languages that follow in C’s footsteps—place them below ==. This is widely considered a mistake because it means common operations like testing a flag require parentheses.
假设我们决定在 Lox 中添加按位 & 和 | 运算符。我们应该将它们放在优先级层次结构中的哪个位置？C 语言——以及大多数追随 C 语言脚步的语言——将它们放在 == 之下。这被广泛认为是一个错误，因为它意味着像测试标志这样的常见操作需要括号。

if (flags & FLAG_MASK == SOME_FLAG) { ... } // Wrong.
if ((flags & FLAG_MASK) == SOME_FLAG) { ... } // Right.

Should we fix this for Lox and put bitwise operators higher up the precedence table than C does? There are two strategies we can take.
我们应该为 Lox 修复这个问题，并将位运算符的优先级设置得比 C 语言更高吗？我们可以采取两种策略。

You almost never want to use the result of an == expression as the operand to a bitwise operator. By making bitwise bind tighter, users don’t need to parenthesize as often. So if we do that, and users assume the precedence is chosen logically to minimize parentheses, they’re likely to infer it correctly.
几乎从不希望将 == 表达式的结果用作位运算符的操作数。通过使位运算符绑定更紧密，用户不需要经常使用括号。因此，如果我们这样做，并且用户假设优先级是逻辑选择的以最小化括号，他们很可能会正确推断出它。

This kind of internal consistency makes the language easier to learn because there are fewer edge cases and exceptions users have to stumble into and then correct. That’s good, because before users can use our language, they have to load all of that syntax and semantics into their heads. A simpler, more rational language makes sense.
这种内部一致性使得语言更易于学习，因为用户需要遇到并纠正的边缘情况和例外更少。这是好事，因为在用户能够使用我们的语言之前，他们必须将所有语法和语义装入脑海。一个更简单、更合理的语言是有意义的。

But, for many users there is an even faster shortcut to getting our language’s ideas into their wetware—use concepts they already know. Many newcomers to our language will be coming from some other language or languages. If our language uses some of the same syntax or semantics as those, there is much less for the user to learn (and unlearn).
但是，对于许多用户来说，将我们语言的思想融入他们的“湿件”中有一个更快捷的途径——利用他们已经熟悉的概念。许多新接触我们语言的用户可能来自其他一种或多种语言背景。如果我们的语言采用了与那些语言相似的语法或语义，用户需要学习（以及摒弃）的内容就会少得多。

This is particularly helpful with syntax. You may not remember it well today, but way back when you learned your very first programming language, code probably looked alien and unapproachable. Only through painstaking effort did you learn to read and accept it. If you design a novel syntax for your new language, you force users to start that process all over again.
这对于语法特别有帮助。你可能今天记得不太清楚，但回想当初学习第一门编程语言时，代码可能看起来陌生且难以接近。只有通过艰苦的努力，你才学会阅读并接受它。如果你为你的新语言设计了一种全新的语法，你就迫使用户重新开始这个过程。

Taking advantage of what users already know is one of the most powerful tools you can use to ease adoption of your language. It’s almost impossible to overestimate how valuable this is. But it faces you with a nasty problem: What happens when the thing the users all know kind of sucks? C’s bitwise operator precedence is a mistake that doesn’t make sense. But it’s a familiar mistake that millions have already gotten used to and learned to live with.
利用用户已有的知识是你可以用来促进语言采用的最强大工具之一。这一点的重要性几乎无法高估。但它也带来了一个棘手的问题：当用户所熟知的东西其实并不好时，该怎么办？C 语言的位运算符优先级就是一个毫无道理的错误。然而，这是一个数百万用户已经习惯并学会与之共处的熟悉错误。

Do you stay true to your language’s own internal logic and ignore history? Do you start from a blank slate and first principles? Or do you weave your language into the rich tapestry of programming history and give your users a leg up by starting from something they already know?
你是否坚持自己语言的内部逻辑而忽视历史？你是否从零开始，遵循基本原则？还是你将语言编织进编程历史的丰富挂毯中，通过从用户已知的内容出发，为他们提供助力？

There is no perfect answer here, only trade-offs. You and I are obviously biased towards liking novel languages, so our natural inclination is to burn the history books and start our own story.
这里没有完美的答案，只有权衡。你和我显然倾向于喜欢新颖的语言，所以我们自然的倾向是烧掉历史书，开始我们自己的故事。

In practice, it’s often better to make the most of what users already know. Getting them to come to your language requires a big leap. The smaller you can make that chasm, the more people will be willing to cross it. But you can’t always stick to history, or your language won’t have anything new and compelling to give people a reason to jump over.
在实践中，充分利用用户已有的知识往往更为可取。让他们接受你的语言需要跨越巨大的鸿沟。你能够缩小的差距越小，愿意跨越它的人就越多。但你不能总是固守历史，否则你的语言将缺乏新颖和引人入胜之处，无法给人们提供跨越的理由。