Programming is a creative work and an art. Mastering any art requires a lot of practice and understanding, so the "wisdom" proposed here is not a diet pill that claims to make you lose 10 pounds a day, and it cannot replace your own diligence. However, since the software industry likes to be different and likes to complicate simple things, I hope these words can point out some correct directions for confused people, so that they can avoid some detours and basically achieve the goal of "you get what you sow".
编程是一项创造性工作,也是一门艺术。掌握任何一门艺术都需要大量的实践和理解,所以这里提出的 "智慧 "并不是号称能让人日瘦十斤的减肥药,也不能代替自己的勤奋。不过,由于软件行业喜欢标新立异,喜欢把简单的事情复杂化,我希望这些话能给迷茫的人们指出一些正确的方向,让他们少走一些弯路,基本达到 "种瓜得瓜,种豆得豆 "的目的。
Some people like to show off how many lines of code they have written, as if the number of lines of code is the standard for measuring programming skills. However, if you always write code in a hurry, but never go back to ponder, modify and refine, it is actually impossible to improve your programming skills. You will produce more and more mediocre or even bad code. In this sense, the so-called "work experience" of many people is not necessarily proportional to the quality of his code. If you have decades of work experience, but never go back to refine and reflect on your code, then you may not be as good as someone who has only one or two years of experience, but likes to ponder and comprehend carefully.
有些人喜欢炫耀自己写了多少行代码,似乎代码行数就是衡量编程能力的标准。然而,如果你总是匆匆忙忙地编写代码,却从不回头思考、修改和完善,那么你的编程技能实际上是不可能提高的。你会编写出越来越多平庸甚至糟糕的代码。从这个意义上说,很多人所谓的 "工作经验 "与他的代码质量并不一定成正比。如果你有几十年的工作经验,却从不回头完善和反思自己的代码,那么你可能还不如一个只有一两年工作经验,却喜欢认真思考和领悟的人。
A great writer once said: "The quality of a writer is not determined by how many words he publishes, but by how much he throws into his wastebasket." I think the same theory applies to programming. Good programmers delete more code than they keep. If you see someone write a lot of code but don't delete much, then his code must have a lot of garbage.
一位伟大的作家曾经说过一位伟大的作家曾经说过:"一个作家的质量不是由他发表了多少文字决定的,而是由他扔进了多少废纸篓决定的"。我认为同样的理论也适用于编程。优秀的程序员删除的代码比保留的代码要多。如果你看到有人写了很多代码,但删掉的不多,那么他的代码中一定有很多垃圾。
Just like literature, code can't be written overnight. Inspiration seems to come in bits and pieces. No one can write it all in one go. Even the best programmers need a while to find the simplest and most elegant way to write it. Sometimes you refine a piece of code over and over again and think you've reached the top and can't improve it any more, but when you look back a few months later, you find a lot of room for improvement and simplification. It's exactly the same as writing an article. When you look back at something you wrote a few months or years ago, you can always find some improvements.
就像文学作品一样,代码也不是一蹴而就的。灵感似乎是零零碎碎的。没有人能一气呵成。即使是最优秀的程序员,也需要一段时间才能找到最简单、最优雅的编写方法。有时,你一遍又一遍地完善一段代码,认为自己已经达到了顶峰,再也无法改进了,但几个月后回过头来看,你会发现还有很多可以改进和简化的地方。这和写文章是一样的。当你回顾几个月或几年前写的东西时,总能发现一些改进之处。
So if you are no longer making progress by refining the code, you can put it aside for now. After a few weeks or months, you may come back with new ideas. After repeating this process many times, you will accumulate inspiration and wisdom, so that you can move directly in the right direction, or nearly the right direction, when you encounter new problems.
因此,如果你不再通过完善代码来取得进步,你可以暂时把它放在一边。几周或几个月后,你可能会有新的想法。多次重复这个过程后,你就会积累灵感和智慧,从而在遇到新问题时能直接朝着正确的方向或接近正确的方向前进。
People hate "spaghetti code" because it is like noodles that are winding around and can't be sorted out. So what shape does elegant code generally have? After years of observation, I found that elegant code has some obvious characteristics in shape.
人们讨厌 "意大利面代码",因为它就像绕来绕去的面条,无法理清头绪。那么,优雅的代码一般是什么形状的呢?经过多年的观察,我发现优雅的代码在形状上有一些明显的特征。
If we ignore the specific content, from the general structure, elegant code looks like some neatly arranged boxes. If you make an analogy with tidying up a room, it is easy to understand. If you throw all the items into a big drawer, they will all be mixed together. It will be difficult for you to organize and find what you need quickly. But if you put a few small boxes in the drawer and put the items in them according to their categories, then they will not run around everywhere, and you can find and manage them more easily.
如果我们忽略具体内容,从总体结构上看,优雅的代码就像一些排列整齐的方框。如果把整理房间做个比喻,就很容易理解了。如果把所有物品都扔进一个大抽屉,它们就会混在一起。你很难整理并迅速找到你需要的东西。但如果在抽屉里放几个小盒子,把物品按照类别放进去,它们就不会到处乱跑,你也能更容易地找到和管理它们。
Another characteristic of elegant code is that its logic generally looks like a tree with clear branches. This is because almost everything a program does is the transmission and branching of information. You can think of code as a circuit, with current flowing through wires, branching or converging. If you think like this, your code will have fewer if statements with only one branch, and it will look like this:
优雅代码的另一个特点是,其逻辑一般看起来像一棵树,树枝清晰。这是因为程序所做的一切几乎都是信息的传输和分支。你可以把代码想象成一个电路,电流在电线中流动,分支或汇聚。如果你这样想,你的代码就会减少只有一个分支的 if 语句,看起来就会像这样:
if (...) {
if (...) {
...
} else {
...
}
} else if (...) {
...
} else {
...
}
Notice? In my code, if statements almost always have two branches. They may be nested, have multiple levels of indentation, and there may be a small amount of repeated code in the else branch. However, this structure is very rigorous and clear in logic. I will tell you later why it is best to have two branches in if statements.
注意到了吗?在我的代码中,if 语句几乎总是有两个分支。它们可能是嵌套的,有多级缩进,else 分支中可能有少量重复代码。不过,这种结构在逻辑上是非常严谨和清晰的。稍后我会告诉你为什么 if 语句中最好有两个分支。
Some people are clamoring to make programs "modular", and what they do is divide the code into multiple files and directories, and then call these directories or files "modules". They even put these directories in different VCS repos. As a result, this approach does not bring smooth cooperation, but brings a lot of trouble. This is because they don't actually understand what "modules" are. Superficially splitting the code and putting it in different locations not only fails to achieve the purpose of modularization, but also creates unnecessary trouble.
有些人热衷于将程序 "模块化",他们的做法是将代码分成多个文件和目录,然后将这些目录或文件称为 "模块"。他们甚至把这些目录放在不同的 VCS 仓库中。结果,这种做法不仅没有带来顺畅的合作,反而带来了很多麻烦。这是因为他们实际上并不了解什么是 "模块"。表面上将代码拆分,放在不同的位置,不仅达不到模块化的目的,还会带来不必要的麻烦。
True modularity is not in the textual sense, but in the logical sense. A module should be like a circuit chip, with well-defined inputs and outputs. In fact, a good modularization method already exists, and its name is "function". Each function has clear inputs (parameters) and outputs (return values). The same file can contain multiple functions, so you don't actually need to separate the code into multiple files or directories to achieve code modularization. I can write all the code in the same file, but it is still a very modular code.
真正的模块化不是文本意义上的模块化,而是逻辑意义上的模块化。模块应该像电路芯片一样,有明确的输入和输出。事实上,一种很好的模块化方法已经存在,它的名字叫 "函数"。每个函数都有明确的输入(参数)和输出(返回值)。同一个文件可以包含多个函数,因此实际上不需要将代码分成多个文件或目录来实现代码模块化。我可以在同一个文件中编写所有代码,但它仍然是一个非常模块化的代码。
To achieve good modularity, you need to do the following:
要实现良好的模块化,需要做到以下几点:
Avoid writing functions that are too long. If you find that a function is too large, you should break it up into smaller ones. I usually write functions that are no longer than 40 lines. For comparison, a typical laptop screen can hold 50 lines of code. I can see a 40-line function at a glance without scrolling. The reason it's only 40 lines instead of 50 is that the maximum viewing angle that I can see without moving my eyes is only 40 lines of code.
避免编写过长的函数。如果发现函数过长,应将其分解成较小的函数。我编写的函数通常不超过 40 行。相比之下,一个典型的笔记本电脑屏幕可以容纳 50 行代码。我不用滚动就能一眼看到 40 行的函数。之所以只有 40 行而不是 50 行,是因为我在不移动视线的情况下能看到的最大视角只有 40 行代码。
If I look at the code without moving my eyes, I can map the entire code to my visual nerves, so that even if I suddenly close my eyes, I can still see the code. I find that when I close my eyes, my brain can process the code more effectively. You can imagine what other shapes this code can take. 40 lines is not a big limit, because the more complex parts of the function are often extracted by me and made into smaller functions, and then called from the original function.
如果我不动眼睛看代码,就能将整个代码映射到我的视觉神经上,这样即使我突然闭上眼睛,也能看到代码。我发现当我闭上眼睛时,大脑可以更有效地处理代码。你可以想象这些代码还能以什么形式出现。40 行并不是一个很大的限制,因为函数中较复杂的部分通常会被我提取出来,做成较小的函数,然后从原始函数中调用。
Make small utility functions. If you look closely at the code, you'll notice that there's a lot of repetition. It's possible to extract some of this commonly used code, no matter how short, into a function. Some helper functions may only be two lines long, but they can greatly simplify the logic in the main function.
制作小型实用功能。如果仔细观察代码,你会发现有很多重复的地方。我们可以将这些常用代码中的一部分提取出来,无论多简短,都可以变成一个函数。有些辅助函数可能只有两行,但却能大大简化主函数中的逻辑。
Some people don't like to use small functions because they want to avoid the overhead of function calls, and as a result they write functions that are hundreds of lines long. This is an outdated concept. Modern compilers can automatically inline small functions into the place where they are called, so no function calls are generated at all, and no extra overhead is generated.
有些人不喜欢使用小函数,因为他们想避免函数调用的开销,结果写出了几百行的函数。这是一个过时的概念。现代编译器可以自动将小函数内嵌到调用它们的地方,因此根本不会产生函数调用,也不会产生额外的开销。
Similarly, some people also like to use macros to replace small functions, which is also an outdated concept. In the early C language compiler, only macros were statically "inlined", so they used macros to achieve the purpose of inlining. However, whether or not a function can be inlined is not the fundamental difference between macros and functions. There is a huge difference between macros and functions (I will talk about this later), and macros should be avoided as much as possible. Using macros for inlining is actually an abuse of macros, which will cause all kinds of troubles, such as making the program difficult to understand, difficult to debug, prone to errors, etc.
同样,有些人还喜欢用宏来代替小函数,这也是一个过时的概念。在早期的 C 语言编译器中,只有宏才能被静态地 "内联",因此他们使用宏来达到内联的目的。然而,函数能否内联并不是宏和函数的根本区别。宏和函数之间存在着巨大的差异(这一点我将在后面谈到),应尽量避免使用宏。使用宏进行内联实际上是对宏的滥用,会带来各种麻烦,比如使程序难以理解、难以调试、容易出错等。
Each function does one simple thing. Some people like to make some "general" functions that can do both this and that, and its internal "select" what the function should do based on certain variables and conditions. For example, you might write a function like this:
每个函数只做一件简单的事。有些人喜欢编写一些既能做这个又能做那个的 "通用 "函数,其内部会根据某些变量和条件 "选择 "函数应该做的事情。例如,你可以写这样一个函数
void foo() {
if (getOS().equals("MacOS")) {
a();
} else {
b();
}
c();
if (getOS().equals("MacOS")) {
d();
} else {
e();
}
}
The person who wrote this function did different things depending on whether the system was "MacOS". You can see that in this function, only c()
the two systems have common functions, and the others ,,,, a()
all belong to different branches.b()
d()
e()
编写这个函数的人根据系统是否为 "MacOS "做了不同的事情。你可以看到,在这个函数中,只有 c()
两个系统有共同的功能,其他 ,,,, a()
都属于不同的分支。 b()
d()
e()
This kind of "reuse" is actually harmful. If a function can do two things, and the similarities between them are less than their differences, then you'd better write two different functions, otherwise the logic of the function will not be very clear and errors are prone to occur. In fact, the above function can be rewritten as two functions:
这种 "重复使用 "其实是有害的。如果一个函数可以做两件事,而它们之间的相同点小于它们之间的不同点,那么你最好写两个不同的函数,否则函数的逻辑就不会很清晰,也容易出错。事实上,上述函数可以改写为两个函数:
void fooMacOS() {
a();
c();
d();
}
and 和
void fooOther() {
b();
c();
e();
}
If you find that two things have mostly the same content with only a few differences, most of the time you can extract the same parts and make them into a helper function. For example, if you have a function like this:
如果您发现两件事情的内容基本相同,只有少数几处不同,那么大多数情况下您可以提取相同的部分,并将其制作成一个辅助函数。例如,如果你有一个这样的函数:
void foo() {
a();
b()
c();
if (getOS().equals("MacOS")) {
d();
} else {
e();
}
}
Among them a()
, b()
, c()
are the same, only d()
and e()
are different according to the system. Then you can extract a()
, b()
, c()
out:
其中 a()
, b()
, c()
是相同的,只有 d()
和 e()
根据系统是不同的。那么就可以将 a()
, b()
, c()
提取出来:
void preFoo() {
a();
b()
c();
Then create two functions:
然后创建两个函数:
void fooMacOS() {
preFoo();
d();
}
and 和
void fooOther() {
preFoo();
e();
}
In this way, we can share the code and make sure each function does only one simple thing. The logic of the code is clearer.
这样,我们就可以共享代码,确保每个函数只做一件简单的事情。代码的逻辑更加清晰。
Avoid using global variables and class members to pass information, and try to use local variables and parameters. Some people write code and often use class members to pass information, like this:
避免使用全局变量和类成员传递信息,尽量使用局部变量和参数。有些人在编写代码时经常使用类成员来传递信息,就像下面这样:
class A {
String x;
void findX() {
...
x = ...;
}
void foo() {
findX();
...
print(x);
}
}
First, he uses findX()
to write a value to the member x
. Then, uses x
the value of . In this way, x
becomes a data channel between findX
and . Because belongs to , the program loses its modular structure. Since the two functions depend on the member x , they no longer have clear inputs and outputs, but rely on global data. and can no longer exist without , and since class members may be changed by other code, the code becomes difficult to understand and ensure correctness.print
x
class A
findX
foo
class A
首先,他使用 findX()
向成员 x
写入一个值。然后,使用 x
写入成员 x
的值。 这样,x
就成了 findX
和 print
之间的数据通道。因为属于 ,所以程序失去了模块化结构。由于这两个函数依赖于成员 x,它们不再有明确的输入和输出,而是依赖于全局数据。由于类成员可能会被其他代码修改,代码变得难以理解并确保正确性。print
x
class A
findX
foo
class A
If you use local variables instead of class members to pass information, then these two functions do not need to depend on a certain class, and are easier to understand and less error-prone:
如果使用局部变量而不是类成员来传递信息,那么这两个函数就不需要依赖于某个类,而且更容易理解,也不容易出错:
String findX() {
...
x = ...;
return x;
}
void foo() {
String x = findX();
print(x);
}
Some people think that writing a lot of comments can make the code more readable, but they find that the opposite is true. Not only do comments fail to make the code readable, but the large number of comments filling the code makes the program difficult to read. Moreover, once the logic of the code is modified, many comments will become outdated and need to be updated. Modifying comments is a considerable burden, so a large number of comments have become a stumbling block to improving the code.
有些人认为编写大量注释可以提高代码的可读性,但他们发现事实恰恰相反。注释不仅不能提高代码的可读性,而且大量的注释充斥着代码,使程序变得难以阅读。此外,一旦修改了代码的逻辑,许多注释就会过时,需要更新。修改注释是一个相当大的负担,因此大量注释已成为改进代码的绊脚石。
In fact, truly elegant and readable code hardly needs comments. If you find that you need to write a lot of comments, then your code must be ambiguous and the logic is not clear. In fact, programming language is more powerful and rigorous than natural language. It actually has the most important elements of natural language: subject, predicate, object, noun, verb, if, then, otherwise, yes, no, ... So if you make full use of the expressive power of programming language, you can use the program itself to express what it is doing without the assistance of natural language.
事实上,真正优雅可读的代码几乎不需要注释。如果你发现自己需要写很多注释,那么你的代码一定是模棱两可、逻辑不清的。事实上,编程语言比自然语言更强大、更严谨。它实际上拥有自然语言最重要的元素:主语、谓语、宾语、名词、动词、如果、那么、否则、是、否......。因此,如果充分利用编程语言的表达能力,就可以用程序本身来表达它正在做的事情,而无需自然语言的帮助。
There are a few times when you might want to use something counterintuitive to work around some other code design problem. In this case, you can use a short comment to explain why you wrote it that way. This should happen rarely, otherwise it means that the design of the entire code is wrong.
有些时候,你可能想用一些反直觉的方法来解决其他代码设计上的问题。在这种情况下,您可以使用简短的注释来解释为什么要这样写。这种情况应该很少发生,否则就意味着整个代码的设计是错误的。
If you fail to properly use the advantages provided by the programming language, you will find that the program is still difficult to understand, so you need to write comments. So I will tell you some points now, which may help you greatly reduce the need to write comments:
如果你不能正确利用编程语言提供的优势,你会发现程序仍然难以理解,因此你需要写注释。因此,我现在要告诉你一些要点,它们可能会帮助你大大减少写注释的需要:
Use meaningful function and variable names. If your function and variable names can effectively describe their logic, then you don't need to write comments to explain what it does. For example:
使用有意义的函数和变量名。如果你的函数和变量名能有效地描述其逻辑,那么你就不需要写注释来解释它的作用。例如
// put elephant1 into fridge2
put(elephant1, fridge2);
Since my function name , plus put
two meaningful variable names , already explains what it does (put the elephant in the refrigerator), the comment above is completely unnecessary.elephant1
fridge2
既然我的函数名加上 put
两个有意义的变量名已经解释了它的作用(把大象放进冰箱),上面的注释就完全没有必要了。 elephant1
fridge2
Local variables should be as close as possible to where they are used. Some people like to define many local variables at the beginning of the function, and then use them far below, like this:
局部变量应尽可能靠近其使用位置。有些人喜欢在函数的开头定义许多局部变量,然后在下面使用它们,就像这样:
void foo() {
int index = ...;
...
...
bar(index);
...
}
Since it has not been used index
and the data it depends on has not been changed, this variable definition can actually be moved closer to where it is used:
由于它没有被使用 index
,而且它所依赖的数据也没有被修改,因此这个变量的定义实际上可以移到更靠近它被使用的地方:
void foo() {
...
...
int index = ...;
bar(index);
...
}
This way, readers bar(index)
can see how the calculation is done without looking up far index
. And this short distance can strengthen the reader's understanding of the "calculation order" here. Otherwise, if the index is at the top, readers may suspect that it actually stores some data that will change, or that it has been modified later. If the index is placed at the bottom, readers will clearly know that the index does not store any mutable value, and it has not changed after it is calculated.
这样,读者 bar(index)
不用抬头看 index
很远,就能看到计算的过程。而这个短距离可以加强读者对此处 "计算顺序 "的理解。否则,如果把索引放在最上面,读者可能会怀疑它实际上存储了一些将要改变的数据,或者是后来被修改过的数据。如果把索引放在最下面,读者就会清楚地知道,索引并不存储任何可变值,而且在计算之后也没有改变过。
If you see through the nature of local variables - they are wires in a circuit, then you can better understand the benefits of proximity. The closer the variable definition is to where it is used, the shorter the wire length. You don't need to touch a wire and go around and around to find the port that receives it, so the circuit is easier to understand.
如果你看透了局部变量的本质--它们是电路中的导线,那么你就能更好地理解近似性的好处。变量定义越靠近使用位置,导线长度就越短。你不需要摸着导线绕来绕去寻找接收它的端口,因此电路更容易理解。
Local variable names should be short. This seems to conflict with the first point. How can a short variable name be meaningful? Note that I am talking about local variables here, because they are local, and point 2 has put it as close to the place where it is used as possible, so you will easily know its meaning based on the context:
本地变量名应简短。这似乎与第一点有冲突。简短的变量名怎么会有意义呢?请注意,我在这里说的是局部变量,因为它们是局部的,而第 2 点已经把它尽可能地放在了使用它的地方,所以你很容易根据上下文知道它的含义:
For example, you have a local variable that indicates whether an operation succeeded:
例如,有一个局部变量表示操作是否成功:
boolean successInDeleteFile = deleteFile("foo.txt");
if (successInDeleteFile) {
...
} else {
...
}
This local variable successInDeleteFile
doesn't need to be so verbose. Because it is only used once, and the place where it is used is in the next line, the reader can easily find that it is deleteFile
the result returned. If you rename it success
, the reader will know that it means "success in deleteFile" based on a little context. So you can change it to this:
这个局部变量 successInDeleteFile
不需要如此冗长。因为它只被使用一次,而且使用它的地方在下一行,读者可以很容易地发现它就是 deleteFile
返回的结果。如果把它改名为 success
,读者就会根据上下文知道它的意思是 "deleteFile 成功"。因此可以将其改为
boolean success = deleteFile("foo.txt");
if (success) {
...
} else {
...
}
This way of writing not only does not miss any useful semantic information, but is also easier to read. successInDeleteFile
This kind of " camelCase " is actually an eyesore if more than three words are connected together. So if you can use one word to express the same meaning, it is of course better.
这种写法不仅不会遗漏任何有用的语义信息,而且更容易阅读。 successInDeleteFile
如果有三个以上的单词连在一起,这种 "驼峰式 "的写法实际上是很碍眼的。因此,如果能用一个词表达相同的意思,当然更好。
Don't reuse local variables. Many people don't like to define new local variables when writing code, but like to "reuse" the same local variables, assigning them values repeatedly to express completely different meanings. For example, write:
不要重复使用局部变量。很多人在编写代码时不喜欢定义新的局部变量,而是喜欢 "重复使用 "相同的局部变量,反复赋值以表达完全不同的含义。例如,写
String msg;
if (...) {
msg = "succeed";
log.info(msg);
} else {
msg = "failed";
log.info(msg);
}
Although this is logically sound, it is difficult to understand and can be confusing. The variable msg
is assigned twice, representing two completely different values. They are log.info
used immediately and not passed to other places. This assignment method unnecessarily increases the scope of the local variable, making people think that it may change in the future and may be used elsewhere. A better approach is actually to define two variables:
虽然这在逻辑上是合理的,但却很难理解,可能会引起混淆。变量 msg
被赋值两次,代表两个完全不同的值。它们被 log.info
立即使用,而不会传递到其他地方。这种赋值方法不必要地增加了局部变量的作用域,让人觉得它将来可能会发生变化,可能会被用到其他地方。实际上,更好的方法是定义两个变量:
if (...) {
String msg = "succeed";
log.info(msg);
} else {
String msg = "failed";
log.info(msg);
}
Since the scope of these two msg
variables is limited to the branch of the if statement in which they are located, you can clearly see the two msg
scopes being used and know that there is no relationship between them.
由于这两个 msg
变量的作用域仅限于它们所在的 if 语句分支,因此可以清楚地看到使用的两个 msg
作用域,并知道它们之间没有任何关系。
Extract complex logic and make it into a "helper function". Some people write functions that are so long that it is hard to tell what the statements inside are doing, so they mistakenly think they need to write comments. If you look closely at the code, you will find that the unclear code can often be extracted and made into a function, and then called from the original place. Since functions have names, you can use meaningful function names instead of comments. Here is an example:
提取复杂逻辑,将其变成 "辅助函数"。有些人写的函数太长,很难说清里面的语句在做什么,因此误以为需要写注释。如果仔细观察代码,就会发现不清楚的代码往往可以提取出来,变成一个函数,然后从原来的地方调用。由于函数有名称,因此可以使用有意义的函数名称来代替注释。下面是一个例子:
...
// put elephant1 into fridge2
openDoor(fridge2);
if (elephant1.alive()) {
...
} else {
...
}
closeDoor(fridge2);
...
If you take this code and define it as a function:
如果将这段代码定义为一个函数:
void put(Elephant elephant, Fridge fridge) {
openDoor(fridge);
if (elephant.alive()) {
...
} else {
...
}
closeDoor(fridge);
}
So the original code can be changed to:
因此,原始代码可以改为
...
put(elephant1, fridge2);
...
Much clearer, and the comment is unnecessary.
这样就清楚多了,注释也就没有必要了。
Extract complex expressions and make them into intermediate variables. Some people hear that "functional programming" is a good thing, but they don't understand its true meaning, so they use a lot of nested functions in their code. Like this:
提取复杂表达式并将其转化为中间变量。有些人听说 "函数式编程 "是个好东西,但并不理解它的真正含义,因此在代码中使用了大量嵌套函数。比如这样
Pizza pizza = makePizza(crust(salt(), butter()),
topping(onion(), tomato(), sausage()));
This line of code is too long and has too many nested functions, making it hard to read. In fact, well-trained functional programmers know the benefits of intermediate variables and will not blindly use nested functions. They will turn this code into something like this:
这行代码太长,嵌套函数太多,难以阅读。事实上,训练有素的函数式程序员知道中间变量的好处,不会盲目使用嵌套函数。他们会把这段代码变成这样
Crust crust = crust(salt(), butter());
Topping topping = topping(onion(), tomato(), sausage());
Pizza pizza = makePizza(crust, topping);
Writing in this way not only effectively controls the length of a single line of code, but also because the introduced intermediate variables have "meaning", the steps are clear and become easy to understand.
这样编写不仅能有效控制单行代码的长度,而且由于引入的中间变量具有 "意义",因此步骤清晰,易于理解。
Break lines where it makes sense. For most programming languages, the logic of the code is independent of whitespace characters, so you can break lines almost anywhere, or you can not break lines at all. This language design is a good thing because it gives programmers the freedom to control the format of their own code. However, it also causes some problems because many people do not know how to break lines properly.
在合理的地方换行对于大多数编程语言来说,代码的逻辑与空白字符无关,因此几乎可以在任何地方换行,也可以完全不换行。这种语言设计是件好事,因为它给了程序员控制自己代码格式的自由。不过,它也带来了一些问题,因为很多人不知道如何正确换行。
Some people like to use the automatic line wrapping mechanism of IDE. After editing, they use a hotkey to reformat the entire code, and the IDE will automatically wrap the code that exceeds the line width limit. However, this automatic line wrapping is often not based on the logic of the code and cannot help understand the code. After automatic line wrapping, the following code may be generated:
有些人喜欢使用集成开发环境的自动换行机制。编辑完成后,他们使用热键重新格式化整个代码,IDE 会自动将超过行宽限制的代码换行。然而,这种自动换行往往不是基于代码的逻辑,无法帮助理解代码。自动换行后,可能会生成以下代码:
if (someLongCondition1() && someLongCondition2() && someLongCondition3() &&
someLongCondition4()) {
...
}
Because someLongCondition4()
the line width limit is exceeded, the editor automatically wraps it to the next line. Although the line width limit is met, the position of the line break is quite arbitrary, which does not help people understand the logic of the code. These boolean expressions are all concatenated &&
, so they are actually on an equal footing. To express this point, when you need to wrap a line, you should put each expression on a new line, like this:
由于超出了行宽限制,编辑器会自动将其换行到下一行。虽然满足了行宽限制,但换行的位置非常随意,不利于人们理解代码的逻辑。这些布尔表达式都是连接在一起的,因此它们实际上是平等的。为了表达这一点,当需要换行时,应将每个表达式放在新行中,就像这样:
if (someLongCondition1() &&
someLongCondition2() &&
someLongCondition3() &&
someLongCondition4()) {
...
}
In this way, each condition is aligned, and the logic is very clear. Here is another example:
这样,每个条件都对齐了,逻辑也非常清晰。下面是另一个例子:
log.info("failed to find file {} for command {}, with exception {}", file, command,
exception);
This line is too long and is automatically wrapped like this. file
, command
and exception
are the same thing, but two are left on the first line and the last one is wrapped on the second line. It would be better to wrap it manually like this:
这一行太长了,会像这样自动换行。在"...... "和"...... "之间,有两个相同的内容,但其中两个留在了第一行,最后一个被包装到了第二行。最好像这样手动换行:
log.info("failed to find file {} for command {}, with exception {}",
file, command, exception);
Putting the format string on a separate line and its arguments on another line makes the logic clearer.
将格式字符串放在单独一行,而将其参数放在另一行,这样逻辑会更清晰。
In order to prevent the IDE from messing up these manually adjusted line breaks, many IDEs (such as IntelliJ) have a "keep original line breaks" setting in the automatic formatting settings. If you find that the IDE's line breaks are not logical, you can modify these settings and then keep your own manual line breaks in some places.
为了防止集成开发环境弄乱这些手动调整的换行符,许多集成开发环境(如 IntelliJ)都在自动格式化设置中设置了 "保留原始换行符"。如果您发现集成开发环境的换行符不符合逻辑,可以修改这些设置,然后在某些地方保留自己的手动换行符。
At this point, I must warn you that the "no need for comments, let the code explain itself" here does not mean that the code should look like some kind of natural language. There is a JavaScript testing tool called Chai that allows you to write code like this:
说到这里,我必须提醒大家,这里的 "无需注释,让代码自己解释自己 "并不意味着代码应该看起来像某种自然语言。有一种名为 Chai 的 JavaScript 测试工具可以让你写出这样的代码:
expect(foo).to.be.a('string');
expect(foo).to.equal('bar');
expect(foo).to.have.length(3);
expect(tea).to.have.property('flavors').with.length(3);
This is an extremely wrong approach. Programming languages are inherently simpler and clearer than natural languages, but this way of writing makes them look like natural languages, which makes them complex and difficult to understand.
这种做法是极其错误的。编程语言本来就比自然语言简单明了,但这种写法却让编程语言看起来像自然语言,从而变得复杂难懂。
Programming languages like to be innovative and provide various "features", but some of these features are not good things. Many features cannot stand the test of time and eventually cause more trouble than they solve. Many people blindly pursue "shortness" and "succinctness", or in order to show that they are smart and learn quickly, they like to use some special structures in the language to write overly "smart" and difficult to understand code.
编程语言喜欢创新并提供各种 "功能",但其中有些功能并非好事。很多功能经不起时间的考验,最终带来的麻烦比解决的问题还多。很多人盲目追求 "简短"、"简洁",或者为了显示自己聪明、学得快,喜欢使用语言中的一些特殊结构来编写过于 "聪明"、难以理解的代码。
You don't have to use everything a language provides. In fact, you only need a small part of its functions to write excellent code. I have always been against "making full use of" all the features in a programming language. In fact, I have a set of best constructs in my mind. No matter how "magical" or "new" the features provided by the language are, I basically only use the ones that have been tried and tested and I think are trustworthy.
您不必使用一种语言提供的所有功能。事实上,你只需要其中的一小部分功能就能编写出优秀的代码。我一直反对 "充分利用 "编程语言的所有功能。事实上,在我的脑海中有一套最佳结构。无论编程语言提供的功能有多么 "神奇 "或 "新颖",我基本上只使用那些经过测试并认为值得信赖的功能。
Now, in response to some problematic language features, I will introduce some coding standards that I use myself and explain why they can make the code simpler.
现在,针对一些有问题的语言特点,我将介绍一些我自己使用的编码标准,并解释为什么它们可以使代码更简单。
Avoid using increment and decrement expressions (i++, ++i, i–, –i). These increment and decrement expressions are actually design errors left over from history. They have strange meanings and are very easy to get wrong. They confuse and entangle two completely different operations, reading and writing, and mess up the semantics. The results of expressions containing them may depend on the order of evaluation, so it may run correctly under a certain compiler, but strange errors may occur when using another compiler.
避免使用递增和递减表达式(i++, ++i, i-, -i)。这些递增和递减表达式实际上是历史遗留下来的设计错误。它们的含义很奇怪,而且很容易出错。它们混淆和纠缠了读写这两种完全不同的操作,并扰乱了语义。包含它们的表达式的结果可能取决于求值顺序,因此在某个编译器下可能运行正确,但在使用另一个编译器时可能会出现奇怪的错误。
In fact, these two expressions can be decomposed into two steps, separating reading and writing: one step to update the value of i, and the other step to use the value of i. For example, if you want to write foo(i++)
, you can split it into int t = i; i += 1; foo(t);
. If you want to write foo(++i)
, you can split it into . i += 1; foo(i);
The code after splitting has the same meaning, but it is much clearer. It is clear at a glance whether the update is before or after the value is obtained.
事实上,这两个表达式可以分解成两步,将读和写分开:一步是更新 i 的值,另一步是使用 i 的值。例如,如果要写 foo(i++)
,可以将其拆分为 int t = i; i += 1; foo(t);
。如果要写 foo(++i)
,可以将其拆分为 。 i += 1; foo(i);
分割后的代码含义相同,但更加清晰。更新是在获取值之前还是之后,一目了然。
Some people may think that i++ or ++i is more efficient than the split ones, but this is just an illusion. After the basic compiler optimization, the generated machine code is completely indistinguishable. Increment and decrement expressions can only be used safely in two cases. One is in the update part of a for loop, for example for(int i = 0; i < 5; i++)
. The other is written in a single line, for example i++;
. These two cases are completely unambiguous. You need to avoid other cases, such as using them in complex expressions, such as foo(i++)
, foo(++i) + foo(i)
... No one should know or investigate what these mean.
有些人可能会认为 i++ 或 ++i 比拆分后的效率更高,但这只是一种错觉。经过编译器的基本优化后,生成的机器码完全没有区别。只有在两种情况下才能安全地使用递增和递减表达式。一种是在 for 循环的更新部分,例如 for(int i = 0; i < 5; i++)
。另一种情况是写在一行中,例如 i++;
。这两种情况完全没有歧义。你需要避免其他情况,比如在复杂的表达式中使用它们,如 foo(i++)
, foo(++i) + foo(i)
...任何人都不应该知道或研究它们的含义。
Never omit curly braces. Many languages allow you to omit curly braces in certain situations. For example, C and Java allow you to omit curly braces when there is only one sentence in the if statement:
切勿省略大括号。许多语言允许在某些情况下省略大括号。例如,C 和 Java 允许在 if 语句中只有一句话时省略大括号:
if (...)
action1();
At first glance, it seems like two fewer words are needed, which is great. But this often causes strange problems. For example, you later want to add a sentence action2()
to this if, so you change the code to:
乍一看,似乎少了两个字,这很好。但这往往会引起奇怪的问题。例如,你后来想在这个 if 中添加一个 action2()
的句子,于是你将代码改为:
if (...)
action1();
action2();
For the sake of beauty, you carefully used action1()
indentation. At first glance, they are together, so you subconsciously think that they will only be executed when the if condition is true, but action2()
in fact, outside the if, it will be executed unconditionally. I call this phenomenon "optical illusion". In theory, every programmer should find this error, but in practice it is easy to be overlooked.
为了美观,你小心翼翼地使用了 action1()
缩进。乍一看,它们是在一起的,所以你会下意识地认为只有当 if 条件为真时,它们才会被执行,但 action2()
其实在 if 之外,它会无条件地执行。我称这种现象为 "视错觉"。从理论上讲,每个程序员都应该发现这个错误,但在实践中却很容易被忽视。
Then you ask, who would be so stupid, action2()
can't I just add curly braces when joining? But from a design perspective, this is not a reasonable approach. First of all, maybe you want to action2()
remove it later, so you have to remove the curly braces again to keep the style consistent, isn't it annoying? Secondly, this makes the code style inconsistent, some ifs have curly braces, and some don't. Besides, why do you need to remember this rule? If you don't ask any questions, as long as it is an if-else statement, put curly braces on it, and you don't have to think about it, just as C and Java don't provide you with this special way of writing. This way you can maintain complete consistency and reduce unnecessary thinking.
然后你会问,谁会这么傻, action2()
我就不能在连接时加上大括号吗?但从设计的角度来看,这种做法并不合理。首先,也许你稍后想 action2()
删除它,所以你必须再次删除大括号才能保持风格一致,这不是很烦人吗?其次,这会使代码风格不一致,有些 if 有大括号,有些没有。另外,为什么要记住这条规则呢?如果你不问任何问题,只要是 if-else 语句,就在上面加上大括号,你就不必考虑它,就像 C 和 Java 没有为你提供这种特殊的书写方式一样。这样就可以保持完全一致,减少不必要的思考。
Some people may say that it is unsightly to put curly braces around everything, even just one sentence. However, after practicing this coding standard for a few years, I did not find it more unsightly. On the contrary, the existence of curly braces made the code boundaries clear and less burdensome for my eyes.
有些人可能会说,把所有东西,哪怕只是一句话,都加上大括号,有碍观瞻。然而,在实践了几年这种编码标准后,我并没有觉得它更难看。恰恰相反,大括号的存在使代码边界清晰,减轻了我的负担。
Use parentheses wisely and don't rely blindly on operator precedence. Using operator precedence to reduce parentheses 1 + 2 * 3
is fine for common arithmetic expressions like this one. However, some people hate parentheses so much that they will write 2 << 7 - 2 * 3
expressions like this without parentheses at all.
明智使用括号,不要盲目依赖运算符优先。对于像这样的普通算术表达式,使用运算符先例来减少小括号 1 + 2 * 3
是没有问题的。但是,有些人非常讨厌括号,以至于在写 2 << 7 - 2 * 3
这样的表达式时完全不加括号。
The problem here is that <<
the priority of shift operations is something that many people are not familiar with and that goes against common sense. Since x << 1
it is equivalent to x
multiplying by 2, many people mistakenly think that this expression is equivalent to (2 << 7) - (2 * 3)
, so it is equal to 250. However, the actual <<
priority +
is lower than addition, so this expression is actually equivalent to 2 << (7 - 2 * 3)
, so it is equal to 4!
这里的问题在于 <<
移位运算的优先级是很多人不熟悉的,也是违反常理的。因为 x << 1
相当于 x
乘以 2,所以很多人误以为这个表达式相当于 (2 << 7) - (2 * 3)
,所以等于 250。但实际上 <<
优先 +
低于加法,所以这个表达式实际上等价于 2 << (7 - 2 * 3)
,所以它等于 4!
The solution to this problem is not to ask everyone to memorize the operator precedence table, but to add parentheses appropriately. For example, in the example above, it is better to write it directly with parentheses 2 << (7 - 2 * 3)
. Although the meaning is the same without parentheses, adding parentheses makes it clearer, and readers no longer need to memorize <<
the precedence to understand the code.
解决这个问题的办法不是要求大家背诵运算符先例表,而是适当添加括号。例如,在上面的例子中,最好直接用括号 2 << (7 - 2 * 3)
来写。虽然不加括号,意思是一样的,但加上括号后,意思就更清楚了,读者也不必再死记硬背 <<
的先例来理解代码了。
Avoid using continue and break. It is fine to have return in a loop statement (for, while), but if you use continue or break, it will complicate the logic and termination conditions of the loop and make it difficult to ensure correctness.
避免使用 continue 和 break。在循环语句(for、while)中使用 return 可以,但如果使用 continue 或 break,就会使循环的逻辑和终止条件复杂化,难以确保正确性。
The reason for the appearance of continue or break is often that you have not thought through the logic of the loop. If you have thought it through, you should hardly need continue or break. If continue or break appears in your loop, you should consider rewriting the loop. There are many ways to rewrite the loop:
出现 continue 或 break 的原因往往是你没有考虑清楚循环的逻辑。如果考虑周全,就几乎不需要 continue 或 break。如果您的循环中出现 continue 或 break,您应该考虑重写循环。重写循环的方法有很多:
Let me give you some examples of these situations.
让我举例说明这些情况。
Case 1: There is a continue in the following code:
情况 1:下面的代码中有一个 continue:
List<String> goodNames = new ArrayList<>();
for (String name: names) {
if (name.contains("bad")) {
continue;
}
goodNames.add(name);
...
}
It says: "If name contains the word 'bad', skip the following loop code..." Note that this is a "negative" description. It is not telling you when to "do" something, but telling you when to "not do" something. In order to know what it is doing, you must figure out which statements will be skipped by continue, and then reverse the logic in your mind to know what it is trying to do. This is why loops with continue and break are not easy to understand. They rely on "control flow" to describe "what not to do" and "what to skip", but in the end you still don't know what it is going to do.
它说:"如果名称中包含'坏'字,则跳过下面的循环代码......"。请注意,这是一个 "否定 "描述。它不是告诉你何时 "做 "某事,而是告诉你何时 "不做 "某事。为了知道它在做什么,你必须弄清楚哪些语句会被 continue 跳过,然后在脑海中逆转逻辑,才能知道它想做什么。这就是带有 continue 和 break 的循环不容易理解的原因。它们依靠 "控制流 "来描述 "不做什么 "和 "跳过什么",但最终你还是不知道它要做什么。
In fact, we only need to reverse the continue condition, and this code can be easily converted into an equivalent code without continue:
事实上,我们只需要反转 continue 条件,就可以很容易地将这段代码转换成没有 continue 条件的等价代码:
List<String> goodNames = new ArrayList<>();
for (String name: names) {
if (!name.contains("bad")) {
goodNames.add(name);
...
}
}
goodNames.add(name);
All the code after it is put inside the if, with an extra layer of indentation, but the continue is gone. If you read this code again, you will find it clearer. Because it is a more "positive" description. It says: "When name does not contain the word 'bad', add it to the list of goodNames..."
goodNames.add(name);
之后的所有代码都放在 if 内,并多了一层缩进,但 continue 消失了。如果你再读一遍这段代码,你会发现它更清晰了。因为这是一个更 "正面 "的描述。它说:"当名称不包含'坏'字时,将其添加到好名称列表中......"
Case 2: Both for and while have a "termination condition" in the loop header, which should be the only exit condition of the loop. If you have a break in the middle of the loop, it actually adds an exit condition to the loop. You can often just merge this condition into the loop header to remove the break.
情况 2:for 和 while 在循环头中都有一个 "终止条件",它应该是循环的唯一退出条件。如果在循环中间断开,实际上就给循环增加了一个退出条件。通常只需将这个条件合并到循环头中,就可以去掉中断。
For example, the following code:
例如,以下代码
while (condition1) {
...
if (condition2) {
break;
}
}
When condition is true, break will exit the loop. In fact, you only need to reverse condition2 and put it in the termination condition of while head to remove this break statement. The rewritten code is as follows:
当条件为真时,break 会退出循环。实际上,只需将 condition2 倒置,放在 while head 的终止条件中,就可以去掉这个 break 语句。重写后的代码如下:
while (condition1 && !condition2) {
...
}
On the surface, this situation seems to only apply when break appears at the beginning or end of the loop, but in fact, most of the time, break can be moved to the beginning or end of the loop in some way. I don't have a specific example yet, and I will add it when it appears.
从表面上看,这种情况似乎只适用于断点出现在循环的开始或结束时,但实际上,大多数情况下,断点可以通过某种方式移到循环的开始或结束处。我现在还没有具体的例子,等有了我会再补充的。
Case 3: After many breaks exit the loop, there is actually a return next. This kind of break can often be directly replaced with a return. For example, in the following example:
情况 3:在多次中断退出循环后,下一次实际上是返回。这种断点通常可以直接用 return 代替。例如,在下面的示例中
public boolean hasBadName(List<String> names) {
boolean result = false;
for (String name: names) {
if (name.contains("bad")) {
result = true;
break;
}
}
return result;
}
This function checks if there is a name in the names list that contains the word "bad". Its loop contains a break statement. This function can be rewritten as:
该函数检查姓名列表中是否有包含 "bad "的姓名。其循环包含一个 break 语句。该函数可重写为
public boolean hasBadName(List<String> names) {
for (String name: names) {
if (name.contains("bad")) {
return true;
}
}
return false;
}
The improved code uses return true
return directly when name contains "bad" instead of assigning a value to the result variable, breaking out, and finally returning. If the loop ends without return, it returns false, indicating that no such name was found. Using return instead of break eliminates the break statement and the result variable.
改进后的代码在名称包含 "bad "时直接使用 return true
返回,而不是给结果变量赋值、中断并最终返回。如果循环结束时没有返回,则返回 false,表示没有找到这样的名称。使用 return 代替 break 可以省去 break 语句和结果变量。
I have seen many other examples of using continue and break, and almost without exception, they can be eliminated, and the transformed code becomes much clearer. My experience is that 99% of break and continue can be eliminated by replacing them with return statements or flipping the if condition. The remaining 1% contains complex logic, but it can also be eliminated by extracting a helper function. The modified code becomes easier to understand and easier to ensure correctness.
我还看过许多其他使用 continue 和 break 的示例,几乎无一例外,它们都可以被取消,转换后的代码也会变得更加清晰。我的经验是,99% 的 break 和 continue 都可以通过用返回语句或翻转 if 条件来消除。剩下的 1%包含复杂的逻辑,但也可以通过提取辅助函数来消除。修改后的代码更容易理解,也更容易确保正确性。
I have an important principle when writing code: if there is a more direct and clear way to write it, choose it, even if it looks longer and more clumsy, choose it. For example, there is a "clever" way to write the Unix command line:
在编写代码时,我有一个重要原则:如果有更直接、更清晰的编写方法,就选择它,即使它看起来更长、更笨拙,也要选择它。例如,Unix 命令行有一种 "聪明 "的写法:
command1 && command2 && command3
Since the logical operation of Shell language a && b
has the characteristic of "short circuit", if it a
is equal to false, then b
there is no need to execute it. This is why command2 will be executed only when command1 succeeds, and command3 will be executed only when command2 succeeds.
由于 Shell 语言的逻辑运算 a && b
具有 "短路 "特性,如果它 a
等于 false,那么 b
就没有必要执行。这就是为什么只有当命令 1 成功时才会执行命令 2,而只有当命令 2 成功时才会执行命令 3 的原因。
command1 || command2 || command3
Operators ||
also have similar characteristics. In the command line above, if command1 succeeds, then command2 and command3 will not be executed. If command1 fails and command2 succeeds, then command3 will not be executed.
操作符 ||
也有类似的特点。在上面的命令行中,如果命令 1 成功执行,则不会执行命令 2 和命令 3。如果命令 1 失败而命令 2 成功,则不会执行命令 3。
This seems to be more clever and concise than using an if statement to determine failure, so some people borrow this approach and use it in their program code. For example, they might write code like this:
这似乎比使用 if 语句来判断失败更巧妙、更简洁,因此有些人借鉴了这种方法,并将其用于程序代码中。例如,他们可能会写这样的代码:
if (action1() || action2() && action3()) {
...
}
Can you tell what this code is trying to do? Under what conditions will action2 and action3 be executed, and under what conditions will they not be executed? Maybe if you think about it for a moment, you know what it is doing: "If action1 fails, execute action2, if action2 succeeds, execute action3". However, that kind of semantics is not directly "mapped" to this code. For example, which word in the code does the word "failure" correspond to? You can't find it because it is included in the ||
semantics. You need to know ||
the short-circuit characteristics and the semantics of logical or to know that it says "If action1 fails...". Every time you see this line of code, you need to think about it, and the accumulated load will make people very tired.
你能看出这段代码想要做什么吗?在什么情况下会执行 action2 和 action3,在什么情况下不会执行?也许你稍加思考,就会知道它在做什么:"如果操作 1 失败,执行操作 2,如果操作 2 成功,执行操作 3"。然而,这种语义并不能直接 "映射 "到这段代码中。例如,代码中的 "失败 "一词对应的是哪个词?你找不到它,因为它包含在 ||
语义中。你需要知道 ||
短路特性和逻辑语义,或者知道它说的是 "如果操作 1 失败......"。每次看到这行代码都需要思考,日积月累会让人非常疲惫。
In fact, this way of writing is an abuse of the short-circuit characteristics of logical operations &&
and ||
. These two operators may not execute the expression on the right side. The reason is for the efficiency of the machine execution, not to provide people with such a "clever" usage. The original intention of these two operators is just as logical operations, and they are not used to replace if statements for you. In other words, they just happen to achieve the effect of some if statements, but you should not use them to replace if statements because of this. If you do so, it will make the code obscure and difficult to understand.
事实上,这种写法是滥用了逻辑运算 &&
和 ||
的短路特性。这两个运算符可能不会执行右侧的表达式。原因是为了提高机器执行的效率,而不是为人们提供这种 "巧妙 "的用法。这两个运算符的初衷只是作为逻辑运算,而不是用来代替 if 语句。换句话说,它们只是碰巧实现了某些 if 语句的效果,但你不应该因此用它们来替代 if 语句。如果这样做,会使代码变得晦涩难懂。
The above code can be written in a more clumsy way, which will be much clearer:
上述代码可以用更笨拙的方式编写,这样会更清晰:
if (!action1()) {
if (action2()) {
action3();
}
}
It's obvious what the code is saying here, without even having to think about it: if action1() fails, then execute action2(), if action2() succeeds, execute action3(). Do you see the one-to-one correspondence here? if
=if, !
=fail, ... You don't need to use logic to know what it's saying.
代码的意思显而易见,甚至无需思考:如果 action1() 失败,则执行 action2(),如果 action2() 成功,则执行 action3()。你看到这里的一一对应关系了吗? if
=if, !
=fail, ...你不需要使用逻辑就能知道它在说什么。
In the previous section, I mentioned that there are very few if statements with only one branch in the code I write. Most of the if statements I write have two branches, so a lot of my code looks like this:
在上一节中,我提到在我编写的代码中,只有一个分支的 if 语句非常少。我写的大多数 if 语句都有两个分支,所以我的很多代码看起来都是这样的:
if (...) {
if (...) {
...
return false;
} else {
return true;
}
} else if (...) {
...
return false;
} else {
return true;
}
The purpose of using this method is to handle all possible situations flawlessly and avoid missing corner cases. The reason why every if statement has two branches is that if the if condition is true, you do something; but if the if condition is not true, you should know what to do something else. Regardless of whether your if statement has an else, you can't escape and must think about this problem.
使用这种方法的目的是完美地处理所有可能的情况,避免遗漏角落情况。每个 if 语句都有两个分支的原因是,如果 if 条件为真,你就会做一些事情;但如果 if 条件不为真,你就应该知道该做其他事情。无论你的 if 语句是否有 else,你都无法逃避,必须思考这个问题。
Many people like to omit the else branch when writing if statements because they feel that some else branches have repeated code. For example, in my code, there are two else branches return true
. To avoid duplication, they omit the two else branches and only use one at the end return true
. In this way, the control flow automatically "falls down" to the end of the if statement without the else branch return true
. Their code looks like this:
很多人在编写 if 语句时喜欢省略 else 分支,因为他们觉得有些 else 分支会重复代码。例如,在我的代码中,有两个 else 分支 return true
。为了避免重复,他们省略了两个 else 分支,只在末尾使用一个 return true
。 这样,控制流就会自动 "下降 "到 if 语句的末尾,而没有 else 分支 return true
。他们的代码如下
if (...) {
if (...) {
...
return false;
}
} else if (...) {
...
return false;
}
return true;
This style of writing seems to be more concise and avoids repetition, but it is easy to have oversights and loopholes. The nested if statements omit some else cases, and it is difficult to correctly analyze and reason about the else cases by relying on the "control flow" of the statements. If your if condition uses logical operations such as &&
and ||
, it is even more difficult to see whether all cases are covered.
这种写法看似更加简洁,避免了重复,但容易出现疏漏和漏洞。嵌套的 if 语句省略了一些 else 情况,依靠语句的 "控制流 "很难对 else 情况进行正确的分析和推理。如果 if 条件使用了 &&
和 ||
等逻辑运算,就更难判断是否涵盖了所有情况。
All branches that are missed due to negligence will automatically "fall down" and return unexpected results. Even if you are sure it is correct after reading it once, every time you read this code, you cannot be sure that it takes care of all situations and have to reason again. This concise writing method brings repetitive and heavy mental overhead. This is the so-called "noodle code" because the logical branches of the program are not like a tree with distinct branches and leaves, but like noodles that go around and around.
所有因疏忽而遗漏的分支都会自动 "倒下",并返回意想不到的结果。即使你在阅读一遍后确定它是正确的,但每次阅读这段代码时,你都无法确定它是否照顾到了所有情况,而不得不再次推理。这种简洁的编写方法会带来重复和沉重的精神负担。这就是所谓的 "面条代码",因为程序的逻辑分支并不像树一样有明显的枝叶,而是像面条一样绕来绕去。
Another way to omit the else branch is as follows:
省略 else 分支的另一种方法如下:
String s = "";
if (x < 5) {
s = "ok";
}
The person who wrote this code likes to use a "default value" approach in his mind. s
The default is null, and if x<5, then change it (mutate) to "ok". The disadvantage of this writing method is that when it x<5
does not hold, you need to look up to know what the value of s is. This is when you are lucky, because s is not far above. When many people write this kind of code, the initial value of s is a certain distance away from the judgment statement, and some other logic and assignment operations may be inserted in the middle. Such code, with variables changed back and forth, is dizzying and prone to errors.
编写这段代码的人喜欢使用他心目中的 "默认值 "方法。 s
默认值为空,如果 x<5,则将其改为(mutate)"ok"。这种写法的缺点是,当它 x<5
不成立时,你需要查找才能知道 s 的值是多少。这时你很幸运,因为 s 就在上方不远处。很多人写这种代码时,s 的初始值与判断语句有一定的距离,中间可能会插入一些其他的逻辑和赋值操作。这样的代码,变量来回变化,让人眼花缭乱,容易出错。
Now compare it to how I write it:
现在对比一下我是怎么写的:
String s;
if (x < 5) {
s = "ok";
} else {
s = "";
}
This may seem like an extra word or two, but it's much clearer. That's because we explicitly specify x<5
what the value of s is when it's not true. It's right there, it's ""
(empty string). Notice that even though I'm using the assignment operator, I'm not "changing" the value of s. s starts out with no value, and it never changes after it's assigned. My approach is often called more "functional" because I'm only assigning a value once.
这看似多了一两个字,但却更加清晰。这是因为我们明确指定了 x<5
当 s 不为真时的值。就在这里,它是 ""
(空字符串)。请注意,尽管我使用了赋值运算符,但我并没有 "改变" s 的值。我的方法通常被称为更 "函数式",因为我只赋值一次。
If I forget to write the else branch, the Java compiler will not let me go. It will complain: "In a certain branch, s is not initialized." This forces me to clearly set the value of s under various conditions and not miss any case.
如果我忘了写 else 分支,Java 编译器就不会放过我。它会抱怨"在某个分支中,s 没有初始化"。这就迫使我在各种条件下清楚地设置 s 的值,而不能遗漏任何情况。
Of course, since this case is relatively simple, you can also write it like this:
当然,由于这种情况相对简单,也可以这样写:
String s = x < 5 ? "ok" : "";
For more complex situations, I recommend writing it as an if statement.
对于更复杂的情况,我建议将其写成 if 语句。
Using an if statement with two branches is just one of the reasons why my code is impeccable. The idea of writing if statements in this way actually contains a general idea of making code reliable: exhaustively enumerate all situations without missing any.
使用带有两个分支的 if 语句只是我的代码无懈可击的原因之一。以这种方式编写 if 语句的想法实际上包含了使代码可靠的一般思路:详尽列举所有情况,不遗漏任何一种。
Most of the functions of a program are to process information. From a pile of complicated and ambiguous information, eliminate most of the "interference information" and find the one you need. Correctly reasoning about all "possibilities" is the core idea of writing impeccable code. In this section, I will talk about how to use this idea in error handling.
程序的大部分功能都是处理信息。从一堆复杂而模糊的信息中,排除大部分 "干扰信息",找到你需要的信息。正确推理所有 "可能性 "是编写无懈可击代码的核心思想。在本节中,我将谈谈如何在错误处理中使用这一思想。
Error handling is an old problem, but after decades, many people still don't understand it. The Unix system API manual generally tells you the possible return values and error messages. For example, the Linux read system call manual contains the following:
错误处理是一个老问题,但几十年过去了,很多人仍然不了解它。Unix 系统 API 手册一般会告诉你可能的返回值和错误信息。例如,Linux read 系统调用手册包含以下内容:
RETURN VALUE
On success, the number of bytes read is returned...
On error, -1 is returned, and errno is set appropriately.
ERRORS
EAGAIN, EBADF, EFAULT, EINTR, EINVAL, ...
Many beginners will forget to check read
whether the return value is -1, thinking that read
it is cumbersome to check the return value every time they call it, and it seems that nothing will happen if they don't check it. This idea is actually very dangerous. If the return value of the function tells you to either return a positive number indicating the length of the data read, or return -1, then you must make corresponding and meaningful processing for this -1. Never think that you can ignore this special return value because it is a "possibility". If the code misses any possible situation, it may produce unexpected and catastrophic results.
很多初学者会忘记检查 read
返回值是否为-1,认为 read
每次调用都要检查返回值很麻烦,而且不检查似乎也不会发生什么。这种想法其实非常危险。如果函数的返回值告诉你要么返回一个表示读取数据长度的正数,要么返回-1,那么你就必须对这个-1 做相应的、有意义的处理。千万不要认为这是一种 "可能性",就可以忽略这个特殊的返回值。如果代码忽略了任何可能的情况,都可能产生意想不到的灾难性结果。
For Java, this is relatively convenient. If a Java function has a problem, it is usually indicated by an exception. You can think of the exception plus the original return value of the function as a "union type". For example:
对于 Java 来说,这相对比较方便。如果 Java 函数出现问题,通常会出现异常。你可以把异常加上函数的原始返回值看作一个 "联合类型"。例如
String foo() throws MyException {
...
}
Here MyException is an error return. You can think of this function as returning a union type: {String, MyException}
. Any calling foo
code must properly handle MyException to ensure the correct operation of the program. The Union type is a fairly advanced type. Currently, only a few languages (such as Typed Racket) have this type. I mention it here just to facilitate the explanation of the concept. After mastering the concept, you can actually implement a union type system in your mind, so that you can write reliable code using ordinary languages.
这里的 MyException 是错误返回。你可以把这个函数看作是返回一个联合类型: {String, MyException}
.任何调用 foo
的代码都必须正确处理 MyException,以确保程序的正确运行。联合类型是一种相当高级的类型。目前,只有少数语言(如 Typed Racket)拥有这种类型。我在这里提到它,只是为了便于解释这一概念。在掌握了这个概念之后,你就可以在头脑中真正实现一个联合类型系统,这样你就可以使用普通语言编写可靠的代码了。
Since Java's type system requires functions to declare possible exceptions in their types and forces callers to handle possible exceptions, it is basically impossible for them to be missed due to negligence. However, some Java programmers have a bad habit that makes this safety mechanism almost completely ineffective. Whenever the compiler reports an error saying "You did not catch the possible exception of this foo function", some people don't even think about it and directly change the code to this:
由于 Java 的类型系统要求函数在其类型中声明可能的异常,并强制调用者处理可能的异常,因此基本上不可能因疏忽而漏掉异常。然而,一些 Java 程序员有一个坏习惯,使得这种安全机制几乎完全失效。每当编译器报错说 "你没有捕获这个 foo 函数的可能异常 "时,有些人想都不想就直接把代码改成了这样:
try {
foo();
} catch (Exception e) {}
Or at most put a log in it, or simply add your own function type throws Exception
, so that the compiler will no longer complain. These practices seem to be very convenient, but they are all wrong, and you will pay the price for them in the end.
或者最多只在其中加一个日志,或者干脆添加自己的函数类型 throws Exception
,这样编译器就不会再抱怨了。这些做法看似很方便,但都是错误的,最终你会为此付出代价。
If you catch the exception and ignore it, you won't know that foo actually failed. This is like driving at a junction and seeing a sign saying "Construction ahead, road closed", but still continuing to drive forward. This will of course go wrong sooner or later because you don't know what you are doing.
如果你捕捉到异常并忽略它,你就不会知道 foo 实际上失败了。这就好比在路口开车时,看到 "前方施工,道路封闭 "的标志,但仍然继续向前行驶。这当然迟早会出错,因为你不知道自己在做什么。
When catching exceptions, you should not use a broad type like Exception. You should catch exactly the type of exception A that might occur. Using a broad exception type is a big problem because it will catch other exceptions (such as B) inadvertently. Your code logic is based on judging whether A occurs, but you catch all exceptions (Exception class), so when other exception B occurs, your code will have inexplicable problems because you thought A occurred, but it didn't. This kind of bug is sometimes difficult to find even with a debugger.
捕获异常时,不应使用 Exception 这样的宽泛类型。应准确捕获可能发生的异常 A 类型。使用宽泛的异常类型是个大问题,因为它会在不经意间捕获其他异常(如 B)。你的代码逻辑是基于判断 A 是否发生,但你却捕获了所有异常(异常类),因此当其他异常 B 发生时,你的代码就会出现莫名其妙的问题,因为你以为 A 发生了,但其实没有。这种错误有时即使使用调试器也很难发现。
If you add it to the type of your function throws Exception
, then you will inevitably need to handle the exception where it is called. If the function that calls it also writes it throws Exception
, the problem will spread further. My experience is to try to handle the exception as soon as it occurs. Otherwise, if you return it to your caller, it may not know what to do.
如果将它添加到函数 throws Exception
的类型中,那么就不可避免地需要在调用它的地方处理异常。如果调用它的函数也写入了 throws Exception
,问题就会进一步扩散。我的经验是,一旦出现异常,应立即处理。否则,如果将异常返回给调用者,它可能不知道该怎么办。
In addition, try { ... } catch should contain as little code as possible. For example, if foo
and bar
may both produce exception A, your code should be written as follows:
此外,try { ...} catch 应该包含尽可能少的代码。例如,如果 foo
和 bar
都可能产生异常 A,那么代码应编写如下:
try {
foo();
} catch (A e) {...}
try {
bar();
} catch (A e) {...}
Rather than 而不是
try {
foo();
bar();
} catch (A e) {...}
The first way of writing can clearly distinguish which function has a problem, while the second way of writing mixes everything together. Clearly distinguishing which function has a problem has many benefits. For example, if your catch code contains log, it can provide you with more precise error information, which will greatly speed up your debugging process.
第一种写法可以明确区分哪个功能出了问题,而第二种写法则把所有东西都混在一起。明确区分哪个函数出了问题有很多好处。例如,如果您的 catch 代码包含日志,它就能为您提供更精确的错误信息,从而大大加快您的调试过程。
The idea of exhaustive enumeration is so useful that based on this principle, we can derive some basic principles that will allow you to handle null pointers flawlessly.
穷举枚举的思想非常有用,基于这一原则,我们可以推导出一些基本原则,让你可以完美地处理空指针。
First of all, you should know that the type systems of many languages (C, C++, Java, C#, ...) are completely wrong in their handling of null. This mistake originated from Tony Hoare 's earliest design. Hoare called this mistake his " billion dollar mistake " because the financial and human losses it caused far exceeded one billion dollars.
首先,你应该知道许多语言(C、C++、Java、C#......)的类型系统在处理 null 时是完全错误的。这个错误源于 Tony Hoare 最早的设计。霍尔称这个错误为 "十亿美元的错误",因为它造成的经济和人员损失远远超过十亿美元。
The type systems of these languages allow null to appear in any place where an object (pointer) type can appear, but null is not a legal object at all. It is not a String, not an Integer, and not a custom class. The type of null should be NULL, that is, null itself. Based on this basic view, we derive the following principles:
这些语言的类型系统允许 null 出现在任何可以出现对象(指针)类型的地方,但 null 根本不是一个合法的对象。它不是字符串,不是整数,也不是自定义类。null 的类型应该是 NULL,即 null 本身。根据这一基本观点,我们得出以下原则:
Try not to generate null pointers. Try not to use null to initialize variables, and try not to return null from functions. If your function needs to return results such as "nothing" or "an error", try to use Java's exception mechanism. Although the writing method is a bit awkward, Java's exceptions, combined with the return value of the function, can basically be used as a union type. For example, if you have a function find that can help you find a String, or it may not find anything, you can write it like this:
尽量不要生成空指针。尽量不要使用 null 来初始化变量,也尽量不要从函数中返回 null。如果您的函数需要返回 "无 "或 "错误 "等结果,请尽量使用 Java 的异常机制。虽然编写方法有点笨拙,但 Java 的异常机制与函数的返回值相结合,基本上可以用作联合类型。例如,如果有一个函数 find 可以帮你找到一个字符串,也可能什么都找不到,你可以这样写:
public String find() throws NotFoundException {
if (...) {
return ...;
} else {
throw new NotFoundException();
}
}
Java's type system forces you to catch the NotFoundException, so you can't miss this situation like you miss checking null. Java's exceptions are also something that is easy to abuse, but I've already told you how to use exceptions correctly in the previous section.
Java 的类型系统强制您捕获 NotFoundException,因此您不能错过这种情况,就像错过检查 null 一样。Java 的异常也很容易被滥用,但我在上一节已经告诉过你如何正确使用异常。
Java's try...catch syntax is quite cumbersome and clumsy, so if you are careful enough, find
functions like this can also return null to indicate "not found". This looks a little better because you don't have to use try...catch when calling. Many people write functions that return null to indicate "an error occurred", which is actually a misuse of null. "An error occurred" and "nothing" are actually two completely different things. "Nothing" is a very common and normal situation, such as not finding something in a hash table, which is normal. "An error occurred" indicates a rare situation, where a meaningful value should normally exist, but something went wrong accidentally. If your function wants to indicate "an error occurred", you should use an exception instead of null.
Java 的 try...catch 语法相当繁琐和笨拙,因此如果足够小心,像这样的 find
函数也可以返回 null 表示 "未找到"。这样看起来更好一些,因为在调用时不必使用 try...catch。许多人编写返回 null 的函数来表示 "发生错误",这实际上是对 null 的误用。"发生错误 "和 "无 "实际上是完全不同的两件事。无 "是一种非常常见和正常的情况,比如在哈希表中找不到东西,这是正常的。"发生错误 "表示一种罕见的情况,即正常情况下应该存在一个有意义的值,但意外出错了。如果您的函数想表示 "发生错误",则应使用异常而不是 null。
void foo() {
String found = find();
int len = found.length();
...
}
When an exception occurs in the call to foo, they don't care and just change the call to this:
如果在调用 foo 时出现异常,他们也不会在意,只是将调用改为 this:
try {
foo();
} catch (Exception e) {
...
}
In this way, when found is null, NullPointerException will be caught and handled. This is actually a very wrong approach. First of all, as mentioned in the previous section, catch (Exception e)
this approach should be avoided because it catches all exceptions, including NullPointerException. This will cause you to accidentally catch NullPointerException in the try statement, thus messing up the logic of the code.
这样,当 found 为空时,NullPointerException 就会被捕获和处理。这实际上是一种非常错误的方法。首先,正如上一节所述, catch (Exception e)
这种方法应该避免,因为它会捕获包括 NullPointerException 在内的所有异常。这会导致你在 try 语句中意外捕获 NullPointerException,从而扰乱代码的逻辑。
In addition, even if you write it catch (NullPointerException e)
, it is not allowed. NullPointerException occurs because there is no null check inside foo. Now you don't treat the problem properly, but add catch to every place where it is called, and your life will become more and more miserable in the future. The correct way should be to change foo, not the code that calls it. foo should be changed to this:
此外,即使写成 catch (NullPointerException e)
也是不允许的。出现 NullPointerException 的原因是 foo 内部没有进行 null 检查。现在你没有正确对待这个问题,而是在每一个调用它的地方都加上 catch,今后你的生活会变得越来越悲惨。正确的方法应该是修改 foo,而不是修改调用它的代码:
void foo() {
String found = find();
if (found != null) {
int len = found.length();
...
} else {
...
}
}
Check whether null is null when it may appear, and then handle it accordingly.
在可能出现 null 时,检查 null 是否为空,然后进行相应处理。
Don't put null into "container data structures". A collection is a collection of objects that are grouped together in some way, so null should not be put into structures such as Array, List, Set, and should not appear in the key or value of a Map. Putting null into a container is the source of some inexplicable errors. Because the position of an object in a container is generally determined dynamically, once null runs into it from a certain entrance, it is difficult for you to figure out where it has gone, and you are forced to check null at all locations where values are taken from this container. It is also difficult for you to know who put it in, and too much code makes debugging extremely difficult.
不要将 null 放入 "容器数据结构"。集合是以某种方式组合在一起的对象的集合,因此不应将 null 放入 Array、List、Set 等结构中,也不应出现在 Map 的键或值中。将 null 放入容器中会导致一些莫名其妙的错误。因为对象在容器中的位置通常是动态确定的,一旦 null 从某个入口进入容器,你就很难知道它去了哪里,而且你不得不在从该容器取值的所有位置检查 null。你也很难知道是谁把它放进去的,而且过多的代码会给调试带来极大的困难。
The solution is: if you really want to represent "nothing", then you simply don't put it in (Array, List, Set has no elements, Map doesn't have that entry at all), or you can specify a special, truly legal object to represent "nothing".
解决方法是:如果你真的想表示 "无",那就干脆不放(Array、List、Set 没有元素,Map 根本没有这个条目),或者你可以指定一个特殊的、真正合法的对象来表示 "无"。
It should be pointed out that class objects do not belong to containers. So null can be used as the value of an object member when necessary to indicate that it does not exist. For example:
需要指出的是,类对象不属于容器。因此,必要时可以使用 null 作为对象成员的值,以表示该成员不存在。例如
class A {
String name = null;
...
}
This is possible because null can only appear in the name member of object A, and you don't have to suspect that other members will become null. So every time you access the name member, just check whether it is null, and you don't need to do the same check for other members.
之所以能做到这一点,是因为 null 只出现在对象 A 的 name 成员中,你不必怀疑其他成员也会变成 null。因此,每次访问名称成员时,只需检查它是否为空,而无需对其他成员进行同样的检查。
Function callers: Clearly understand the meaning of null, check and handle null return values as early as possible to reduce its spread. One annoying thing about null is that it can mean different things in different places. Sometimes it means "nothing", "not found". Sometimes it means "an error", "failure". Sometimes it can even mean "success", ... There are many misuses, but no matter what, you must understand the meaning of each null and not confuse them.
函数调用者:明确了解 null 的含义,尽早检查并处理 null 返回值,以减少其传播。null 的一个恼人之处在于,它在不同的地方有不同的含义。有时,它表示 "无"、"未找到"。有时它意味着 "错误"、"失败"。有时甚至表示 "成功",...错误的用法有很多,但无论如何,你都必须理解每个 null 的含义,不要混淆它们。
If the function you call may return null, you should make a "meaningful" treatment for null as soon as possible. For example, the above function find
returns null to indicate "not found", so the calling find
code should check whether the return value is null as soon as it returns, and make a meaningful treatment for the "not found" situation.
如果调用的函数可能返回 null,则应尽快对 null 进行 "有意义 "的处理。例如,上述函数 find
返回 null 表示 "未找到",因此调用 find
的代码应在返回后立即检查返回值是否为 null,并对 "未找到 "的情况做出有意义的处理。
What does "meaningful" mean? I mean that the person using this function should know exactly what to do when he gets null and take responsibility. He should not just "report to his superiors" and pass the responsibility to his caller. If you violate this, you may adopt an irresponsible and dangerous writing style:
有意义 "是什么意思?我的意思是,使用该功能的人应该清楚地知道,当他收到空号时应该做什么,并承担责任。他不应该只是 "向上级报告",把责任推给调用者。如果你违反了这一点,你可能会采用一种不负责任和危险的写作风格:
public String foo() {
String found = find();
if (found == null) {
return null;
}
}
When you see find() returns null, foo itself also returns null. In this way, null has moved from one place to another, and it means something else. If you write code like this without thinking, the final result is that null may appear anywhere in the code at any time. In order to protect yourself, every function of yours will be written like this:
当你看到 find() 返回空值时,foo 本身也返回空值。这样一来,null 就从一个地方转移到了另一个地方,而且它还意味着别的东西。如果你不假思索地编写这样的代码,最终的结果就是 null 随时都可能出现在代码的任何地方。为了保护自己,你的每个函数都要这样写:
public void foo(A a, B b, C c) {
if (a == null) { ... }
if (b == null) { ... }
if (c == null) { ... }
...
}
Function author: explicitly declare that null parameters are not accepted, and crash immediately when the parameter is null. Do not try to "tolerate" null and do not let the program continue to execute. If the caller uses null as a parameter, then the caller (not the function author) should be fully responsible for the program crash.
函数作者:明确声明不接受空参数,并在参数为空时立即崩溃。不要试图 "容忍 "null,也不要让程序继续执行。如果调用者使用 null 作为参数,那么调用者(而不是函数编写者)应对程序崩溃负全部责任。
The reason why the above example becomes a problem is that people are "tolerant" of null. This "protective" writing style tries to be "fault-tolerant" and "handle null gracefully", but the result is that callers pass null to your function more recklessly. Later, there are a lot of nonsense in your code, null can appear anywhere, and no one knows where it comes from. No one knows what null means or what to do, and everyone kicks null to others. In the end, null spreads like a plague, everywhere, and becomes a nightmare.
上述例子之所以成为问题,是因为人们对 null 的 "容忍"。这种 "保护性 "写作风格试图做到 "容错 "和 "优雅地处理 null",但结果却是调用者更加肆无忌惮地将 null 传递给你的函数。之后,你的代码中就会出现大量废话,null 可以出现在任何地方,而且没人知道它从何而来。没有人知道 null 意味着什么或该怎么做,每个人都把 null 踢给别人。最后,null 就像瘟疫一样四处蔓延,成为一场噩梦。
The correct approach is actually to take a tough attitude. You should tell the user of the function that none of my parameters can be null. If you give me null, you will be responsible for the program crash. As for what to do if there is null in the caller's code, he should know how to deal with it (refer to the above points), and the function author should not worry about it.
正确的做法其实是采取强硬的态度。你应该告诉函数的用户,我的参数都不能为空。如果你给了我 null,你将对程序崩溃负责。至于调用者的代码中出现 null 时该怎么办,他应该知道如何处理(参考上述要点),而函数作者则不必担心。
A simple way to adopt a hardline approach is to use Objects.requireNonNull()
. Its definition is simple:
采用硬线方法的一个简单方法是使用 Objects.requireNonNull()
,它的定义很简单:
public static <T> T requireNonNull(T obj) {
if (obj == null) {
throw new NullPointerException();
} else {
return obj;
}
}
You can use this function to check every parameter that you don't want to accept null. As long as the parameter passed in is null, it will immediately trigger NullPointerException
a crash. In this way, you can effectively prevent null pointers from being passed to other places unknowingly.
你可以用这个函数来检查每一个你不想接受 null 的参数。只要传入的参数为空,就会立即触发 NullPointerException
崩溃。通过这种方式,你可以有效地防止空指针在不知情的情况下被传递到其他地方。
Use @NotNull and @Nullable tags. IntelliJ provides two tags, @NotNull and @Nullable. Add them before the type, which can prevent the appearance of null pointers in a concise and reliable way. IntelliJ itself will perform static analysis on the code containing this tag and point out NullPointerException
where it may appear at runtime. At runtime, null pointers will be generated where they should not appear IllegalArgumentException
, even if you have never dereferenced that null pointer. In this way, you can discover and prevent the appearance of null pointers as early as possible.
使用 @NotNull 和 @Nullable 标记。IntelliJ 提供了两个标记:@NotNull 和 @Nullable。将它们添加到类型之前,可以简洁可靠地防止出现空指针。IntelliJ 本身会对包含该标记的代码进行静态分析,并在运行时指出可能出现 NullPointerException
的地方。在运行时,将在不应出现空指针的地方生成空指针 IllegalArgumentException
,即使您从未引用过该空指针。这样,你就能尽早发现并防止空指针的出现。
Use the Optional type. Languages like Java 8 and Swift provide a type called Optional. Using this type correctly can largely avoid the problem of null. The problem of null pointers exists because you can "access" members of an object without "checking" for null.
使用可选类型。Java 8 和 Swift 等语言提供了一种名为 Optional 的类型。正确使用这种类型可以在很大程度上避免 null 问题。存在空指针问题的原因是,您可以 "访问 "对象的成员,而无需 "检查 "是否为空。
The design principle of the Optional type is to combine the two operations of "checking" and "accessing" into one, making it an "atomic operation". In this way, you cannot just access without checking. This approach is actually a special case of pattern matching in languages such as ML and Haskell. Pattern matching combines the two operations of type judgment and member access into one, so you can't make mistakes.
可选类型的设计原则是将 "检查 "和 "访问 "两种操作合二为一,使其成为 "原子操作"。这样,就不能只进行访问而不进行检查。这种方法实际上是 ML 和 Haskell 等语言中模式匹配的一种特例。模式匹配将类型判断和成员访问这两种操作合二为一,因此不会出错。
For example, in Swift, you can write:
例如,在 Swift 中可以这样写
let found = find()
if let content = found {
print("found: " + content)
}
You find()
get a value of type Optional from the function found
. Assuming its type is String?
, the question mark means it may contain a String or nil. Then you can use a special if statement to perform a null check and access the contents at the same time. This if statement is different from a normal if statement, its condition is not a Bool, but a variable binding let content = found
.
您 find()
从函数 found
中获得了一个可选类型的值。假设它的类型是 String?
,问号表示它可能包含字符串或 nil。然后可以使用特殊的 if 语句执行 null 检查,并同时访问其内容。这条 if 语句不同于普通的 if 语句,它的条件不是一个 Bool,而是一个变量绑定 let content = found
。
I don't really like this syntax, but what this statement means is: if found is nil, then the entire if statement is skipped. If it is not nil, then the variable content is bound to the value in found (unwrap operation), and then executed print("found: " + content)
. Because this way of writing combines the check and the access, you can't just access without checking.
我不太喜欢这种语法,但这条语句的意思是:如果 found 为 nil,则跳过整个 if 语句。如果不是 nil,那么变量内容就会与 found 中的值绑定(解除绑定操作),然后执行 print("found: " + content)
。因为这种写法结合了检查和访问,所以不能只访问而不检查。
Java 8's approach is a bit clumsy. If you get a Optional<String>
value of type found, you have to use the "functional programming" approach to write the following code:
Java 8 的方法有点笨拙。如果得到一个类型为 found 的 Optional<String>
值,就必须使用 "函数式编程 "方法编写以下代码:
Optional<String> found = find();
found.ifPresent(content -> System.out.println("found: " + content));
This Java code is equivalent to the Swift code above. It contains a "judgment" and a "value" operation. ifPresent first determines whether found has a value (equivalent to determining whether it is null). If so, its content is "bound" to the content parameter of the lambda expression (unwrap operation), and then the content in the lambda is executed. Otherwise, if found has no content, the lambda in ifPresent is not executed.
这段 Java 代码等同于上面的 Swift 代码。ifPresent 首先确定 found 是否有值(相当于确定它是否为空)。如果有,则将其内容 "绑定 "到 lambda 表达式的内容参数(解包操作),然后执行 lambda 中的内容。否则,如果 found 没有内容,则不执行 ifPresent 中的 lambda。
There is a problem with this design of Java. After judging null, all the contents in the branch must be written in lambda. In functional programming, this lambda is called " continuation ", and Java calls it " Consumer ", which means "if found is not null, get its value, and then do something". Since lambda is a function, you cannot write statements in it return
to return the outer function. For example, if you want to rewrite the following function (containing null):
Java 的这种设计存在一个问题。在判断 null 之后,分支中的所有内容都必须写入 lambda。在函数式编程中,这个 lambda 被称为 "continuation",而 Java 将其称为 "Consumer",意思是 "如果找到的不是 null,获取其值,然后做某事"。由于 lambda 是一个函数,因此不能在其中编写 return
返回外部函数的语句。例如,如果要重写下面的函数(包含 null):
public static String foo() {
String found = find();
if (found != null) {
return found;
} else {
return "";
}
}
It will be more troublesome. Because if you write like this:
这样会更麻烦。因为如果你这样写
public static String foo() {
Optional<String> found = find();
found.ifPresent(content -> {
return content; // can't return from foo here
});
return "";
}
return a
It cannot be foo
returned from the function. It can only be returned from the lambda, and because the return type of the lambda ( Consumer.acceptvoid
) must be , the compiler will report an error saying that you returned String. Since the free variables of the closure in Java are read-only, you cannot assign values to variables outside the lambda, so you cannot use this writing method:
它不能从函数中 foo
返回。它只能从 lambda 返回,而且由于 lambda 的返回类型(Consumer.accept void
)必须是 ,编译器会报错,说你返回的是 String。由于 Java 中闭包的自由变量是只读的,因此不能为 lambda 之外的变量赋值,所以不能使用这种写入方法:
public static String foo() {
Optional<String> found = find();
String result = "";
found.ifPresent(content -> {
result = content; // can't assign to result
});
return result;
}
So, although you get the content of found in lambda, it is still confusing how to use this value and how to return a value. Your usual Java programming techniques are almost completely useless here. In fact, after judging null, you must use a series of weird functional programming operations provided by Java 8 : map
, flatMap
, orElse
and so on, and think of combining them to express the meaning of the original code. For example, the previous code can only be rewritten like this:
因此,虽然您在 lambda 中获得了 found 的内容,但如何使用该值以及如何返回值仍然令人困惑。你常用的 Java 编程技巧在这里几乎完全没有用武之地。事实上,在判断 null 之后,您必须使用 Java 8 提供的一系列奇怪的函数式编程操作: map
、 flatMap
、 orElse
等等,并考虑将它们组合起来,以表达原始代码的含义。例如,前面的代码只能这样重写:
public static String foo() {
Optional<String> found = find();
return found.orElse("");
}
This is fine for simple cases. I really don't know how to express more complex code. I doubt whether the Optional type methods in Java 8 provide enough expressiveness. The few things in it are not very expressive, but the working principle can be related to advanced theories such as functor, continuation, and even monad... It seems that after using Optional, this language is no longer Java.
这对于简单的情况来说没有问题。我真的不知道如何表达更复杂的代码。我怀疑 Java 8 中的可选类型方法是否提供了足够的表现力。其中的一些东西虽然表现力不强,但其工作原理可以与高级理论联系起来,如函数、延续甚至 monad...似乎使用了 Optional 之后,这门语言就不再是 Java 了。
So although Java provides Optional, I think its usability is actually quite low and difficult to be accepted by people. In contrast, Swift's design is simpler and more intuitive, close to ordinary procedural programming. You only need to remember a special syntax if let content = found {...}
, and the code writing in it is no different from ordinary procedural languages.
因此,虽然 Java 提供了可选性,但我认为它的可用性其实很低,很难被人们接受。相比之下,Swift 的设计更加简单直观,接近于普通的过程式编程。你只需要记住一个特殊的语法 if let content = found {...}
,用它编写代码与普通的过程式语言没有什么区别。
In short, just remember that the key to using the Optional type is "atomic operation", which combines null checking and value retrieval into one. This requires you to use the special writing method I just introduced. If you violate this principle and separate the check and value retrieval into two steps, you may still make mistakes. For example, in Java 8, you can use found.get()
this method to directly access the content in found. In Swift, you can also use found!
it to directly access without checking.
简而言之,只需记住使用可选类型的关键是 "原子操作",它将空值检查和值检索合二为一。这就要求你使用我刚才介绍的特殊编写方法。如果违反这一原则,将检查和取值分成两步,仍有可能犯错。例如,在 Java 8 中,您可以使用 found.get()
这种方法直接访问 found 中的内容。在 Swift 中,您也可以使用 found!
it 直接访问,而无需检查。
You can write Java code like this to use the Optional type:
您可以编写这样的 Java 代码来使用可选类型:
Option<String> found = find();
if (found.isPresent()) {
System.out.println("found: " + found.get());
}
If you use this method, and separate the check and the value into two steps, a runtime error may occur. if (found.isPresent())
It is essentially the same as a normal null check. If you forget to judge found.isPresent()
and proceed directly found.get()
, it will appear NoSuchElementException
. It NullPointerException
is essentially the same thing. So this way of writing is actually the same thing as the normal use of null. If you want to use the Optional type and get its benefits, be sure to follow the "atomic operation" writing method I introduced earlier.
如果使用这种方法,并将检查和取值分为两个步骤,可能会出现运行时错误。 if (found.isPresent())
这与正常的空值检查基本相同。如果忘记判断 found.isPresent()
而直接进行 found.get()
,就会出现 NoSuchElementException
。 它 NullPointerException
本质上是一样的。所以这种写法其实和正常使用 null 是一回事。如果你想使用可选类型并获得它的好处,请务必遵循我前面介绍的 "原子操作 "书写方法。
The human brain is a wonderful thing. Although everyone knows that over-engineering is bad, in actual projects, over-engineering often occurs involuntarily. I have made this mistake many times myself, so I think it is necessary to analyze the signs and omens of over-engineering, so that it can be discovered and avoided in time at the early stage.
人脑是个神奇的东西。虽然大家都知道过度设计是不好的,但在实际项目中,过度设计往往会不由自主地发生。我自己就曾多次犯过这样的错误,所以我认为有必要分析一下过度工程的征兆和预兆,以便在早期阶段及时发现和避免。
An important sign of over-engineering is when you over-think about the "future", considering things that haven't happened yet, and needs that haven't appeared yet. For example, "If we have millions of lines of code and thousands of people in the future, such tools won't be able to support it", "I may need this function in the future, so I write the code now and put it there", "Many people will expand this code in the future, so we make it reusable now"...
过度工程化的一个重要标志就是过度考虑 "未来",考虑尚未发生的事情和尚未出现的需求。例如,"如果将来我们有几百万行代码和成千上万的人,这样的工具将无法支持它","我将来可能需要这个功能,所以我现在就写代码并把它放在那里","将来很多人会扩展这段代码,所以我们现在就让它可以重用"......
This is why many software projects are so complex. They don't actually do much, but they add a lot of unnecessary complexity for the so-called "future". The problem at hand has not been solved yet, but it is dragged down by the "future". People don't like people with short-sightedness, but in real engineering, sometimes you just have to look closer and solve the problem at hand first, and then talk about the expansion problem in the future.
这就是许多软件项目如此复杂的原因。它们实际上并没有做什么,却为所谓的 "未来 "增加了许多不必要的复杂性。眼下的问题尚未解决,却被 "未来 "所拖累。人们不喜欢目光短浅的人,但在真正的工程中,有时你只需仔细观察,先解决手头的问题,然后再谈未来的扩展问题。
Another source of over-engineering is excessive concern for "code reuse". Many people are concerned about "reuse" before they have written "usable" code. In order to make the code reusable, they are tied up by the various frameworks they have created, and in the end they can't even write usable code. If you can't write usable code well, how can you reuse it? Many projects that consider too much reuse at the beginning are completely abandoned and no one uses them because others find that the code is too difficult to understand. Writing one from scratch saves a lot of trouble.
过度工程化的另一个根源是对 "代码重用 "的过度关注。许多人在写出 "可用 "代码之前就已经开始关注 "重用 "问题。为了使代码可以重用,他们被自己创建的各种框架束缚住了手脚,最终甚至写不出可用的代码。如果连可用的代码都写不好,还怎么重用呢?很多项目在开始时考虑了太多的重用,但因为别人觉得代码太难理解而彻底放弃,无人使用。从头开始写,可以省去很多麻烦。
Excessive concern for "testing" can also lead to over-engineering. Some people change the originally simple code into a form that is "convenient for testing" for the sake of testing, which introduces a lot of complexity, so that the code that could have been written correctly in one go becomes extremely complex and has many bugs.
过度关注 "测试 "也会导致过度工程化。有些人为了测试,把原本简单的代码改成了 "方便测试 "的形式,这就引入了很多复杂性,使原本可以一次性写好的代码变得异常复杂,出现了很多 Bug。
There are two kinds of "bug-free" code in the world. One is "code without obvious bugs", and the other is "code without obvious bugs". In the first case, because the code is extremely complex, with many tests and various coverage, it seems that all the tests have passed, so the code is considered correct. In the second case, because the code is simple and direct, even if there are not many tests written, you can tell at a glance that it cannot have bugs. Which kind of "bug-free" code do you like?
世界上有两种 "无错误 "代码。一种是 "没有明显错误的代码",另一种是 "没有明显错误的代码"。在第一种情况下,由于代码极其复杂,测试很多,覆盖面很广,似乎所有测试都通过了,所以代码被认为是正确的。第二种情况是,因为代码简单直接,即使没有写很多测试,也能一眼看出不可能有错误。你喜欢哪种 "无错误 "代码?
Based on this, I have summarized the following principles to prevent over-engineering:
在此基础上,我总结了以下防止过度工程化的原则:
over. 完毕。
(This is not a free article. If you want to keep this information in your head, please pay here . Otherwise, please read here:
(这不是一篇免费文章。如果您想将这些信息牢记于心,请在此处付费。否则,请阅读此处: