这是用户在 2024-3-27 22:57 为 https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

Can AI Solve Science?
AI 能解决科学问题吗?

Note: Click any diagram to get Wolfram Language code to reproduce it. Wolfram Language code for training the neural nets used here is also available (requires GPU).
AI 最终能够做任何事情吗?

Can AI Solve Science?

Won’t AI Eventually Be Able to Do Everything?

Particularly given its recent surprise successes, there’s a somewhat widespread belief that eventually AI will be able to “do everything”, or at least everything we currently do. So what about science? Over the centuries we humans have made incremental progress, gradually building up what’s now essentially the single largest intellectual edifice of our civilization. But despite all our efforts, there are still all sorts of scientific questions that remain. So can AI now come in and just solve all of them?
特别是考虑到其最近的意外成功,有一种普遍的观点认为,最终人工智能将能够“做任何事情”,或者至少是我们目前所做的一切。那么科学呢?几个世纪以来,我们人类逐步取得了进步,逐渐建立起了现在基本上是我们文明中最大的智力建筑。但尽管我们的努力,仍然存在各种科学问题。那么现在人工智能能够解决所有这些问题吗?

To this ultimate question we’re going to see that the answer is inevitably and firmly no. But that certainly doesn’t mean AI can’t importantly help the progress of science. At a very practical level, for example, LLMs provide a new kind of linguistic interface to the computational capabilities that we’ve spent so long building in the Wolfram Language. And through their knowledge of “conventional scientific wisdom” LLMs can often provide what amounts to very high-level “autocomplete” for filling in “conventional answers” or “conventional next steps” in scientific work.
对于这个最终的问题,我们将看到答案无疑是不可能的。但这当然并不意味着人工智能不能对科学的进展起到重要的帮助。例如,在非常实际的层面上,人工智能可以为我们花费了很长时间建立的 Wolfram 语言的计算能力提供一种新的语言界面。通过他们对“传统科学智慧”的了解,人工智能通常可以提供类似于在科学工作中填补“传统答案”或“传统下一步”的高级“自动完成”。

But what I want to do here is to discuss what amount to deeper questions about AI in science. Three centuries ago science was transformed by the idea of representing the world using mathematics. And in our times we’re in the middle of a major transformation to a fundamentally computational representation of the world (and, yes, that’s what our Wolfram Language computational language is all about). So how does AI stack up? Should we think of it essentially as a practical tool for accessing existing methods, or does it provide something fundamentally new for science?
但我在这里想要讨论的是关于科学中人工智能的更深层次问题。三个世纪前,科学通过使用数学来表示世界而发生了转变。而在我们的时代,我们正处于向基本上是计算性的世界表示的重大转变中(是的,这就是我们的 Wolfram 语言计算语言所关注的)。那么人工智能如何应对呢?我们应该将其视为访问现有方法的实用工具,还是它为科学提供了根本上的新东西?

My goal here is to explore and assess what AI can and can’t be expected to do in science. I’m going to consider a number of specific examples, simplified to bring out the essence of what is (or isn’t) going on. I’m going to talk about intuition and expectations based on what we’ve seen so far. And I’m going to discuss some of the theoretical—and in some ways philosophical—underpinnings of what’s possible and what’s not.
我的目标是探索和评估人工智能在科学中能够做什么和不能做什么。我将考虑一些具体的例子,简化以突出正在进行的事情的本质。我将根据我们目前所见的直觉和期望进行讨论。我还将讨论一些理论上的,并在某种程度上是哲学性的基础,以确定可能和不可能的事情。

So what do I actually even mean by “AI” here? In the past, anything seriously computational was often considered “AI”, in which case, for example, what we’ve done for so long with our Wolfram Language computational language would qualify—as would all my “ruliological” study of simple programs in the computational universe. But here for the most part I’m going to adopt a narrower definition—and say that AI is something based on machine learning (and usually implemented with neural networks), that’s been incrementally trained from examples it’s been given. Often I’ll add another piece as well: that those examples include either a large corpus of human-generated scientific text, etc., or a corpus of actual experience about things that happen in the world—or, in other words, that in addition to being a “raw learning machine” the AI is something that’s already learned from lots of human-aligned knowledge.
那么,我在这里所说的“AI”实际上是什么意思呢?过去,任何严肃的计算都经常被认为是“AI”,在这种情况下,例如,我们长期以来使用的沃尔夫拉姆语言计算语言所做的工作将符合条件——以及我对计算宇宙中简单程序的“规则学”研究。但在这里,我大部分时间将采用更狭义的定义,并说 AI 是基于机器学习的东西(通常使用神经网络实现),它是通过逐步训练从给定的示例中获得的。通常我还会添加另一个要素:这些示例包括大量的人工生成的科学文本等,或者关于世界上发生的事情的实际经验,换句话说,除了是一个“原始学习机器”之外,AI 还是已经从大量与人类相关的知识中学到的东西。

OK, so we’ve said what we mean by AI. So now what do we mean by science, and by “doing science”? Ultimately it’s all about taking things that are “out there in the world” (and usually the natural world) and having ways to connect or translate them to things we can think or reason about. But there are several, rather different, common “workflows” for actually doing science. Some center around prediction: given observed behavior, predict what will happen; find a model that we can explicitly state that says how a system will behave; given an existing theory, determine its specific implications. Other workflows are more about explanation: given a behavior, produce a human-understandable narrative for it; find analogies between different systems or models. And still other workflows are more about creating things: discover something that has particular properties; discover something “interesting”.
好的,所以我们已经解释了 AI 的含义。那么现在我们对科学和"做科学"的含义是什么呢?最终,这一切都是关于将"世界上的事物"(通常是自然界的事物)与我们可以思考或推理的事物相连接或翻译。但实际上,有几种不同的常见的"工作流程"来进行科学研究。有些以预测为中心:根据观察到的行为,预测将会发生什么;找到一个我们可以明确陈述的模型,说明系统的行为方式;根据现有理论,确定其具体的影响。其他的工作流程更多地涉及解释:根据一种行为,为其提供一个人类可以理解的叙述;找到不同系统或模型之间的类比。还有其他的工作流程更多地涉及创造事物:发现具有特定属性的事物;发现"有趣"的事物。

In what follows we’ll explore these workflows in more detail, seeing how they can (or cannot) be transformed—or informed—by AI. But before we get into this, we need to discuss something that looms over any attempt to “solve science”: the phenomenon of computational irreducibility.
在接下来的内容中,我们将更详细地探讨这些工作流程,看看它们如何(或不能)通过人工智能进行转变或影响。但在我们深入讨论之前,我们需要讨论一些在任何试图“解决科学”时都会出现的现象:计算不可简化性。

The Hard Limit of Computational Irreducibility
计算不可约性的硬限制

Often in doing science there’s a big challenge in finding the underlying rules by which some system operates. But let’s say we’ve found those rules, and we’ve got some formal way to represent them, say as a program. Then there’s still a question of what those rules imply for the actual behavior of the system. Yes, we can explicitly apply the rules step by step and trace what happens. But can we—in one fell swoop—just “solve everything” and know how the system will behave?
在科学研究中,通常存在一个巨大的挑战,即找到某个系统运作的基本规则。但假设我们已经找到了这些规则,并且有一种形式化的方式来表示它们,比如作为一个程序。然后仍然存在一个问题,即这些规则对系统的实际行为意味着什么。是的,我们可以逐步明确地应用这些规则并追踪发生的情况。但是我们能否一次性地“解决一切”并知道系统将如何行动呢?

To do that, we in a sense have to be “infinitely smarter” than the system. The system has to go through all those steps—but somehow we can “jump ahead” and immediately figure out the outcome. A key idea—ultimately supported at a foundational level by our Physics Project—is that we can think of everything that happens as a computational process. The system is doing a computation to determine its behavior. We humans—or, for that matter, any AIs we create—also have to do computations to try to predict or “solve” that behavior. But the Principle of Computational Equivalence says that these computations are all at most equivalent in their sophistication. And this means that we can’t expect to systematically “jump ahead” and predict or “solve” the system; it inevitably takes a certain irreducible amount of computational work to figure out what exactly the system will do. And so, try as we might, with AI or otherwise, we’ll ultimately be limited in our “scientific power” by the computational irreducibility of the behavior.
为了做到这一点,我们在某种程度上必须比系统“无限聪明”。系统必须经历所有这些步骤,但我们可以“跳过”并立即找出结果。一个关键的想法——在我们的物理项目的基础上得到了支持——是我们可以将发生的一切都看作是一个计算过程。系统正在进行计算以确定其行为。我们人类——或者说,我们创建的任何人工智能——也必须进行计算,以尝试预测或“解决”这种行为。但是计算等价原理表明,这些计算在复杂性上最多是等价的。这意味着我们不能指望系统能够系统地“跳过”并预测或“解决”系统;无论我们如何努力,无论是通过人工智能还是其他方式,我们的“科学力量”最终都会受到行为的计算不可简化性的限制。

But given computational irreducibility, why is science actually possible at all? The key fact is that whenever there’s overall computational irreducibility, there are also an infinite number of pockets of computational reducibility. In other words, there are always certain aspects of a system about which things can be said using limited computational effort. And these are what we typically concentrate on in “doing science”.
但是,鉴于计算不可简化性,为什么科学实际上是可能的呢?关键的事实是,每当存在整体计算不可简化性时,也会存在无限数量的计算可简化性区域。换句话说,总是有系统的某些方面可以用有限的计算努力来描述。而这些通常是我们在“从事科学研究”时所关注的内容。

But inevitably there are limits to this—and issues that run into computational irreducibility. Sometimes these manifest as questions we just can’t answer, and sometimes as “surprises” we couldn’t see coming. But the point is that if we want to “solve everything” we’ll inevitably be confronted with computational irreducibility, and there just won’t be any way—with AI or otherwise—to shortcut just simulating the system step by step.
但是不可避免地,这存在着限制和遇到计算不可简化的问题。有时候,这些问题表现为我们无法回答的问题,有时候则是我们无法预见的“意外”。但关键是,如果我们想要“解决一切”,我们将不可避免地面临计算不可简化的问题,无论是通过人工智能还是其他方式,都没有捷径可以跳过逐步模拟系统的过程。

There is, however, a subtlety here. What if all we ever want to know about are things that align with computational reducibility? A lot of science—and technology—has been constructed specifically around computationally reducible phenomena. And that’s for example why things like mathematical formulas have been able to be as successful in science as they have.
然而,这里有一个微妙之处。如果我们只关心与计算可约简性相一致的事物,那该怎么办?很多科学和技术都是围绕着计算可约简的现象构建的。这就是为什么数学公式之类的东西在科学上能够取得如此成功的原因。

But we certainly know we haven’t yet solved everything we want in science. And in many cases it seems like we don’t really have a choice about what we need to study; nature, for example, forces it upon us. And the result is that we inevitably end up face-to-face with computational irreducibility.
但我们确实知道,我们在科学上还没有解决我们想要的一切。在许多情况下,似乎我们并没有选择我们需要研究的内容;例如,自然强迫我们这样做。结果是,我们不可避免地面对计算不可简化性。

As we’ll discuss, AI has the potential to give us streamlined ways to find certain kinds of pockets of computational reducibility. But there’ll always be computational irreducibility around, leading to unexpected “surprises” and things we just can’t quickly or “narratively” get to. Will this ever end? No. There’ll always be “more to discover”. Things that need more computation to reach. Pockets of computational reducibility that we didn’t know were there. And ultimately—AI or not—computational irreducibility is what will prevent us from ever being able to completely “solve science”.
正如我们将讨论的那样,人工智能有潜力为我们提供寻找特定类型的计算可简化区域的简化方式。但是,始终会存在计算不可简化的情况,导致意外的“惊喜”和我们无法快速或“叙述性地”达到的事物。这种情况会结束吗?不会。总会有“更多的发现”。需要更多计算才能达到的事物。我们不知道存在的计算可简化区域。而且最终,无论有没有人工智能,计算不可简化是阻止我们能够完全“解决科学”的因素。

There’s a curious historical resonance to all this. Back at the beginning of the twentieth century, there was a big question of whether all of mathematics could be “mechanically solved”. The arrival of Gödel’s theorem, however, seemed to establish that it could not. And now that we know that science also ultimately has a computational structure, the phenomenon of computational irreducibility—which is, in effect, a sharpening of Gödel’s theorem—shows that it too cannot be “mechanically solved”.
所有这一切都有一个奇特的历史共鸣。在 20 世纪初,有一个重大问题,即是否可以“机械地解决”所有数学问题。然而,哥德尔的定理的出现似乎证明了它是不可能的。现在我们知道科学也最终具有计算结构,计算不可简化的现象(实际上是哥德尔定理的深化)表明它也不能被“机械地解决”。

We can still ask, though, whether the mathematics—or science—that humans choose to study might manage to live solely in pockets of computational reducibility. But in a sense the ultimate reason that “math is hard” is that we’re constantly seeing evidence of computational irreducibility: we can’t get around actually having to compute things. Which is, for example, not what methods like neural net AI (at least without the help of tools like Wolfram Language) are good at.
我们仍然可以问一下,人类选择研究的数学或科学是否能够完全存在于计算可简化的领域。但从某种意义上说,“数学很难”的最终原因是我们不断看到计算不可简化的证据:我们无法绕过实际计算事物。例如,这并不是神经网络人工智能这样的方法擅长的(至少没有像 Wolfram Language 这样的工具的帮助)。

Things That Have Worked in the Past

Before getting into the details of what modern machine-learning-based AI might be able to do in “solving science”, it seems worthwhile to recall some of what’s worked in the past—not least as a kind of baseline for what modern AI might now be able to add.

I myself have been using computers and computation to discover things in science for more than four decades now. My first big success came in 1981 when I decided to try enumerating all possible rules of a certain kind (elementary cellular automata) and then ran them on a computer to see what they did:

I’d assumed that with simple underlying rules, the final behavior would be correspondingly simple. But in a sense the computer didn’t assume that: it just enumerated rules and computed results. And so even though I never imagined it would be there, it was able to “discover” something like rule 30.
我曾以为,通过简单的基本规则,最终的行为也会相应简单。但从某种意义上说,计算机并没有做出这样的假设:它只是枚举规则并计算结果。因此,即使我从未想象过它会出现,它仍能够“发现”类似于规则 30 的东西。

Over and over again I have had similar experiences: I can’t see how some system can manage to do anything “interesting”. But when I systematically enumerate possibilities, there it is: something unexpected, interesting—and “clever”—effectively discovered by computer.
一次又一次,我都有类似的经历:我无法看出某个系统如何能够做出任何“有趣”的事情。但是当我系统地列举可能性时,它就出现了:计算机发现了一些意想不到、有趣且“聪明”的东西。

In the early 1990s I wondered what the simplest possible universal Turing machine might be. I would never have been able to figure it out myself. The machine that had held the record since the early 1960s had 7 states and 4 colors. But the computer let me discover just by systematic enumeration the 2-state, 3-color machine
在 1990 年代初,我想知道可能的最简单的通用图灵机是什么。我自己永远无法弄清楚。自从 1960 年代初以来,保持记录的机器有 7 个状态和 4 种颜色。但是通过系统枚举,计算机让我发现了一个只有 2 个状态和 3 种颜色的机器。

that in 2007 was proved universal (and, yes, it’s the simplest possible universal Turing machine).
在 2007 年被证明是通用的(是的,它是最简单的通用图灵机)。

In 2000 I was interested in what the simplest possible axiom system for logic (Boolean algebra) might be. The simplest known up to that time involved 9 binary (Nand) operations. But by systematically enumerating possibilities, I ended up finding the single 6-operation axiom (which I proved correct using automated theorem proving). Once again, I had no idea this was “out there”, and certainly I would never have been able to construct it myself. But just by systematic enumeration the computer was able to find what seemed to me like a very “creative” result.
2000 年,我对逻辑(布尔代数)可能的最简公理系统感兴趣。到那时为止,已知的最简公理系统涉及 9 个二元( Nand )运算。但通过系统地列举可能性,我最终找到了单一的 6 个运算公理 (我使用自动定理证明证明了其正确性)。再次强调,我对此毫不知情,而且我肯定无法自己构建出来。但仅仅通过系统列举,计算机就能找到我认为非常“有创造力”的结果。

In 2019 I was doing another systematic enumeration, now of possible hypergraph rewriting rules that might correspond to the lowest-level structure of our physical universe. When I looked at the geometries that were generated I felt like as a human I could roughly classify what I saw. But were there outliers? I turned to something closer to “modern AI” to do the science—making a feature space plot of visual images:
2019 年,我进行了另一次系统列举,这次是可能与我们物理宇宙的最底层结构相对应的超图重写规则。当我看到生成的几何图形时,我感觉作为人类,我可以大致对所见到的进行分类。但是否存在离群值呢?我转向更接近“现代人工智能”的方法来进行科学研究——制作一个视觉图像的特征空间图。

Feature space plot of visual images

It needed me as a human to interpret it, but, yes, there were outliers that had effectively been “automatically discovered” by the neural net that was making the feature space plot.

I’ll give one more example—of a rather different kind—from my personal experience. Back in 1987—as part of building Version 1.0 of what’s now Wolfram Language—we were trying to develop algorithms to compute hundreds of mathematical special functions over very broad ranges of arguments. In the past, people had painstakingly computed series approximations for specific cases. But our approach was to use what amounts to machine learning, burning months of computer time fitting parameters in rational approximations. Nowadays we might do something similar with neural nets rather than rational approximations. But in both cases the concept is to find a general model of the “world” one’s dealing with (here, values of special functions)—and try to learn the parameters in the model from actual data. It’s not exactly “solving science”, and it wouldn’t even allow one to “discover the unexpected”. But it’s a place where “AI-like” knowledge of general expectations about smoothness or simplicity lets one construct the analog of a scientific model.

Can AI Predict What Will Happen?

It’s not the only role of science—and in the sections that follow we’ll explore others. But historically what’s often been viewed as a defining feature of successful science is: can it predict what will happen? So now we can ask: does AI give us a dramatically better way to do this?

In the simplest case we basically want to use AI to do inductive inference. We feed in the results of a bunch of measurements, then ask the AI to predict the results of measurements we haven’t yet done. At this level, we’re treating the AI as a black box; it doesn’t matter what’s happening inside; all we care about is whether the AI gives us the right answer. We might think that somehow we can set up the AI up so that it “isn’t making any assumptions”—and is just “following the data”. But it’s inevitable that there’ll be some underlying structure in the AI, that makes it ultimately assume some kind of model for the data.

Yes, there can be a lot of flexibility in this model. But one can’t have a truly “model-less model”. Perhaps the AI is based on a huge neural network, with billions of numerical parameters that can get tweaked. Perhaps even the architecture of the network can change. But the whole neural net setup inevitably defines an ultimate underlying model.
是的,这个模型可以有很大的灵活性。但是一个真正的“无模型模型”是不可能存在的。也许人工智能是基于一个庞大的神经网络,拥有数十亿个可以调整的数值参数。甚至网络的架构也可以改变。但整个神经网络设置不可避免地定义了一个最终的基础模型。

Let’s look at a very simple case. Let’s imagine our “data” is the blue curve here—perhaps representing the motion of a weight suspended on a spring—and that the “physics” tells us it continues with the red curve:
让我们来看一个非常简单的例子。假设我们的“数据”是这里的蓝色曲线——可能代表了一个悬挂在弹簧上的重物的运动——而“物理学”告诉我们它会继续以红色曲线进行。

Now let’s take a very simple neural net
现在让我们来看一个非常简单的神经网络

and let’s train it using the “blue curve” data above to get a network with a certain collection of weights:
让我们使用上面的“蓝曲线”数据来训练它,以获得一组特定的权重的网络

Now let’s apply this trained network to reproduce our original data and extend it:
现在让我们将训练好的网络应用于复制并扩展我们的原始数据:

And what we see is that the network does a decent job of reproducing the data it was trained on, but when it comes to “predicting the future” it basically fails.
我们看到的是,网络在重现其训练数据方面做得不错,但在“预测未来”方面基本上失败了。

So what’s going on here? Did we just not train long enough? Here’s what happens with progressively more rounds of training:
那么这里发生了什么?我们是不是训练时间不够长?随着训练轮次的逐渐增加,以下是发生的情况:

It doesn’t seem like this helps much. So maybe the problem is that our network is too small. Here’s what happens with networks having a series of sizes:
这似乎并没有太大帮助。所以也许问题在于我们的网络太小了。以下是不同规模网络的情况:

And, yes, larger sizes help. But they don’t solve the problem of making our prediction successful. So what else can we do? Well, one feature of the network is its activation function: how we determine the output at each node from the weighted sum of inputs. Here are some results with various (popular) activation functions:

And there’s something notable here—that highlights the idea that there are “no model-less models”: different activation functions lead to different predictions, and the form of the predictions seems to be a direct reflection of the form of the activation function. And indeed there’s no magic here; it’s just that the neural net corresponds to a function whose core elements are activation functions.

So, for example, the network

corresponds to the function

where ϕ represents the activation function used in this case.

Of course, the idea of approximating one function by some combination of standard functions is extremely old (think: epicycles and before). Neural nets allow one to use more complicated (and hierarchical) combinations of more complicated and nonlinear functions, and provide a more streamlined way of “fitting all the parameters” that are involved. But at a fundamental level it’s the same idea.

And for example here are some approximations to our “data” constructed in terms of more straightforward mathematical functions:
这里举例一些用更直接的数学函数构建的我们的“数据”的近似值

These have the advantage that it’s quite easy to state “what each model is” just by “giving its formula”. But just as with our neural nets, there are problems in making predictions.
这些具有优势,通过“给出其公式”就能很容易地说明“每个模型是什么”。但就像我们的神经网络一样,预测也存在问题。

(By the way, there are a whole range of methods for things like time series prediction, involving ideas like “fitting to recurrence relations”—and, in modern times, using transformer neural nets. And while some of these methods happen to be able to capture a periodic signal like a sine wave well, one doesn’t expect them to be broadly successful in accurately predicting functions.)
顺便说一下,对于诸如时间序列预测之类的事情,有一整套方法,涉及到诸如“拟合循环关系”的思想,以及在现代使用的转换器神经网络。虽然其中一些方法碰巧能够很好地捕捉正弦波等周期信号,但人们并不指望它们在准确预测函数方面取得广泛成功。

OK, one might say, perhaps we’re trying to use—and train—our neural nets in too narrow a way. After all, it seems as if it was critical to the success of ChatGPT to have a large amount of training data about all kinds of things, not just some narrow specific area. Presumably, though, what that broad training data did was to let ChatGPT learn the “general patterns of language and common sense”, which it just wouldn’t be able to pick up from narrower training data.
好吧,有人可能会说,也许我们试图以太过狭窄的方式使用和训练我们的神经网络。毕竟,似乎对于 ChatGPT 的成功来说,拥有大量关于各种事物的训练数据是至关重要的,而不仅仅是一些狭窄的特定领域。然而,可以推测的是,广泛的训练数据让 ChatGPT 能够学习“语言和常识的一般模式”,这是它无法从更狭窄的训练数据中获取的。

So what’s the analog for us here? It might be that we’d want our neural net to have a “general idea of how functions work”—for example to know about things like continuity of functions, or, for that matter, periodicity or symmetry. So, yes, we can go ahead and train not just on a specific “window” of data like we did above, but on whole families of functions—say collections of trigonometric functions, or perhaps all the built-in mathematical functions in the Wolfram Language.
那么对我们来说,类似的情况是什么呢?也许我们希望我们的神经网络对“函数的工作方式有一个普遍的概念” - 例如了解函数的连续性,或者说周期性或对称性。因此,是的,我们可以继续训练,不仅仅是在像上面那样的特定“窗口”数据上,而是在整个函数族上进行训练 - 比如三角函数集合,或者 Wolfram 语言中的所有内置数学函数。

And, needless to say, if we do this, we’ll surely be able to successfully predict our sine curve above—just as we would if we were using traditional Fourier analysis with sine curves as our basis. But is this “doing science”?
而且,不用说,如果我们这样做,我们肯定能够成功地预测我们上面的正弦曲线 - 就像我们使用传统的傅里叶分析以正弦曲线为基础一样。但这是“做科学”吗?

In essence it’s saying, “I’ve seen something like this before, so I figure this is what’s going to happen now”. And there’s no question that can be useful; indeed it’s an automated version of a typical thing that a human experienced in some particular area will be able to do. We’ll return to this later. But for now the main point is that at least when it comes to things like predicting functions, it doesn’t seem as if neural nets—and today’s AIs—can in any obvious way “see further” than what goes into their construction and training. There’s no “emergent science”; it’s just fairly direct “pattern matching”.
本质上,它是在说:“我以前见过类似的东西,所以我认为现在会发生这种情况”。毫无疑问,这是有用的;实际上,这是一个人在某个特定领域经验丰富时能够做到的典型事情的自动化版本。我们稍后会回到这个问题。但现在的主要观点是,至少在预测函数等方面,似乎神经网络和今天的人工智能不能以任何明显的方式“看得更远”超出它们的构建和训练。没有“新兴科学”;只是相当直接的“模式匹配”。

Predicting Computational Processes

Predicting a function is a particularly austere task and one might imagine that “real processes”—for example in nature—would have more “ambient structure” which an AI could use to get a “foothold” for prediction. And as an example of what we might think of as “artificial nature” we can consider computational systems like cellular automata. Here’s an example of what a particular cellular automaton rule does, with a particular initial condition:

There’s a mixture here of simplicity and complexity. And as humans we can readily predict what’s going to happen in the simple parts, but basically can’t say much about the other parts. So how would an AI do?

Clearly if our “AI” can just run the cellular automaton rule then it will be able to predict everything, though with great computational effort. But the real question is whether an AI can shortcut things to make successful predictions without doing all that computational work—or, put another way, whether the AI can successfully find and exploit pockets of computational reducibility.

So, as a specific experiment, let’s set up a neural net to try to efficiently predict the behavior of our cellular automaton. Our network is basically a straightforward—though “modern”—convolutional autoencoder, with 59 layers and a total of about 800,000 parameters:

Convolutional autoencoder

It’s trained much like an LLM. We got lots of examples of the evolution of our cellular automaton, then we showed the network the “top half” of each one, and tried to get it to successfully continue this, to predict the “bottom half”. In the specific experiment we did, we gave 32 million examples of 64-cell-wide cellular automaton evolution. (And, yes, this number of examples is tiny compared to all possible initial configurations.) Then we tried feeding in “chunks” of cellular automaton evolution 64 cells wide and 64 steps long—and looked to see what probabilities the network assigned to different possible continuations.

Here are some results for a sequence of different initial conditions:

And what we see is what we might expect: when the behavior is simple enough, the network basically gets it right. But when the behavior is more complicated, the network usually doesn’t do so well with it. It still often gets it at least “vaguely right”—but the details aren’t there.
当行为足够简单时,我们所看到的就是我们所期望的:网络基本上能够正确理解。但是当行为变得更加复杂时,网络通常无法很好地处理。它仍然经常至少“模糊地”理解对了,但细节不够准确。

Perhaps, one might think, the network just wasn’t trained for long enough, or with enough examples. And to get some sense of the effect of more training, here’s how the predicted probabilities evolve with successive quarter million rounds of training:
也许,有人会认为,网络训练的时间不够长,或者示例不够多。为了对更多训练的效果有所了解,以下是随着连续 25 万轮训练预测概率的变化情况:

These should be compared to the exact result:
这些应该与准确的结果进行比较

And, yes, with more training there is improvement, but by the end it seems like it probably won’t get much better. (Though its loss curve does show some sudden downward jumps during the course of training, presumably as “discoveries” are made—and we can’t be sure there won’t be more of these.)

It’s extremely typical of machine learning that it manages to do a good job of getting things “roughly right”. But nailing the details is not what machine learning tends to be good at. So when what one’s trying to do depends on that, machine learning will be limited. And in the prediction task we’re considering here, the issue is that once things go even slightly off track, everything basically just gets worse from there on out.

Identifying Computational Reducibility

Computational reducibility is at the center of what we normally think of as “doing science”. Because it’s not only responsible for letting us make predictions, it’s also what lets us identify regularities, make models and compressed summaries of what we see—and develop understanding that we can capture in our minds.

But how can we find computational reducibility? Sometimes it’s very obvious. Like when we make a visualization of some behavior (like the cellular automaton evolution above) and immediately recognize simple features in it. But in practice computational reducibility may not be so obvious, and we may have to dig through lots of details to find it. And this is a place where AI can potentially help a lot.

At some level we can think of it as a story of “finding the right parametrization” or the “right coordinate system”. As a very straightforward example, consider the seemingly quite random cloud of points:

Just turning this particular cloud of points to the appropriate angle reveals obvious regularities:

But is there a general way to pick out regularities if they’re there? There’s traditional statistics (“Is there a correlation between A and B?”, etc.). There’s model fitting (“Is this a sum of Gaussians?”). There’s traditional data compression (“Is it shorter after run-length encoding?”). But all of these pick out only rather specific kinds of regularities. So can AI do more? Can it perhaps somehow provide a general way to find regularities?

To say one’s found a regularity in something is basically equivalent to saying one doesn’t need to specify all the details of the thing: that there’s a reduced representation from which one can reconstruct it. So, for example, given the “points-lie-on-lines” regularity in the picture above, one doesn’t need to separately specify the positions of all the points; one just needs to know that they form stripes with a certain separation.
说某物存在规律基本上等同于说不需要详细说明该物的所有细节:存在一种简化的表达方式可以重新构建它。因此,例如,在上面的图片中,给定“点在直线上”的规律,就不需要单独指定所有点的位置;只需要知道它们以一定的间隔形成条纹即可。

OK, so let’s imagine we have an image with a certain number of pixels. We can ask whether there’s reduced representation that involves less data—from which the image can effectively be reconstructed. And with neural nets there’s what one might think of as a trick for finding such a reduced representation.
好的,让我们假设我们有一张包含一定数量像素的图像。我们可以询问是否存在一种减少数据量的简化表示方式,通过该方式可以有效地重建图像。而使用神经网络,有一种可以找到这种简化表示方式的技巧。

The basic idea is to set up a neural net as an autoencoder that takes inputs and reproduces them as outputs. One might think this would be a trivial task. But it’s not, because the data from the input has to flow through the innards of the neural net, effectively being “ground up” at the beginning and “reconstituted” at the end. But the point is that with enough examples of possible inputs, it’s potentially possible to train the neural net to successfully reproduce inputs, and operate as an autoencoder.
基本思想是建立一个神经网络作为自编码器,它接受输入并将其复制为输出。有人可能认为这是一项微不足道的任务。但事实并非如此,因为输入的数据必须通过神经网络的内部流动,从一开始就被“研磨”,最后被“重组”。但关键是,通过足够多的可能输入的示例,有可能训练神经网络成功地复制输入,并作为自编码器运行。

But now the idea is to look inside the autoencoder, and to pull out a reduced representation that it’s come up with. As data flows from layer to layer in the neural net, it’s always trying to preserve the information it needs to reproduce the original input. And if a layer has fewer elements, what’s present at that layer must correspond to some reduced representation of the original input.
但现在的想法是查看自动编码器内部,并提取出它所生成的简化表示。当数据在神经网络的层之间流动时,它始终试图保留重现原始输入所需的信息。如果某一层的元素较少,那么该层中存在的内容必须对应于原始输入的某种简化表示。

Let’s start with a standard modern image autoencoder, that’s been trained on a few billion images typical of what’s on the web. Feed it a picture of a cat, and it’ll successfully reproduce something that looks like the original picture:
让我们从一个标准的现代图像自编码器开始,它已经在数十亿张典型的网络图片上进行了训练。将一张猫的图片输入进去,它将成功地复制出一个看起来像原始图片的东西。

But in the middle there’ll be a reduced representation, with many fewer pixels—that somehow still captures what’s needed of the cat (here shown with its 4 color channels separated):
但是在中间会有一个减少的表示,像素要少得多,但仍然能够捕捉到猫所需的内容(这里显示了猫的 4 个颜色通道分离)

We can think of this as a kind of “black-box model” for the cat image. We don’t know what the elements (“features”) in the model mean, but somehow it’s successfully capturing “the essence of the picture”.
我们可以将其视为猫图像的一种“黑盒模型”。我们不知道模型中的元素(“特征”)的含义,但它以某种方式成功地捕捉到了“图片的本质”。

So what happens if we apply this to “scientific data”, or for example “artificial natural processes” like cellular automata? Here’s a case where we get successful compression:
如果我们将这个应用于“科学数据”,或者例如“人工自然过程”比如细胞自动机,会发生什么呢?这是一个成功压缩的案例

In this case it’s not quite so successful:

And in these cases—where there’s underlying computational irreducibility—it has trouble:

But there’s a bit more to this story. You see, the autoencoder we’re using was trained on “everyday images”, not these kinds of “scientific images”. So in effect it’s trying to model our scientific images in terms of constructs like eyes and ears that are common in pictures of things like cats.

So what happens if—like in the case of cellular automaton prediction above—we train an autoencoder more specifically on the kinds of images we want?

Here are two very simple neural nets that we can use as an “encoder” and a “decoder” to make an autoencoder:

Now let’s take the standard MNIST image training set, and use these to train the autoencoder:

Each of these images has 28×28 pixels. But in the middle of the autoencoder we have a layer with just two elements. So this means that whatever we ask it to encode must be reduced to just two numbers:

And what we see here is that at least for images that look more or less like the ones it was trained on, the autoencoder manages to reconstruct something that looks at least roughly right, even from the radical compression. If you give it other kinds of images, however, it won’t be as successful, instead basically just insisting on reconstructing them as looking like images from its training set:

OK, so what about training it on cellular automaton images? Let’s take 10 million images generated with a particular rule:
好的,那么对于在元胞自动机图像上进行训练呢?让我们拿 10 百万个由特定规则生成的图像来训练它:

Now we train our autoencoder on these images. Then we try feeding it similar images:
现在我们对这些图像进行自动编码器训练。然后我们尝试输入类似的图像:

The results are at best very approximate; this small neural net didn’t manage to learn the “detailed ways of this particular cellular automaton”. If it had been successful at characterizing all the apparent complexity of the cellular automaton evolution with just two numbers, then we could have considered this an impressive piece of science. But, unsurprisingly, the neural net was effectively blocked by computational irreducibility.
结果充其量只是非常近似的;这个小型神经网络无法学习到这个特定细胞自动机的“详细方式”。如果它能够成功地用只有两个数字来描述细胞自动机演化的所有表面复杂性,那么我们可以认为这是一项令人印象深刻的科学成果。但是,毫不奇怪,神经网络在计算上的不可简化性有效地阻碍了它的进展。

But even though it can’t “seriously crack computational irreducibility” the neural net can still “make useful discoveries”, in effect by finding little pieces of computational reducibility, and little regularities. So, for example, if we take images of “noisy letters” and use a neural net to reduce them to pairs of numbers, and use these numbers to place the images, we get a “dimension-reduced feature space plot” that separates images of different letters:
但即使它不能“严重破解计算不可约性”,神经网络仍然可以“做出有用的发现”,实际上是通过找到一些计算可约性的小片段和小规律。因此,例如,如果我们将“噪声字母”的图像使用神经网络转化为一对数字,并使用这些数字来放置图像,我们就可以得到一个“降维特征空间图”,可以将不同字母的图像分开。

But consider, for example, a collection of cellular automata with different rules:

Here’s how a typical neural net would arrange these images in “feature space”:

And, yes, this has almost managed to automatically discover the four classes of behavior that I identified in early 1983. But it’s not quite there. Though in a sense this is a difficult case, very much face-to-face with computational irreducibility. And there are plenty of cases (think: arrangement of the periodic table based on element properties; similarity of fluid flows based on Reynolds number; etc.) where one can expect a neural net to key into pockets of computational reducibility and at least successfully recapitulate existing scientific discoveries.

AI in the Non-human World

In its original concept AI was about developing artificial analogs of human intelligence. And indeed the recent great successes of AI—say in visual object recognition or language generation—are all about having artificial systems that reproduce the essence of what humans do. It’s not that there’s a precise theoretical definition of what makes an image be of a cat versus of a dog. What matters is that we can have a neural net that will come to the same conclusions as humans do.

So why does this work? Probably it’s because neural nets capture the architectural essence of actual brains. Of course the details of artificial neural networks aren’t the same as biological brains. But in a sense the big surprise of modern AI is that there seems to be enough universality to make artificial neural nets behave in ways that are functionally similar to human brains, at least when it comes to things like visual object recognition or language generation.

But what about questions in science? At one level we can ask whether neural nets can emulate what human scientists do. But there’s also another level: is it possible that neural nets can just directly work out how systems—say in nature—behave? Imagine we’re studying some physical process. Human scientists might find some human-level description of the system, say in terms of mathematical equations. But the system itself is just directly doing what it does. And the question is whether that’s something a neural net can capture.
但是科学问题呢?在某个层面上,我们可以问神经网络是否能够模拟人类科学家的工作。但还有另一个层面:神经网络是否能够直接推断出系统(比如自然界中的系统)的行为方式?想象一下,我们正在研究某个物理过程。人类科学家可能会找到一些关于系统的人类级描述,比如用数学方程来描述。但系统本身只是直接按照自己的方式运行。问题是神经网络能否捕捉到这一点。

And if neural nets “work” on “human-like tasks” only because they’re architecturally similar to brains, there’s no immediate reason to think that they should be able to capture “raw natural processes” that aren’t anything to do with brains. So what’s going on when AI does something like predicting protein folding?
如果神经网络之所以能够在“类似人类任务”上“发挥作用”,仅仅是因为它们在结构上与大脑相似,那么就没有立即的理由认为它们能够捕捉与大脑无关的“原始自然过程”。那么当人工智能做出像预测蛋白质折叠这样的事情时,到底发生了什么?

One part of the story, I suspect, is that even though the physical process of protein folding has nothing to do with humans, the question of what aspects of it we consider significant does. We don’t expect that the neural net will predict the exact position of every atom (and in natural environments the atoms in a protein don’t even have precisely fixed positions). Instead, we want to know things like whether the protein has the “right general shape”, with the right “identifiable features” (like, say, alpha helices), or the right functional properties. And these are now more “human” questions—more in the “eye of the beholder”—and more like a question such as whether we humans judge an image to be of a cat versus a dog. So if we conclude that a neural net “solves the scientific problem” of how a protein folds, it might be at least in part just because the criteria of success that our brains (“subjectively”) apply is something that a neural net—with its brain-like architecture—happens to be able to deliver.
故事的一部分,我怀疑,即使蛋白质折叠的物理过程与人类无关,我们认为其中哪些方面是重要的问题却与人类有关。我们并不期望神经网络能够预测每个原子的确切位置(在自然环境中,蛋白质中的原子甚至没有完全固定的位置)。相反,我们想知道的是蛋白质是否具有“正确的一般形状”,具有正确的“可识别特征”(比如α螺旋)或正确的功能特性。而这些问题更加“人类化”——更像是一个主观的问题,就像我们人类判断一张图片是猫还是狗一样。因此,如果我们得出结论认为神经网络“解决了蛋白质折叠的科学问题”,这可能至少部分是因为我们大脑(“主观地”)应用的成功标准恰好是神经网络这种类似大脑结构的东西能够实现的。

It’s a bit like producing an image with generative AI. At the level of basic human visual perception, it may look like something we recognize. But if we scrutinize it, we can see that it’s not “objectively” what we think it is:
这有点像使用生成式人工智能生成图像。从基本的人类视觉感知水平来看,它可能看起来像我们所认识的东西。但是如果我们仔细观察,我们会发现它并不是我们所认为的“客观”对象。

It wasn’t ever really practical with “first-principles physics” to figure out how proteins fold. So the fact that neural nets can get even roughly correct answers is impressive. So how do they do it? A significant part of it is surely effectively just matching chunks of protein to what’s in the training set—and then finding “plausible” ways to “stitch” these chunks together. But there’s probably something else too. One’s familiar with certain “pieces of regularity” in proteins (things like alpha helices and beta sheets). But it seems likely that neural nets are effectively plugging into other kinds of regularity; they’ve somehow found pockets of reducibility that we didn’t know were there. And particularly if just a few pockets of reducibility show up over and over again, they’ll effectively represent new, general “results in science” (say, some new kind of commonly occurring “meta-motif” in protein structure).
用“第一原理物理学”来解决蛋白质折叠问题从来都不是实际可行的。所以神经网络能够得到即使是粗略正确的答案是令人印象深刻的。那么它们是如何做到的呢?其中一个重要的部分肯定是将蛋白质的片段与训练集中的内容进行匹配,然后找到“合理”的方法将这些片段“拼接”在一起。但可能还有其他因素。我们对蛋白质中某些“规律性的部分”(比如α螺旋和β折叠)很熟悉。但神经网络可能有效地插入了其他类型的规律性;它们以某种方式找到了我们不知道存在的可简化的区域。特别是如果只有几个可简化的区域一次又一次地出现,它们将有效地代表科学中的新的普遍“结果”(比如蛋白质结构中一种新的常见“元模体”)。

But while it’s fundamentally inevitable that there must be an infinite number of pockets of computational reducibility in the end, it’s not clear at the outset either how significant these might be in things we care about, or how successful neural net methods might be in finding them. We might imagine that insofar as neural nets mirror the essential operation of our brains, they’d only be able to find pockets of reducibility in cases where we humans could also readily discover them, say by looking at some visualization or another.

But an important point is that our brains are normally “trained” only on data that we readily experience with our senses: we’ve seen the equivalent of billions of images, and we’ve heard zillions of sounds. But we don’t have direct experience of the microscopic motions of molecules, or of a multitude of kinds of data that scientific observations and measuring devices can deliver.

A neural net, however, can “grow up” with very different “sensory experiences”—say directly experiencing “chemical space”, or, for that matter “metamathematical space”, or the space of financial transactions, or interactions between biological organisms, or whatever. But what kinds of pockets of computational reducibility exist in such cases? Mostly we don’t know. We know the ones that correspond to “known science”. But even though we can expect others must exist, we don’t normally know what they are.

Will they be “accessible” to neural nets? Again, we don’t know. Quite likely, if they are accessible, then there’ll be some representation—or, say, visualization—in which the reducibility will be “obvious” to us. But there are plenty of ways this could fail. For example, the reducibility could be “visually obvious”, but only, say, in 3D volumes where, for example, it’s hard even to distinguish different structures of fluffy clouds. Or perhaps the reducibility could be revealed only through some computation that’s not readily handled by a neural net.

Inevitably there are many systems that show computational irreducibility, and which—at least in their full form—must be inaccessible to any “shortcut method”, based on neural nets or otherwise. But what we’re asking is whether, when there is a pocket of computational reducibility, it can be captured by a neural net.

But once again we’re confronted with the fact there are no “model-less models”. Some particular kind of neural net will readily be able to capture some particular kinds of computational reducibility; another will readily be able to capture others. And, yes, you can always construct a neural net that will approximate any given specific function. But in capturing some general kind of computational reducibility, we are asking for much more—and what we can get will inevitably depend on the underlying structure of the neural net.

But let’s say we’ve got a neural net to successfully key into computational reducibility in a particular system. Does that mean it can predict everything? Typically no. Because almost always the computational reducibility is “just a pocket”, and there’s plenty of computational irreducibility—and “surprises”—“outside”.

And indeed this seems to happen even in the case of something like protein folding. Here are some examples of proteins with what we perceive as fairly simple structures—and the neural net prediction (in yellow) agrees quite well with the results of physical experiments (gray tubes):
实际上,即使在蛋白质折叠这样的情况下,这种情况似乎也会发生。以下是一些我们认为具有相当简单结构的蛋白质的例子,神经网络预测(黄色)与物理实验结果(灰色管状物)非常吻合。

But for proteins with what we perceive as more complicated structures, the agreement is often not nearly as good:
但对于我们认为结构更复杂的蛋白质来说,协议通常远远不如此好

These proteins are all are at least similar to ones that were used to train the neural net. But how about very different proteins—say ones with random sequences of amino acids?
这些蛋白质至少与用于训练神经网络的蛋白质相似。但是对于非常不同的蛋白质,比如具有随机氨基酸序列的蛋白质,又如何呢?

It’s hard to know how well the neural net does here; it seems likely that particularly if there are “surprises” it won’t successfully capture them. (Of course, it could be that all “reasonable proteins” that normally appear in biology could have certain features, and it could be “unfair” to apply the neural net to “unbiological” random ones—though for example in the adaptive immune system, biology does effectively generate at least short “random proteins”.)
很难知道神经网络在这方面的表现如何;如果有“意外”,它似乎不太可能成功捕捉到它们。(当然,可能所有在生物学中通常出现的“合理蛋白质”都具有某些特征,将神经网络应用于“非生物学”的随机蛋白质可能是“不公平”的——尽管例如在适应性免疫系统中,生物学确实能够有效地生成至少短的“随机蛋白质”。)

Solving Equations with AI
使用人工智能解方程

In traditional mathematical science the typical setup is: here are some equations for a system; solve them to find out how the system behaves. And before computers, that usually meant that one had to find some “closed-form” formula for the solution. But with computers, there’s an alternative approach: make a discrete “numerical approximation”, and somehow incrementally solve the equations. To get accurate results, though, may require many steps and lots of computational effort. So then the question is: can AI speed this up? And in particular, can AI, for example, go directly from initial conditions for an equation to a whole solution?
在传统的数学科学中,典型的设置是:这是一个系统的一些方程;解决它们以了解系统的行为。在计算机出现之前,通常意味着必须找到一些“闭合形式”的解法公式。但是有了计算机,就有了另一种方法:进行离散的“数值逼近”,并以某种方式逐步解决方程。然而,为了获得准确的结果,可能需要许多步骤和大量的计算工作。那么问题是:人工智能能加速这个过程吗?特别是,例如,人工智能能否直接从方程的初始条件得到整个解决方案?

Let’s consider as an example a classical piece of mathematical physics: the three-body problem. Given initial positions and velocities of three point masses interacting via inverse-square-law gravity, what trajectories will the masses follow? There’s a lot of diversity—and often a lot of complexity—which is why the three-body problem has been such a challenge:

But what if we train a neural net on lots of sample solutions? Can it then figure out the solution in any particular case? We’ll use a rather straightforward “multilayer perceptron” network:

We feed it initial conditions, then ask it to generate a solution. Here are a few examples of what it does, with the correct solutions indicated by the lighter background paths:

When the trajectories are fairly simple, the neural net does decently well. But when things get more complicated, it does decreasingly well. It’s as if the neural net has “successfully memorized” the simple cases, but doesn’t know what to do in more complicated cases. And in the end this is very similar to what we saw above in examples like predicting cellular automaton evolution (and presumably also protein folding).
当轨迹相对简单时,神经网络表现得还不错。但当事情变得更加复杂时,它的表现就越来越差。就好像神经网络已经“成功记住”了简单的情况,但在更复杂的情况下不知道该怎么做。最后,这与我们在预测细胞自动机演化(以及可能也包括蛋白质折叠)的例子中看到的非常相似。

And, yes, once again this is a story of computational irreducibility. To ask to just “get the solution” in one go is to effectively ask for complete computational reducibility. And insofar as one might imagine that—if only one knew how to do it—one could in principle always get a “closed-form formula” for the solution, one’s implicitly assuming computational reducibility. But for many decades I’ve thought that something like the three-body problem is actually quite full of computational irreducibility.
是的,再次强调,这是一个关于计算不可简化性的故事。一次性要求“得到解决方案”实际上是要求完全的计算可简化性。而且,只要我们知道如何做到,我们原则上总是可以得到解决方案的“闭合形式公式”,这是在暗示计算可简化性。但是几十年来,我一直认为三体问题之类的问题实际上充满了计算不可简化性。

Of course, had a neural net been able to “crack the problem” and immediately generate solutions, that would effectively have demonstrated computational reducibility. But as it is, the apparent failure of neural nets provides another piece of evidence for computational irreducibility in the three-body problem. (It’s worth mentioning, by the way, that while the three-body problem does show sensitive dependence on initial conditions, that’s not the primary issue here; rather, it’s the actual intrinsic complexity of the trajectories.)
当然,如果神经网络能够“解决问题”并立即生成解决方案,那将有效地证明计算可约性。但事实上,神经网络的明显失败为三体问题的计算不可约性提供了另一种证据。(顺便提一下,虽然三体问题确实对初始条件敏感,但这不是主要问题;而是轨迹的实际内在复杂性。)

We already know that discrete computational systems like cellular automata are rife with computational irreducibility. And we might have imagined that continuous systems—described for example by differential equations—would have more structure that would somehow make them avoid computational irreducibility. And indeed insofar as neural nets (in their usual formulation) involve continuous numbers, we might have thought that they would be able in some way to key into the structure of continuous systems to be able to predict them. But somehow it seems as if the “force of computational irreducibility” is too strong, and will ultimately be beyond the power of neural networks.
我们已经知道,像元胞自动机这样的离散计算系统充满了计算不可简化性。我们可能会想象连续系统,例如由微分方程描述的系统,会有更多的结构,使它们以某种方式避免计算不可简化性。确实,就神经网络(通常的表述)涉及到连续数值而言,我们可能认为它们能够以某种方式与连续系统的结构相契合,从而能够对其进行预测。但不知何故,似乎“计算不可简化性的力量”太强大了,最终将超出神经网络的能力范围。

Having said that, though, there can still be a lot of practical value to neural networks in doing things like solving equations. Traditional numerical approximation methods tend to work locally and incrementally (if often adaptively). But neural nets can more readily handle “much larger windows”, in a sense “knowing longer runs of behavior” and being able to “jump ahead” across them. In addition, when one’s dealing with very large numbers of equations (say in robotics or systems engineering), neural nets can typically just “take in all the equations and do something reasonable” whereas traditional methods effectively have to work with the equations one by one.

The three-body problem involves ordinary differential equations. But many practical problems are instead based on partial differential equations (PDEs), in which not just individual coordinates, but whole functions f[x] etc., evolve with time. And, yes, one can use neural nets here as well, often to significant practical advantage. But what about computational irreducibility? Many of the equations and situations most studied in practice (say for engineering purposes) tend to avoid it, but certainly in general it’s there (notably, say, in phenomena like fluid turbulence). And when there’s computational irreducibility, one can’t ultimately expect neural nets to do well. But when it comes to satisfying our human purposes—as in other examples we’ve discussed—things may look better.
三体问题涉及普通微分方程。但许多实际问题基于偏微分方程(PDEs),其中不仅仅是个别坐标,而是整个函数 f[x]等随时间演化。是的,人们也可以在这里使用神经网络,通常会有显著的实际优势。但是计算不可约性呢?在实践中研究的许多方程和情况(例如工程目的)往往避免了它,但总体而言,它是存在的(特别是在流体湍流等现象中)。当存在计算不可约性时,我们不能最终期望神经网络表现良好。但是当涉及满足我们人类目的时,情况可能会更好。

As an example, consider predicting the weather. In the end, this is all about PDEs for fluid dynamics (and, yes, there are also other effects to do with clouds, etc.). And as one approach, one can imagine directly and computationally solving these PDEs. But another approach would be to have a neural net just “learn typical patterns of weather” (as old-time meteorologists had to), and then have the network (a bit like for protein folding) try to patch together these patterns to fit whatever situation arises.
以预测天气为例。最终,这完全涉及流体动力学的偏微分方程(是的,还有与云等其他效应有关的因素)。作为一种方法,可以想象直接通过计算来解决这些偏微分方程。但另一种方法是让神经网络“学习典型的天气模式”(就像老式气象学家一样),然后让网络(有点像蛋白质折叠)尝试将这些模式拼凑在一起以适应任何出现的情况。

How successful will this be? It’ll probably depend on what we’re looking at. It could be that some particular aspect of the weather shows considerable computational reducibility and is quite predictable, say by neural nets. And if this is the aspect of the weather that we care about, we might conclude that the neural net is doing well. But if something we care about (“will it rain tomorrow?”) doesn’t tap into a pocket of computational reducibility, then neural nets typically won’t be successful in predicting it—and instead there’d be no choice but to do explicit computation, and perhaps impractically much of it.
这会有多成功?这可能取决于我们关注的是什么。也许天气的某个特定方面表现出了相当大的计算可简化性,并且可以通过神经网络进行相当准确的预测。如果这正是我们关心的天气方面,我们可能会得出神经网络表现良好的结论。但是,如果我们关心的某些事情(比如“明天会下雨吗?”)没有涉及到计算可简化性的领域,那么神经网络通常无法成功预测,而只能进行明确的计算,可能需要大量的计算,甚至可能不切实际。

AI for Multicomputation 多计算的人工智能

In what we’ve discussed so far, we’ve mostly been concerned with seeing whether AI can help us “jump ahead” and shortcut some computational process or another. But there are also lots of situations where what’s of interest is instead to shortcut what one can call a multicomputational process, in which there are many possible outcomes at each step, and the goal is for example to find a path to some final outcome.
在我们迄今讨论的内容中,我们主要关注的是 AI 是否能帮助我们“提前跳跃”并简化某个计算过程。但也有很多情况下,我们感兴趣的是简化所谓的多计算过程,即每个步骤都有许多可能的结果,目标是找到通往某个最终结果的路径。

As a simple example of a multicomputational process, let’s consider a multiway system operating on strings, where at each step we apply the rules {A BBB, BB A} in all possible ways:
作为一个多计算过程的简单例子,让我们考虑一个在字符串上操作的多路径系统,在每一步中,我们以所有可能的方式应用规则{A BBB, BB A}

Given this setup we can ask a question like: what’s the shortest path from A to BABA? And in the case shown here it’s easy to compute the answer, say by explicitly running a pathfinding algorithm on the graph:
鉴于这个设置,我们可以问一个问题:从 A 到 BABA 的最短路径是什么?在这里展示的情况下,很容易计算出答案,比如通过在图上明确运行一个路径查找算法

There are many kinds of problems that follow this same general pattern. Finding a winning sequence of plays in a game graph. Finding the solution to a puzzle as a sequence of moves through a graph of possibilities. Finding a proof of a theorem given certain axioms. Finding a chemical synthesis pathway given certain basic reactions. And in general solving a multitude of NP problems in which many “nondeterministic” paths of computation are possible.
有许多种问题遵循相同的一般模式。在游戏图中找到一系列获胜的步骤。通过可能性图找到解决方案的拼图序列。在给定某些公理的情况下找到定理的证明。在给定基本反应的情况下找到化学合成途径。总的来说,解决许多 NP 问题,其中存在许多“非确定性”计算路径。

In the very simple example above, we’re readily able to explicitly generate a whole multiway graph. But in most practical examples, the graph would be astronomically too large. So the challenge is typically to suss out what moves to make without tracing the whole graph of possibilities. One common approach is to try to find a way to assign a score to different possible states or outcomes, and to pursue only paths with (say) the highest scores. In automated theorem proving it’s also common to work “downward from initial propositions” and “upward from final theorems”, trying to see where the paths meet in the middle. And there’s also another important idea: if one has established the “lemma” that there’s a path from X to Y, one can add X Y as a new rule in the collection of rules.
在上面非常简单的例子中,我们可以明确地生成一个完整的多路图。但在大多数实际例子中,图形将会非常庞大。因此,挑战通常是在不追踪所有可能性的整个图形的情况下找出应该采取的移动方式。一种常见的方法是尝试找到一种方法来为不同的可能状态或结果分配分数,并仅追求具有(例如)最高分数的路径。在自动定理证明中,常见的方法也是从“初始命题”向下工作,从“最终定理”向上工作,试图看看路径在中间相遇的地方。还有另一个重要的想法:如果已经建立了从 X 到 Y 的路径的“引理”,则可以将 X Y 添加为规则集合中的新规则。

So how might AI help? As a first approach, we could consider taking something like our string multiway system above, and training what amounts to a language-model AI to generate sequences of tokens that represent paths (or what in a mathematical setting would be proofs). The idea is to feed the AI a collection of valid sequences, and then to present it with the beginning and end of a new sequence, and ask it to fill in the middle.
那么人工智能如何帮助呢?作为第一种方法,我们可以考虑采用类似于上面的字符串多路系统,并训练一种被称为语言模型的人工智能来生成代表路径(或在数学环境中将是证明)的令牌序列。这个想法是将一系列有效的序列输入给人工智能,然后给它提供一个新序列的开头和结尾,并要求它填充中间部分。

We’ll use a fairly basic transformer network:
我们将使用一个相对简单的 Transformer 网络:

Then we train it by giving lots of sequences of tokens corresponding to valid paths (with E being the “end token”)
然后我们通过提供许多与有效路径相对应的令牌序列来训练它(以 E 作为“结束令牌”)

together with “negative examples” indicating the absence of paths:
与“负例”一起,表示路径的缺失:

Now we “prompt” the trained network with a “prefix” of the kind that appeared in the training data, and then iteratively run “LLM style” (effectively at zero temperature, i.e. always choosing the “most probable” next token):
现在我们用训练数据中出现的一种“前缀”来“提示”训练过的网络,然后迭代地运行“LLM风格”(实际上是在零温度下,即始终选择“最有可能”的下一个标记)。

For a while, it does perfectly—but near the end it starts making errors, as indicated by the tokens shown in red. There’s different performance with different destinations—with some cases going off track right at the beginning:
有一段时间,它表现得非常完美,但接近结尾时开始出现错误,如红色标示的标记所示。不同的目的地有不同的表现,有些情况在开始时就偏离了轨道。

How can we do better? One possibility is at each step to keep not just the token that’s considered most probable, but a stack of tokens—thereby in effect generating a multiway system that the “LLM controller” could potentially navigate. (One can think of this somewhat whimsically as a “quantum LLM”, that’s always exploring multiple paths of history.)
我们如何做得更好?一种可能性是在每一步都保留不仅仅是被认为最有可能的标记,而是一堆标记 - 从而实际上生成一个多路径系统,"控制器"可以潜在地导航。 (可以将其有些奇思妙想地看作是一个"量子LLM",它始终在探索多条历史路径。)

(By the way, we could also imagine training with many different rules, then doing what amounts to zero-shot learning and giving a “pre-prompt” that specifies what rule we want to use in any particular case.)
顺便说一下,我们还可以想象使用许多不同的规则进行训练,然后进行所谓的零-shot 学习,并给出一个“预先提示”,以指定在任何特定情况下我们想要使用的规则。

One of the issues with this LLM approach is that the sequences it generates are often even “locally wrong”: the next element can’t follow from the one before according to the rules given.
这种LLM方法的一个问题是,它生成的序列通常甚至是“局部错误”的:根据给定的规则,下一个元素无法从前一个元素中推导出来。

But this suggests another approach one can take. Instead of having the AI try to “immediately fill in the whole sequence”, get it instead just to pick “where to go next”, always following one of the specified rules. Then a simple goal for training is in effect to get the AI to learn the distance function for the graph, or in other words, to be able to estimate how long the shortest path is (if it exists) from any one node to any other. Given such a function, a typical strategy is to follow what amounts to a path of “steepest descent”—at each step picking the move that the AI estimates will do best in reducing the distance to the destination.
但这表明了另一种方法。与其让 AI 尝试“立即填充整个序列”,不如让它只选择“下一步去哪里”,始终遵循指定的规则之一。然后,训练的一个简单目标实际上是让 AI 学习图的距离函数,或者换句话说,能够估计从任一节点到任何其他节点的最短路径的长度(如果存在)。在给定这样的函数后,一个典型的策略是按照“最陡下降”的路径进行操作-在每一步中选择 AI 估计能够最好地减少到目的地距离的移动。

How can this actually be implemented with neural networks? One approach is to use two encoders (say constructed out of transformers)—that in effect generate two embeddings, one for source nodes, and one for destination nodes. The network then combines these embeddings and learns a “metric” that characterizes the distance between the nodes:
如何使用神经网络实际实现这一点?一种方法是使用两个编码器(例如由 transformers 构建)——实际上生成两个嵌入,一个用于源节点,一个用于目标节点。然后,网络将这些嵌入组合起来,并学习一个描述节点之间距离的“度量”。

Training such a network on the multiway system we’ve been discussing—by giving it a few million examples of source-destination distances (plus an indicator of whether this distance is infinite)—we can use the network to predict a piece of the distance matrix for the multiway system. And what we find is that this predicted matrix is similar—but definitely not identical—to the actual matrix:
通过在我们讨论过的多路系统上训练这样一个网络——给它几百万个源-目的地距离的例子(以及一个指示是否为无穷距离的指示器)——我们可以使用该网络来预测多路系统的距离矩阵的一部分。我们发现,这个预测的矩阵与实际矩阵相似,但绝对不相同。

Still, we can imagine trying to build a path where at each step we compute the estimated distances-to-destination predicted by the neural net for each possible destination, then pick the one that “gets furthest”:
尽管如此,我们可以想象尝试构建一条路径,在每一步中,我们计算神经网络预测的到达目的地的估计距离,然后选择“走得最远”的目的地

Each individual move here is guaranteed to be valid, and we do indeed eventually reach our destination BABA—though in slightly more steps than the true shortest path. But even though we don’t quite find the optimal path, the neural net has managed to allow us to at least somewhat prune our “search space”, by prioritizing nodes and traversing only the red edges:
每个个体的移动都是有效的,我们确实最终到达目的地 BABA,尽管步骤比真正的最短路径稍多一些。但即使我们没有找到最优路径,神经网络仍然成功地帮助我们在一定程度上剪枝了“搜索空间”,通过优先选择节点并只遍历红色边缘。

(A technical point is that the particular neural net we’ve used here has the property that all paths between any given pair of nodes always have the same length—so if any path is found, it can be considered “the shortest”. A rule like {A AAB, BBA B} doesn’t have this property and a neural net trained for this rule can end up finding paths that reach the correct destination but aren’t as short as they could be.)
(一个技术上的要点是,我们在这里使用的特定神经网络具有这样的特性:任何给定节点对之间的所有路径长度始终相同 - 因此,如果找到任何路径,它可以被认为是“最短的”。像{A AAB,BBA B}这样的规则没有这个特性,而为这个规则训练的神经网络可能会找到到达正确目的地的路径,但不如可能的路径短。)

Still, as is typical with neural nets, we can’t be sure how well this will work. The neural net might make us go arbitrarily far “off track”, and it might even lead us to a node where we have no path to our destination—so that if we want to make progress we’ll have to resort to something like traditional algorithmic backtracking.
然而,正如神经网络通常的情况一样,我们无法确定这种方法的效果如何。神经网络可能会使我们偏离目标,甚至可能导致我们无法找到通往目的地的路径,因此如果我们想取得进展,就必须采取类似传统算法的回溯方法。

But at least in simple cases the approach can potentially work well—and the AI can successfully find a path that wins the game, proves the theorem, etc. But one can’t expect it to always work. And the reason is that it’s going to run into multicomputational irreducibility. Just as in a single “thread of computation” computational irreducibility can mean that there’s no shortcut to just “going through the steps of the computation”, so in a multiway system multicomputational irreducibility can mean that there’s no shortcut to just “following all the threads of computation”, then seeing, for example, which end up merging with which.
但至少在简单的情况下,这种方法有潜力工作得很好——人工智能可以成功找到赢得游戏、证明定理等的路径。但不能指望它总是有效。原因在于它将遇到多重计算不可简化性。就像在单个“计算线程”中,计算不可简化性意味着没有捷径可以只“按照计算的步骤进行”,在多路径系统中,多重计算不可简化性意味着没有捷径可以只“跟踪所有计算线程”,然后看看哪些线程最终会合并在一起。

But even though this could happen in principle, does it in fact happen in practice in cases of interest to us humans? In something like games or puzzles, we tend to want it to be hard—but not too hard—to “win”. And when it comes to mathematics and proving theorems, cases that we use for exercises or competitions we similarly want to be hard, but not too hard. But when it comes to mathematical research, and the frontiers of mathematics, one doesn’t immediately expect any such constraint. And the result is then that one can expect to be face-to-face with multicomputational irreducibility—making it hard for AI to help too much.
但即使原则上可能发生这种情况,实际上在我们人类感兴趣的情况下是否发生呢?在游戏或谜题等方面,我们倾向于希望它难一些,但又不要太难才能“赢”。而当涉及到数学和证明定理时,我们用于练习或竞赛的案例也希望难一些,但又不要太难。但是当涉及到数学研究和数学的前沿时,人们并不立即期望有这样的限制。结果就是我们可能面对多计算不可简化的情况,这使得人工智能帮助起来变得困难。

There is, however, one footnote to this story, and it has to do with how we choose new directions in mathematics. We can think of a metamathematical space formed by building up theorems from other theorems in all possible ways in a giant multiway graph. But as we’ll discuss below, most of the details of this are far from what human mathematicians would think of as “doing mathematics”. Instead, mathematicians implicitly seem to do mathematics at a “higher level” in which they’ve “coarse grained” this “microscopic metamathematics”—much as we might study a physical fluid in terms of comparatively-simple-to-describe continuous dynamics even though “underneath” there are lots of complicated molecular motions.
然而,这个故事有一个注脚,与我们如何选择数学新方向有关。我们可以将元数学空间看作是通过以所有可能的方式从其他定理中建立定理而形成的巨大多路径图。但正如我们将在下面讨论的那样,其中大部分细节远非人类数学家所认为的“进行数学”的方式。相反,数学家似乎在“更高层次”上进行数学,他们对这种“微观元数学”进行了“粗粒化”处理,就像我们可能会通过相对简单描述的连续动力学来研究物理流体一样,尽管“底层”存在许多复杂的分子运动。

So can AI help with mathematics at this “fluid-dynamics-style” level? Potentially so, but mainly in what amounts to providing code assistance. We have something we want to express, say, in Wolfram Language. But we need help—“LLM style”—in going from our informal conception to explicit computational language. And insofar as what we’re doing follows the structural patterns of what’s been done before, we can expect something like an LLM to help. But insofar as what we’re expressing is “truly new”, and inasmuch as our computational language doesn’t involve much “boilerplate”, it’s hard to imagine that an AI trained on what’s been done before will help much. Instead, what we in effect have to do is some multicomputationally irreducible computation, that allows us to explore to some fresh part of the computational universe and the ruliad.
那么,人工智能能在这种“流体动力学风格”的水平上帮助数学吗?可能可以,但主要是提供代码辅助。我们有一些想要表达的东西,比如用 Wolfram 语言。但我们需要帮助,以“LLM风格”从我们的非正式概念转化为明确的计算语言。只要我们所做的事情遵循以前所做的结构模式,我们可以期望类似LLM的东西来帮助。但如果我们所表达的是“真正新颖”的,并且我们的计算语言不涉及太多“样板代码”,很难想象一个在以前所做的基础上训练的人工智能会有多大帮助。相反,我们实际上需要做的是进行一些多计算不可约的计算,这样可以让我们探索计算宇宙和规则的新颖部分。

Exploring Spaces of Systems
探索系统空间

“Can one find a system that does X?” Say a Turing machine that runs for a very long time before halting. Or a cellular automaton that grows, but only very slowly. Or, for that matter, a chemical with some particular property.
有没有一种能够实现 X 的系统?比如运行很长时间才停止的图灵机,或者生长速度非常缓慢的元胞自动机,或者具有某种特定属性的化学物质。

This is a somewhat different type of question than the ones we’ve been discussing so far. It’s not about taking a particular rule and seeing what its consequences are. It’s about identifying what rule might exist that has certain consequences.
这是一种与我们之前讨论的问题有些不同的类型。它不是关于确定一个特定规则并观察其后果,而是关于确定可能存在的规则并了解其后果。

And given some space of possible rules, one approach is exhaustive search. And in a sense this is ultimately the only “truly unbiased” approach, that will discover what’s out there to discover, even when one doesn’t expect it. Of course, even with exhaustive search, one still needs a way to determine whether a particular candidate system meets whatever criterion one has set up. But now this is the problem of predicting a computation—where the things we said above apply.
鉴于可能的规则空间,一种方法是穷举搜索。从某种意义上说,这是唯一的“真正无偏”的方法,它将发现那些意想不到的东西。当然,即使进行了穷举搜索,我们仍然需要一种方法来确定一个特定的候选系统是否符合我们设定的标准。但现在这就是预测计算的问题,我们上面所说的内容适用于这个问题。

OK, but can we do better than exhaustive search? And can we, for example, find a way to figure out what rules to explore without having to look at every rule? One approach is to do something like what happens in biological evolution by natural selection: start, say, from a particular rule, and then incrementally change it (perhaps at random), at every step keeping the rule or rules that do best, and discarding the others.
好的,但我们能比穷举搜索做得更好吗?例如,我们能找到一种方法,在不必查看每个规则的情况下,弄清楚要探索哪些规则吗?一种方法是类似于生物进化中的自然选择:从一个特定的规则开始,然后逐步改变它(可能是随机的),每一步都保留表现最好的规则或规则,并丢弃其他规则。

This isn’t “AI” as we’ve operationally defined it here (it’s more like a “genetic algorithm”)—though it is a bit like the inner training loop of a neural net. But will it work? Well, that depends on the structure of the rule space—and, as one sees in machine learning—it tends to work better in higher-dimensional rule spaces than lower-dimensional ones. Because with more dimensions there’s less chance one will get “stuck in a local minimum”, unable to find one’s way out to a “better rule”.
这不是我们在这里所定义的“人工智能”(更像是“遗传算法”)——尽管它有点像神经网络的内部训练循环。但它会起作用吗?嗯,这取决于规则空间的结构——正如我们在机器学习中所看到的,它在高维规则空间中的效果比低维规则空间中的效果要好。因为在更多维度的情况下,有较小的机会陷入“局部最小值”,无法找到通往“更好的规则”的出路。

And in general, if the rule space is like a complicated fractal mountainscape, it’s reasonable to expect one can make progress incrementally (and perhaps AI methods like reinforcement learning can help refine what incremental steps to take). But if instead it’s quite flat, with, say, just one “hole” somewhere (“golf-course style”), one can’t expect to “find the hole” incrementally. So what is the typical structure of rule spaces? There are certainly plenty of cases where the rule space is altogether quite large, but the number of dimensions is only modest. And in such cases (an example being finding small Turing machines with long halting times) there often seem to be “isolated solutions” that can’t be reached incrementally. But when there are more dimensions, it seems likely that what amounts to computational irreducibility will more or less guarantee that there’ll be a “random-enough landscape” that incremental methods will be able to do well, much as we have seen in machine learning in recent years.
而且一般来说,如果规则空间就像一个复杂的分形山脉,那么可以合理地期望能够逐步取得进展(也许人工智能方法如强化学习可以帮助改进逐步采取的步骤)。但如果相反,它是相当平坦的,比如说只有一个“洞”(类似高尔夫球场的风格),就不能指望逐步“找到洞”。那么规则空间的典型结构是什么样的呢?当然有很多情况下,规则空间总体上是相当大的,但维度数量却很有限。在这种情况下(例如找到具有长时间停机的小图灵机),通常似乎存在着无法逐步达到的“孤立解”。但当维度更多时,似乎计算不可约性将或多或少地保证会有一个“足够随机的地形”,逐步方法将能够取得良好的效果,就像我们在机器学习中近年来所看到的那样。

So what about AI? Might there be a way for AI to learn how to “pick winners directly in rule space”, without any kind of incremental process? Might we perhaps be able to find some “embedding space” in which the rules we want are laid out in a simple way—and thus effectively “pre-identified” for us? Ultimately it depends on what the rule space is like, and whether the process of exploring it is necessarily (multi)computationally irreducible, or whether at least the aspects of it that we care about can be explored by a computationally reducible process. (By the way, trying to use AI to directly find systems with particular properties is a bit like trying to use AI to directly generate neural nets from data without incremental training.)
那么人工智能呢?也许有一种方法让人工智能能够在规则空间中直接“挑选赢家”,而不需要任何逐步的过程。也许我们能够找到一些“嵌入空间”,在这个空间中,我们想要的规则以简单的方式呈现出来,从而有效地为我们“预先确定”。最终,这取决于规则空间的特性,以及探索它的过程是否必然是(多)计算不可简化的,或者我们关心的方面是否可以通过可计算简化的过程来探索。(顺便说一句,试图使用人工智能直接找到具有特定属性的系统,有点像试图使用人工智能直接从数据中生成神经网络,而不需要逐步训练。)

Let’s look at a specific simple example based on cellular automata. Say we want to find a cellular automaton rule that—when evolved from a single-cell initial condition—will grow for a while, but then die out after a particular, exact number of steps. We can try to solve this with a very minimal AI-like “evolutionary” approach: start from a random rule, then at each “generation” produce a certain number of “offspring” rules, each with one element randomly changed—then keep whichever is the “best” of these rules. If we want to find a rule that “lives” for exactly 50 steps, we define “best” to be the one that minimizes a “loss function” equal to the distance from 50 of the number of steps a rule actually “lives”.
%%

So, for example, say we start from the randomly chosen (3-color) rule:
所以,例如,假设我们从随机选择的(3 色)规则开始:

Our evolutionary sequence of rules (showing here only the “outcome values”) might be:
我们的进化规则序列(仅显示“结果值”)可能是:

If we look at the behavior of these rules, we see that—after an inauspicious start—they manage to successfully evolve to reach a rule that meets the criterion of “living for exactly 50 steps”:
如果我们观察这些规则的行为,我们会发现,在一个不吉利的开始之后,它们成功地进化到达了符合“生存恰好 50 步”的规则

What we’ve shown here is a particular randomly chosen “path of evolution”. But what happens with other paths? Here’s how the “loss” evolves (over the course of 100 generations) for a collection of paths:
我们在这里展示的是一个特定的随机选择的“进化路径”。但是其他路径会发生什么呢?以下是一组路径在 100 代中“损失”的演变情况:

And what we see is that there’s only one “winner” here that achieves zero loss; on all the other paths, evolution “gets stuck”.
我们所看到的是只有一个“赢家”能够实现零损失;在其他所有路径上,进化都“陷入困境”。

As we mentioned above, though, with more “dimensions” one’s less likely to get stuck. So, for example, if we look at 4-color cellular automaton rules, there are now 64 rather than 27 possible elements (or effectively dimensions) to change, and in this case, many paths of evolution “get further”
正如我们上面提到的,然而,拥有更多的“维度”,一个人就不太可能陷入困境。所以,例如,如果我们看一下 4 色细胞自动机规则,现在有 64 个可能的元素(或有效维度)可以改变,而不是 27 个,在这种情况下,许多进化路径“走得更远”。

and there are more “winners” such as:
还有更多的“赢家”,比如:

How could something like neural nets help us here? Insofar as we can use them to predict cellular automaton evolution, they might give us a way to speed up what amounts to the computation of the loss for each candidate rule—though from what we saw in an earlier section, computational irreducibility is likely to limit this. Another possibility is that—much as in the previous section—we could try to use neural nets to guide us in which random changes to make at each generation. But while computational irreducibility probably helps in making things “effectively random enough” that we won’t get stuck, it makes it difficult to have something like a neural net successfully tell us “which way to go”.
神经网络能如何帮助我们呢?在我们使用它们来预测元胞自动机演化时,它们可能会为我们提供一种加速每个候选规则的损失计算的方法,尽管从之前的部分中我们看到,计算不可约性可能会限制这一点。另一个可能性是,就像前一节中一样,我们可以尝试使用神经网络来指导我们在每一代中进行哪些随机变化。但是,尽管计算不可约性可能有助于使事物“足够随机”,以免陷入困境,但它使得让神经网络成功地告诉我们“应该往哪个方向走”变得困难。

Science as Narrative

In many ways one can view the essence of science—at least as it’s traditionally been practiced—as being about taking what’s out there in the world and somehow casting it in a form we humans can think about. In effect, we want science to provide a human-accessible narrative for what happens, say in the natural world.
在许多方面,人们可以将科学的本质视为将世界上的事物以某种方式呈现出来,以便我们人类能够思考。实际上,我们希望科学能够为自然界中发生的事情提供一个人类可理解的叙述。

The phenomenon of computational irreducibility now shows us that this will often ultimately not be possible. But whenever there’s a pocket of computational reducibility it means that there’s some kind of reduced description of at least some part of what’s going on. But is that reduced description something that a human could reasonably be expected to understand? Can it, for example, be stated succinctly in words, formulas, or computational language? If it can, then we can think of it as representing a successful “human-level scientific explanation”.
计算不可约现象现在告诉我们,这通常最终是不可能的。但是,每当存在计算可约的区域时,这意味着至少有一部分正在发生的事情有某种简化的描述。但是,这种简化的描述是否是人类可以合理期望理解的呢?例如,它是否可以用简洁的文字、公式或计算语言来陈述?如果可以,那么我们可以将其视为成功的“人类水平科学解释”。

So can AI help us automatically create such explanations? To do so it must in a sense have a model for what we humans understand—and how we express this understanding in words, etc. It doesn’t do much good to say “here are 100 computational steps that produce this result”. To get a “human-level explanation” we need to break this down into pieces that humans can assimilate.
那么,人工智能能帮助我们自动创建这样的解释吗?为了做到这一点,它必须在某种程度上对我们人类的理解有一个模型,以及我们如何用语言等方式表达这种理解。仅仅说“这里有 100 个计算步骤可以产生这个结果”并没有太大用处。要得到一个“人类水平的解释”,我们需要将其分解成人类可以理解的部分。

As an example, consider a mathematical proof, generated by automated theorem proving:
作为一个例子,考虑一下由自动定理证明生成的数学证明

Automated theorem–proving table

A computer can readily check that this is correct, in that each step follows from what comes before. But what we have here is a very “non-human thing”—about which there’s no realistic “human narrative”. So what would it take to make such a narrative? Essentially we’d need “waypoints” that are somehow familiar—perhaps famous theorems that we readily recognize. Of course there may be no such things. Because what we may have is a proof that goes through “uncharted metamathematical territory”. So—AI assisted or not—human mathematics as it exists today may just not have the raw material to let us create a human-level narrative.
计算机可以轻松检查这是否正确,因为每一步都是根据前面的步骤得出的。但是我们在这里面所涉及的是一种非常“非人类的东西”,关于这个东西没有现实的“人类叙事”。那么要制作这样的叙事需要什么呢?基本上,我们需要一些在某种程度上熟悉的“航点”,也许是我们很容易认出的著名定理。当然,可能并没有这样的东西。因为我们可能有的是一个通过“未知的元数学领域”进行的证明。所以,无论是否辅助人工智能,人类数学如今可能只是没有原材料让我们创造出一个人类水平的叙事。

In practice, when there’s a fairly “short metamathematical distance” between steps in a proof, it’s realistic to think that a human-level explanation can be given. And what’s needed is very much like what Wolfram|Alpha does when it produces step-by-step explanations of its answers. Can AI help? Potentially, using methods like our second approach to AI-assisted multicomputation above.
在实践中,当证明中的步骤之间存在相对“短的元数学距离”时,可以合理地认为可以给出一个人类水平的解释。所需的方法非常类似于 Wolfram|Alpha 在生成答案的逐步解释时所做的。人工智能能够帮助吗?潜在地,可以使用像上面我们提到的辅助多计算的第二种方法。

And, by the way, our efforts with Wolfram Language help too. Because the whole idea of our computational language is to capture “common lumps of computational work” as built-in constructs—and in a sense the process of designing the language is precisely about identifying “human-assimilable waypoints” for computations. Computational irreducibility tells us that we’ll never be able to find such waypoints for all computations. But our goal is to find waypoints that capture current paradigms and current practice, as well as to define directions and frameworks for extending these—though ultimately “what we humans know about” is something that’s determined by the state of human knowledge as it’s historically evolved.
顺便说一下,我们使用 Wolfram 语言的努力也有所帮助。因为我们计算语言的整个理念是捕捉“计算工作的常见块”作为内置结构,并且在某种程度上,设计语言的过程正是为了确定计算的“人类可接受的航点”。计算的不可约性告诉我们,我们永远无法找到所有计算的这种航点。但我们的目标是找到捕捉当前范式和当前实践的航点,以及定义扩展这些范式和实践的方向和框架,尽管最终“我们人类所知道的”是由人类知识的历史演变状态所决定的。

Proofs and computational language programs are two examples of structured “scientific narratives”. A potentially simpler example—aligned with the mathematical tradition for science—is a pure formula. “It’s a power law”. “It’s a sum of exponentials”. Etc. Can AI help with this? A function like FindFormula is already using machine-learning-inspired techniques to take data and try to produce a “reasonable formula for it”.
证明和计算语言程序是结构化的“科学叙述”的两个例子。一个可能更简单的例子是纯粹的公式,与科学的数学传统相一致。“这是一个幂律”。“这是指数和的总和”。等等。人工智能能帮助吗?像 FindFormula 这样的函数已经使用了受机器学习启发的技术,以获取数据并尝试为其生成“合理的公式”。

Here’s what it does for the first 100 primes:
以下是它对前 100 个质数的处理结果:

Going to 10,000 primes it produces a more complicated result:
前往 10,000 个质数,它会产生一个更复杂的结果:

Or, let’s say we ask about the relation between GDP and population for countries. Then we can get formulas like:
或者,我们可以问一下国家的 GDP 和人口之间的关系。然后我们可以得到类似的公式:

But what (if anything) do these formulas mean? It’s a bit like with proof steps and so on. Unless we can connect what’s in the formulas with things we know about (whether in number theory or economics) it’ll usually be difficult to conclude much from them. Except perhaps in some rare cases where one can say “yes, that’s a new, useful law”—like in this “derivation” of Kepler’s third law (where 0.7 is a pretty good approximation to 2/3):
但是这些公式(如果有的话)意味着什么呢?这有点像证明步骤之类的东西。除非我们能将公式中的内容与我们所了解的事物联系起来(无论是在数论还是经济学中),否则通常很难从中得出太多结论。除非在一些罕见的情况下,我们可以说“是的,这是一个新的、有用的定律”,就像这个开普勒第三定律的“推导”中一样(其中 0.7 是对 2/3 的相当不错的近似)。

There’s an even more minimal example of this kind of thing in recognizing numbers. Type a number into Wolfram|Alpha and it’ll try to tell you what “possible closed forms” for the number might be:
在识别数字方面,有一个更简单的例子。在 Wolfram|Alpha 中输入一个数字,它会尝试告诉你该数字的“可能闭合形式”。

Possible closed forms of 12.1234

There are all sorts of tradeoffs here, some very AI informed. What’s the relative importance of getting more digits right compared to having a simple formula? What about having simple numbers in the formula compared to having “more obscure” mathematical constants (e.g. π versus Champernowne’s number)? When we set up this system for Wolfram|Alpha 15 years ago, we used the negative log frequency of constants in the mathematical literature as a proxy for their “information content”. With modern LLM techniques it may be possible to do a more holistic job of finding what amounts to a “good scientific narrative” for a number.
这里有各种权衡,其中一些是非常 AI 相关的。在得到更多正确的数字与拥有简单公式之间,哪个更重要?在公式中使用简单数字与使用“更晦涩”的数学常数(例如π与 Champernowne 的数)之间,哪个更重要?当我们在 15 年前为 Wolfram|Alpha 建立这个系统时,我们使用数学文献中常数的负对数频率作为它们的“信息内容”的代理。使用现代的LLM技术,可能可以更全面地找到一个对于一个数字来说相当于“好的科学叙述”的方法。

But let’s return to things like predicting the outcome of processes such as cellular automaton evolution. In an earlier section we discussed getting neural nets to do this prediction. We viewed this essentially as a “black-box” approach: we wanted to see if we could get a neural net to successfully make predictions, but we weren’t asking to get a “human-level understanding” of those predictions.
但让我们回到像预测细胞自动机演化结果这样的事情上。在前面的部分中,我们讨论了如何让神经网络进行这种预测。我们将这基本上视为一种“黑盒”方法:我们想看看是否能让神经网络成功地进行预测,但我们并不要求对这些预测有“人类水平的理解”。

It’s a ubiquitous story in machine learning. One trains a neural net to successfully predict, classify, or whatever. But if one “looks inside” it’s very hard to tell what’s going on. Here’s the final result of applying an image identification neural network:
这是机器学习中一个无处不在的故事。人们训练一个神经网络成功地进行预测、分类或其他任务。但是如果人们“深入研究”,很难弄清楚其中的运作原理。这是应用图像识别神经网络的最终结果。

And here are the “intermediate thoughts” generated after going through about half the layers in the network:
在通过网络的大约一半层之后,这里是生成的“中间思考”

Maybe something here is a “definitive signature of catness”. But it’s not part of our current scientific lexicon—so we can’t usefully use it to develop a “scientific narrative” that explains how the image should be interpreted.
也许这里的某些东西是“猫的明确特征”。但它不是我们当前科学词汇的一部分,所以我们无法有用地使用它来发展解释这个图像应该如何被理解的“科学叙述”。

But what if we could reduce our images to just a few parameters—say using an autoencoder of the kind we discussed above? Conceivably we could set things up so that we’d end up with “interpretable parameters”—or, in other words, parameters where we can give a narrative explanation of what they mean. For example, we could imagine using something like an LLM to pick parameters that somehow align with words or phrases (“pointiness”, “fractal dimension”, etc.) that appear in explanatory text from around the web. And, yes, these words or phrases could be based on analogies (“cactus-shaped”, “cirrus-cloud-like”, etc.)—and something like an LLM could “creatively” come up with these names.
但是,如果我们能将图像简化为只有几个参数——比如使用我们上面讨论过的自编码器,会怎么样呢?可以想象,我们可以设置一些东西,以便最终得到“可解释的参数”,或者换句话说,我们可以对它们的含义给出一个叙述性的解释。例如,我们可以想象使用类似于LLM的东西来选择与网上的解释性文本中出现的词语或短语(如“尖锐度”,“分形维度”等)相关的参数。是的,这些词语或短语可以基于类比(如“仙人掌形状”,“卷云状”等)——而类似于LLM的东西可以“创造性地”提出这些名称。

But in the end there’s nothing to say that a pocket of computational reducibility picked out by a certain autoencoder will have any way to be aligned with concepts (scientific or otherwise) that we humans have yet explored, or so far given words to. Indeed, in the ruliad at large, it is overwhelmingly likely that we’ll find ourselves in “interconcept space”—unable to create what we would consider a useful scientific narrative.
但最终没有什么可以说的,某个自编码器选择的计算可简化的区域可能没有任何与我们人类已经探索或迄今为止赋予词语的概念(科学或其他)对齐的方式。实际上,在整个规则空间中,我们极有可能发现自己处于“概念间空间”,无法创造出我们认为有用的科学叙述。

This depends a bit, however, on just how we constrain what we’re looking at. We might implicitly define science to be the study of phenomena for which we have—at some time—successfully developed a scientific narrative. And in this case it’s of course inevitable that such a narrative will exist. But even given a fixed method of observation or measurement it’s basically inevitable that as we explore, computational irreducibility will lead to “surprises” that break out of whatever scientific narrative we were using. Or in other words, if we’re really going to discover new science, then—AI or not—we can’t expect to have a scientific narrative based on preexisting concepts. And perhaps the best we can hope for is that we’ll be able to find pockets of reducibility, and that AI will “understand” enough about us and our intellectual history that it’ll be able to suggest a manageable path of new concepts that we should learn to develop a successful scientific narrative for what we discover.
这在一定程度上取决于我们如何限定我们所研究的内容。我们可能会隐含地将科学定义为我们曾经成功地发展出科学叙述的现象的研究。在这种情况下,这样的叙述当然是不可避免的。但是,即使在固定的观察或测量方法下,由于计算不可简化性,我们探索的过程中也不可避免地会出现打破我们所使用的科学叙述的“意外”。换句话说,如果我们真的要发现新的科学,那么无论是否有人工智能,我们都不能指望有一个基于现有概念的科学叙述。或许我们最好的希望是能够找到可简化的领域,并且人工智能能够“理解”我们和我们的知识历史的足够多,以便能够提出一个我们应该学习的可行的新概念路径,从而为我们所发现的内容开发出一个成功的科学叙述。

Finding What’s Interesting
找到有趣的东西

A central part of doing open-ended science is figuring out “what’s interesting”. Let’s say one just enumerates a collection of cellular automata:
找到有趣的东西是进行开放式科学的核心部分。假设一个人只是列举了一系列的元胞自动机:

The ones that just die out—or make uniform patterns—“don’t seem interesting”. The first time one sees a nested pattern generated by a cellular automaton, it might seem interesting (as it did to me in 1981). But pretty soon it comes to seem routine. And at least as a matter of basic ruliology, what one ends up looking for is “surprise”: qualitatively new behavior one hasn’t seen before. (If one’s concerned with specific applications, say to modeling particular systems in the world, then one might instead want to look at rules with certain structure, whether or not their behavior “abstractly seems interesting”.)
那些只会消失或产生均匀模式的模式“似乎不太有趣”。第一次看到由元胞自动机生成的嵌套模式时,它可能看起来很有趣(就像 1981 年对我来说一样)。但很快它就变得平凡。至少在基本的规则学中,我们最终要寻找的是“惊喜”:一种我们以前没有见过的新行为。 (如果我们关心特定的应用,比如对世界中特定系统建模,那么我们可能希望查看具有特定结构的规则,无论它们的行为“在抽象上是否看起来有趣”。)

The fact that one can expect “surprises” (and indeed, be able to do useful, truly open-ended science at all) is a consequence of computational irreducibility. And whenever there’s a “lack of surprise” it’s basically a sign of computational reducibility. And this makes it plausible that AI—and neural nets—could learn to identify at least certain kinds of “anomalies” or “surprises”, and thereby discover some version of “what’s interesting”.
一个人可以期待“惊喜”(实际上,能够进行有用的、真正开放的科学研究)是计算不可简化的结果。每当出现“缺乏惊喜”时,基本上是计算可简化的迹象。这使得人工智能和神经网络能够学习识别至少某些类型的“异常”或“惊喜”,从而发现某种版本的“有趣之处”。

Usually the basic idea is to have a neural net learn the “typical distribution” of data—and then to identify outliers relative to this. So for example we might look at a large number of cellular automaton patterns to learn their “typical distribution”, then plot a projection of this onto a 2D feature space, indicating where certain specific patterns lie:
通常的基本思想是让神经网络学习数据的“典型分布”,然后相对于这个分布来识别异常值。例如,我们可以观察大量的元胞自动机模式,学习它们的“典型分布”,然后将其投影到一个二维特征空间上,指示特定模式所在的位置。

Some of the patterns show up in parts of the distribution where their probabilities are high, but others show up where the probabilities are low—and these are the outliers:
一些模式出现在概率较高的分布部分,但其他模式出现在概率较低的部分——这些就是异常值

Are these outliers “interesting”? Well, it depends on your definition of “interesting”. And in the end that’s “in the eye of the beholder”. Here, the “beholder” is a neural net. And, yes, these particular patterns wouldn’t be what I would have picked. But relative to the “typical patterns” they do seem at least “somewhat different”. And presumably it’s basically a story like the one with neural nets that distinguish pictures of cats and dogs: neural nets make at least somewhat similar judgements to the ones we do—perhaps because our brains are structurally like neural nets.
这些异常值是否“有趣”?嗯,这取决于你对“有趣”的定义。最终,这是“观察者的眼中之物”。在这里,这个“观察者”是一个神经网络。是的,这些特定的模式不是我会选择的。但相对于“典型模式”来说,它们似乎至少“有些不同”。而且可以推测,这基本上是一个关于区分猫和狗的神经网络的故事:神经网络做出的判断至少在某种程度上与我们的判断相似——可能是因为我们的大脑结构类似于神经网络。

OK, but what does a neural net “intrinsically find interesting”? If the neural net is trained then it’ll very much be influenced by what we can think of as the “cultural background” it gets from this training. But what if we just set up neural nets with a given architecture, and pick their weights at random? Let’s say they’re neural nets that compute functions . Then here are examples of collections of functions they compute:
好的,但神经网络“本质上找到有趣的东西”是什么意思?如果神经网络经过训练,它将受到我们可以称之为“文化背景”的影响。但是,如果我们只是随机设置给定架构的神经网络,并选择它们的权重呢?假设它们是计算函数的神经网络。那么这里是它们计算的函数集合的示例:

Not too surprisingly, the functions that come out very much reflect the underlying activation functions that appear at the nodes of our neural nets. But we can see that—a bit like in a random walk process—“more extreme” functions are less likely to be produced by neural nets with random weights, so can be thought of as “intrinsically more surprising” for neural nets.
并不令人意外的是,产生的函数非常反映了我们神经网络节点上出现的激活函数。但我们可以看到,就像在随机游走过程中一样,“更极端”的函数在具有随机权重的神经网络中产生的可能性较小,因此可以被认为对神经网络来说是“本质上更令人惊讶的”。

But, OK, “surprise” is one potential criterion for “interestingness”. But there are others. And to get a sense of this we can look at various kinds of constructs that can be enumerated, and where we can ask which possible ones we consider “interesting enough” that we’ve, for example, studied them, given them specific names, or recorded them in registries.
但是,好吧,“惊喜”是“有趣性”的一个潜在标准。但还有其他标准。为了了解这一点,我们可以看一下可以列举的各种构造,并询问我们认为哪些可能的构造“足够有趣”,以至于我们已经研究过它们,给它们起了特定的名称,或者在注册表中记录了它们。

As a first example, let’s consider a family of hydrocarbon molecules: alkanes. Any such molecule can be represented by a tree graph with nodes corresponding to carbon atoms, and having valence at most 4. There are a total of 75 alkanes with 10 or fewer carbons, and all of them typically appear in standard lists of chemicals (and in our Wolfram Knowledgebase). But with 10 carbons only some alkanes are “interesting enough” that they’re listed, for example in our knowledgebase (aggregating different registries one finds more alkanes listed, but by 11 carbons at least 42 out of 159 always seem to be “missing”—and are not highlighted here):
作为第一个例子,让我们考虑一个烃分子家族:烷烃。任何这样的分子都可以用一个树状图表示,其中节点对应碳原子,并且其价数最多为 4。总共有 75 种含有 10 个或更少碳原子的烷烃,它们通常都出现在化学品的标准列表中(以及我们的沃尔夫勒姆知识库中)。但是,只有一些含有 10 个碳原子的烷烃足够“有趣”,它们才会被列出,例如在我们的知识库中(聚合不同的注册表可以找到更多列出的烷烃,但是在 11 个碳原子时,至少有 42 个烷烃似乎总是“缺失” - 并且在这里没有突出显示)。

What makes some of these alkanes be considered “more interesting” in this sense than others? Operationally it’s a question of whether they’ve been studied, say in the academic literature. But what determines this? Partly it’s a matter of whether they “occur in nature”. Sometimes—say in petroleum or coal—alkanes form through what amount to “random reactions”, where unbranched molecules tend to be favored. But alkanes can also be produced in biological systems, through careful orchestration, say by enzymes. But wherever they come from, it’s as if the alkanes that are more familiar are the ones that seem “more interesting”. So what about “surprise”? Whether a “surprise alkane”—say made by explicit synthesis in a lab—is considered “interesting” probably depends first and foremost on whether it’s identified to have “interesting properties”. And that in turn tends to be a question of how its properties fit into the whole web of human knowledge and technology.
一些烷烃被认为在这个意义上比其他烷烃更有趣的原因是什么?从操作上来说,这是一个问题,即它们是否在学术文献中进行了研究。但是是什么决定了这一点呢?部分原因是它们是否“存在于自然界”。有时候,烷烃会通过类似于“随机反应”的方式形成,其中无支链分子往往更受青睐,比如在石油或煤炭中。但是烷烃也可以在生物系统中通过精心的编排,比如通过酶来产生。但无论它们来自哪里,似乎更熟悉的烷烃似乎更“有趣”。那么“惊喜”呢?一个“惊喜烷烃”(比如在实验室中通过明确的合成制备)是否被认为“有趣”,很可能首先取决于它是否被确认具有“有趣的性质”。而这又往往是一个问题,即它的性质如何适应人类知识和技术的整个网络。

So can AI help in determining which alkanes we’re likely to consider interesting? Traditional computational chemistry—perhaps sped up by AI—can potentially determine the rates at which different alkanes are “randomly produced”. And in a quite different direction, analyzing the academic literature—say with an LLM—can potentially predict how much a certain alkane can be expected to be studied or talked about. Or (and this is particularly relevant for drug candidates) whether there are existing hints of “if only we could find a molecule that does ___” that one can pick up from things like academic literature.
那么,人工智能能帮助确定哪些烷烃可能被认为是有趣的吗?传统的计算化学,也许通过人工智能的加速,可以潜在地确定不同烷烃的“随机生成”速率。而在一个完全不同的方向上,分析学术文献,比如使用LLM,可以潜在地预测某种烷烃可能被研究或讨论的程度。或者(这对于药物候选者尤其相关),是否存在“如果我们能找到一种能做到___的分子就好了”的线索,这些线索可以从学术文献等方面获取。

As another example, let’s consider mathematical theorems. Much like with chemicals, one can in principle enumerate possible mathematical theorems by starting from axioms and then seeing what theorems can progressively be derived from them. Here’s what happens in just two steps starting from some typical axioms for logic:
作为另一个例子,让我们考虑数学定理。就像化学物质一样,原则上可以通过从公理开始,逐步推导出可能的数学定理。从一些典型的逻辑公理开始,经过两个步骤,会发生以下情况:

There are a vast number of “uninteresting” (and often seemingly very pedantic) theorems here. But among all these there are two that are interesting enough that they’re typically given names (“the idempotence laws”) in textbooks of logic. Is there any way to determine whether a theorem will be given a name? One might have thought that would be a purely historical question. But at least in the case of logic there seems to be a systematic pattern. Let’s say one enumerates theorems of logic starting with the simplest, and going on in a lexicographic order. Most theorems in the list will be derivable from earlier ones. But a few will not. And these turn out to be basically exactly the ones that are typically given names (and highlighted here):
这里有大量“无趣”(通常看起来非常拘谨)的定理。但在所有这些定理中,有两个足够有趣,以至于它们通常在逻辑教科书中被赋予名称(“幂等律”)。有没有办法确定一个定理是否会被赋予名称?人们可能会认为这是一个纯粹的历史问题。但至少在逻辑的情况下,似乎存在一种系统的模式。假设我们从最简单的开始,按字典顺序列举逻辑定理。列表中的大多数定理都可以从之前的定理推导出来。但有一些不能。而这些定理恰好是通常被赋予名称(并在这里突出显示)的定理。

Or, in other words, at least in the rather constrained case of basic logic, the theorems considered interesting enough to be given names are the ones that “surprise us with new information”.
换句话说,在基本逻辑的相当受限情况下,被认为有足够有趣而被赋予名称的定理是那些“用新信息给我们带来惊喜”的定理。

If we look more generally in “metamathematical space” we can get some empirical idea of where theorems that have been “considered interesting” lie:
如果我们在“元数学空间”中更广泛地观察,我们可以对被认为“有趣”的定理所在的位置有一些经验性的了解

Could an AI predict this? We could certainly create a neural net trained from the existing literature of mathematics, and its few million stated theorems. And we could then start feeding this neural net theorems found by systematic enumeration, and asking it to determine how plausible they are as things that might appear in mathematical literature. And in our systematic enumeration we could even ask the neural net to determine what “directions” are likely to be “interesting”—like in our second method for “AI-assisted traversal of multiway systems” above.
AI 能预测这个吗?我们肯定可以创建一个神经网络,从现有的数学文献和数百万个陈述的定理中进行训练。然后,我们可以开始向这个神经网络输入通过系统枚举找到的定理,并要求它确定它们在数学文献中出现的可能性有多大。在我们的系统枚举中,我们甚至可以要求神经网络确定哪些“方向”可能是“有趣的”-就像我们上面提到的“AI 辅助多路径系统遍历”的第二种方法一样。

But when it comes to finding “genuinely new science” (or math) there’s a problem with this—because a neural net trained from existing literature is basically going to be looking for “more of the same”. Much like the typical operation of peer review, what it’ll “accept” is what’s “mainstream” and “not too surprising”. So what about the surprises that computational irreducibility inevitably implies will be there? By definition, they won’t be “easily reducible” to what’s been seen before.
但是,当涉及到寻找“真正新的科学”(或数学)时,这存在一个问题——因为从现有文献中训练的神经网络基本上会寻找“更多相同的东西”。就像同行评审的典型操作一样,它所“接受”的是“主流”和“不太令人惊讶”的内容。那么,计算不可约性必然意味着会有哪些令人惊讶的事情呢?根据定义,它们不会“容易归纳”为之前所见过的内容。

Yes, they can provide new facts. And they may even have important applications. But there often won’t be—at least at first—a “human-accessible narrative” that “reaches” them. And what it’ll take to create that is for us humans to internalize some new concept that eventually becomes familiar. (And, yes, as we discussed above, if some particular new concept—or, say, new theorem—seems to be a “nexus” for reaching things, that becomes a target for a concept that’s worth us “adding”.)
是的,它们可以提供新的事实。甚至可能有重要的应用。但通常不会有——至少起初不会有一个“人类可理解的叙述”,能够“触及”它们。要创造出这样的叙述,我们人类需要内化一些最终变得熟悉的新概念。(是的,正如我们上面讨论过的,如果某个特定的新概念——或者说,新定理——似乎是“触及事物”的“纽带”,那就成为了一个值得我们“添加”的概念的目标。)

But in the end, there’s a certain arbitrariness in which “new facts” or “new directions” we want to internalize. Yes, if we go in a particular direction it may lead us to certain ideas or technology or activities. But abstractly we don’t know which direction we might go is “right”; at least in the first instance, that seems like a quintessential matter of human choice. There’s a potential wrinkle, though. What if our AIs know enough about human psychology and society that they can predict “what we’d like”? At first it might seem that they could then successfully “pick directions”. But once again computational irreducibility blocks us—because ultimately we can’t “know what we’ll like” until we “get there”.
但最终,我们想要内化哪些“新事实”或“新方向”是有一定的任意性的。是的,如果我们朝着特定的方向前进,可能会引导我们产生某些想法、技术或活动。但抽象地说,我们不知道我们可能选择的哪个方向是“正确的”;至少在最初的情况下,这似乎是一个典型的人类选择问题。然而,有一个潜在的问题。如果我们的人工智能对人类心理学和社会有足够的了解,能够预测“我们会喜欢什么”?起初似乎它们可以成功地“选择方向”。但再次,计算不可简化阻碍了我们——因为最终我们无法在“到达那里”之前“知道我们会喜欢什么”。

We can relate all this to generative AI, for example for images or text. At the outset, we might imagine enumerating images that consist of arbitrary arrays of pixels. But an absolutely overwhelming fraction of these won’t be at all “interesting” to us; they’ll just look to us like “random noise”:
我们可以将所有这些与生成式人工智能联系起来,例如用于图像或文本。起初,我们可能会想象列举由任意像素数组组成的图像。但其中绝大部分对我们来说都不会“有趣”;它们对我们来说只是“随机噪声”:

By training a neural net on billions of human-selected images, we can get it to produce images that are somehow “generally like what we find interesting”. Sometimes the images produced will be recognizable to the point where we’ll be able to give a “narrative explanation” of “what they look like”:
通过训练一个神经网络使用数十亿个人选择的图像,我们可以让它生成一些“通常与我们感兴趣的东西相似”的图像。有时,生成的图像会被我们认出,以至于我们能够给出一个“叙述性解释”来描述“它们看起来像什么”:

But very often we’ll find ourselves with images “out in interconcept space”:
但是很常见的情况是我们会发现自己有一些“在概念空间之外的图像”

Are these “interesting”? It’s hard to say. Scanning the brain of a person looking at them, we might notice some particular signal—and perhaps an AI could learn to predict that. But inevitably that signal would change if some type of “interconcept image” become popular, and started, say, to be recognized as a kind of art that people are familiar with.
这些是否“有趣”?很难说。扫描一个人看它们时的大脑,我们可能会注意到一些特定的信号,也许人工智能可以学会预测这些信号。但不可避免地,如果某种“跨概念图像”变得流行起来,并开始被人们认可为一种熟悉的艺术形式,那么这些信号就会发生变化。

And in the end we’re back to the same point: things are ultimately “interesting” if our choices as a civilization make them so. There’s no abstract notion of “interestingness” that an AI or anything can “go out and discover” ahead of our choices.
最终,我们回到了同一个观点:如果我们作为一个文明的选择使事物变得有趣,那么它们才是“有趣”的。没有任何人工智能或其他东西可以在我们做出选择之前“去发现”抽象的“有趣性”概念。

And so it is with science. There’s no abstract way to know “what’s interesting” out of all the possibilities in the ruliad; that’s ultimately determined by the choices we make in “colonizing” the ruliad.
科学也是如此。在所有可能性中,没有抽象的方法来知道“什么是有趣的”,这最终取决于我们在“开拓”这个领域时所做的选择。

But what if—instead of going out into the “wilds of the ruliad”—we stay close to what’s already been done in science, and what’s already “deemed interesting”? Can AI help us extend what’s there? As a practical matter—at least when supplemented with our computational language as a tool—the answer is at some level surely yes. And for example LLMs should be able to produce things that follow the pattern of academic papers—with dashes of “originality” coming from whatever randomness is used in the LLM.
但是,如果我们不去“荒野”中,而是紧密围绕已经在科学领域中完成的工作以及已经被认为“有趣”的事物,会怎样呢?人工智能能帮助我们扩展已有的内容吗?从实际角度来看,至少在我们将计算语言作为工具补充进来的情况下,答案肯定是肯定的。例如,LLMs应该能够产生符合学术论文模式的东西,其中的“独创性”来自于LLM中使用的任何随机性。

How far can such an approach get? The existing academic literature is certainly full of holes. Phenomenon A was investigated in system X, and B in Y, but not vice versa, etc. And we can expect that AIs—and LLMs in particular—can be useful in identifying these holes, and in effect “planning” what science is (by this criterion) interesting to do. And beyond this, we can expect that things like LLMs will be helpful in mapping out “usual and customary” paths by which the science should be done. (“When you’re analyzing data like this, one typically quotes such-and-such a metric”; “when you’re doing an experiment like this, you typically prepare a sample like this”; etc.) When it comes to actually “doing the science”, though, our actual computational language tools—together with things like computationally controlled experimental equipment—will presumably be what’s usually more central.
这种方法能走多远?现有的学术文献确实存在很多漏洞。现象 A 在系统 X 中进行了研究,B 在 Y 中进行了研究,但反之则不然,等等。我们可以预期,人工智能,特别是LLMs,可以帮助识别这些漏洞,并有效地“规划”出根据这个标准来做有趣的科学研究。除此之外,我们可以预期,像LLMs这样的东西将有助于绘制出科学应该按照“通常和习惯”的路径进行的地图。(“当你分析这样的数据时,通常引用某种度量标准”;“当你进行这样的实验时,通常准备这样的样本”等等。)然而,当涉及到实际“做科学”的时候,我们实际的计算语言工具,连同像计算机控制的实验设备这样的东西,很可能是更核心的。

But let’s say we’ve defined some major objective for science (“figure out how to reverse aging”, or, a bit more modestly, “solve cryonics”). In giving such an objective, we’re specifying something we consider “interesting”. And then the problem of getting to that objective is—at least conceptually—like finding a proof of theorem or a synthesis pathway for a chemical. There are certain “moves we can make”, and we need to find out how to “string these together” to get to the objective we want. Inevitably, though, there’s an issue with (multi)computational irreducibility: there may be an irreducible number of steps we need to take to get to the result. And even though we may consider the final objective “interesting”, there’s no guarantee that we’ll find the intermediate steps even slightly interesting. Indeed, in many proofs—as well as in many engineering systems—one may need to build on an immense number of excruciating details to get to the final “interesting result”.
但是假设我们为科学设定了一些重大目标(“找出如何逆转衰老”,或者稍微谦虚一点,“解决低温保存”)。通过给出这样一个目标,我们正在指定我们认为“有趣”的事物。然后,达到这个目标的问题——至少在概念上——就像找到一个定理的证明或者一种化学合成途径。我们可以采取某些“策略”,我们需要找出如何“将它们串联起来”以达到我们想要的目标。然而,不可避免地,存在(多)计算不可简化性的问题:我们可能需要采取不可简化的步骤才能得到结果。即使我们可能认为最终目标“有趣”,也不能保证我们会发现中间步骤即使稍微有趣。实际上,在许多证明中——以及许多工程系统中——我们可能需要依靠大量繁琐的细节才能得到最终的“有趣结果”。

But let’s talk more about the question of what to study—or, in effect, what’s “interesting to study”. “Normal science” tends to be concerned with making incremental progress, remaining within existing paradigms, but gradually filling in and extending what’s there. Usually the most fertile areas are on the interfaces between existing well-developed areas. At the outset, it’s not at all obvious that different areas of science should ultimately fit together at all. But given the concept of the ruliad as the ultimate underlying structure, this begins to seem less surprising. Still, to actually see how different areas of science can be “knitted together” one will often have to identify—perhaps initially quite surprising—analogies between very different descriptive frameworks. “A decidable theory in metamathematics is like a black hole in physics”; “concepts in language are like particles in rulial space”; etc.
但让我们更多地讨论一下要学习什么的问题,或者说什么是“有趣的学习”。“正常科学”往往关注的是取得渐进性进展,保持现有范式,并逐渐填补和扩展已有的知识。通常最富有成果的领域位于现有发展较好的领域之间的接口上。起初,并不明显不同科学领域最终应该如何相互契合。但是,考虑到 ruliad 作为最终的基础结构的概念,这开始变得不那么令人惊讶。然而,要真正看到不同科学领域如何“编织在一起”,人们通常需要找到-可能最初相当令人惊讶的-非常不同的描述性框架之间的类比。例如,“在元数学中,可判定的理论就像物理学中的黑洞”;“语言中的概念就像 rulial 空间中的粒子”等等。

And this is an area where one can expect LLMs to be helpful. Having seen the “linguistic pattern” of one area, one can expect them to be able to see its correspondence in another area—potentially with important consequences.
这是一个LLMs可能会有帮助的领域。通过观察一个领域的“语言模式”,可以预期他们能够在另一个领域中找到对应之处,可能会产生重要的影响。

But what about fresh new directions in science? Historically, these have often been the result of applying some new practical methodology (say for doing a new kind of experiment or measurement)—that happens to open up some “new place to look”, where people have never looked before. But usually one of the big challenges is to recognize that something one sees is actually “interesting”. And to do this often in effect involves the creation of some new conceptual framework or paradigm.
但是科学中的新方向呢?从历史上看,这些往往是应用一些新的实用方法(比如进行一种新的实验或测量)的结果,这些方法恰好打开了人们以前从未涉足的“新领域”。但通常一个重大挑战是要认识到自己所看到的东西实际上是“有趣的”。而要做到这一点,往往需要创造一些新的概念框架或范式。

So can AI—as we’ve been discussing it here—be expected to do this? It doesn’t seem likely. AI is typically something trained on existing human material, intended to extrapolate directly from that. It’s not something built to “go out into the wilds of the ruliad”, far from anything already connected to humans.
那么,正如我们在这里讨论的那样,可以期望人工智能做到这一点吗?这似乎不太可能。人工智能通常是在现有的人类材料上进行训练,旨在直接推断出结果。它不是为了“进入人类之外的荒野”而构建的,远离与人类已经连接的一切。

But in a sense that is the domain of “arbitrary computation”, and of things like the simple programs we might enumerate or pick at random in ruliology. And, yes, by going out into the “wilds of the ruliad” it’s easy enough to find fresh, new things not currently assimilated into science. The challenge, though, is to connect them to anything we humans currently “understand” or “find interesting”. And that, as we’ve said before, is something that quintessentially involves human choice, and the foibles of human history. There are an infinite collection of paths that could be taken. (And indeed, in a “society of AIs”, there could be AIs that pursue a certain collection of them.) But in the end what matters to us humans and the enterprise we normally call “science” is our internal experience. And that’s something we ultimately have to form for ourselves.
但从某种意义上说,这是“任意计算”的领域,以及我们可能在 ruliology 中枚举或随机选择的简单程序之类的东西。是的,通过走进“ruliad 的荒野”,很容易找到尚未被科学吸纳的新鲜事物。然而,挑战在于将它们与我们人类目前“理解”或“发现有趣”的任何事物联系起来。正如我们之前所说,这是涉及人类选择和人类历史的怪癖的典型事情。有无数条可能的道路可以选择。(实际上,在“AI 社会”中,可能会有追求其中某一系列道路的 AI。)但对我们人类和我们通常称之为“科学”的事业来说,重要的是我们的内在体验。这是我们最终必须自己形成的东西。

Beyond the “Exact Sciences”
超越“精确科学”

In areas like the physical sciences we’re used to the idea of being able to develop broad theories that can do things like make quantitative predictions. But there are many areas—for example in the biological, human and social sciences—that have tended to operate in much less formal ways, and where things like long chains of successful theoretical inferences are largely unheard of.
在像物理科学这样的领域,我们习惯于能够发展出能够做出定量预测的广泛理论的概念。但是在生物学、人类学和社会科学等许多领域,往往以非正式的方式运作,长时间成功的理论推论链条等事物在这些领域中几乎是闻所未闻的。

So might AI change that? There seem to be some interesting possibilities, particularly around the new kinds of “measurements” that AI enables. “How similar are those artworks?” “How close are the morphologies of those organisms?” “How different are those myths?” These are questions that in the past one mostly had to address by writing an essay. But now AI potentially gives us a path to make such things more definite—and in some sense quantitative.
那么,人工智能能改变这一点吗?似乎有一些有趣的可能性,特别是在人工智能所能实现的新型“测量”方面。“这些艺术品有多相似?”“这些生物的形态有多接近?”“这些神话有多不同?”过去,这些问题大多需要通过写文章来解决。但现在,人工智能潜在地为我们提供了一条使这些事情更加明确、在某种意义上更加定量化的路径。

Typically the key idea is to figure out how to take “unstructured raw data” and extract “meaningful features” from it that can be handled in formal, structured ways. And the main thing that makes this possible is that we have AIs that have been trained on large corpora that reflect “what’s typical in our world”—and which have in effect formed definite internal representations of the world, in terms of which things can for example be described (as we did above) by lists of numbers.
通常的关键思想是找出如何从“非结构化原始数据”中提取“有意义的特征”,以便以正式、结构化的方式处理。而使这成为可能的主要因素是我们拥有经过大规模语料库训练的人工智能,这些语料库反映了“我们世界的典型情况”,并且实际上形成了对世界的明确内部表达,通过这些表达,事物可以用数字列表的方式进行描述(如上所述)。

What do those numbers mean? At the outset we typically have no idea; they’re just the output of some neural net encoder. But what’s important is that they’re definite, and repeatable. Given the same input data, one will always get the same numbers. And, what’s more, it’s typical that when data “seems similar” to us, it’ll tend to be assigned nearby numbers.
这些数字是什么意思?一开始我们通常不知道;它们只是某个神经网络编码器的输出。但重要的是它们是确定的和可重复的。在给定相同的输入数据时,总是会得到相同的数字。而且,更重要的是,当数据对我们来说“看起来相似”时,它们往往会被分配到附近的数字。

In an area like physical science, we expect to build specific measuring devices that measure quantities we “know how to interpret”. But AI is much more of a black box: something is being measured, but at least at the outset we don’t necessarily have any interpretation of it. Sometimes we’ll be able to do training that associates some description we know, so that we’ll get at least a rough interpretation (as in a case like sentiment analysis). But often we won’t.
在物理科学等领域,我们期望构建特定的测量设备来测量我们“知道如何解释”的数量。但是人工智能更像是一个黑匣子:我们正在测量某些东西,但至少在开始阶段我们并没有对其进行任何解释。有时候我们能够进行训练,将一些我们知道的描述与之关联起来,这样我们至少能得到一个粗略的解释(比如情感分析的情况)。但通常情况下我们无法做到这一点。

(And it has to be said that something similar can happen even in physical science. Let’s say we test whether one material scratches the surface of another. Presumably we can interpret that as some kind of hardness of the material, but really it’s just a measurement, that becomes significant if we can successfully associate it with other things.)
(必须指出的是,即使在物理科学中也可能发生类似的情况。假设我们测试一个材料是否能刮伤另一个材料的表面。我们可以将其解释为材料的某种硬度,但实际上它只是一种测量,只有当我们能够成功地将其与其他事物联系起来时,它才具有重要意义。)

One thing that’s particularly notable about “AI measurements” is how they can potentially pick out “small signals” from large volumes of unstructured data. We’re used to having methods like statistics to do similar things on structured, numerical data. But it’s a different story to ask from billions of webpages whether, say, kids who like science typically prefer cats or dogs.
“AI 测量”特别引人注目的一点是它们有可能从大量的非结构化数据中挑选出“微小信号”。我们习惯于使用统计方法在结构化的数值数据上进行类似的操作。但是,从数十亿个网页中询问,比如说,喜欢科学的孩子通常更喜欢猫还是狗,就是另外一回事了。

But given an “AI measurement” what can we expect to do with it? None of this is very clear yet, but it seems at least possible that we can start to find formal relationships. Perhaps it will be a quantitative relationship involving numbers; perhaps it will be better represented by a program that describes a computational process by which one measurement leads to others.
但是,鉴于“AI 测量”,我们可以期望做些什么呢?目前还不太清楚,但至少似乎有可能我们可以开始找到形式上的关系。也许这将是一个涉及数字的定量关系;也许更好地通过描述一个计算过程的程序来表示,该过程通过一个测量导致其他测量。

It’s been common for some time in areas like quantitative finance to find relationships between what amount to simple forms of “AI measurements”—and to be concerned mainly with whether they work, rather than why they work, or how one might narratively describe them.
在一些领域,比如量化金融,长期以来已经很常见了,人们发现了一些类似于“人工智能测量”的简单形式之间的关系,并且主要关注的是它们是否有效,而不是为什么有效,或者如何用叙述的方式描述它们。

In a sense it seems rather unsatisfactory to try to build science on “black-box” AI measurements that one can’t interpret. But at some level this is just an accelerated version of what we often do, say with everyday language. We’re exposed to some new observation or measurement. And eventually we invent words to describe it (“it looks like a fractal”, etc.). And then we can start “reasoning in terms of it”, etc.
从某种意义上说,试图建立在无法解释的“黑盒”人工智能测量上的科学似乎相当不令人满意。但在某种程度上,这只是我们经常做的事情的加速版本,比如用日常语言。我们接触到一些新的观察或测量。最终,我们会发明词语来描述它(“看起来像一个分形”等等)。然后我们可以开始“以它为基础进行推理”等等。

But AI measurements are potentially a much richer source of formalizable material. But how should we do that formalization? Computational language seems to be key. And indeed we already have examples in the Wolfram Language—where functions like ImageIdentity or TextCases (or, for that matter, LLMFunction) can effectively make “AI measurements”, but then we can take their results, and work symbolically with them.
但是,人工智能的测量可能是一个更丰富的可形式化材料来源。但是我们应该如何进行形式化呢?计算语言似乎是关键。事实上,我们已经在 Wolfram 语言中有了一些例子——像 ImageIdentityTextCases (或者说 LLMFunction )这样的函数可以有效地进行“人工智能测量”,然后我们可以使用它们的结果,并对其进行符号化处理。

In physical science we often imagine that we’re working only with “objective measurements” (though my recent “observer theory” implies that actually our nature as observers is crucial even). But AI measurements seem to have a certain immediate “subjectivity”—and indeed their details (say, associated with the particulars of a neural net encoder) will be different for every different AI we use. But what’s important is that if the AI is trained on very large amounts of human experience, there’ll be a certain robustness to it. In a sense we can view many AI measurements as being like the output of a “societal observer”—that uses something like the whole mass of human experience, and in doing so gains a certain “centrality” and “inertia”.
在物理科学中,我们经常想象自己只与“客观测量”打交道(尽管我最近的“观察者理论”暗示着实际上我们作为观察者的本质也是至关重要的)。但是,人工智能的测量似乎具有一定的“主观性”——实际上,它们的细节(比如与神经网络编码器相关的细节)对于我们使用的每个不同的人工智能都是不同的。但重要的是,如果人工智能在大量的人类经验上进行训练,它将具有一定的稳健性。从某种意义上说,我们可以将许多人工智能的测量视为“社会观察者”的输出,它使用了类似于整个人类经验的总和,并通过这样做获得了一定的“中心性”和“惯性”。

What kind of science can we expect to build on the basis of what a “societal observer” measures? For the most part, we don’t yet know. There’s some reason to think that (as in the case of physics and metamathematics) such measurements might tap into pockets of computational reducibility. And if that’s the case, we can expect that we’ll be able to start doing things like making predictions—albeit perhaps only for the results of “AI measurements” which we’ll find hard to interpret. But by connecting such AI measurements to computational language, there seems to be the potential to start constructing “formalized science” in places where it’s never been possible before—and in doing so, to extend the domain of what we might call “exact sciences”.
我们可以期待基于“社会观察者”所测量的内容来建立什么样的科学呢?大部分情况下,我们还不知道。有一些理由认为(就像物理学和元数学的情况一样),这些测量可能会涉及到计算可简化的领域。如果是这样的话,我们可以预期我们将能够开始做一些预测——尽管可能只针对我们难以解释的“AI 测量”结果。但通过将这些 AI 测量与计算语言相连接,似乎有可能在以前从未可能的地方开始构建“形式化科学”,并通过这样做来扩展我们所称之为“精确科学”的领域。

(By the way, another promising application of modern AIs is in setting up “repeatable personas”: entities that effectively behave like humans with certain characteristics, but on which large-scale repeatable experiments of the kind typical in physical science can be done.)
顺便说一下,现代人工智能的另一个有前景的应用是建立“可重复的人物角色”:这些实体在行为上有效地像具有特定特征的人类,但可以进行大规模可重复的实验,类似于物理科学中的实验。

So… Can AI Solve Science?
那么...人工智能能解决科学问题吗?

At the outset, one might be surprised that science is even possible. Why is it that there is regularity that we can identify in the world that allows us to form “scientific narratives”? Indeed, we now know from things like the concept of the ruliad that computational irreducibility is inevitably ubiquitous—and with it fundamental irregularity and unpredictability. But it turns out that the very presence of computational irreducibility necessarily implies that there must be pockets of computational reducibility, where at least certain things are regular and predictable. And it is within these pockets of reducibility that science fundamentally lives—and indeed that we try to operate and engage with the world.
一开始,人们可能会对科学的可能性感到惊讶。为什么世界上存在我们可以识别的规律,使我们能够形成“科学叙事”?事实上,我们现在知道,从类似于“ruliad”的概念中,计算不可简化无处不在,伴随着基本的不规则性和不可预测性。但事实证明,计算不可简化的存在必然意味着必须存在计算可简化的区域,至少在这些区域中某些事物是规律和可预测的。而正是在这些可简化的区域中,科学根本存在,并且我们试图在其中运作和与世界互动。

So how does this relate to AI? Well, the whole story of things like trained neural nets that we’ve discussed here is a story of leveraging computational reducibility, and in particular computational reducibility that’s somehow aligned with what human minds also use. In the past the main way to capture—and capitalize on—computational reducibility was to develop formal ways to describe things, typically using mathematics and mathematical formulas. AI in effect provides a new way to make use of computational reducibility. Normally there’s no human-level narrative to how it works; it’s just that somehow within a trained neural net we manage to capture certain regularities that allow us, for example, to make certain predictions.
那么这与人工智能有什么关系呢?嗯,我们在这里讨论的训练神经网络等事物的整个故事,就是利用计算可约简性的故事,特别是与人类思维方式相一致的计算可约简性。过去捕捉和利用计算可约简性的主要方法是开发形式化的描述方式,通常使用数学和数学公式。人工智能实际上提供了一种利用计算可约简性的新方法。通常它的工作原理没有人类级别的叙述;只是在训练的神经网络中,我们设法捕捉到某些规律,使我们能够进行某些预测,例如。

In a sense the predictions tend to be very “human style”, often looking “roughly right” to us, even though at the level of precise formal detail they’re not quite right. And fundamentally they rely on computational reducibility—and when computational irreducibility is present they more or less inevitably fail. In a sense, the AI is doing “shallow computation”, but when there’s computational irreducibility one needs irreducible, deep computation to work out what will happen.
从某种意义上说,这些预测往往非常“人类风格”,尽管在精确的形式细节上它们并不完全正确,但在我们看来它们大致是正确的。基本上,它们依赖于计算可约性,而当计算不可约性存在时,它们几乎不可避免地失败。从某种意义上说,人工智能正在进行“浅层计算”,但当存在计算不可约性时,需要进行不可约的深层计算来弄清楚将会发生什么。

And there are plenty of places—even in working with traditional mathematical structures—where what AI does won’t be sufficient for what we expect to get out of science. But there are also places where “AI-style science” can make progress even when traditional methods cannot. If one’s doing something like solving a single equation (say, ODE) precisely, AI probably won’t be the best tool. But if one’s got a big collection of equations (say for something like robotics) AI may successfully be able to give a useful “rough estimate” of what will happen, even when traditional methods would get utterly bogged down in details.
在许多领域中,即使在处理传统数学结构时,人工智能所做的也不足以满足我们对科学的期望。但也有一些领域,在传统方法无法取得进展时,“人工智能式科学”可以取得进展。如果要解决一个方程(比如常微分方程),人工智能可能不是最好的工具。但如果有一大堆方程(比如机器人学),人工智能可能成功地给出一个有用的“粗略估计”,即使传统方法在细节上会陷入困境。

It’s a general feature of machine learning—and AI—techniques that they can be very useful if an approximate (“80%”) answer is good enough. But they tend to fail when one needs something more “precise” and “perfect”. And there are quite a few workflows in science (and probably more that can be identified) where this is exactly what one needs. “Pick out candidate cases for something”. “Identify a feature that might important”. “Suggest a possible question to explore”.
这是机器学习和人工智能技术的一般特点,如果一个近似("80%")的答案足够好,它们可以非常有用。但是当需要更加“精确”和“完美”的东西时,它们往往会失败。在科学中有很多工作流程(可能还有更多可以确定的),正是这正是人们所需要的。“挑选出某事的候选案例”。“识别可能重要的特征”。“提出一个可能的探索问题”。

There are clear limitations, though, particularly whenever there’s computational irreducibility. In a sense the typical AI approach to science doesn’t involve explicitly “formalizing things”. But in many areas of science formalization is precisely what’s been most valuable, and what’s allowed towers of results to be obtained. And in recent times we have the powerful new idea of formalizing things computationally—and in particular in using computational language to do this.
然而,存在明显的限制,特别是在存在计算不可简化性的情况下。从某种意义上说,典型的人工智能方法并不涉及明确地“形式化事物”。但在科学的许多领域,形式化恰恰是最有价值的,也是获得大量成果的关键。而近年来,我们有了一个强大的新思想,即以计算方式来形式化事物,特别是使用计算语言来实现这一点。

And given such a computational formalization, we’re able to start doing irreducible computations that let us reach discoveries we have no way to anticipate. We can, for example, enumerate possible computational systems or processes, and see “fundamental surprises”. In typical AI there’s randomness that gives us a certain degree of “originality” in our exploration. But it’s of a fundamentally lower level than we can reach with actual irreducible computations.
鉴于这样的计算形式化,我们能够开始进行不可约计算,从而达到我们无法预料的发现。例如,我们可以列举可能的计算系统或过程,并观察到“基本的惊喜”。在典型的人工智能中,我们有一定程度的随机性,使我们在探索中具有一定程度的“独创性”。但这种独创性在本质上比我们通过实际的不可约计算所能达到的水平要低。

So what should we expect for AI in science going forward? We’ve got in a sense a new—and rather human-like—way of leveraging computational reducibility. It’s a new tool for doing science, destined to have many practical uses. In terms of fundamental potential for discovery, though, it pales in comparison to what we can build from the computational paradigm, and from irreducible computations that we do. But probably what will give us the greatest opportunity to move science forward is to combine the strengths of AI and of the formal computational paradigm. Which, yes, is part of what we’ve been vigorously pursuing in recent years with the Wolfram Language and its connections to machine learning and now LLMs.
那么,对于科学中的人工智能,我们应该期待什么呢?从某种意义上说,我们有了一种新的、相当类似人类的利用计算可简化性的方式。这是一种新的科学工具,注定有许多实际用途。然而,就发现的基本潜力而言,它与我们可以从计算范式和我们所做的不可简化计算中构建的东西相比,相形见绌。但可能给我们最大机会推动科学发展的是将人工智能和形式计算范式的优势结合起来。是的,这正是我们最近几年来在 Wolfram 语言及其与机器学习和现在LLMs的联系方面积极追求的一部分。

Notes 笔记

My goal here has been to outline my current thinking about the fundamental potential (and limitations) of AI in science—developing my ideas by using the Wolfram Language and its AI capabilities to do various simple experiments. I view what I’ve done here as just a beginning. Essentially every experiment could, for example, be done in much more detail, and with much more analysis. (And just click any image to get the Wolfram Language that made it, so you can repeat or extend it.)
我在这里的目标是概述我对科学中人工智能的基本潜力(和限制)的当前思考 - 通过使用 Wolfram 语言及其人工智能功能进行各种简单实验来发展我的想法。我认为我在这里所做的只是一个开始。基本上,每个实验都可以更详细地进行,并进行更多的分析。(只需点击任何图像即可获取生成它的 Wolfram 语言,以便您可以重复或扩展它。)

“AI in science” is a hot topic these days in the world at large, and I am surely aware only of a small part of everything that’s been done. My own emphasis has been on trying to “do the obvious experiments” and trying to piece together for myself the “big picture” of what’s going on. I should emphasize that there’ve been a regular stream of outstanding and impressive “engineering innovations” in AI in recent times, and I won’t be at all surprised if experiments that haven’t worked well for me could be dramatically improved by future such innovations, conceivably even changing my “big-picture” conclusions from them.
“科学中的人工智能”是当今世界上的热门话题,我只了解其中的一小部分。我自己的重点是尝试“做明显的实验”,并试图自己拼凑出正在发生的“大局”。我应该强调的是,最近在人工智能领域出现了一系列杰出而令人印象深刻的“工程创新”,如果未能取得良好效果的实验能够通过未来的创新得到显著改进,甚至可能改变我的“大局”结论,我一点也不会感到惊讶。

I must also offer an apology. While I’ve been exposed—though often basically just “through the grapevine”—to lots of things being done on “AI in science”, especially over the past year, I haven’t made any serious attempt to systematically study the literature of the field, or trace its history and the provenance of ideas in it. So I must leave it to others to make connections between what I’ve done here and what other people may (or may not) have done elsewhere. It’d be fascinating to do a serious analysis of the history of work on AI in science, but it’s not something I’ve had a chance to do.
我还必须道歉。虽然我对“科学中的人工智能”接触过一些东西,尤其是在过去的一年里,但我没有认真尝试系统地研究该领域的文献,或者追溯其历史和思想的来源。因此,我必须让其他人来将我在这里所做的与其他人在其他地方可能(或可能不)所做的联系起来。对于对科学中的人工智能工作的历史进行深入分析将是非常有趣的,但这不是我有机会做的事情。

In my efforts here I have been greatly assisted by Wolfram Institute fellows Richard Assar (“Ruliad Fellow”) and Nik Murzin (“Fourmilab Fellow”). I’m also grateful to the many people who I’ve talked to—or heard from—about AI in science (and related topics) in recent times, including Giulio Alessandrini, Mohammed AlQuraishi, Brian Frezza, Roger Germundsson, George Morgan, Michael Trott and Christopher Wolfram.
在这里,我得到了 Wolfram 研究所的研究员 Richard Assar(“Ruliad Fellow”)和 Nik Murzin(“Fourmilab Fellow”)的大力协助。我也非常感谢最近与我讨论过人工智能在科学(以及相关主题)中的许多人,包括 Giulio Alessandrini,Mohammed AlQuraishi,Brian Frezza,Roger Germundsson,George Morgan,Michael Trott 和 Christopher Wolfram。

Posted in: Artificial Intelligence, Computational Science, Mathematics, Physics, Ruliology
发布在:人工智能、计算科学、数学、物理学、规则学中

2 comments  2 条评论

  1. “…But let’s say we’ve defined some major objective for science (“figure out how to reverse aging”, or, a bit more modestly, “solve cryonics”)…”
    "但是假设我们为科学设定了一些重大目标(比如“找出如何逆转衰老”,或者稍微谦虚一点的“解决低温保存”)..."

    Finding what’s interesting, and defining a major objective for science, is a critical 1st step for any new development that changes the world…especially in healthcare.
    寻找有趣的事物,并为科学设定一个重大目标,是改变世界的任何新发展的关键第一步...尤其是在医疗领域。

    Ty for pointing out cryonics is “more modest” than reversing aging. You’re more right than you may know.
    感谢指出,冷冻技术比逆转衰老更为谨慎。你比你所知道的更正确。

    Maybe an organization out of Stockholm eventually raises greater awareness to seek truth and save/extend lives. The benefits certainly outweigh the risks.
    也许斯德哥尔摩的一个组织最终会提高人们的意识,寻求真相并拯救/延长生命。利益肯定超过风险。

  2. It has long seemed to me that the goal of an intelligent agent would not be to predict the fine-grained evolution of a system (as in the initial attempts with NN’s and the cellular automata) but to simultaneously find a coarse-grained representation of the system and estimate a model that stochastically predicts the evolution of the system in that representation. In other words, to find a self-consistent set of “things that matter” for predicting “things that matter”.
    我一直觉得,智能代理的目标不是预测系统的细粒度演化(如最初使用神经网络和元胞自动机的尝试),而是同时找到系统的粗粒度表示,并估计一个随机预测该表示中系统演化的模型。换句话说,找到一个自洽的“重要事物”集合,用于预测“重要事物”。

    For instance, pressure, volume, and temperature, as a representation of a gas, allow you to predict future pressure, volume and temperature. But they help little in predicting the detailed state of individual molecules. Most conserved quantities also behave like this and admit computationally reducible laws at their own level of abstraction, while leaving detailed “reality” irreducible.
    例如,压力、体积和温度作为气体的表示,可以帮助你预测未来的压力、体积和温度。但它们在预测个别分子的详细状态方面帮助有限。大多数守恒量也表现出这种行为,并在其自身的抽象层面上接受可计算化简的法则,同时保留了详细的“现实”不可简化。

    This view, I believe, aligns with Yann LeCun’s work on V-JEPA and world models, where his goal is to find the encoding that admits the best predictor for future (or censored) encodings of the inputs (rather than for raw, pixel-level prediction). The embedding vectors of this encoding then becomes the representations of a world model that can be useful for many tasks. His work highlights that much of the challenge pragmatically is in keeping the encoder from degenerating towards a trivial equivalence relation with a single class.
    我相信这种观点与 Yann LeCun 在 V-JEPA 和世界模型方面的工作是一致的,他的目标是找到能够对输入的未来(或被审查的)编码进行最佳预测的编码方式(而不是对原始的像素级预测)。这种编码的嵌入向量随后成为世界模型的表示,可以在许多任务中发挥作用。他的工作强调,从实用的角度来看,保持编码器不会退化为与单一类别的平凡等价关系是一个很大的挑战。

    The coarse-grained representation is ultimately merely an equivalence relation on “base reality” and, in order to reap the benefit of the computational reducibility, the price of introducing entropy must be paid, with the thermodynamic entropy varying inversely with the information entropy of the equivalence classes.
    粗粒度表示最终只是“基本现实”上的一个等价关系,为了获得计算可约性的好处,必须付出引入熵的代价,其中热力学熵与等价类的信息熵成反比。

    Physics as we know it might “simply” be a particular coarse-graining that admits laws of time evolution and is in a sweet spot, both in terms of granularity of state and reducibility of computation, to allow complex structures and even life within that regime.
    我们所知的物理学可能只是一种特定的粗粒度表示,它允许时间演化的法则,并且在状态的粒度和计算的可约性方面处于一个甜蜜点,从而在这个范围内允许复杂结构甚至生命的存在。