这是用户在 2024-5-29 13:41 为 https://vlcn.io/blog/crdt-substrate 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
A Framework for Convergence
收敛框架

A Framework for Convergence
收敛框架

What if I told you anyone can create a new CRDT (opens in a new tab)? That with no special knowledge, you can write a set of mutations that can be applied to a data structure in a way that is guaranteed to converge? That you can do this without any locks, mutexes, coordination, or other concurrency primitives? That you can do this without any special knowledge of distributed systems?
如果我告诉你任何人都可以创建一个新的 CRDT 呢?没有特殊知识,你可以编写一组变异,可以应用于数据结构,保证收敛?你可以做到这一点,而不需要任何锁,互斥锁,协调或其他并发原语?你可以做到这一点,而不需要任何分布式系统的特殊知识吗?


meme

Nobody can be told this truth. They must see it for themselves.
没有人可以被告知这个真相。他们必须亲眼看到。

Diving In 潜水中

We'll start with an initial state --
我们将从一个初始状态开始--

And then let you author any mutation you want to update that state. The only rules are:
然后让您编写任何您想要更新该状态的变异。唯一的规则是:

  1. Your mutations must be deterministic..
    您的变异必须是确定性的。
  2. If the mutation is a no-op it should return false.
    如果变异是一个无操作,它应该返回 false。
  3. The first argument to the mutation is always state
    变异的第一个参数始终是 state
  4. Additional arguments for the mutation come after state
    突变的额外参数在 state 之后

Some example mutations have been provided for you.
一些示例变异已为您提供。

After making any changes to the code, press shift + enter or the play button (top right of the box) to re-run the code and update the example.
在对代码进行任何更改后,按 shift + enterplay 按钮(位于框的右上角)重新运行代码并更新示例。

Let's see this in action.
让我们看看它的实际运作。

Below are three independent UIs, representing nodes or processes, that are driven by three independent copies of state. On each one you can run your mutations, updating the state for that specific node. Whenever you desire, have the nodes merge their state together by pressing the "sync nodes!" button.
以下是三个独立的用户界面,代表节点或进程,由三个独立的状态副本驱动。您可以在每个界面上运行您的变更,更新该特定节点的状态。每当您希望时,通过按下“同步节点!”按钮,让这些节点合并它们的状态。


Loading... 加载中...

Pretty cool, right? All of the nodes converge to the same state on merge and you didn't have to write any special code to make this happen or use any special data structures. You just vanilla JavaScript and the framework took care of the rest.
相当酷,对吧?所有节点在合并时都会收敛到相同的状态,您无需编写任何特殊代码或使用任何特殊数据结构。您只需使用普通的 JavaScript,框架会处理其余部分。

Before we get into how this works, let's talk about intent. While CRDTs and other structures may meet the mathematical definition of convergence it is likely that they will not meet the user's expectations of convergence.
在我们深入了解这是如何工作之前,让我们先谈谈意图。虽然 CRDT 和其他结构可能符合收敛的数学定义,但很可能不符合用户对收敛的期望。

Depending on what examples you ended up writing for yourself, you may have already noticed the intent problem.
根据您为自己编写的示例,您可能已经注意到意图问题。

Intent 意图

How do we preserve the user's intent and how does the mathematical definition of convergence deviate from practical expectations?
我们如何保留用户意图,数学上的收敛定义又如何偏离实际期望?

To understand this better, we'll look at a delete operation that is based on array index.
为了更好地理解这一点,我们将看一个基于数组索引的删除操作。

deleteItem(state, index) {
  if (index < 0 || index >= state.items.length) {
    return false;
  }
  state.items.splice(index, 1);
}

This mutation is deterministic and it will converge but it doesn't preserve intent.
这种突变是确定性的,它会收敛,但不保留意图。

A few ways it can fail to preserve intent:
它可能未能保留意图的几种方式:

  1. Two users trying to delete the same item concurrently will result in two items being removed
    两个用户同时尝试删除相同的项目将导致两个项目被移除
  2. Trying to remove an element, by array index, could end up removing a completely unrelated element after merging states.
    尝试通过数组索引删除元素可能会在合并状态后意外删除一个完全无关的元素。

The above cases are illustrated below.
上述案例如下所示。

Fig 1: Both users attempt to delete the letter b concurrently. Since the mutation uses array indices, b and c are both removed on merge rather than just b.
图 1:两个用户同时尝试删除字母 b 。由于变异使用数组索引,合并时会同时移除 bc ,而不仅仅是 b



Fig 2: Both users concurrently create a list of items. User 1 then removes c from their list (index 2). Post merge, b ends up being removed!
图 2:两个用户同时创建一个项目列表。用户 1 然后从他们的列表中移除 c (索引 2)。合并后, b 最终被移除!

The problem with both examples is that we're using array index (or alias) to indirectly reference what we are actually talking about.
这两个示例的问题在于我们使用数组索引(或别名)来间接引用我们实际讨论的内容。

Fixing Intent 修复意图

While yes, you can create a CRDT without any special knowledge you can't create a CRDT that preserves user intent without some knowledge.
虽然是的,您可以创建一个不需要任何特殊知识的 CRDT,但是您无法创建一个保留用户意图的 CRDT 而不具备一些知识。

Many types of intent (delete, move to, insert at) can be preserved by knowing a few principles and following them when writing mutations.
许多种意图(删除、移动至、插入到)可以通过了解一些原则并在编写变更时遵循它们来保留。

Intent Preserving Principles
意图保留原则

Refer to items by their identity
按其标识引用项目

When you're operating on something, you need to refer to that thing exactly rather than using an indirect reference.
当您在操作某物时,您需要准确地引用该物,而不是使用间接引用。

This is most clearly illustrated in the mutations that used array indices to talk about what to remove. If, instead, each element had a unique ID and we passed that to mutations when deleting / moving / etc., the user's intent would be preserved.
这一点最清楚地体现在使用数组索引来讨论要移除什么的突变中。如果每个元素都有一个唯一的 ID,并且在删除/移动等操作时将其传递给突变,用户的意图将得到保留。

Example 例子

addItem(state, id, item) {
  if (state.items.find((i) => i.id === id)) {
    return false;
  }
  state.items.push({
    id,
    item,
  });
},
 
deleteItem(state, id) {
  const index = state.items.findIndex((e) => e.id === id);
  if (index == -1) {
    return false;
  }
  state.items.splice(index, 1);
}

Fig 3: The updated mutations. Note that the item id must be passed in to the mutation. Generating a random uuid within the mutation would make the mutation non-deterministic.
图 3:更新的突变。请注意,必须将项目 ID 传递给突变。在突变中生成随机 UUID 会使突变变得不确定。

Go ahead and try those mutations yourself or watch the example --
继续尝试这些变异,或者观看示例--

Fig 4: Both users concurrently remove A_2 or item b. After merging, that is still what both users see in contrast to the array index based removal.
图 4:两个用户同时移除 A_2 或项目 b 。合并后,这仍然是两个用户看到的,与基于数组索引的移除相反。

Position is Relative 位置是相对的

If you need to prevent interleaving of sequences, use relative positions.
如果您需要防止序列交错,可以使用相对位置。

This is similar to the array index idea. Our addItem mutation is implicitly using array index as the sort order for items. This is fine for many use cases but for other use cases, such as text insertion, this is problem. The reason is that concurrent edits in the same location will end up interleaving characters.
这类似于数组索引的概念。我们的 addItem 变异隐式地使用数组索引作为项目的排序顺序。对于许多用例来说这是可以的,但对于其他用例,比如文本插入,这就是问题。原因在于在相同位置的并发编辑将导致字符交错。

E.g., if Node A writes "girl" and Node B concurrently writes "boy" the result post-merge will be "gbioryl".
例如,如果节点 A 写入“女孩”,节点 B 同时写入“男孩”,合并后的结果将是“ gbioryl ”。

We can fix this by making the notion of position relative. That is, the position an item was inserted at is identified by what is to the left or right of that item.
我们可以通过将位置的概念设为相对来解决这个问题。也就是说,插入项目的位置由该项目左侧或右侧的内容来确定。

addItem(state, id, rightOf, item) {
  if (state.items.find((i) => i.id === id)) {
    return false;
  }
 
  let index = state.items.findIndex((e) => e.id === rightOf);
  if (index < 0) {
    index = state.items.length;
  }
 
  state.items.splice(index + 1, 0, {
    id,
    item
  });
}

Give it a whirl in the live example or see the video below --
在实际示例中试一试,或查看下面的视频--

Fig 5: using "right of" to relatively position new insertions. No interleaving!
图 5:使用“right of”来相对定位新插入物。不要交错!

There could still be a problem here, however. What if another user concurrently deleted something you were inserting next to?
然而,这里仍然可能存在问题。如果另一个用户同时删除了您要插入的内容,会怎么样呢?

Relative positioning would break down in this case. If you want to insert next to b1 but b1 no longer exists after a merge then your position no longer exists.
在这种情况下,相对定位会失效。如果您想要插入到 b1 旁边,但在合并后 b1 不再存在,那么您的位置也不复存在。

You could fix this with invariants (do not allow deletion of these sorts of things) or soft deletion & tombstoning. Tombstoning an item would drop its content but retain its id and position so position information always exists. In addition to the problems outlined for sequences, we can run into issues when moving structures around that have pointers to one another. E.g., moving folders inside a file tree. Concurrent edits of this tree can create cycles such as a folder that contains itself.
您可以通过不变量(不允许删除这些类型的内容)或软删除和墓碑化来解决这个问题。墓碑化一个项目会丢弃其内容,但保留其 ID 和位置,因此位置信息始终存在。除了序列中概述的问题之外,当移动具有相互指针的结构时,我们可能会遇到问题。例如,在文件树中移动文件夹。对此树的并发编辑可能会创建循环,例如包含自身的文件夹。

While this framework idea is powerful (we got relatively far with 0 knowledge of CRDTs) certain cases can begin to explode in complexity.
尽管这个框架的想法很强大(我们在对 CRDT 一无所知的情况下取得了相对较远的进展),但某些情况可能开始变得非常复杂。

Other Options for Intent
意图的其他选项

A different approach to preserving intent is to reach for an existing CRDT (opens in a new tab) that has the intention preserving properties you'd like. This where we're headed with Vulcan. The end goal being a declarative extension to the SQL Schema Definition Language to allow you to easily combine the behavior you need:
保留意图的另一种方法是寻找具有您所需的保留意图属性的现有 CRDT。这就是我们与 Vulcan 合作的方向。最终目标是将声明性扩展到 SQL 模式定义语言,以便您可以轻松地组合所需的行为。

CREATE TABLE post (
 id INTEGER PRIMARY KEY NOT NULL,
 views COUNTER,
 content PERITEXT,
 owner_id LWW INTEGER
) AS CausalLengthSet;

Fig 6: The north star syntax of cr-sqlite.
图 6:cr-sqlite 的北极星语法。

At some point we might ship a "convergence framework" idea to provide a simple way to author custom convergent data types when those that are available don't meet your needs.
在某个时候,我们可能会推出一个“融合框架”概念,为您提供一种简单的方式来创建自定义的融合数据类型,当现有的类型无法满足您的需求时。

👋

If any of this sounds interesting to you, schedule some time with me (opens in a new tab) to discuss it or:
如果您对任何内容感兴趣,请安排一些时间与我讨论

  • CRDTs
  • Contributing to cr-sqlite
    为 cr-sqlite 做出贡献
  • Integration into your product, sponsorship, investment, contract work, design partnerships, etc.
    整合到您的产品中,赞助,投资,合同工作,设计合作伙伴关系等。

Even though there are drawbacks the framework seems pretty magical. For many cases you can just write normal business logic and the state you mutated will converge as expected. How does this work?
尽管存在一些缺点,但这个框架似乎非常神奇。在许多情况下,您只需编写正常的业务逻辑,您所改变的状态将按预期收敛。这是如何工作的?

Framework Implementation
框架实现

The framework is split into two parts:
框架分为两部分:

  1. A log of mutations
    变异日志
  2. Machinery to merge and apply the log to the application's state
    将日志合并并应用于应用程序状态的机制

Mutation Log 变异日志

Every time one of the named mutations is run, we log it. The log looks like:
每次运行其中一个命名的突变时,我们都会记录下来。日志看起来像:

type MutationLog = [EventID, ParentIDs[], MutationName, Args[]][];

In a system that isn't distributed the log would be linear. In our case, however, we need to account for concurrent edits happening in other processes. This is where ParentIDs comes in. ParentIDs refers to all the mutations that happened immediately before the current mutation.
在一个非分布式系统中,日志将是线性的。然而,在我们的情况下,我们需要考虑其他进程中同时发生的并发编辑。这就是 ParentIDs 的作用。 ParentIDs 指的是在当前变异之前立即发生的所有变异。

Thus the log, rather than being linear, is a DAG. The relationships in the DAG express time.
因此,日志不是线性的,而是一个有向无环图。有向无环图中的关系表示时间。

  • Branches represent concurrent modifications.
    分支代表并发修改。
  • Branches that converge into a node represent events that happened before the node being converged into.
    汇聚成节点的分支代表在汇聚到该节点之前发生的事件。
  • Linear runs represents events that happen one after the other
    线性运行代表着一个接一个发生的事件
  • Linear runs in separate branches are concurrent sequences of events.
    线性运行在单独的分支中是事件的并发序列。

time is a dag

Fig 7: 3 processes incrementing a distributed counter.
图 7:3 个进程递增分布式计数器。

  • (A) Two processes increment the counter twice (concurrent sequences of events) at the same time a third process increments it once (concurrent event).
    (A) 两个进程同时两次增加计数器(并发事件序列),同时第三个进程增加一次(并发事件)。
  • (B) All processes sync their changes (arrows converging)
    所有进程同步它们的更改(箭头汇聚)
  • (C) A single process increments the counter twice more after all concurrent events.
    (C)一个单独的进程在所有并发事件之后将计数器增加两倍。

In this setup, each process can locally write to its log without coordinating with others. All it needs to do when writing is create a new event whose parents are the leaves of it's current copy of the log. In our interactive example, each mutation you define is decorated such that it calls addEvent when being invoked.
在这个设置中,每个进程都可以在本地写入其日志,无需与其他进程协调。写入时所需做的就是创建一个新事件,其父事件是其当前日志副本的叶子节点。在我们的交互示例中,您定义的每个变异都被装饰,使其在被调用时调用 addEvent

export type Event = {
  mutationName: string;
  mutationArgs: any[];
};
 
class DAG {
  root: DAGNode;
  nodeRelation: Map<NodeID, DAGNode> = new Map();
  seq: number = 0;
 
  ...
 
  addEvent(event: Event) {
    const node: DAGNode = {
      parents: this.findLeaves(),
      id: `${this.nodeName}-${this.seq++}`,
      event,
    };
    this.nodeRelation.set(node.id, node);
 
    return node;
  }
 
  findLeaves(): Set<NodeID> {
    const leaves = new Set<NodeID>([...this.nodeRelation.keys()]);
    for (const n of this.nodeRelation.values()) {
      // if p is a parent then it is not a leaf. Remove it from the leaves set.
      for (const p of n.parents) {
        leaves.delete(p);
      }
    }
 
    return leaves;
  }
}

Syncing state between processes is a matter of merging these logs together and merging these DAGs is trivial. Since each node in the DAG is completely unique and immutable, all we have to do is union the sets of nodes. This unioning can happen in any order and is idempotent meaning convergence does not rely on the order the DAGs are merged and duplicate syncs are not a problem.
在进程之间同步状态是将这些日志合并在一起的问题,合并这些有向无环图(DAG)是微不足道的。由于 DAG 中的每个节点都是完全独特且不可变的,我们所要做的就是合并节点集。这种合并可以以任何顺序进行,并且是幂等的,这意味着收敛不依赖于合并 DAG 的顺序,重复的同步也不是问题。

class DAG {
 
  ...
 
  merge(other: DAG): DAG {
    const ret = new DAG(this.nodeName);
    ret.nodeRelation = new Map([...this.nodeRelation, ...other.nodeRelation]);
    ret.seq = Math.max(this.seq, other.seq) + 1;
    return ret;
  }
}

Since all processes share a common ancestor for their DAG (the sentinel "ROOT" element), the DAG will always be connected (opens in a new tab) after a merge. This ensures no events are ever lost by being orphaned.
由于所有进程共享一个用于它们的 DAG(哨兵“ROOT”元素)的共同祖先,所以在合并后,DAG 将始终保持连接。这确保了没有事件会因为变成孤立状态而丢失。

Based on this, the DAG:
根据这个,DAG:

  • Only grows 只增长
  • Can be merged in any order (it is commutative and associative)
    可以以任何顺序合并(它是可交换的和可结合的)
  • Merges are idempotent 合并是幂等的

Meaning the DAG is a CRDT. Since we know that DAG will always converge across all peers we can use it to order and drive mutations against application state.
意味着有向无环图(DAG)是一个 CRDT。由于我们知道 DAG 将始终在所有对等方之间收敛,我们可以使用它来对应用状态进行排序和驱动变化。

Which brings us to the last trick -- updating the current application state based on the DAG.
这就引出了最后一个技巧 - 根据有向无环图(DAG)更新当前应用程序状态。

Updating State 更新状态

Updating application state after a merge is a matter of:
在合并后更新应用程序状态是一个问题:

  1. rewinding to a certain snapshot of application state (or all the way back to initial state)
    将应用程序状态倒带到某个快照(或者一直回到初始状态)
  2. Replaying the mutation log against that state
    对该状态重新播放变异日志

But how do you replay a log that is not linear? You make it linear. Each process does a deterministic breadth first traversal (opens in a new tab) of the DAG to create a single sequence of events. This interpretation of events is what is applied to the state snapshot.
但是如何重放一个不是线性的日志呢?你需要将其变成线性的。每个进程都对有向无环图进行确定性的广度优先遍历,以创建一个事件序列。这种事件的解释是应用于状态快照的。

The area where this gets tricky is when handling concurrent branches. Since there is no ordering between concurrent events we just have to pick one and stick with it. A common way to order concurrent events is to order them by process id or value.
处理并发分支时变得棘手的地方在于,由于并发事件之间没有顺序,我们只能选择一个并坚持下去。对并发事件进行排序的常见方法是按进程 ID 或值对它们进行排序。

A non-optimized version of this based on our DAG structure:
基于我们的 DAG 结构的非优化版本:

export type NodeID = "ROOT" | `${NodeName}-${number}`;
 
export type DAGNode = {
  parents: Set<NodeID>;
  id: NodeID;
  event: Event;
};
 
class DAG {
  ...
 
  getEventsInOrder(): DAGNode[] {
    const graph = new Map<NodeID, DAGNode[]>();
    for (const n of this.nodeRelation.keys()) {
      graph.set(n, []);
    }
 
    // Convert the graph so we have child pointers.
    for (const n of this.nodeRelation.values()) {
      for (const p of n.parents) {
        graph.get(p)!.push(n);
      }
    }
 
    // Now sort the children of each node by their id. IDs never collide given node name is encoded into the id.
    for (const children of graph.values()) {
      children.sort((a, b) => {
        const [aNode, aRawSeq] = a.id.split("-");
        const [bNode, bRawSeq] = b.id.split("-");
        const aSeq = parseInt(aRawSeq);
        const bSeq = parseInt(bRawSeq);
        if (aSeq === bSeq) {
          return aNode < bNode ? -1 : 1;
        }
        return aSeq - bSeq;
      });
    }
 
    const events: DAGNode[] = [];
 
    // Finally do our breadth first traversal.
    const visited = new Set<DAGNode>();
    const toVisit = [this.root];
    for (const n of toVisit) {
      if (visited.has(n)) {
        continue;
      }
      visited.add(n);
      if (n.id !== "ROOT") {
        events.push(n);
      }
      for (const child of graph.get(n.id)!) {
        toVisit.push(child);
      }
    }
 
    return events;
  }
}

While toy versions of all ideas seem to be able to be implemented in 100 lines of JavaScript (opens in a new tab), a production grade version that:
尽管所有想法的玩具版本似乎可以在 100 行 JavaScript 中实现,但一个生产级版本却需要更多的工作:

  • Scales to hundreds or thousands of transactions per second (no matter how large the transaction logs become)
    每秒扩展到数百或数千笔交易(无论交易日志变得多么庞大)
  • Shares structure and compresses well
    股份结构良好且压缩效果好

will take some doing. Although you could certainly scale the toy to specific use cases with some hacks. For the general case you'll need:
这需要一些操作。虽然你可以通过一些黑客技巧将玩具按照特定用例进行扩展。对于一般情况,你将需要:

  • A data structure that can rewind to arbitrary versions easily (opens in a new tab)
    一个可以轻松倒带到任意版本的数据结构
  • A way to calculate the deltas between DAGs on different nodes so as not to require sending the entire dag on sync
    计算不同节点上有向无环图(DAG)之间的增量的一种方法,以避免在同步时需要发送整个 DAG
  • Something to prevent mistakes such as mutations being non-deterministic
    防止诸如突变这类错误的方法是非确定性的
  • A way to compress the log while still supporting random access to it
    一种在仍然支持对其进行随机访问的情况下压缩日志的方法
  • A way to know when it is safe to prune events out of the log
    知道何时可以安全地修剪日志中的事件的方法

Acknowledgements 致谢