这是用户在 2024-6-4 18:52 为 https://katyscode.wordpress.com/2020/08/10/practical-il2cpp-protobuf/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
Home > IL2CPP > Practical IL2CPP Reverse Engineering: Extracting Protobuf definitions from applications using protobuf-net (Case Study: Fall Guys)
主页 > IL2CPP > 实用 IL2CPP 逆向工程:使用 protobuf-net 从应用程序中提取 Protobuf 定义(案例研究:Fall Guys)

Practical IL2CPP Reverse Engineering: Extracting Protobuf definitions from applications using protobuf-net (Case Study: Fall Guys)
实用 IL2CPP 逆向工程:使用 protobuf-net 从应用程序中提取 Protobuf 定义(案例研究:Fall Guys)

August 10, 2020 2020 年 8 月 10 日 Leave a comment  发表评论 Go to comments  前往评论
7 Votes 7 票

DISCLAIMER: The following information and source code is provided for educational purposes only. I do not condone cheating in online multiplayer games and expressly discourage this behaviour. This tutorial is intended to demonstrate the thought processes and techniques involved in reverse engineering. It is not intended to enable cheating, the modification of gameplay or any interference or alteration of any server-side components of the analysed product in any way whatsoever. Check your local laws before using this software. At the time of writing I have never connected to a Fall Guys network endpoint or launched the client.
免责声明:以下信息和源代码仅供教育目的。我不容忍在线多人游戏中的作弊行为,并明确反对这种行为。本教程旨在演示逆向工程中涉及的思维过程和技术。其无意以任何方式进行作弊、修改游戏玩法或干扰或更改所分析产品的任何服务器端组件。使用此软件之前,请检查您当地的法律。在撰写本文时,我从未连接到 Fall Guys 网络端点或启动客户端。

You can download the full source code for this tutorial from the Il2CppProtoExtractor-FallGuys GitHub repo.
您可以从 Il2CppProtoExtractor-FallGuys GitHub 存储库下载本教程的完整源代码。

[Updated 7th December 2020: added instructions for using Il2CppInspector’s NuGet package in preference to creating a git clone; added link to commit with an example showing how to find attribute values automatically with a disassembly framework]
[2020 年 12 月 7 日更新:添加了使用 Il2CppInspector 的 NuGet 包而不是创建 git 克隆的说明;添加了提交链接,其中包含一个示例,显示如何使用反汇编框架自动查找属性值]

Introduction 介绍

Il2CppInspector provides several powerful tools to interact with IL2CPP application code and data via static analysis:
Il2CppInspector 提供了几个强大的工具来通过静态分析与 IL2CPP 应用程序代码和数据进行交互:

  • A low-level binary representation (Il2CppInspector) which allows you to query the IL2CPP metadata in its original format
    低级二进制表示 ( Il2CppInspector ),允许您以其原始格式查询 IL2CPP 元数据
  • A .NET type model (TypeModel) which provides a Reflection-style API to all of the types in the application
    .NET 类型模型 ( TypeModel ),为应用程序中的所有类型提供反射式 API
  • An application model (AppModel) which provides an API to query the compiled C++ types, methods and other symbols in the binary, including those not represented by .NET types
    应用程序模型 ( AppModel ),提供 API 来查询二进制文件中已编译的 C++ 类型、方法和其他符号,包括那些未由 .NET 类型表示的符号

In this article, we will leverage the .NET type model to inspect a game and derive a Google Protobuf .proto file encapsulating its network protocol.
在本文中,我们将利用 .NET 类型模型来检查游戏并派生封装其网络协议的 Google Protobuf .proto 文件。

Pre-requisites: 先决条件:

  • Knowledge of .NET, C# and LINQ
    了解 .NET、C# 和 LINQ
  • Basic awareness with what IL2CPP is and what it does (no in-depth knowledge needed)
    对 IL2CPP 是什么及其用途有基本了解(不需要深入的知识)
  • Basic awareness of what Google Protobuf is
    对 Google Protobuf 的基本了解
  • Basic knowledge of how to use a disassembler such as IDA and how to read basic x86-64 assembly code
    了解如何使用 IDA 等反汇编程序以及如何阅读基本 x86-64 汇编代码
  • An inquisitive mind 一颗好奇心

In this article, you will learn:

  • How to set up a new Visual Studio project which uses Il2CppInspector
    如何设置使用 Il2CppInspector 的新 Visual Studio 项目
  • How to load an IL2CPP application and create a type model
    如何加载 IL2CPP 应用程序并创建类型模型
  • How to use LINQ to query .NET types, interfaces, fields, properties, generic type arguments, arrays and attributes in an IL2CPP application
    如何使用 LINQ 查询 IL2CPP 应用程序中的 .NET 类型、接口、字段、属性、泛型类型参数、数组和属性
  • How to extract constructor arguments to custom attributes not retained by IL2CPP in the metadata
    如何将构造函数参数提取到元数据中 IL2CPP 未保留的自定义属性
  • How to transform all of the combined data into a .proto file
    如何将所有组合数据转换为 .proto 文件

The game at hand today is Fall Guys published by Devolver Digital, a Battle Royale-style party game where 60 players race around in bright colorful maps vying for victory. The game requires an upfront purchase and then has microtransactions on top. Being asked to pay more for the rest of the content when I’ve already purchased a game makes me very cantankerous, and Fall Guys also happens to be compiled with IL2CPP, which makes it the perfect target for some reverse engineering fun!
今天手头的游戏是由 Devolver Digital 发行的《Fall Guys》,这是一款大逃杀风格的派对游戏,60 名玩家在色彩鲜艳的地图中争夺胜利。该游戏需要预先购买,然后再进行微交易。当我已经购买了一款游戏时,却被要求为其余内容支付更多费用,这让我非常脾气暴躁,而且《Fall Guys》也恰好是用 IL2CPP 编译的,这使其成为一些逆向工程乐趣的完美目标!

Although I’m using Fall Guys for this example, many of the techniques described below are applicable to any game deployed with IL2CPP and using Protobuf.
虽然我在本示例中使用 Fall Guys,但下面描述的许多技术适用于使用 IL2CPP 部署并使用 Protobuf 的任何游戏。

I bought the game. What do I do now?

At this point we’re going to pretend we were in the position I was in 2 days ago: we’d like to look at the game’s network protocol but we know literally nothing about it. All we have are the game files.

Your first job is to run Il2CppInspector (at the CLI or GUI) using Fall Guys’s GameAssembly.dll and FallGuys_client_Data/il2cpp_data/Metadata/global-metadata.dat files from the game’s Steam directory as input. All Windows IL2CPP games use this same folder layout so you’ll always be able to find these files in the same relative paths for any game.
您的第一项工作是使用游戏 Steam 目录中 Fall Guys 的 GameAssembly.dllFallGuys_client_Data/il2cpp_data/Metadata/global-metadata.dat 文件作为输入来运行 Il2CppInspector(在 CLI 或 GUI 上)。所有 Windows IL2CPP 游戏都使用相同的文件夹布局,因此您始终能够在任何游戏的相同相对路径中找到这些文件。

Il2CppInspector produces a lot of output files, but the ones we are interested in today are:
Il2CppInspector 生成许多输出文件,但我们今天感兴趣的是:

  • The C# types file (types.cs by default). This will let us cruise around every type and method declaration in the game looking for items of interest!
    C# 类型文件(默认为 types.cs )。这将使我们能够浏览游戏中的每个类型和方法声明,寻找感兴趣的项目!
  • The IDA/Ghidra Python script (il2cpp.py by default). This is actually optional, and if you’re not using a disassembler you don’t need it at all, but if you are, running this script in your disassembler will make the game code much easier to read.
    IDA/Ghidra Python 脚本(默认为 il2cpp.py )。这实际上是可选的,如果您不使用反汇编程序,则根本不需要它,但如果您使用反汇编程序,则在反汇编程序中运行此脚本将使游戏代码更易于阅读。

Tip: Need a disassembler? Try downloading IDA Freeware Edition.
提示:需要反汇编程序吗?尝试下载 IDA 免费版。

Tip: Visual Studio Code’s folder browser provides a convenient way to browse the .cs files generated by Il2CppInspector. For easy searching in this use case, I recommend selecting File per namespace in the GUI or use the --layout=namespace option from the CLI when generating the files.
提示:Visual Studio Code 的文件夹浏览器提供了一种便捷的方式来浏览由 Il2CppInspector 生成的 .cs 文件。为了在此用例中轻松搜索,我建议在 GUI 中选择 File per namespace,或者在生成文件时使用 CLI 中的 --layout=namespace 选项。

Determining which networking library is in use

When we first inspect an application, we don’t really know what we’re looking for exactly and have to use some intuition. This essentially comes from experience, and as you get more reverse engineering (RE) adventures under your belt you’ll gradually become familiar with commonly used tools and libraries by software developers.
当我们第一次检查应用程序时,我们并不真正知道我们到底在寻找什么,并且必须使用一些直觉。这本质上来自经验,随着您进行更多的逆向工程 (RE) 冒险,您将逐渐熟悉软件开发人员常用的工具和库。

There are a couple of obvious starting points:

  1. Watch the network traffic with a tool like WireShark to see if there is anything recognizable (requires knowledge of network protocols and doesn’t always work if the traffic is encrypted)
    使用 WireShark 等工具观察网络流量,看看是否有任何可识别的内容(需要网络协议知识,如果流量加密,则并不总是有效)
  2. Grep (search) through the .cs files produced by Il2CppInspector for words like “packet”, “client”, “server”, “network”, “message”, “send”, “receive”, “protocol” etc. You can also search for the names of common networking libraries such as Lidgren, Photon PUN or indeed Google Protobuf. This is a very crude brute-force approach, but when you’ve got nothing, you’ve got nothing to lose!
    Grep(搜索)由 Il2CppInspector 生成的 .cs 文件,查找“数据包”、“客户端”、“服务器”、“网络”、“消息”、“发送”、“接收”、“协议”等单词”等。您还可以搜索常见网络库的名称,例如 Lidgren、Photon PUN 甚至 Google Protobuf。这是一种非常粗暴的暴力方法,但当你一无所有时,你就没有什么可失去的了!

Tip: Research is 90% of hacking. If you don’t know what the popular networking libraries are, simply Google “most popular c# network libraries for unity games” and browse through a few of the results.
提示:研究是黑客攻击的 90%。如果您不知道流行的网络库是什么,只需谷歌“最流行的 Unity 游戏 C# 网络库”并浏览一些结果。

Searching through the generated C# code we can quickly see there is a namespace called Google.Protobuf and another one just called ProtoBuf, so it’s a really good bet that at least some of the network traffic is using Protobuf. Note I say “at least some” because you should never assume anything! Indeed, Fall Guys uses Protobuf for login and matchmaking but uses a different, UDP transport layer-based format for in-game simulation updates (we won’t be covering that here).
搜索生成的 C# 代码,我们可以很快看到有一个名为 Google.Protobuf 的命名空间,另一个名为 ProtoBuf 的命名空间,因此可以肯定的是,至少有一些网络流量是使用 Protobuf。请注意,我说“至少一些”是因为你永远不应该假设任何事情!事实上,《Fall Guys》使用 Protobuf 进行登录和匹配,但使用不同的基于 UDP 传输层的格式进行游戏内模拟更新(我们不会在这里介绍)。

One thing that makes Protobuf a little bit different from the other options is that Protobuf itself isn’t actually a specific library but merely the specification of a format. While Google does have its own implementations for a variety of languages, a number of other implementations have also cropped up for .NET including SilentOrbit and protobuf-net.
Protobuf 与其他选项有点不同的一点是,Protobuf 本身实际上并不是一个特定的库,而仅仅是一种格式的规范。虽然 Google 确实有自己的各种语言的实现,但也出现了许多其他 .NET 实现,包括 SilentOrbit 和 protobuf-net。

Protobuf compilers all work in a similar way: they take one or more .proto files describing various network message formats, and generate C# classes which represent those messages together with methods to serialize and deserialize them. Each produces its own flavour of code so we’re going to need to know the specific implementation our game is using, but fortunately this is easy: all of the implementations are open source and we can simply examine some typical code they generate and look for similarities in our generated C# code.
Protobuf 编译器都以类似的方式工作:它们采用一个或多个描述各种网络消息格式的 .proto 文件,并生成表示这些消息的 C# 类以及序列化和反序列化它们的方法。每个都会产生自己风格的代码,因此我们需要知道我们的游戏正在使用的具体实现,但幸运的是这很容易:所有实现都是开源的,我们可以简单地检查它们生成的一些典型代码并查找我们生成的 C# 代码中存在相似之处。

This is something you will generally research yourself, but to cut a long story short:

  • Google’s implementation always has a field called _parser in every message class
    Google 的实现在每个消息类中始终有一个名为 _parser 的字段
  • SilentOrbit’s implementation always has a method called DeserializeLengthDelimited(Stream stream) for every message class
    SilentOrbit 的实现始终为每个消息类提供一个名为 DeserializeLengthDelimited(Stream stream) 的方法
  • protobuf-net’s implementation always decorates each message class with a [ProtoContract] attribute (there is an exception to this where you can configure attribute-less serialization behaviour using protobuf-net’s RuntimeTypeModel object, but this is not applicable here)
    protobuf-net 的实现总是用 [ProtoContract] 属性装饰每个消息类(有一个例外,您可以使用 protobuf-net 的 RuntimeTypeModel 对象配置无属性序列化行为,但是这个此处不适用)

We are essentially looking for the simplest possible reliable indicator that a particular library is in use. It doesn’t matter what the indicator is, as long as we can reliably find it and use it to identify which classes in the application are Protobuf message classes.
我们本质上是在寻找最简单的可靠指标来表明特定库正在使用中。指标是什么并不重要,只要我们能够可靠地找到它并使用它来识别应用程序中哪些类是 Protobuf 消息类即可。

Again I want to stress that research is critical: I didn’t conjure this information up by magic; I found it by downloading each tool and experimenting with simple test cases, and by reading the documentation for each tool. This kind of work is often the most time-consuming part and typically gets overlooked in guides like this, but it’s an essential skill – take your time!
我想再次强调,研究是至关重要的:这些信息不是我凭空变出来的;而是我的。我通过下载每个工具并尝试简单的测试用例以及阅读每个工具的文档来找到它。此类工作通常是最耗时的部分,并且通常在此类指南中被忽视,但这是一项基本技能 - 慢慢来!

As we look through the generated C# files, we can indeed see a large number of classes annotated with the [ProtoContract] attribute, so we can safely conclude that we are looking at a usage of the protobuf-net library. We make a mental note of this as it will be important later.
当我们查看生成的 C# 文件时,我们确实可以看到大量用 [ProtoContract] 属性注释的类,因此我们可以放心地得出结论,我们正在查看 protobuf-net 库的用法。我们将这一点记在心里,因为稍后它会很重要。

Creating a project which uses Il2CppInspector
创建使用 Il2CppInspector 的项目

Looking through all the message classes in our favourite text editor (and by that I obviously mean Visual Studio Code) is nice, but if we’re looking to interpret the data on the wire from outside the application or write a replacement client or server, we’re going to need to reconstruct the original .proto files.
在我们最喜欢的文本编辑器(显然我指的是 Visual Studio Code)中查看所有消息类很好,但如果我们希望从应用程序外部解释线路上的数据或编写替换客户端或服务器,我们需要重建原始的 .proto 文件。

You could do this by hand by painstakingly looking through all the C# generated by Il2CppInspector, but that’s not going to be a good time. Plus, in-game network protocols frequently change with new patches, so we really want to automate the process of generating these files as much as possible.
您可以通过仔细查看 Il2CppInspector 生成的所有 C# 来手动完成此操作,但这不是一个好时机。另外,游戏中的网络协议经常会随着新补丁的变化而变化,因此我们确实希望尽可能自动化生成这些文件的过程。

The real power of Il2CppInspector lies not in the files it produces, but in how you can integrate it into your own RE tools. Let’s do this now by creating a new project and setting it up for use with Il2CppInspector.
Il2CppInspector 的真正强大之处不在于它生成的文件,而在于如何将其集成到您自己的 RE 工具中。现在让我们创建一个新项目并将其设置为与 Il2CppInspector 一起使用来完成此操作。

  1. In Visual Studio (the full fat version, not VS Code – we’re going to need our big girl/boy pants for this), choose File -> New Project…, select Console App (.NET Core) as the project type and give it a name that makes you feel like a hacker warlord (I went with ConsoleApp1)
    在 Visual Studio(完整版本,而不是 VS Code - 为此我们需要我们的大女孩/男孩裤子)中,选择“文件”->“新建项目...”,选择“控制台应用程序 (.NET Core)”作为项目类型,然后给它起一个让你感觉像黑客军阀的名字(我用了 ConsoleApp1

  2. Right-click on the project in Solution Explorer and choose Manage NuGet Packages…
    右键单击“解决方案资源管理器”中的项目,然后选择“管理 NuGet 包...”

  3. In the Browse tab, search for NoisyCowStudios.Il2CppInspector and install the latest version
    在浏览选项卡中,搜索 NoisyCowStudios.Il2CppInspector 并安装最新版本

  4. To confirm that it works, add:

    using Il2CppInspector;
    to the top of Program.cs in your console app project and compile.
    到控制台应用程序项目中的 Program.cs 顶部并编译。

Info: If you prefer to use the latest commit of Il2CppInspector directly from GitHub, replace steps 2 and 3 above as follows:
信息:如果您更喜欢直接从 GitHub 使用 Il2CppInspector 的最新提交,请按如下方式替换上面的步骤 2 和 3:

  1. Add the Il2CppInspector GitHub repo. To do this, open a command prompt, cd into your solution folder and type:
    添加 Il2CppInspector GitHub 存储库。为此,请打开命令提示符, cd 进入您的解决方案文件夹并键入:

    git clone --recursive https://github.com/djkaty/Il2CppInspector
    (note: if you have created a GitHub repo for your solution, use git submodule add instead[1])
    (注意:如果您已为解决方案创建了 GitHub 存储库,请使用 git submodule add 而不是 [1]

  2. Right-click on the solution in Solution Explorer and choose Add -> New Solution Folder. Call it Submodules.
    右键单击“解决方案资源管理器”中的解决方案,然后选择“添加”->“新建解决方案文件夹”。称之为 Submodules

  3. Right-click Submodules and choose Add -> Existing Project… then navigate through your solution folder and select Il2CppInspector/Il2CppInspector.Common/Il2CppInspector.csproj. Il2CppInspector will now appear in Solution Explorer.
    右键单击 Submodules 并选择“添加”->“现有项目...”,然后浏览解决方案文件夹并选择 Il2CppInspector/Il2CppInspector.Common/Il2CppInspector.csproj 。 Il2CppInspector 现在将出现在解决方案资源管理器中。

  4. Repeat this process and this time select Il2CppInspector/Bin2Object/Bin2Object/Bin2Object.csproj.
    重复此过程,这次选择 Il2CppInspector/Bin2Object/Bin2Object/Bin2Object.csproj

  5. Right-click your new console app project in Solution Explorer and choose Add -> Project Reference… Tick the box for Il2CppInspector in the dialog that opens and click OK (you don’t need to tick the box for Bin2Object).
    在解决方案资源管理器中右键单击新的控制台应用程序项目,然后选择添加 -> 项目引用… 在打开的对话框中勾选 Il2CppInspector 框,然后单击确定(无需勾选 Bin2Object 框)。

If all goes well, you should see compile output like this:

1>------ Rebuild All started: Project: Bin2Object, Configuration: Debug Any CPU ------
1>Bin2Object -> H:\test\ConsoleApp1\Il2CppInspector\Bin2Object\Bin2Object\bin\Debug\netstandard2.1\Bin2Object.dll
2>------ Rebuild All started: Project: Il2CppInspector, Configuration: Debug Any CPU ------
2>Il2CppInspector -> H:\test\ConsoleApp1\Il2CppInspector\Il2CppInspector.Common\bin\Debug\netstandard2.1\Il2CppInspector.Common.dll
3>------ Rebuild All started: Project: ConsoleApp1, Configuration: Debug Any CPU ------
3>ConsoleApp1 -> h:\test\ConsoleApp1\ConsoleApp1\bin\Debug\netcoreapp3.1\ConsoleApp1.dll
========== Rebuild All: 3 succeeded, 0 failed, 0 skipped ==========

(see commit for these changes)

Creating the type model

Before we can do any analysis, we need to process the binary and metadata files with Il2CppInspector. If we copy those files to our project output directory, this can be written as follows:
在进行任何分析之前,我们需要使用 Il2CppInspector 处理二进制文件和元数据文件。如果我们将这些文件复制到项目输出目录中,则可以编写如下:

using Il2CppInspector.Reflection;
// Set the path to your metadata and binary files here
public static string MetadataFile = @"global-metadata.dat";
public static string BinaryFile = @"GameAssembly.dll";
// First we load the binary and metadata files into Il2CppInspector
// There is only one image so we use [0] to select this image
Console.WriteLine("Loading package...");
var package = Il2CppInspector.Il2CppInspector.LoadFromFile(
    BinaryFile, MetadataFile, silent: true)[0];
// Now we create the .NET type model from the package
// This creates a .NET Reflection-style interface we can query with Linq
Console.WriteLine("Creating type model...");
var model = new TypeModel(package);

This code will take a few seconds (or more) to run. Once finished, Il2CppInspector gives access to the binary and metadata as a single package and allows low-level querying of all the metadata structures and tables in the application. TypeModel takes all of this information and converts it into a high level .NET type model.
此代码将需要几秒钟(或更长时间)才能运行。完成后, Il2CppInspector 可以将二进制文件和元数据作为单个包进行访问,并允许对应用程序中的所有元数据结构和表进行低级查询。 TypeModel 获取所有这些信息并将其转换为高级 .NET 类型模型。

Note: Some binary files, for example Apple Universal Binaries (“Fat Mach-O”) and APK files may contain multiple IL2CPP images for different architectures. Il2CppInspector.LoadFromFile returns an array of Il2CppInspectors – one for each image. If you know your binary only contains one image, just use [0] or First() to select it.
注意:某些二进制文件,例如 Apple 通用二进制文件(“Fat Mach-O”)和 APK 文件可能包含针对不同架构的多个 IL2CPP 映像。 Il2CppInspector.LoadFromFile 返回一个 Il2CppInspector 数组 - 每个图像一个。如果您知道您的二进制文件仅包含一张图像,则只需使用 [0]First() 来选择它。

(see commit for these changes)

The simplest possible output: all message names

Let’s start with a basic sanity check (I’m told this is politically incorrect now – please send complaints to /dev/null) by printing out the names of all of the Protobuf message classes – that is, all of the classes with a [ProtoContract] attribute.
让我们通过打印出所有 Protobuf 消息类的名称(即所有类)来开始基本的健全性检查(我被告知这现在在政治上不正确 - 请向 /dev/null 发送投诉)带有 [ProtoContract] 属性。

First, we need to get a handle to ProtoContractAttribute (the naming convention of C# attributes dictates that the Attribute suffix is elided when using it in an attribute decorator, so the actual name of the type is ProtoContractAttribute).
首先,我们需要获取 ProtoContractAttribute 的句柄(C# 属性的命名约定规定,在属性装饰器中使用它时会省略 Attribute 后缀,因此该属性的实际名称类型为 ProtoContractAttribute )。

// All protobuf messages have this class attribute
var protoContract = model.GetType("ProtoBuf.ProtoContractAttribute");

Now we can enumerate every type which has this attribute:

using System.Linq;
// Get all the messages by searching for types with [ProtoContract]
var messages = model.TypesByDefinitionIndex
    .Where(t => t.CustomAttributes.Any(a => a.AttributeType == protoContract));

TypeModel is designed to be queried with LINQ and as you can see it’s trivially easy to drill down to the types we want. Here we go through TypesByDefinitionIndex – which includes exactly the types you see in Il2CppInspector’s C# output – look at the attributes of each one and select any that have ProtoContractAttribute.
TypeModel 被设计为使用 LINQ 进行查询,正如您所看到的,它非常容易深入到我们想要的类型。在这里,我们浏览 TypesByDefinitionIndex – 它完全包含您在 Il2CppInspector 的 C# 输出中看到的类型 – 查看每个类型的属性并选择具有 ProtoContractAttribute 的任何类型。

NOTE: TypeModel includes various other type enumerators including TypesByReferenceIndex, GenericParameterTypes, TypesByFullName, and Types (which includes everything), as well as the GetType() method.
注意: TypeModel 包括各种其他类型的枚举器,包括 TypesByReferenceIndexGenericParameterTypesTypesByFullNameTypes (其中包括一切),以及 GetType() 方法。

Normally we would iterate over types with Types, however for Fall Guys this includes some constructed generic types decorated with [ProtoContract] that we don’t want. Whereas Types includes every compiled generic type with concrete type arguments, TypesByDefinitionIndex only includes types as they have been defined by the developer in source code.
通常我们会使用 Types 迭代类型,但是对于 Fall Guys 来说,这包括一些我们不想要的用 [ProtoContract] 装饰的构造泛型类型。 Types 包含每个已编译的带有具体类型参数的泛型类型,而 TypesByDefinitionIndex 仅包含开发人员在源代码中定义的类型。

Another way to avoid this problem in this particular application is to still iterate over Types but include the condition !t.IsGenericType in the LINQ Where query.
在此特定应用程序中避免此问题的另一种方法是仍然迭代 Types ,但在 LINQWhere 查询中包含条件 !t.IsGenericType

Now we have a list of message types in messages, we just need to iterate over them and print their names:
现在我们在 messages 中有一个消息类型列表,我们只需要迭代它们并打印它们的名称:

// Print all of the message names
foreach (var message in messages)

And that’s it! In total, 4 lines of code to retrieve and output the name of every Protobuf message class in the binary. Not bad – now let’s turn it up a notch!
就是这样!总共 4 行代码用于检索和输出二进制文件中每个 Protobuf 消息类的名称。还不错——现在让我们把它提高一个档次!

Tip: There are quite a few ways you can fetch type names, depending on the situation

  • FullName – the fully-qualified name of the type (including its namespace) with decorators applied
    FullName – 应用了装饰器的类型的完全限定名称(包括其命名空间)

  • Name – the unscoped CTS name of the type with decorators applied. This is what you will use most often along with FullName
    Name – 应用了装饰器的类型的无作用域 CTS 名称。这是您最常与 FullName 一起使用的内容

  • BaseName – the unscoped (ie. not fully-qualified with a namespace) CTS name of the type with all decorators such as nested type indicators and generic type parameter suffixes removed
    BaseName – 类型的无作用域(即未完全限定命名空间)CTS 名称,删除了所有装饰器(例如嵌套类型指示符和泛型类型参数后缀)

  • CSharpName – the name as it would appear in C# source code – ref keywords and decorators for arrays, pointers, generic arguments and nullable types are applied. CTS primitive types are displayed as C# friendly type names.
    CSharpName – C# 源代码中出现的名称 – 应用数组、指针、泛型参数和可为 null 类型的 ref 关键字和装饰器。 CTS 基元类型显示为 C# 友好类型名称。

  • UnmangledBaseName – the base name with CTS names demangled, eg. if BaseName is List`1, then UnmangledBaseName is List.
    UnmangledBaseName – 带有 CTS 名称的基本名称,例如。如果 BaseNameList`1 ,则 UnmangledBaseNameList

  • GetCSharpTypeDeclarationName(bool includeVariance) – the name as it would appear in a type declaration, either as the type being defined or as one of the specified derived types or interfaces. The same as CSharpName except that top-level names are not converted from CTS primitive types (nested generic type arguments are), and generic type arguments use a fully-qualified scope. If includeVariance is true, the in and out covariant and contravariant decorators will be emitted if applicable.
    GetCSharpTypeDeclarationName(bool includeVariance) – 出现在类型声明中的名称,可以是定义的类型,也可以是指定的派生类型或接口之一。与 CSharpName 相同,只是顶级名称不是从 CTS 基元类型转换而来(嵌套泛型类型参数是),并且泛型类型参数使用完全限定范围。如果 includeVariance 为 true,则将发出 in 和 out 协变和逆变装饰器(如果适用)。

  • GetScopedCSharpName(Scope usingScope = null, bool omitRef = false, bool isPartOfTypeDeclaration = false) – like CSharpName except that Il2CppInspector tries really hard to emit the name using the least amount of scope qualifiers necessary to access it from the local scope specified in usingScope.
    GetScopedCSharpName(Scope usingScope = null, bool omitRef = false, bool isPartOfTypeDeclaration = false) – 与 CSharpName 类似,只是 Il2CppInspector 非常努力地尝试使用从 usingScope 中指定的本地作用域访问它所需的最少量的作用域限定符来发出名称。

(see commit for these changes)

Houston, We Have A Problem

When we construct complex output from a complex input, we want to start simple and gradually implement all of the things we need to create the complete output. This is usually a more manageable approach than trying to do everything in one pass. We’ve found all of the messages and their names, let’s now look at a more or less randomly chosen message from the generated C#:
当我们从复杂的输入构建复杂的输出时,我们希望从简单开始,逐步实现创建完整输出所需的所有内容。这通常是一种比尝试一次性完成所有事情更易于管理的方法。我们已经找到了所有消息及其名称,现在让我们看看从生成的 C# 中或多或少随机选择的消息:

[ProtoContract] // 0x0000000180048B40-0x0000000180048B70
public class ItemPaymentLineDto // TypeDefIndex: 9137
    // ... irrelevant stuff elided ...
    // Properties (getter and setter addresses elided)
    [ProtoMember] // 0x0000000180001E80-0x0000000180001EA0
    public string Type { get; set; }
    [ProtoMember] // 0x0000000180005500-0x0000000180005520
    public string Id { get; set; }
    [ProtoMember] // 0x0000000180005A20-0x0000000180005A40
    public int Quantity { get; set; }

We know from the Protocol Buffers Language Guide[3] that each message contains zero or more fields and that each field has a type, a name and a field number (the field numbers are sent over the wire before each field value to distinguish each part of the message; the names are not sent and are only for humans). We also know from protobuf-net’s documentation that each field or property to be serialized must be decorated with a [ProtoMember] attribute. So we know that the above class has three message fields which should be encoded something like this:
我们从 Protocol Buffers 语言指南 [3] 中得知,每条消息包含零个或多个字段,并且每个字段都有一个类型、名称和字段编号(字段编号在每个字段值之前通过线路发送)以区分消息的每个部分;名称不会发送,仅适用于人类)。我们还从 protobuf-net 的文档中知道,要序列化的每个字段或属性都必须用 [ProtoMember] 属性进行修饰。所以我们知道上面的类有三个消息字段,它们应该编码如下:

message ItemPaymentLineDto {
  string Type = ?;
  string Id = ?;
  int32 Quantity = ?;

So far so good, but where are the field numbers? Let’s look at protobuf-net’s documentation to find out. From protobuf-net’s Github page:
到目前为止一切顺利,但是字段编号在哪里?让我们看看 protobuf-net 的文档来找出答案。来自 protobuf-net 的 Github 页面:

The field numbers are specified as constructor arguments to ProtoMemberAttribute. As we can see, these arguments are not included in Il2CppInspector’s output, and this turns out to be the major roadblock in our plan. To find out why, we need to delve deep into the murky inner workings of IL2CPP.
字段编号被指定为 ProtoMemberAttribute 的构造函数参数。正如我们所看到的,这些参数没有包含在 Il2CppInspector 的输出中,而这却是我们计划中的主要障碍。为了找出原因,我们需要深入研究 IL2CPP 的隐秘内部运作方式。

Custom Attribute Generators in IL2CPP
IL2CPP 中的自定义属性生成器

Almost all the information you could wish for is available from the vast amount of metadata IL2CPP retains. Unfortunately, arguments to attribute constructors are handled differently.
几乎您想要的所有信息都可以从 IL2CPP 保留的大量元数据中获得。不幸的是,属性构造函数的参数的处理方式不同。

When you decorate an item with an attribute like [Foo(123)], IL2CPP doesn’t store the argument 123 in metadata. Instead, it creates a new C++ function which is essentially a trampoline that calls Foo‘s constructor with the argument 123. This function is called a custom attribute generator (I’ll call them CAGs for short but this is not an official acronym) and always has the following signature:
当您使用 [Foo(123)] 之类的属性装饰项目时,IL2CPP 不会将参数 123 存储在元数据中。相反,它创建了一个新的 C++ 函数,该函数本质上是一个蹦床,它使用参数 123 调用 Foo 的构造函数。该函数称为自定义属性生成器(我将简称它们为 CAG,但这是不是官方缩写)并且始终具有以下签名:

void xxx_CustomAttributesCacheGenerator(CustomAttributesCache *)

where xxx is the IL2CPP internal identifier for the item on which the attribute is set (not the type of the attribute).
其中 xxx 是设置属性的项目的 IL2CPP 内部标识符(不是属性的类型)。

If the exact same set of constructor arguments for the exact same attributes are used again elsewhere, the same CAG will be called when the using item is invoked.
如果在其他地方再次使用完全相同的属性的完全相同的构造函数参数集,则在调用使用项时将调用相同的 CAG。

Moreover if multiple attributes are decorating a single item, IL2CPP will generate a single CAG for the entire set of attributes. This is a big deal because it means that every unique used set of attributes with unique constructor arguments gets its own CAG.
此外,如果多个属性装饰单个项目,IL2CPP 将为整个属性集生成单个 CAG。这是一件大事,因为这意味着每个具有唯一构造函数参数的唯一使用的属性集都有自己的 CAG。

This is best illustrated with a fabricated example:

[Foo(1)] public int field1;          // Generate CAG 'A'
[Foo(2)] public int field2;          // Generate CAG 'B'
[Foo(1)] public int field3;          // Reuse CAG 'A'
[Foo(1, 5)] public int field4;       // Generate CAG 'C'
[Bar(1)] public int field5;          // Generate CAG 'D'
[Foo(1), Bar(1)] public int field6;  // Generate CAG 'E'
[Foo(1), Bar(1)] public int field7;  // Reuse CAG 'E'
[Foo(2), Bar(1)] public int field8;  // Generate CAG 'F'

During compilation, IL2CPP will generate a CAG for field1 and field2.
在编译期间,IL2CPP 将为 field1field2 生成一个 CAG。

field3 uses the same attribute as field1 so the CAG for field1 will be re-used.
field3 使用与 field1 相同的属性,因此将重新使用 field1 的 CAG。

field4 uses a new 2nd argument so it gets its own CAG, as does field5 which uses a previously unseen attribute.
field4 使用新的第二个参数,因此它获得自己的 CAG, field5 也是如此,它使用以前未见过的属性。

At first glance it seems that field6 uses the same attributes as field3 and field5 combined, but since they have never occurred in this combination, and because an item can only have one CAG, a new CAG is generated for field6.
乍一看, field6 似乎使用与 field3field5 组合相同的属性,但因为它们从未出现在这个组合中,并且因为一个项目可以只有一个 CAG,为 field6 生成一个新的 CAG。

field7 uses the exact set of attributes with the same arguments as field6 so the CAG for field6 will be re-used.
field7 使用与 field6 具有相同参数的精确属性集,因此将重新使用 field6 的 CAG。

field8 uses a previously seen set of attributes (in field6 and field7) but one of the arguments is different, so a new CAG is generated for field8.
field8 使用之前看到的一组属性(在 field6field7 中),但其中一个参数不同,因此为 field8

Custom Attribute Generator object code

Let’s fire up a disassembler and look at the code for a ProtoMember CAG. Il2CppInspector gives all of the CAG address in the C# output in comments after each attribute, and it will also demarcate and type the functions in IDA/Ghidra. The first property in ItemPaymentLineDto above has a CAG at 0x180001E80, which looks like this:
让我们启动反汇编程序并查看 ProtoMember CAG 的代码。 Il2CppInspector 在每个属性后面的注释中给出了 C# 输出中的所有 CAG 地址,并且它还将在 IDA/Ghidra 中划分和键入函数。上面 ItemPaymentLineDto 中的第一个属性在 0x180001E80 处有一个 CAG,如下所示:

.text:0000000180001E80 ; void __stdcall ProtoMemberAttribute_CustomAttributesCacheGenerator(CustomAttributesCache *)
.text:0000000180001E80 ProtoMemberAttribute_CustomAttributesCacheGenerator proc near
.text:0000000180001E80                                         ; DATA XREF: .rdata:00000001821F3EE0↓o
.text:0000000180001E80                                         ; .rdata:00000001821F4280↓o ...
.text:0000000180001E80                 mov     rcx, [rcx+8]
.text:0000000180001E84                 xor     r8d, r8d
.text:0000000180001E87                 mov     rcx, [rcx]
.text:0000000180001E8A                 lea     edx, [r8+1]
.text:0000000180001E8E                 jmp     ProtoMemberAttribute__ctor
.text:0000000180001E8E ProtoMemberAttribute_CustomAttributesCacheGenerator endp

This doesn’t look too bad. The x64 stdcall calling convention[4] is used which means that the first four arguments (ignoring floating point numbers) are stored in RCX, RDX, R8 and R9.
这看起来还不错。使用 x64 stdcall 调用约定 [4] ,这意味着前四个参数(忽略浮点数)存储在 RCX、RDX、R8 和 R9 中。

Object methods always take a hidden this argument as the first parameter when compiled to executable code, because everything becomes a top-level function and objects don’t exist, so a pointer to the object’s data must be included in every function call that equates to one of the object’s methods so we know which instance we are working with. Therefore in this case, RCX is this.
对象方法在编译为可执行代码时始终采用隐藏的 this 参数作为第一个参数,因为一切都成为顶级函数并且对象不存在,因此必须包含指向对象数据的指针每个函数调用都相当于对象的方法之一,因此我们知道正在使用哪个实例。因此在这种情况下,RCX 是 this

XORing a register with itself is a common way of setting it to zero, therefore R8 is zero. The key instruction is lea edx, [r8+1] which loads 0+1 (ie. 1) into EDX (which is the bottom 32 bits of RDX). Therefore EDX is 1, and this is the argument that will be passed to ProtoMemberAttribute‘s constructor.
将寄存器与其自身进行异或是将其设置为零的常见方法,因此 R8 为零。关键指令是 lea edx, [r8+1] ,它将0+1(即1)加载到EDX(即RDX的底部32位)中。因此 EDX 为 1,这是将传递给 ProtoMemberAttribute 构造函数的参数。

The function ends with a jump (jmp) to ProtoMemberAttribute__ctor, and we conclude that this function’s behaviour is equivalent to calling the constructor ProtoMember(1).
该函数以跳转( jmp )到 ProtoMemberAttribute__ctor 结束,我们得出结论,该函数的行为相当于调用构造函数 ProtoMember(1)

Let’s have a look at the CAG for the second property in ItemPaymentLineDto:
让我们看一下 ItemPaymentLineDto 中第二个属性的 CAG:

.text:0000000180005500 ; void __stdcall ProtoMemberAttribute_CustomAttributesCacheGenerator_6442472704(CustomAttributesCache *)
.text:0000000180005500 ProtoMemberAttribute_CustomAttributesCacheGenerator_6442472704 proc near
.text:0000000180005500                                         ; DATA XREF: .rdata:00000001821F4288↓o
.text:0000000180005500                                         ; .rdata:00000001821F42D0↓o ...
.text:0000000180005500                 mov     rcx, [rcx+8]
.text:0000000180005504                 xor     r8d, r8d
.text:0000000180005507                 mov     rcx, [rcx]
.text:000000018000550A                 lea     edx, [r8+2]
.text:000000018000550E                 jmp     ProtoMemberAttribute__ctor
.text:000000018000550E ProtoMemberAttribute_CustomAttributesCacheGenerator_6442472704 endp

That looks awfully similar to the previous CAG. In fact, it’s identical, with one exception: lea edx, [r8+2]. This time we are loading the value 2 instead of 1. Therefore we can deduce that this function is equivalent to calling the constructor ProtoMember(2).
这看起来与之前的 CAG 非常相似。事实上,它是相同的,除了一个例外: lea edx, [r8+2] 。这次我们加载的是值 2 而不是 1。因此我们可以推断该函数相当于调用构造函数 ProtoMember(2)

Let’s check the third and final property:

.text:0000000180005A20 ; void __stdcall ProtoMemberAttribute_CustomAttributesCacheGenerator_6442474016(CustomAttributesCache *)
.text:0000000180005A20 ProtoMemberAttribute_CustomAttributesCacheGenerator_6442474016 proc near
.text:0000000180005A20                                         ; DATA XREF: .rdata:00000001821F4290↓o
.text:0000000180005A20                                         ; .rdata:00000001821F42E8↓o ...
.text:0000000180005A20                 mov     rcx, [rcx+8]
.text:0000000180005A24                 xor     r8d, r8d
.text:0000000180005A27                 mov     rcx, [rcx]
.text:0000000180005A2A                 lea     edx, [r8+3]
.text:0000000180005A2E                 jmp     ProtoMemberAttribute__ctor
.text:0000000180005A2E ProtoMemberAttribute_CustomAttributesCacheGenerator_6442474016 endp

Well it sure looks like we found our field numbers! Here we have another identical function except that this time lea edx, [r8+3] sets the constructor argument to 3, so this function is equivalent to calling ProtoMember(3).
好吧,看起来我们确实找到了字段编号!这里我们有另一个相同的函数,只不过这次 lea edx, [r8+3] 将构造函数参数设置为 3,因此该函数相当于调用 ProtoMember(3)

It’s cold and dark down here in the Mariana Trench.

Acquiring the attribute constructor arguments programmatically

Now we know how to get the argument to any call to ProtoMemberAttribute‘s constructor by hand, we need to indulge in some kind of dark magic to automate this process. A quick look at the hex view of the functions above makes the solution clear:
现在我们知道如何手动获取对 ProtoMemberAttribute 构造函数的任何调用的参数,我们需要沉迷于某种黑魔法来自动化此过程。快速浏览一下上述函数的十六进制视图就可以清楚地了解解决方案:

0000000180001E80  48 8B 49 08 45 33 C0 48  8B 09 41 8D 50 01 E9 5D
0000000180005500  48 8B 49 08 45 33 C0 48  8B 09 41 8D 50 02 E9 DD
0000000180005A20  48 8B 49 08 45 33 C0 48  8B 09 41 8D 50 03 E9 BD

The field number (which is part of the lea instruction) is always 13 (0x0D) bytes offset from the start of the function.
字段编号(是 lea 指令的一部分)始终是距函数开头 13 (0x0D) 字节的偏移量。

TypeModel provides an API to fetch all of the CAGs that call constructors on a specific type (including those which call constructors on multiple types including the specified type):
TypeModel 提供了一个 API 来获取调用特定类型的构造函数的所有 CAG(包括调用多种类型(包括指定类型)的构造函数的 CAG):

// Get all of the CAGs for ProtoMember so we can determine field numbers
var atts = model.CustomAttributeGenerators[protoMember];

atts is now a List<CustomAttributeData>, which among other things provides the virtual address of the CAG and its function body. We can use these with LINQ projection to create a dictionary which maps ProtoMemberAttribute CAG addresses to field numbers:
atts 现在是 List<CustomAttributeData> ,其中提供了 CAG 及其函数体的虚拟地址。我们可以将它们与 LINQ 投影结合使用来创建一个字典,将 ProtoMemberAttribute CAG 地址映射到字段编号:

// Create a mapping of CAG virtual addresses to field numbers
vaFieldMapping = atts.Select(a => new {
    VirtualAddress = a.VirtualAddress.Start,
    FieldNumber    = a.GetMethodBody()[0x0D]
.ToDictionary(kv => kv.VirtualAddress, kv => kv.FieldNumber);
// Print all attribute mappings
foreach (var item in vaFieldMapping)
    Console.WriteLine($"{item.Key.ToAddressString()} = {item.Value}");

VirtualAddress is a tuple (ulong Start, ulong End) giving the start and end virtual addresses of the function, and GetMethodBody() retrieves the actual object code as byte[]. By indexing 0x0D bytes into this, we retrieve the field number.
VirtualAddress 是一个元组 (ulong Start, ulong End) ,给出函数的开始和结束虚拟地址, GetMethodBody() 检索实际的目标代码作为 byte[] 。通过将 0x0D 字节索引到其中,我们检索字段号。

Tip: The class TypeModel.MethodBase also includes VirtualAddress and GetMethodBody(). You can use these to inspect regular .NET methods, constructors, property getters and setters, and events in the same way. These APIs are also available for MethodInvoker which handles Method.Invoke thunks – another IL2CPP idiosyncracy.
提示:类 TypeModel.MethodBase 还包括 VirtualAddress GetMethodBody() 。您可以使用它们以相同的方式检查常规 .NET 方法、构造函数、属性 getter 和 setter 以及事件。这些 API 也可用于处理 Method.Invoke thunk 的 MethodInvoker ——另一个 IL2CPP 特性。

Note: The end address in VirtualAddress is exclusive, ie. it points to the byte after the final byte of the function.
注意: VirtualAddress 中的结束地址是排他的,即。它指向函数最后一个字节之后的字节。

Tip: the ToAddressString() extension method can be used on ulongs and VirtualAddress tuples to format address values into hex values automatically formatted to 8 or 16 digits depending on the address length, and with a 0x prefix. For VirtualAddress tuples, the start and end addresses will be output as a range like 0x12345678 - 0x12345699.
提示: ToAddressString() 扩展方法可用于 ulongVirtualAddress 元组,将地址值格式化为十六进制值,这些值会自动格式化为 8 或 16 位数字,具体取决于地址长度,并带有 0x 前缀。对于 VirtualAddress 元组,起始地址和结束地址将作为 0x12345678 - 0x12345699 之类的范围输出。

When we run the code above, we get this output:

0x0000000180001E80 = 1
0x0000000180005500 = 2
0x0000000180005A20 = 3
0x0000000180005BF0 = 4
0x0000000180005E20 = 5
0x0000000180055270 = 139
0x00000001800072C0 = 6
0x0000000180010270 = 7
0x0000000180005820 = 8
0x0000000180005660 = 8
0x000000018002FCB0 = 8
0x0000000180044480 = 9
0x0000000180044580 = 10
0x0000000180044620 = 11
0x0000000180044730 = 12
0x0000000180044C10 = 13

Success! …ish. The output seems mostly reasonable, but what is going on at 0x180055270? It seems like the Mariana Trench has more secrets left to uncover – We Need To Go Deeper.
成功! ……差不多。输出看起来大部分是合理的,但是 0x180055270 发生了什么?马里亚纳海沟似乎还有更多秘密有待揭开——我们需要更深入。

Note: This solution to finding attribute constructor arguments from CAGs is both architecture dependent, compiler dependent, and build settings dependent. Therefore it is very fragile, and must be tailored to individual target applications. This is the reason why Il2CppInspector does not include a protobuf-net extractor out of the box.
注意:这种从 CAG 查找属性构造函数参数的解决方案既依赖于体系结构、依赖于编译器,也依赖于构建设置。因此它非常脆弱,必须根据具体的目标应用进行定制。这就是 Il2CppInspector 不包含开箱即用的 protobuf-net 提取器的原因。

(see commit for these changes)

Custom Attribute Generators with multiple attributes

We dip back into our disassembler to check out the CAG at 0x180055270:
我们回到反汇编器中检查 0x180055270 处的 CAG:

.text:000000180055270 ; void __stdcall ObsoleteAttribute_CustomAttributesCacheGenerator_6442799728(CustomAttributesCache *)
.text:0000000180055270 ObsoleteAttribute_CustomAttributesCacheGenerator_6442799728 proc near
.text:0000000180055270                                         ; DATA XREF: .rdata:00000001821F43A8↓o
.text:0000000180055270                                         ; .pdata:0000000182EA6868↓o
.text:0000000180055270                 push    rbx
.text:0000000180055272                 sub     rsp, 20h
.text:0000000180055276                 mov     rbx, rcx
.text:0000000180055279                 xor     r8d, r8d        ; method
.text:000000018005527C                 mov     rcx, [rcx+8]
.text:0000000180055280                 lea     edx, [r8+5]     ; tag
.text:0000000180055284                 mov     rcx, [rcx]      ; this
.text:0000000180055287                 call    ProtoMemberAttribute__ctor
.text:000000018005528C                 mov     rax, [rbx+8]
.text:0000000180055290                 lea     rcx, aFoundInsideExt ; "Found inside `ExtraIdentities`."
.text:0000000180055297                 mov     rbx, [rax+8]
.text:000000018005529B                 call    il2cpp_string_new_wrapper
.text:00000001800552A0                 mov     rdx, rax        ; message
.text:00000001800552A3                 xor     r8d, r8d        ; method
.text:00000001800552A6                 mov     rcx, rbx        ; this
.text:00000001800552A9                 add     rsp, 20h
.text:00000001800552AD                 pop     rbx
.text:00000001800552AE                 jmp     ObsoleteAttribute__ctor_1
.text:00000001800552AE ObsoleteAttribute_CustomAttributesCacheGenerator_6442799728 endp

This has similarities to the other CAGs we examined but we can see that this constructs two attributes: ProtoMember(5) and Obsolete("Found inside 'ExtraIdentities'.") (we can confirm this by checking the C# types output file), and the compiler has added some extra instructions at the start which break our ugly hack.
这与我们检查的其他 CAG 相似,但我们可以看到它构造了两个属性: ProtoMember(5)Obsolete("Found inside 'ExtraIdentities'.") (我们可以通过检查 C# 类型输出文件来确认这一点),并且编译器在开始时添加了一些额外的指令,这打破了我们丑陋的黑客行为。

Il2CppInspector does offer some tooling in AppModel designed to be used with disassembly frameworks like Capstone.NET to help with problems like this, but for this article we’ll take the lazy approach and fix it manually:
Il2CppInspector 确实在 AppModel 中提供了一些工具,旨在与 Capstone.NET 等反汇编框架一起使用,以帮助解决此类问题,但在本文中,我们将采用惰性方法并手动修复它:

vaFieldMapping[0x180055270] = 5;

Note that unlike all of the code up to this point, this will unfortunately break whenever the game is patched, because the addresses will change.

Let’s not be too hasty though. Are we sure all of the other field numbers are correct?

We can find out by checking every CAG by hand in a disassembler, but that would defeat the point of making an automatic mapping. Instead, let’s call on Il2CppInspector to help us once again by finding out which of the ProtoMember CAGs are also used to initialize other attribute types and therefore produce a list of addresses we need to check by hand:
我们可以通过在反汇编程序中手动检查每个 CAG 来找出答案,但这会破坏进行自动映射的意义。相反,让我们调用 Il2CppInspector 来再次帮助我们,找出哪些 ProtoMember CAG 还用于初始化其他属性类型,从而生成我们需要手动检查的地址列表:

// Find CAGs which are used by other attribute types and shared with ProtoMember
var sharedAtts = vaFieldMapping.Keys
    .Select(a => model.CustomAttributeGeneratorsByAddress[a]
                      .Where(a => a.AttributeType != protoMember))
    .SelectMany(l => l);
// Warn about shared mappings
foreach (var item in sharedAtts)
    Console.WriteLine($"WARNING: Attribute generator " +
        "{item.VirtualAddress.ToAddressString()} is shared with " +
        "{item.AttributeType.FullName} - check disassembly listing");

This LINQ voodoo may require some explaining:
这个 LINQ 巫毒可能需要一些解释:

First, we iterate over all of the ProtoMember CAG addresses we selected earlier and call CustomAttributeGeneratorsByAddress[ulong] with each CAG address as the index. CustomAttributeGeneratoresByAddress returns a List<CustomAttributeData> with one entry for each attribute type the CAG at the given address initializes.
首先,我们迭代之前选择的所有 ProtoMember CAG 地址,并以每个 CAG 地址作为索引调用 CustomAttributeGeneratorsByAddress[ulong]CustomAttributeGeneratoresByAddress 返回一个 List<CustomAttributeData> ,其中包含给定地址处 CAG 初始化的每个属性类型的一个条目。

We then eliminate ProtoMember from each of these lists. Since we’re only iterating over CAGs guaranteed to initialize a ProtoMember, this means that if there are no other attribute types initialized by that CAG, the list will be empty, otherwise it will contain all the other attribute types the CAG initializes.
然后我们从每个列表中删除 ProtoMember 。由于我们仅迭代保证初始化 ProtoMember 的 CAG,这意味着如果该 CAG 没有初始化其他属性类型,则列表将为空,否则它将包含所有其他属性类型CAG 初始化。

Now we use SelectMany to merge all of the lists together and flatten them into a single list. The end result is a list which is the subset of the original ProtoMember CAG list containing only those CAGs which also initialize other attribute types besides ProtoMember.
现在我们使用 SelectMany 将所有列表合并在一起并将它们展平为单个列表。最终结果是一个列表,它是原始 ProtoMember CAG 列表的子集,仅包含那些还初始化除 ProtoMember 之外的其他属性类型的 CAG。

Here is the output:

WARNING: Attribute generator 0x0000000180055270-0x00000001800552C0 is shared with System.ObsoleteAttribute - check disassembly listing
WARNING: Attribute generator 0x0000000180005660-0x00000001800056B0 is shared with System.ObsoleteAttribute - check disassembly listing
WARNING: Attribute generator 0x000000018002FCB0-0x000000018002FD00 is shared with System.ObsoleteAttribute - check disassembly listing

Besides the suspect we already looked at, two more of our CAGs have additional code. When we check them in the disassembler using the same techniques as above, we find that the field number arguments for these three CAGs are 5, 7 and 1 respectively, and we add these modifications as a fixup:
除了我们已经查看过的嫌疑人之外,我们的另外两个 CAG 还具有额外的代码。当我们使用与上述相同的技术在反汇编器中检查它们时,我们发现这三个 CAG 的字段编号参数分别为 5、7 和 1,并且我们添加这些修改作为修复:

// Fixups we have determined from the disassembly
vaFieldMapping[0x180055270] = 5;
vaFieldMapping[0x180005660] = 7;
vaFieldMapping[0x18002FCB0] = 1;

What is the benefit of writing the above code snippet that checks for “shared” CAGs instead of just going through them all by hand? The answer is that when the game is patched, you will now be able to immediately generate a list of addresses you need to check in the disassembler so that you can modify the fixups quickly.

(see commit for these changes)

Info: The “correct” (or at least more robust) solution to this problem is to use a disassembly framework to scan through each CAG for a call or jmp to the ProtoMemberAttribute constructor, then step backwards one instruction at a time until you find the instruction that sets EDX and read the operand.
信息:此问题的“正确”(或至少更稳健)解决方案是使用反汇编框架扫描每个 CAG,以查找 calljmpProtoMemberAttribute 构造函数,然后一次后退一条指令,直到找到设置 EDX 的指令并读取操作数。

I’ll cover this kind of low-level work in a future tutorial – in the meantime, you can see the solution in this commit. This example uses the Capstone disassembly framework (C# bindings) to step through the binary code and determine the attribute values automatically.
我将在以后的教程中介绍这种低级工作 - 同时,您可以在此提交中看到解决方案。此示例使用 Capstone 反汇编框架(C# 绑定)逐步执行二进制代码并自动确定属性值。

Producing the message definitions

Now we have access to everything we need: the types, the names and the field numbers. Let’s get cooking! Here is our overall message output loop:

// This is any field or property with the [ProtoMember] attribute
foreach (var message in messages) {
    var name   = message.CSharpName;
    var fields = message.DeclaredFields.Where(f => f.CustomAttributes.Any(a => a.AttributeType == protoMember));
    var props  = message.DeclaredProperties.Where(p => p.CustomAttributes.Any(a => a.AttributeType == protoMember));
    var proto = new StringBuilder();
    proto.Append($"message {name} {{\n");
    // TODO: We are going to need to map these C# types to protobuf types!
    // Output C# fields
    foreach (var field in fields) {
        var pmAtt = field.CustomAttributes.First(a => a.AttributeType == protoMember);
        proto.Append($"  {field.FieldType.Name} {field.Name} = {vaFieldMapping[pmAtt.VirtualAddress.Start]};\n");
    // Output C# properties
    foreach (var prop in props) {
        var pmAtt = prop.CustomAttributes.First(a => a.AttributeType == protoMember);
        proto.Append($"  {prop.PropertyType.Name} {prop.Name} = {vaFieldMapping[pmAtt.VirtualAddress.Start]};\n");

(see commit for these changes)

This is fairly straightforward C# code which simply iterates over each message class, which in turn iterates all over the fields and properties within each class that have a [ProtoMember] attribute, printing out their name, type and field number.
这是相当简单的 C# 代码,它只是迭代每个消息类,进而迭代每个类中具有 [ProtoMember] 属性的所有字段和属性,打印出它们的名称、类型和字段号。

Notice how we use our previously constructed vaFieldMapping[pmAtt.VirtualAddress.Start] to map the virtual address of the [ProtoMember] CAG on the field or property being output to its corresponding field number.
请注意我们如何使用之前构造的 vaFieldMapping[pmAtt.VirtualAddress.Start] 将要输出的字段或属性上的 [ProtoMember] CAG 的虚拟地址映射到其相应的字段号。

Here is a snippet of the output for one message:

message ItemPaymentLineDto {
  String Type = 1;
  String Id = 2;
  Int32 Quantity = 3;

An excellent start. However, Protobuf uses its own set of primitive types so we’re going to need to convert our .NET type names into Protobuf type names.
一个很好的开始。但是,Protobuf 使用自己的一组基元类型,因此我们需要将 .NET 类型名称转换为 Protobuf 类型名称。

Scalar type mappings 标量类型映射

This is where things get a bit dicey, because there are several ways to represent both integer and floating point types in Protobuf and we can’t assume how this is implemented. Remember how I said earlier that it would be important to remember that protobuf-net was the exact implementation? This is the reason why. We can look at the documentation or source code of the networking library being used to determine how it serializes specific C# types into Protobuf fields.
这就是事情变得有点冒险的地方,因为 Protobuf 中有多种表示整数和浮点类型的方法,我们无法假设这是如何实现的。还记得我之前说过,记住 protobuf-net 是确切的实现很重要吗?这就是原因。我们可以查看所使用的网络库的文档或源代码,以确定它如何将特定的 C# 类型序列化到 Protobuf 字段中。

For protobuf-net, there is a wonderful Unofficial Manual[2] which among other things lists the complete mapping of C# types to Protobuf types. We add the mapping as a simple lookup table using a Dictionary:
对于 protobuf-net,有一份精彩的非官方手册 [2] ,其中列出了 C# 类型到 Protobuf 类型的完整映射。我们使用 Dictionary 将映射添加为简单的查找表:

// Define type map from .NET types to protobuf types
// This is specifically how protobuf-net maps types
// and is not the same for all .NET protobuf libraries
private static Dictionary<string, string> protoTypes = new Dictionary<string, string> {
    ["System.Int32"] = "int32",
    ["System.UInt32"] = "uint32",
    ["System.Byte"] = "uint32",
    ["System.SByte"] = "int32",
    ["System.UInt16"] = "uint32",
    ["System.Int16"] = "int32",
    ["System.Int64"] = "int64",
    ["System.UInt64"] = "uint64",
    ["System.Single"] = "float",
    ["System.Double"] = "double",
    ["System.Decimal"] = "bcl.Decimal",
    ["System.Boolean"] = "bool",
    ["System.String"] = "string",
    ["System.Byte[]"] = "bytes",
    ["System.Char"] = "uint32",
    ["System.DateTime"] = "google.protobuf.Timestamp"

Actually performing the mapping is also straightforward. We just replace the output line with:

protoTypes.TryGetValue(field.FieldType.FullName ?? string.Empty, out var protoTypeName);
proto.Append($"  {protoTypeName ?? field.FieldType.Name} {field.Name} = {vaFieldMapping[pmAtt.VirtualAddress.Start]};\n");

We use TryGetValue to attempt to map the .NET type name to the Protobuf type name. If the type name exists in the dictionary, we use it, otherwise protoTypeName will be null so we just fall back to the .NET type name.
我们使用 TryGetValue 尝试将 .NET 类型名称映射到 Protobuf 类型名称。如果类型名称存在于字典中,我们就使用它,否则 protoTypeName 将是 null ,因此我们只需回退到 .NET 类型名称。

A couple of things to note:

  • We must use the FullName for the field (or property) type we are mapping from because that is how we have stored the type names in the dictionary. We did this on purpose to avoid conflicts with same-named types in other namespaces like Foo.Double.
    我们必须使用 FullName 作为我们要映射的字段(或属性)类型,因为这就是我们在字典中存储类型名称的方式。我们这样做的目的是为了避免与其他命名空间(如 Foo.Double )中的同名类型发生冲突。
  • Generic type parameters like T have a populated Name but a FullName of null in accordance with how the .NET Reflection API functions. To avoid a NullReferenceException, we just use the null-coalescing operator to pass an empty string as the dictionary key, which will fail and cause the subsequent code to use the .NET type name in Name (eg. T).
    通用类型参数(如 T )具有填充的 Name ,但根据 .NET Reflection API 的功能,具有 nullFullName 。为了避免出现 NullReferenceException ,我们只需使用 null 合并运算符传递一个空字符串作为字典键,这将失败并导致后续代码使用 Name )。

Let’s look at ItemPaymentLineDto again:
我们再看一下 ItemPaymentLineDto

message ItemPaymentLineDto {
  string Type = 1;
  string Id = 2;
  int32 Quantity = 3;

Awesome, this is a valid Protobuf message definition!
太棒了,这是一个有效的 Protobuf 消息定义!

(see commit for these changes)

Dealing with non-scalar types

We’re on the home stretch now, but there is still work to do. We now look over some messages to see what else needs to be implemented:

message UserSessionHeader {
  string UserId = 1;
  // unhandled array
  String[] Roles = 2;
  ContentTargeting ContentTargeting = 3;
  String[] IdentityTypes = 4;
  string AnalyticsId = 5;
  // unhandled map
  Dictionary`2[System.String,System.String] ExtraIdentities = 6;
  Dictionary`2[System.String,System.String] AdditionalProperties = 7;
message OperationHeader {
  Guid RequestId = 1; // see note
  OperationType Type = 2; // unhandled enum
message StoreBundleDto {
  string Id = 1;
  // unhandled list
  List`1[Catapult.Modules.Store.Protocol.Dtos.PaymentDto] PaymentOptions = 2;
  StoreBundleRewardDto Reward = 3;
  string CategoryName = 4;
  // unhandled nullable type
  Nullable`1[Int32] Stock = 5;
  Nullable`1[DateTime] AvailableTo = 6;

Note: There are some classes which receive special help in protobuf-net via the static methods in ProtoBuf.BclHelpers which provide encodings for TimeSpan, DateTime (google.protobuf.TimeStamp), decimal and Guid objects. Therefore we leave these type names intact when producing our .proto file even though they are non-standard types. If the output is re-compiled with protobuf-net it should work nicely. Changes will need to be made if compiling the output .proto with a different Protobuf implementation.
注意:有些类通过 ProtoBuf.BclHelpers 中的静态方法在 protobuf-net 中获得特殊帮助,这些方法为 TimeSpanDateTimegoogle.protobuf.TimeStampGuid 对象。因此,在生成 .proto 文件时,我们会保持这些类型名称不变,即使它们是非标准类型。如果使用 protobuf-net 重新编译输出,它应该可以很好地工作。如果使用不同的 Protobuf 实现编译输出 .proto ,则需要进行更改。

It doesn’t really matter what order we take these in so we’ll just go ahead and start with whatever we feel like.

Arrays 数组

The issues with arrays are lexical:

  1. When a type has a name like System.UInt32[], the array index brackets prevent our type name mapping code from working. We’d like to use Protobuf type names where they can be mapped but leave the rest intact, as we did for primitive type names.
    当类型具有类似 System.UInt32[] 的名称时,数组索引括号会阻止我们的类型名称映射代码工作。我们希望在可以映射的地方使用 Protobuf 类型名称,但保持其余部分不变,就像我们对原始类型名称所做的那样。

  2. Arrays are represented in Protobuf messages as repeated fields, so an array of string is defined as repeated string and an array of Foo is defined as repeated Foo.
    数组在 Protobuf 消息中表示为 repeated 字段,因此字符串数组定义为 repeated string ,Foo 数组定义为 repeated Foo

In the .NET Reflection API and Il2CppInspector alike, arrays, pointers and references have an ElementType property which retrieves the underlying type, so we simply use this as the input to the type map when handling an array:
在 .NET Reflection API 和 Il2CppInspector 中,数组、指针和引用都有一个 ElementType 属性来检索底层类型,因此我们在处理数组时只需使用它作为类型映射的输入:

// Handle arrays
var typeBaseName = type.IsArray? type.ElementType.FullName : type.FullName ?? string.Empty;
// Handle primitive value types
if (protoTypes.TryGetValue(typeBaseName, out var protoTypeName))
    typeBaseName = protoTypeName;
    typeBaseName = type.IsArray? type.ElementType.Name : type.Name;
// Handle arrays
var annotatedName = typeBaseName;
if (type.IsArray)
    annotatedName = "repeated " + annotatedName;
// Output field
proto.Append($"  {annotatedName} {name} = {vaFieldMapping[pmAtt.VirtualAddress.Start]};\n");

We simply replace uses of FullName with ElementType.FullName and Name with ElementType.Name when type.IsArray is true. Before outputting the field, we prepend the repeated keyword to the type name if the type is an array.
type.IsArray 为 true 时,我们只需将 FullName 替换为 ElementType.FullName ,将 Name 替换为 ElementType.Name 。在输出字段之前,如果类型是数组,我们会在类型名称前面添加 repeated 关键字。

(see commit for these changes)

Lists 列表

As far as Protobuf is concerned, a .NET List or any other kind of one-dimensional sequence of elements is just an array. The only difference is which container the developer has chosen to represent the sequence in their code. On the wire, they are all the same.
就 Protobuf 而言,.NET List 或任何其他类型的一维元素序列只是一个数组。唯一的区别是开发人员选择哪个容器来表示代码中的序列。在电线上,它们都是一样的。

First we need to identify if the field we are examining is indeed a list. There are numerous possibilities:

type.Namespace == "System.Collections.Generic" && type.UnmangledBaseName == "List"


type.FullName == "System.Collections.Generic.List`1"

However there is a better way:

type.ImplementedInterfaces.Any(i => i.FullName == "System.Collections.Generic.IList`1"))

This query returns true if the type implements IList<T>, which means rather then just catching List<T>, this check will also pick up ArrayList, SynchronizedCollection<T>, ImmutableList<T>, ReadOnlyCollection<T> plus all of the other one-dimensional collections in .NET, as well as any custom collections defined by the developers in the application being examined that implement IList<T>. A much more robust test!
如果类型实现 IList<T> ,此查询将返回 true ,这意味着此检查不仅会捕获 List<T> ,还会捕获 ArrayListSynchronizedCollection<T>ImmutableList<T>ReadOnlyCollection<T> 以及 .NET 中的所有其他一维集合,以及开发人员在正在检查的实现 IList<T> 。更稳健的测试!

Tip: TypeInfo.ImplementedInterfaces returns all of the interfaces implemented by a type including those inherited from its base classes. To get only the interfaces implemented explicitly by a specific derived type, use TypeInfo.NonInheritedInterfaces.
提示: TypeInfo.ImplementedInterfaces 返回类型实现的所有接口,包括从其基类继承的接口。要仅获取由特定派生类型显式实现的接口,请使用 TypeInfo.NonInheritedInterfaces

Here is the full solution:

var isRepeated = type.IsArray;
var typeFullName = isRepeated? type.ElementType.FullName : type.FullName ?? string.Empty;
var typeFriendlyName = isRepeated? type.ElementType.Name : type.Name;
// Handle one-dimensional collections like lists
if (type.ImplementedInterfaces.Any(i => i.FullName == "System.Collections.Generic.IList`1")) {
    typeFullName = type.GenericTypeArguments[0].FullName;
    typeFriendlyName = type.GenericTypeArguments[0].Name;
    isRepeated = true;
// Handle primitive value types
if (protoTypes.TryGetValue(typeFullName, out var protoTypeName))
    typeFriendlyName = protoTypeName;
// Handle repeated fields
var annotatedName = typeFriendlyName;
if (isRepeated)
    annotatedName = "repeated " + annotatedName;

We merge the handling of arrays and lists by creating a flag isRepeated which we use to track whether we’re dealing with an array/list or a regular type.
我们通过创建一个标志 isRepeated 来合并数组和列表的处理,我们用它来跟踪我们是在处理数组/列表还是常规类型。

We also split up the name handling into two variables, typeFullName – which is the name we will eventually look up in the type map – and typeFriendlyName, which is the name that will be output. The typeFriendlyName is initially set to the unqualified .NET type name and overwritten if the type name can be found in the Protobuf primitive type map.
我们还将名称处理分为两个变量, typeFullName (这是我们最终将在类型映射中查找的名称)和 typeFriendlyName (这是将输出的名称) 。 typeFriendlyName 最初设置为非限定的 .NET 类型名称,如果可以在 Protobuf 基元类型映射中找到该类型名称,则将其覆盖。

To get the underlying type of items in the list, we access the first element of TypeInfo.GenericTypeArguments array property, which returns all of the type arguments used to instantiate the generic type. In this case, there is just one type argument: the type of items in the list.
为了获取列表中项目的基础类型,我们访问 TypeInfo.GenericTypeArguments 数组属性的第一个元素,该元素返回用于实例化泛型类型的所有类型参数。在这种情况下,只有一种类型参数:列表中项目的类型。

Note: This is a naive implementation which doesn’t handle nesting of lists or arrays in lists.

(see commit for these changes)

Dictionaries 词典

Dictionaries are .NET’s basic flavour of associative arrays where a set of unique keys each indexes one value. Protobuf 3 encodes these as a map<key, value> field. Protobuf 2 does not support this but Google offers a simple workaround which you can implement if needed.
字典是 .NET 关联数组的基本风格,其中一组唯一键每个索引一个值。 Protobuf 3 将它们编码为 map<key, value> 字段。 Protobuf 2 不支持此功能,但 Google 提供了一个简单的解决方法,您可以根据需要实施。

// Handle maps (IDictionary)
if (type.ImplementedInterfaces
    .Any(i => i.FullName == "System.Collections.Generic.IDictionary`2")) {
    // This time we have two generic arguments to deal with - the key and the value
    var keyFullName = type.GenericTypeArguments[0].FullName;
    var valueFullName = type.GenericTypeArguments[1].FullName;
    // We're going to have to deal with building this proto type name
    // separately from the value types below
    // We don't set isRepeated because it's implied by using a map type
    protoTypes.TryGetValue(keyFullName, out var keyFriendlyName);
    protoTypes.TryGetValue(valueFullName, out var valueFriendlyName);
    typeFriendlyName = $"map<{keyFriendlyName ?? type.GenericTypeArguments[0].Name}" +
        $", {valueFriendlyName ?? type.GenericTypeArguments[1].Name}>";

Handling dictionaries is much like handling lists: we detect a dictionary by checking if the type implements IDictionary<K, V>, but this time we have two type arguments and we run them both through the Protobuf type map before building a single final type name in typeFriendlyName.
处理字典很像处理列表:我们通过检查类型是否实现 IDictionary<K, V> 来检测字典,但是这次我们有两个类型参数,我们在构建单个最终类型之前通过 Protobuf 类型映射运行它们 typeFriendlyName 中的名称。

As we are outputting a single map and not multiple maps, we don’t set isRepeated to true for dictionaries.
由于我们输出单个 map 而不是多个地图,因此我们不会将字典的 isRepeated 设置为 true。

(see commit for these changes)

Nullable types 可空类型

In C#, types like int? are really just syntactic sugar for the generic wrapper Nullable<T> – in this case Nullable<int>. So once again we will pluck the true underlying type out from t he first generic type argument.
在 C# 中,像 int? 这样的类型实际上只是通用包装器 Nullable<T> 的语法糖 - 在本例中是 Nullable<int> 。因此,我们将再次从第一个泛型类型参数中取出真正的基础类型。

Secondly, according to the Protobuf-net Unofficial Manual[2], nullable types with a value of null are not sent over the wire, so this is a big clue that they should use Protobuf’s optional keyword, which signifies that a particular field may or may not be present in the message.
其次,根据 Protobuf-net 非官方手册[2],值为 null 的可空类型不会通过网络发送,因此这是一个很大的线索,它们应该使用 Protobuf 的 optional 关键字,这表示消息中可能存在也可能不存在特定字段。

The solution is therefore straightforward:

var isOptional = false;
// ...
// Handle nullable types
if (type.FullName == "System.Nullable`1") {
    // Once again we look at the first generic argument to get the real type
    typeFullName = type.GenericTypeArguments[0].FullName;
    typeFriendlyName = type.GenericTypeArguments[0].Name;
    isOptional = true;
// ...
// Handle nullable (optional) fields
if (isOptional)
    annotatedName = "optional " + annotatedName;

This follows identical principles as working with an IList<T>, except here we check FullName for a specific single type, and add an isOptional flag so that we can output the optional keyword later.
这遵循与使用 IList<T> 相同的原则,除了这里我们检查 FullName 的特定单一类型,并添加 isOptional 标志,以便我们可以输出稍后 optional 关键字。

(see commit for these changes)

Enumerations 枚举

Enumerations pose a different problem to the previous issues. We don’t actually need to change the individual field output at all, but we need to ensure that any enumerations used by any messages also have their type definitions output to the .proto file.
枚举提出了与之前问题不同的问题。我们实际上根本不需要更改单个字段输出,但我们需要确保任何消息使用的任何枚举也将其类型定义输出到 .proto 文件。

There are various ways to tackle this. The approach I’ve chosen is to keep a running list of all the used enums as each message is constructed. Once we have processed every message, we have a complete list of required enums and can then output their type definitions.

First we build the list:

var enums = new HashSet<TypeInfo>();