这是用户在 2024-9-21 20:43 为 https://asmtutor.com/#top 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

Learn Assembly Language 学习汇编语言

This project was put together to teach myself NASM x86 assembly language on linux.
这个项目是为了在 linux 上自学 NASM x86 汇编语言。

Github Project » Github 项目 »

Lesson 1 第 1 课

The obligatory 'Hello, world!'
强制性的“你好,世界!

Introduction to the Linux System Call Table. In this lesson we use software interrupts to request system functions from the kernel in order to print out 'Hello World!' to the console.
Linux 系统调用表简介。在本课中,我们使用软件中断从内核请求系统函数,以便将 'Hello World!' 打印到控制台。

View lesson » 查看课程 »

Lesson 2 第 2 课

Proper program exit 正确的程序退出

A very brief lesson about memory addresses, sequential code execution and how to properly terminate a program without errors.
关于内存地址、顺序代码执行以及如何正确终止程序而不会出错的非常简短的课程。

View details » 查看详情 »

Lesson 3 第 3 课

Calculate string length 计算字符串长度

What if we wanted to output something that we don't know the length of? Like user input? Learn about loops, labels and pointer arithmic.
如果我们想输出一些我们不知道长度的东西怎么办?喜欢用户输入?了解循环、标签和指针算术。

View details » 查看详情 »

Lesson 4 第 4 课

Subroutines 子 例 程

Introduction to the stack and how to write clean, reusable code in assembly.
堆栈简介以及如何在 assembly 中编写干净、可重用的代码。

View lesson » 查看课程 »

Lesson 5 第 5 课

External include files 外部包含文件

To further simplify our code we can move our subroutines into an external include file.
为了进一步简化我们的代码,我们可以将子例程移动到外部包含文件中。

View details » 查看详情 »

Lesson 6 第 6 课

NULL terminating bytes NULL 终止字节

A quick lesson on how memory is handled. This lesson also fixes the duplication bug we added in lesson 5.
有关如何处理内存的快速课程。本课还修复了我们在第 5 课中添加的重复错误。

View details » 查看详情 »

Lesson 7 第 7 课

Linefeeds 换行符

How you can use the stack to print linefeeds after strings and an introduction to the Extended Stack Pointer ESP.
如何使用堆栈在字符串后打印换行符,以及扩展堆栈指针 ESP 简介。

View lesson » 查看课程 »

Lesson 8 第 8 课

Passing arguments 传递参数

Passing arguments to your program from the command line.
从命令行将参数传递给程序。

View lesson » 查看课程 »

Lesson 9 第 9 课

User input 用户输入

Introduction to the BSS section and how to trigger a call to sys_read to process user input.
BSS 部分介绍如何触发对 sys_read 的调用以处理用户输入。

View lesson » 查看课程 »

Lesson 10 第 10 课

Count to 10 计数到 10

Introduction to numbers and counting in assembly.
程序集中的数字和计数简介。

View lesson » 查看课程 »

Lesson 11 第 11 课

Count to 10 (itoa) 计数到 10 (itoa)

Introduction to ASCII and how to convert integers to their string representations in assembly.
ASCII 简介以及如何在 assembly 中将整数转换为其字符串表示形式。

View lesson » 查看课程 »

Lesson 12 第 12 课

Calculator - addition 计算器 - 加法

Introduction to calulating numbers in assembly. This tutorial describes a simple program to add two numbers together.
在汇编中计算数字的简介。本教程描述了一个将两个数字相加的简单程序。

View lesson » 查看课程 »

Lesson 13 第 13 课

Calculator - subtraction 计算器 - 减法

Introduction to calulating numbers in assembly. This tutorial describes a simple program to subtract one number from another.
在汇编中计算数字的简介。本教程描述了一个简单的程序,用于从一个数字中减去另一个数字。

View lesson » 查看课程 »

Lesson 14 第 14 课

Calculator - multiplication
计算器 - 乘法

Introduction to calulating numbers in assembly. This tutorial describes a simple program to multiply two numbers together.
在汇编中计算数字的简介。本教程描述了一个将两个数字相乘的简单程序。

View lesson » 查看课程 »

Lesson 15 第 15 课

Calculator - division 计算器 - 除法

Introduction to calulating numbers in assembly. This tutorial describes a simple program to divide one number by another.
在汇编中计算数字的简介。本教程描述了一个将一个数字除以另一个数字的简单程序。

View lesson » 查看课程 »

Lesson 16 第 16 课

Calculator (atoi) 计算器 (atoi)

This program takes a series of passed string arguments, converts them to integers and adds them all together.
该程序接受一系列传递的字符串参数,将它们转换为整数并将它们全部相加。

View lesson » 查看课程 »

Lesson 17 第 17 课

Namespace

Introduction to how NASM handles namespace when it comes to global and local labels.
介绍 NASM 在处理全局和本地标签时如何处理命名空间。

View lesson » 查看课程 »

Lesson 18 第 18 课

Fizz Buzz 嘶嘶

The Fizz Buzz programming challenge recreated in NASM.
在 NASM 中重新创建的 Fizz Buzz 编程挑战赛。

View lesson » 查看课程 »

Lesson 19 第 19 课

Execute Command 执行命令

In this lesson we replace the currently running process with a new process that executes a command.
在本课中,我们将当前正在运行的进程替换为执行命令的新进程。

View lesson » 查看课程 »

Lesson 20 第 20 课

Process Forking 流程分叉

In this lesson we create a new process that duplicates our current process.
在本课中,我们将创建一个复制当前流程的新流程。

View lesson » 查看课程 »

Lesson 21 第 21 课

Telling the time 报时

In this lesson we ask the kernel for the current unix timestamp.
在本课中,我们向内核询问当前的 unix 时间戳。

View lesson » 查看课程 »

Lesson 22 第 22 课

File Handling - Create 文件处理 - 创建

In this lesson we learn how to create a new file in Assembly.
在本课中,我们将学习如何在 Assembly 中创建新文件。

View lesson » 查看课程 »

Lesson 23 第 23 课

File Handling - Write 文件处理 - 写入

In this lesson we write content to a newly created text file.
在本课中,我们将内容写入新创建的文本文件。

View lesson » 查看课程 »

Lesson 24 第 24 课

File Handling - Open 文件处理 - 打开

In this lesson we open a text file and print it's file descriptor.
在本课中,我们将打开一个文本文件并打印其文件描述符。

View lesson » 查看课程 »

Lesson 25 第 25 课

File Handling - Read 文件处理 - 读取

In this lesson we read content from a newly created text file.
在本课中,我们从新创建的文本文件中读取内容。

View lesson » 查看课程 »

Lesson 26 第 26 课

File Handling - Close 文件处理 - 关闭

In this lesson we close a newly created text file using it's file descriptor.
在本课中,我们将使用新创建的文本文件的文件描述符关闭该文件。

View lesson » 查看课程 »

Lesson 27 第 27 课

File Handling - Update 文件处理 - 更新

In this lesson we update the content of an included text file using seek.
在本课中,我们将使用 seek 更新包含的文本文件的内容。

View lesson » 查看课程 »

Lesson 28 第 28 课

File Handling - Delete 文件处理 - 删除

In this lesson we learn how to delete a file.
在本课中,我们将学习如何删除文件。

View lesson » 查看课程 »

Lesson 29 第 29 课

Sockets - Create 套接字 - 创建

In this lesson we learn how to create a new socket in assembly and store it's file descriptor.
在本课中,我们将学习如何在 assembly 中创建新的套接字并存储其文件描述符。

View lesson » 查看课程 »

Lesson 30 第 30 课

Sockets - Bind 套接字 - 绑定

In this lesson we learn how to bind a socket to an IP Address & Port Number.
在本课中,我们将学习如何将套接字绑定到IP地址和端口号。

View lesson » 查看课程 »

Lesson 31 第 31 课

Sockets - Listen 套接字 - 侦听

In this lesson we learn how to make a socket listen for incoming connections.
在本课中,我们将学习如何让套接字侦听传入的连接。

View lesson » 查看课程 »

Lesson 32 第 32 课

Sockets - Accept 套接字 - 接受

In this lesson we learn how to make a socket accept incoming connections.
在本课中,我们将学习如何使套接字接受传入连接。

View lesson » 查看课程 »

Lesson 33 第 33 课

Sockets - Read 套接字 - 读取

In this lesson we learn how to read incoming requests on a socket.
在本课中,我们将学习如何读取套接字上的传入请求。

View lesson » 查看课程 »

Lesson 34 第 34 课

Sockets - Write 套接字 - 写入

In this lesson we learn how to make a socket respond to incoming requests.
在本课中,我们将学习如何让套接字响应传入的请求。

View lesson » 查看课程 »

Lesson 35 第 35 课

Sockets - Close 套接字 - 关闭

In this lesson we learn how to shutdown and close an open socket connection.
在本课中,我们将学习如何关闭和关闭开放的套接字连接。

View lesson » 查看课程 »

Lesson 36 第 36 课

Download a Webpage 下载网页

In this lesson we're going to connect to a webserver and send a HTTP request for a webpage. We'll then print the server's response to our terminal.
在本课中,我们将连接到 Web 服务器并发送对网页的 HTTP 请求。然后,我们将服务器对终端的响应打印出来。

View lesson » 查看课程 »


Lesson 1 第 1 课

Hello, world! 世界您好!

First, some background 首先,一些背景

Assembly language is bare-bones. The only interface a programmer has above the actual hardware is the kernel itself. In order to build useful programs in assembly we need to use the linux system calls provided by the kernel. These system calls are a library built into the operating system to provide functions such as reading input from a keyboard and writing output to the screen.
汇编语言是基本的。程序员在实际硬件之上的唯一接口是内核本身。为了在汇编中构建有用的程序,我们需要使用内核提供的 linux 系统调用。这些系统调用是操作系统中内置的库,用于提供从键盘读取输入和将输出写入屏幕等功能。

When you invoke a system call the kernel will immediately suspend execution of your program. It will then contact the necessary drivers needed to perform the task you requested on the hardware and then return control back to your program.
当您调用系统调用时,内核将立即暂停程序的执行。然后,它将联系在硬件上执行您请求的任务所需的必要驱动程序,然后将控制权交还给您的程序。

Note: Drivers are called drivers because the kernel literally uses them to drive the hardware.
注意:驱动程序之所以被称为驱动程序,是因为内核实际上使用它们来驱动硬件。

We can accomplish this all in assembly by loading EAX with the function number (operation code OPCODE) we want to execute and filling the remaining registers with the arguments we want to pass to the system call. A software interrupt is requested with the INT instruction and the kernel takes over and calls the function from the library with our arguments. Simple.
我们可以在汇编中完成这一切,方法是用我们想要执行的函数号(操作代码 OPCODE)加载 EAX,并用我们想要传递给系统调用的参数填充剩余的寄存器。使用 INT 指令请求软件中断,内核接管并使用我们的参数从库中调用函数。简单。

For example requesting an interrupt when EAX=1 will call sys_exit and requesting an interrupt when EAX=4 will call sys_write instead. EBX, ECX & EDX will be passed as arguments if the function requires them. Click here to view an example of a Linux System Call Table and its corresponding OPCODES.
例如,当 EAX=1 将调用 sys_exit 时请求中断,而当 EAX=4 将调用 sys_write 时请求中断。如果函数需要,EBX、ECX和EDX将作为参数传递。单击此处查看 Linux 系统调用表及其相应 OPCODES 的示例。

Writing our program 编写我们的程序

Firstly we create a variable 'msg' in our .data section and assign it the string we want to output in this case 'Hello, world!'. In our .text section we tell the kernel where to begin execution by providing it with a global label _start: to denote the programs entry point.
首先,我们在 .data 部分创建一个变量 'msg',并为其分配我们想要输出的字符串,在本例中为 'Hello, world!'。在我们的 .text 部分中,我们通过为内核提供一个全局标签 _start: 来表示程序入口点,从而告诉内核从哪里开始执行。

We will be using the system call sys_write to output our message to the console window. This function is assigned OPCODE 4 in the Linux System Call Table. The function also takes 3 arguments which are sequentially loaded into EDX, ECX and EBX before requesting a software interrupt which will perform the task.
我们将使用 system 调用 sys_write将我们的消息输出到控制台窗口。此函数在 Linux 系统调用表中被分配为 OPCODE 4。该函数还接受 3 个参数,这些参数在请求执行任务的软件中断之前按顺序加载到 EDX、ECX 和 EBX 中。

The arguments passed are as follows:
传递的参数如下:

  • EDX will be loaded with the length (in bytes) of the string.
    EDX 将加载字符串的长度(以字节为单位)。
  • ECX will be loaded with the address of our variable created in the .data section.
    ECX 将加载在 .data 部分中创建的变量的地址。
  • EBX will be loaded with the file we want to write to – in this case STDOUT.
    EBX 将加载我们想要写入的文件 – 在本例中为 STDOUT。
The datatype and meaning of the arguments passed can be found in the function's definition.
传递的参数的数据类型和含义可以在函数的定义中找到。

We compile, link and run the program using the commands below.
我们使用以下命令编译、链接和运行程序。

helloworld.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
; Hello World Program - asmtutor.com
; Compile with: nasm -f elf helloworld.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld.o -o helloworld
; Run with: ./helloworld
 
SECTION .data
msg     db      'Hello World!', 0Ah    ; assign msg variable with your message string
 
SECTION .text
global  _start
 
_start:
 
    mov     edx, 13    ; number of bytes to write - one for each letter plus 0Ah (line feed character)
    mov     ecx, msg   ; move the memory address of our message string into ecx
    mov     ebx, 1     ; write to the STDOUT file
    mov     eax, 4     ; invoke SYS_WRITE (kernel opcode 4)
    int     80h
~$ nasm -f elf helloworld.asm ~$ ld -m elf_i386 helloworld.o -o helloworld ~$ ./helloworld Hello World! 世界您好! Segmentation fault 分段错误

Error: Segmentation fault
错误:分段错误


Lesson 2 第 2 课

Proper program exit 正确的程序退出

Some more background 更多背景

After successfully learning how to execute a system call in Lesson 1 we now need to learn about one of the most important system calls in the kernel, sys_exit.
在第 1 课中成功学习如何执行系统调用后,我们现在需要了解内核中最重要的系统调用之一 sys_exit。

Notice how after our 'Hello, world!' program ran we got a Segmentation fault? Well, computer programs can be thought of as a long strip of instructions that are loaded into memory and divided up into sections (or segments). This general pool of memory is shared between all programs and can be used to store variables, instructions, other programs or anything really. Each segment is given an address so that information stored in that section can be found later.
注意到我们的 'Hello, world!' 程序运行后,我们是怎么遇到 Segmentation 错误的吗?嗯,计算机程序可以被认为是一条长条指令,这些指令被加载到内存中并分为多个部分(或段)。这个通用内存池在所有程序之间共享,可用于存储变量、指令、其他程序或任何实际内容。每个区段都有一个地址,以便以后可以找到存储在该区段中的信息。

To execute a program that is loaded in memory, we use the global label _start: to tell the operating system where in memory our program can be found and executed. Memory is then accessed sequentially following the program logic which determines the next address to be accessed. The kernel jumps to that address in memory and executes it.
要执行加载到内存中的程序,我们使用全局标签 _start: 来告诉操作系统可以在内存中的哪个位置找到和执行我们的程序。然后按照程序逻辑顺序访问内存,该程序逻辑确定要访问的下一个地址。内核跳转到内存中的该地址并执行它。

It's important to tell the operating system exactly where it should begin execution and where it should stop. In Lesson 1 we didn't tell the kernel where to stop execution. So, after we called sys_write the program continued sequentially executing the next address in memory, which could have been anything. We don't know what the kernel tried to execute but it caused it to choke and terminate the process for us instead - leaving us the error message of 'Segmentation fault'. Calling sys_exit at the end of all our programs will mean the kernel knows exactly when to terminate the process and return memory back to the general pool thus avoiding an error.
请务必告诉操作系统它应该从哪里开始执行,应该从哪里停止。在第 1 课中,我们没有告诉内核在何处停止执行。因此,在我们调用 sys_write 后,程序继续按顺序执行内存中的下一个地址,该地址可以是任何内容。我们不知道内核试图执行什么,但它导致它窒息并为我们终止了进程 - 给我们留下了“Segmentation fault”的错误消息。在所有程序结束时调用 sys_exit 意味着内核确切地知道何时终止进程并将内存返回给通用池,从而避免错误。

Writing our program 编写我们的程序

Sys_exit has a simple function definition. In the Linux System Call Table it is allocated OPCODE 1 and is passed a single argument through EBX.
Sys_exit 具有一个简单的函数定义。在 Linux 系统调用表中,它被分配为 OPCODE 1,并通过 EBX 传递单个参数。

In order to execute this function all we need to do is:
为了执行此函数,我们需要做的就是:

  • Load EBX with 0 to pass zero to the function meaning 'zero errors'.
    加载 EBX 0 以将零传递给函数,表示“零错误”。
  • Load EAX with 1 to call sys_exit.
    加载 EAX 1 以调用 sys_exit。
  • Then request an interrupt on libc using INT 80h.
    然后使用 INT 80h 在 libc 上请求中断。

We then compile, link and run it again.
然后我们编译、链接并再次运行它。

Note: Only new code added in each lesson will be commented.
注意:只会注释每节课中添加的新代码。

helloworld.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
; Hello World Program - asmtutor.com
; Compile with: nasm -f elf helloworld.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld.o -o helloworld
; Run with: ./helloworld
 
SECTION .data
msg     db      'Hello World!', 0Ah
 
SECTION .text
global  _start
 
_start:
 
    mov     edx, 13
    mov     ecx, msg
    mov     ebx, 1
    mov     eax, 4
    int     80h
 
    mov     ebx, 0     ; return 0 status on exit - 'No Errors'
    mov     eax, 1     ; invoke SYS_EXIT (kernel opcode 1)
    int     80h
~$ nasm -f elf helloworld.asm ~$ ld -m elf_i386 helloworld.o -o helloworld ~$ ./helloworld Hello World! 世界您好!

Lesson 3 第 3 课

Calculate string length 计算字符串长度

Firstly, some background 首先,一些背景

Why do we need to calculate the length of a string?
为什么我们需要计算字符串的长度?

Well sys_write requires that we pass it a pointer to the string we want to output in memory and the length in bytes we want to print out. If we were to modify our message string we would have to update the length in bytes that we pass to sys_write as well, otherwise it will not print correctly.
嗯sys_write要求我们向它传递一个指针,指向我们想要在内存中输出的字符串和我们想要打印出来的长度(以字节为单位)。如果我们要修改我们的消息字符串,我们也必须更新我们传递给 sys_write 的长度(以字节为单位),否则它将无法正确打印。

You can see what I mean using the program in Lesson 2. Modify the message string to say 'Hello, brave new world!' then compile, link and run the new program. The output will be 'Hello, brave ' (the first 13 characters) because we are still only passing 13 bytes to sys_write as its length. It will be particularly necessary when we want to print out user input. As we won't know the length of the data when we compile our program, we will need a way to calculate the length at runtime in order to successfully print it out.
您可以在第 2 课中了解我使用该程序的含义。修改消息字符串以表示 'Hello, brave new world!',然后编译、链接并运行新程序。输出将是 'Hello, brave '(前 13 个字符),因为我们仍然只将 13 个字节作为其长度传递给 sys_write。当我们想要打印出用户输入时,这将特别必要。由于我们在编译程序时不知道数据的长度,因此我们需要一种方法来计算运行时的长度,以便成功打印出来。

Writing our program 编写我们的程序

To calculate the length of the string we will use a technique called pointer arithmetic. Two registers are initialised pointing to the same address in memory. One register (in this case EAX) will be incremented forward one byte for each character in the output string until we reach the end of the string. The original pointer will then be subtracted from EAX. This is effectively like subtraction between two arrays and the result yields the number of elements between the two addresses. This result is then passed to sys_write replacing our hard coded count.
为了计算字符串的长度,我们将使用一种称为指针算术的技术。两个 registers 被初始化,指向内存中的相同地址。一个寄存器(在本例中为 EAX)将为输出字符串中的每个字符向前增加一个字节,直到我们到达字符串的末尾。然后,将从 EAX 中减去原始指针。这实际上就像两个数组之间的减法,结果会产生两个地址之间的元素数。然后,此结果将传递给 sys_write 替换我们的硬编码计数。

The CMP instruction compares the left hand side against the right hand side and sets a number of flags that are used for program flow. The flag we're checking is the ZF or Zero Flag. When the byte that EAX points to is equal to zero the ZF flag is set. We then use the JZ instruction to jump, if the ZF flag is set, to the point in our program labeled 'finished'. This is to break out of the nextchar loop and continue executing the rest of the program.
CMP 指令将左侧与右侧进行比较,并设置许多用于程序流的标志。我们正在检查的标志是 ZF 或 Zero Flag。当 EAX 指向的字节等于零时,设置 ZF 标志。然后,如果设置了 ZF 标志,我们使用 JZ 指令跳转到程序中标记为 'finished' 的点。这是为了跳出 nextchar 循环并继续执行程序的其余部分。

helloworld-len.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
; Hello World Program (Calculating string length)
; Compile with: nasm -f elf helloworld-len.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld-len.o -o helloworld-len
; Run with: ./helloworld-len
 
SECTION .data
msg     db      'Hello, brave new world!', 0Ah ; we can modify this now without having to update anywhere else in the program
 
SECTION .text
global  _start
 
_start:
 
    mov     ebx, msg       ; move the address of our message string into EBX
    mov     eax, ebx       ; move the address in EBX into EAX as well (Both now point to the same segment in memory)
 
nextchar:
    cmp     byte [eax], 0  ; compare the byte pointed to by EAX at this address against zero (Zero is an end of string delimiter)
    jz      finished       ; jump (if the zero flagged has been set) to the point in the code labeled 'finished'
    inc     eax            ; increment the address in EAX by one byte (if the zero flagged has NOT been set)
    jmp     nextchar       ; jump to the point in the code labeled 'nextchar'
 
finished:
    sub     eax, ebx       ; subtract the address in EBX from the address in EAX
                            ; remember both registers started pointing to the same address (see line 15)
                            ; but EAX has been incremented one byte for each character in the message string
                            ; when you subtract one memory address from another of the same type
                            ; the result is number of segments between them - in this case the number of bytes
 
    mov     edx, eax       ; EAX now equals the number of bytes in our string
    mov     ecx, msg       ; the rest of the code should be familiar now
    mov     ebx, 1
    mov     eax, 4
    int     80h
 
    mov     ebx, 0
    mov     eax, 1
    int     80h
~$ nasm -f elf helloworld-len.asm ~$ ld -m elf_i386 helloworld-len.o -o helloworld-len ~$ ./helloworld-len Hello, brave new world! 你好,勇敢的新世界!

Lesson 4 第 4 课

Subroutines 子 例 程

Introduction to subroutines
子例程简介

Subroutines are functions. They are reusable pieces of code that can be called by your program to perform various repeatable tasks. Subroutines are declared using labels just like we've used before (eg. _start:) however we don't use the JMP instruction to get to them - instead we use a new instruction CALL. We also don't use the JMP instruction to return to our program after we have run the function. To return to our program from a subroutine we use the instruction RET instead.
子例程是函数。它们是可重用的代码段,程序可以调用这些代码来执行各种可重复的任务。子例程使用标签进行声明,就像我们之前使用的那样(例如 _start:),但是我们不使用 JMP 指令来访问它们 - 而是使用新的指令 CALL。在运行函数后,我们也不会使用 JMP 指令返回到我们的程序。要从子例程返回我们的程序,我们改用指令 RET。

Why don't we JMP to subroutines?
为什么我们不对子例程进行 JMP 呢?

The great thing about writing a subroutine is that we can reuse it. If we want to be able to use the subroutine from anywhere in the code we would have to write some logic to determine where in the code we had jumped from and where we should jump back to. This would litter our code with unwanted labels. If we use CALL and RET however, assembly handles this problem for us using something called the stack.
编写子例程的伟大之处在于我们可以重用它。如果我们希望能够从代码中的任何位置使用子例程,则必须编写一些逻辑来确定我们从代码中的哪个位置跳转,以及应该跳回到哪里。这将使我们的代码中充斥着不需要的标签。但是,如果我们使用 CALL 和 RET,assembly 会使用称为堆栈的东西为我们处理这个问题。

Introduction to the stack
堆栈简介

The stack is a special type of memory. It's the same type of memory that we've used before however it's special in how it is used by our program. The stack is what is call Last In First Out memory (LIFO). You can think of the stack like a stack of plates in your kitchen. The last plate you put on the stack is also the first plate you will take off the stack next time you use a plate.
堆栈是一种特殊类型的内存。它与我们之前使用的内存类型相同,但它在我们的程序使用方式上很特殊。该堆栈就是所谓的 Last In First Out 内存 (LIFO)。您可以将堆栈想象成厨房中的一堆盘子。您放在堆栈上的最后一个盘子也是您下次使用盘子时从堆栈中取出的第一个盘子。

The stack in assembly is not storing plates though, its storing values. You can store a lot of things on the stack such as variables, addresses or other programs. We need to use the stack when we call subroutines to temporarily store values that will be restored later.
但是,assembly 中的 stack 不是存储板,而是存储值。您可以在堆栈上存储很多东西,例如变量、地址或其他程序。当我们调用子例程时,我们需要使用堆栈来临时存储稍后将恢复的值。

Any register that your function needs to use should have it's current value put on the stack for safe keeping using the PUSH instruction. Then after the function has finished it's logic, these registers can have their original values restored using the POP instruction. This means that any values in the registers will be the same before and after you've called your function. If we take care of this in our subroutine we can call functions without worrying about what changes they're making to our registers.
您的函数需要使用的任何 register 都应该将其当前值放在堆栈上,以便使用 PUSH 指令安全保存。然后在函数完成其 logic之后,这些 registers 可以使用 POP 指令恢复其原始值。这意味着 registers 中的任何值在您调用函数之前和之后都是相同的。如果我们在子例程中处理到这个问题,我们就可以调用函数,而不必担心它们对我们的寄存器进行了哪些更改。

The CALL and RET instructions also use the stack. When you CALL a subroutine, the address you called it from in your program is pushed onto the stack. This address is then popped off the stack by RET and the program jumps back to that place in your code. This is why you should always JMP to labels but you should CALL functions.
CALL 和 RET 指令也使用堆栈。当你 CALL 一个子例程时,你在程序中调用它的地址被压入堆栈。然后,RET 从堆栈中弹出此地址,程序跳回到代码中的该位置。这就是为什么您应该始终对标签进行 JMP,但您应该 CALL 函数。

helloworld-len.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
; Hello World Program (Subroutines)
; Compile with: nasm -f elf helloworld-len.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld-len.o -o helloworld-len
; Run with: ./helloworld-len
 
SECTION .data
msg     db      'Hello, brave new world!', 0Ah
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, msg       ; move the address of our message string into EAX
    call    strlen         ; call our function to calculate the length of the string
 
    mov     edx, eax       ; our function leaves the result in EAX
    mov     ecx, msg       ; this is all the same as before
    mov     ebx, 1
    mov     eax, 4
    int     80h
 
    mov     ebx, 0
    mov     eax, 1
    int     80h
 
strlen:                    ; this is our first function declaration
    push    ebx            ; push the value in EBX onto the stack to preserve it while we use EBX in this function
    mov     ebx, eax       ; move the address in EAX into EBX (Both point to the same segment in memory)
 
nextchar:                  ; this is the same as lesson3
    cmp     byte [eax], 0
    jz      finished
    inc     eax
    jmp     nextchar
 
finished:
    sub     eax, ebx
    pop     ebx            ; pop the value on the stack back into EBX
    ret                    ; return to where the function was called
~$ nasm -f elf helloworld-len.asm ~$ ld -m elf_i386 helloworld-len.o -o helloworld-len ~$ ./helloworld-len Hello, brave new world! 你好,勇敢的新世界!

Lesson 5 第 5 课

External include files 外部包含文件

External include files allow us to move code from our program and put it into separate files. This technique is useful for writing clean, easy to maintain programs. Reusable bits of code can be written as subroutines and stored in separate files called libraries. When you need a piece of logic you can include the file in your program and use it as if they are part of the same file.
外部包含文件允许我们从程序中移动代码并将其放入单独的文件中。此技术对于编写干净、易于维护的程序非常有用。可重用的代码位可以编写为子例程,并存储在称为 libraries 的单独文件中。当您需要一段 logic 时,可以将该文件包含在程序中,并像使用它们一样使用它,就好像它们是同一个文件的一部分一样。

In this lesson we will move our string length calculating subroutine into an external file. We fill also make our string printing logic and program exit logic a subroutine and we will move them into this external file. Once it's completed our actual program will be clean and easier to read.
在本课中,我们将把字符串长度计算子例程移动到外部文件中。我们还将字符串打印逻辑和程序退出逻辑设置为子例程,并将它们移动到此外部文件中。一旦完成,我们的实际程序将变得干净且更易于阅读。

We can then declare another message variable and call our print function twice in order to demonstrate how we can reuse code.
然后,我们可以声明另一个 message 变量并调用我们的 print 函数两次,以演示如何重用代码。

Note: I won't be showing the code in functions.asm after this lesson unless it changes. It will just be included if needed.
注意:在这节课之后,我不会在 functions.asm 中显示代码,除非它发生更改。如果需要,它只会被包括在内。

functions.asm 函数.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
;------------------------------------------
; int slen(String message)
; String length calculation function
slen:
    push    ebx
    mov     ebx, eax
 
nextchar:
    cmp     byte [eax], 0
    jz      finished
    inc     eax
    jmp     nextchar
 
finished:
    sub     eax, ebx
    pop     ebx
    ret
 
 
;------------------------------------------
; void sprint(String message)
; String printing function
sprint:
    push    edx
    push    ecx
    push    ebx
    push    eax
    call    slen
 
    mov     edx, eax
    pop     eax
 
    mov     ecx, eax
    mov     ebx, 1
    mov     eax, 4
    int     80h
 
    pop     ebx
    pop     ecx
    pop     edx
    ret
 
 
;------------------------------------------
; void exit()
; Exit program and restore resources
quit:
    mov     ebx, 0
    mov     eax, 1
    int     80h
    ret
helloworld-inc.asm helloworld-inc.asm 公司
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
; Hello World Program (External file include)
; Compile with: nasm -f elf helloworld-inc.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld-inc.o -o helloworld-inc
; Run with: ./helloworld-inc
 
%include        'functions.asm'                            ; include our external file
 
SECTION .data
msg1    db      'Hello, brave new world!', 0Ah             ; our first message string
msg2    db      'This is how we recycle in NASM.', 0Ah     ; our second message string
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, msg1      ; move the address of our first message string into EAX
    call    sprint         ; call our string printing function
 
    mov     eax, msg2      ; move the address of our second message string into EAX
    call    sprint         ; call our string printing function
 
    call    quit           ; call our quit function
~$ nasm -f elf helloworld-inc.asm ~$ ld -m elf_i386 helloworld-inc.o -o helloworld-inc ~$ ./helloworld-inc Hello, brave new world! 你好,勇敢的新世界! This is how we recycle in NASM.
这就是我们在 NASM 中回收的方式。
This is how we recycle in NASM.
这就是我们在 NASM 中回收的方式。

Error: Our second message is outputted twice. This is fixed in the next lesson.
错误:我们的第二条消息输出两次。此问题将在下一课中修复。


Lesson 6 第 6 课

NULL terminating bytes NULL 终止字节

Ok so why did our second message print twice when we only called our sprint function on msg2 once? Well actually it did only print once. You can see what I mean if you comment out our second call to sprint. The output will be both of our message strings.
好的,那么为什么我们只在 msg2 上调用了一次 sprint 函数,而我们的第二条消息打印了两次呢?嗯,实际上它只打印了一次。如果你把我们对 sprint 的第二次调用注释掉,你就会明白我的意思。输出将是我们的两个消息字符串。

But how is this possible?
但这怎么可能呢?

What is happening is we weren't properly terminating our strings. In assembly, variables are stored one after another in memory so the last byte of our msg1 variable is right next to the first byte of our msg2 variable. We know our string length calculation is looking for a zero byte so unless our msg2 variable starts with a zero byte it keeps counting as if it's the same string (and as far as assembly is concerned it is the same string). So we need to put a zero byte or 0h after our strings to let assembly know where to stop counting.
发生的事情是我们没有正确地终止我们的字符串。在汇编中,变量一个接一个地存储在内存中,因此 msg1 变量的最后一个字节紧挨着 msg2 变量的第一个字节。我们知道我们的字符串长度计算正在寻找一个零字节,因此除非我们的 msg2 变量以零字节开头,否则它会一直计数,就好像它是同一个字符串一样(就汇编而言,它是同一个字符串)。所以我们需要在字符串后面放一个 0 字节或 0h,让 assembly 知道在哪里停止计数。

Note: In programming 0h denotes a null byte and a null byte after a string tells assembly where it ends in memory.
注意:在编程中,0h 表示一个 null 字节和一个 null 字节,字符串后告诉 assembly 它在内存中的结束位置。

helloworld-inc.asm helloworld-inc.asm 公司
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
; Hello World Program (NULL terminating bytes)
; Compile with: nasm -f elf helloworld-inc.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld-inc.o -o helloworld-inc
; Run with: ./helloworld-inc
 
%include        'functions.asm'
 
SECTION .data
msg1    db      'Hello, brave new world!', 0Ah, 0h         ; NOTE the null terminating byte
msg2    db      'This is how we recycle in NASM.', 0Ah, 0h ; NOTE the null terminating byte
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, msg1
    call    sprint
 
    mov     eax, msg2
    call    sprint
 
    call    quit
~$ nasm -f elf helloworld-inc.asm ~$ ld -m elf_i386 helloworld-inc.o -o helloworld-inc ~$ ./helloworld-inc Hello, brave new world! 你好,勇敢的新世界! This is how we recycle in NASM.
这就是我们在 NASM 中回收的方式。

Lesson 7 第 7 课

Linefeeds 换行符

Linefeeds are essential to console programs like our 'hello world' program. They become even more important once we start building programs that require user input. But linefeeds can be a pain to maintain. Sometimes you will want to include them in your strings and sometimes you will want to remove them. If we continue to hard code them in our variables by adding 0Ah after our declared message text, it will become a problem. If there's a place in the code that we don't want to print out the linefeed for that variable we will need to write some extra logic remove it from the string at runtime.
换行符对于控制台程序(如我们的“你好世界”程序)至关重要。一旦我们开始构建需要用户输入的程序,它们就变得更加重要。但是维护换行可能很痛苦。有时你会想把它们包含在你的字符串中,有时你会想删除它们。如果我们继续在变量中对它们进行硬编码,在声明的消息文本后添加 0Ah,这将成为一个问题。如果代码中有一个地方我们不想打印出该变量的换行符,我们将需要编写一些额外的逻辑:在运行时将其从字符串中删除。

It would be better for the maintainability of our program if we write a subroutine that will print out our message and then print a linefeed afterwards. That way we can just call this subroutine when we need the linefeed and call our current sprint subroutine when we don't.
如果我们编写一个子例程来打印我们的消息,然后打印换行符,那对程序的可维护性会更好。这样,我们可以在需要换行时调用此子例程,并在不需要时调用当前的 sprint 子例程。

A call to sys_write requires we pass a pointer to an address in memory of the string we want to print so we can't just pass a linefeed character (0Ah) to our print function. We also don't want to create another variable just to hold a linefeed character so we will instead use the stack.
对 sys_write 的调用要求我们将指针传递给要打印的字符串内存中的地址,因此我们不能只将换行符 (0Ah) 传递给我们的 print 函数。我们也不想创建另一个变量来保存换行字符,因此我们将改用堆栈。

The way it works is by moving a linefeed character into EAX. We then push EAX onto the stack and get the address pointed to by the Extended Stack Pointer. ESP is another register. When you push items onto the stack, ESP is decremented to point to the address in memory of the last item and so it can be used to access that item directly from the stack. Since ESP points to an address in memory of a character, sys_write will be able to use it to print.
它的工作方式是将换行字符移动到 EAX 中。然后,我们将 EAX 推送到堆栈上,并获取 Extended Stack Pointer 指向的地址。ESP 是另一个寄存器。当您将项目推送到堆栈上时,ESP 会递减以指向内存中最后一项的地址,因此它可以用于直接从堆栈访问该项目。由于 ESP 指向字符内存中的地址,因此sys_write将能够使用它来打印。

Note: I've highlighted the new code in functions.asm below.
注意:我在下面的 functions.asm 中突出显示了新代码。

functions.asm 函数.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
;------------------------------------------
; int slen(String message)
; String length calculation function
slen:
    push    ebx
    mov     ebx, eax
 
nextchar:
    cmp     byte [eax], 0
    jz      finished
    inc     eax
    jmp     nextchar
 
finished:
    sub     eax, ebx
    pop     ebx
    ret
 
 
;------------------------------------------
; void sprint(String message)
; String printing function
sprint:
    push    edx
    push    ecx
    push    ebx
    push    eax
    call    slen
 
    mov     edx, eax
    pop     eax
 
    mov     ecx, eax
    mov     ebx, 1
    mov     eax, 4
    int     80h
 
    pop     ebx
    pop     ecx
    pop     edx
    ret
 
 
;------------------------------------------
; void sprintLF(String message)
; String printing with line feed function
sprintLF:
    call    sprint
 
    push    eax        ; push eax onto the stack to preserve it while we use the eax register in this function
    mov     eax, 0Ah   ; move 0Ah into eax - 0Ah is the ascii character for a linefeed
                        ; as eax is 4 bytes wide, it now contains 0000000Ah
    push    eax        ; push the linefeed onto the stack so we can get the address
                        ; given that we have a little-endian architecture, eax register bytes are stored in reverse order,
                        ; this corresponds to stack memory contents of 0Ah, 0h, 0h, 0h,
                        ; giving us a linefeed followed by a NULL terminating byte
    mov     eax, esp   ; move the address of the current stack pointer into eax for sprint
    call    sprint     ; call our sprint function
    pop     eax        ; remove our linefeed character from the stack
    pop     eax        ; restore the original value of eax before our function was called
    ret                ; return to our program
 
 
;------------------------------------------
; void exit()
; Exit program and restore resources
quit:
    mov     ebx, 0
    mov     eax, 1
    int     80h
    ret
helloworld-lf.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
; Hello World Program (Print with line feed)
; Compile with: nasm -f elf helloworld-lf.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld-lf.o -o helloworld-lf
; Run with: ./helloworld-lf
 
%include        'functions.asm'
 
SECTION .data
msg1    db      'Hello, brave new world!', 0h         ; NOTE we have removed the line feed character 0Ah
msg2    db      'This is how we recycle in NASM.', 0h ; NOTE we have removed the line feed character 0Ah
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, msg1
    call    sprintLF   ; NOTE we are calling our new print with linefeed function
 
    mov     eax, msg2
    call    sprintLF   ; NOTE we are calling our new print with linefeed function
 
    call    quit
~$ nasm -f elf helloworld-lf.asm ~$ ld -m elf_i386 helloworld-lf.o -o helloworld-lf ~$ ./helloworld-lf Hello, brave new world! 你好,勇敢的新世界! This is how we recycle in NASM.
这就是我们在 NASM 中回收的方式。

Lesson 8 第 8 课

Passing arguments 传递参数

Passing arguments to your program from the command line is as easy as popping them off the stack in NASM. When we run our program, any passed arguments are loaded onto the stack in reverse order. The name of the program is then loaded onto the stack and lastly the total number of arguments is loaded onto the stack. The last two stack items for a NASM compiled program are always the name of the program and the number of passed arguments.
从命令行将参数传递给程序就像在 NASM 中将它们从堆栈中弹出一样简单。当我们运行程序时,任何传递的参数都会以相反的顺序加载到堆栈上。然后将程序的名称加载到堆栈上,最后将参数总数加载到堆栈上。NASM 编译程序的最后两个堆栈项始终是程序的名称和传递的参数的数量。

So all we have to do to use them is pop the number of arguments off the stack first, then iterate once for each argument and perform our logic. In our program that means calling our print function.
因此,要使用它们,我们所要做的就是先从堆栈中弹出参数的数量,然后为每个参数迭代一次并执行我们的逻辑。在我们的程序中,这意味着调用我们的 print 函数。

Note: We are using the ECX register as our counter for the loop. Although it's a general-purpose register it's original intention was to be used as a counter.
注意:我们使用 ECX 寄存器作为循环的计数器。虽然它是一个通用寄存器,但它的初衷是用作计数器。

helloworld-args.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
; Hello World Program (Passing arguments from the command line)
; Compile with: nasm -f elf helloworld-args.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld-args.o -o helloworld-args
; Run with: ./helloworld-args
 
%include        'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    pop     ecx            ; first value on the stack is the number of arguments
 
nextArg:
    cmp     ecx, 0h        ; check to see if we have any arguments left
    jz      noMoreArgs     ; if zero flag is set jump to noMoreArgs label (jumping over the end of the loop)
    pop     eax            ; pop the next argument off the stack
    call    sprintLF       ; call our print with linefeed function
    dec     ecx            ; decrease ecx (number of arguments left) by 1
    jmp     nextArg        ; jump to nextArg label
 
noMoreArgs:
    call    quit
~$ nasm -f elf helloworld-args.asm ~$ ld -m elf_i386 helloworld-args.o -o helloworld-args ~$ ./helloworld-args "This is one argument" "This is another" 101
~$ ./helloworld-args “这是一个参数” “这是另一个” 101
./helloworld-args This is one argument 这是一个参数 This is another 这是另一个 101

Lesson 9 第 9 课

User input 用户输入

Introduction to the .bss section
.bss 部分简介

So far we've used the .text and .data section so now it's time to introduce the .bss section. BSS stands for Block Started by Symbol. It is an area in our program that is used to reserve space in memory for uninitialised variables. We will use it to reserve some space in memory to hold our user input since we don't know how many bytes we'll need to store.
到目前为止,我们已经使用了 .text 和 .data 部分,现在是时候介绍 .bss 部分了。BSS 代表 Block Started by Symbol。它是我们程序中的一个区域,用于在内存中为未初始化的变量保留空间。我们将使用它在内存中保留一些空间来保存我们的用户输入,因为我们不知道需要存储多少字节。

The syntax to declare variables is as follows:
声明变量的语法如下:

.bss section example .bss 节示例
1
2
3
4
5
6
SECTION .bss
variableName1:      RESB    1      ; reserve space for 1 byte
variableName2:      RESW    1      ; reserve space for 1 word
variableName3:      RESD    1      ; reserve space for 1 double word
variableName4:      RESQ    1      ; reserve space for 1 double precision float (quad word)
variableName5:      REST    1      ; reserve space for 1 extended precision float
Writing our program 编写我们的程序

We will be using the system call sys_read to receive and process input from the user. This function is assigned OPCODE 3 in the Linux System Call Table. Just like sys_write this function also takes 3 arguments which will be loaded into EDX, ECX and EBX before requesting a software interrupt that will call the function.
我们将使用 system 调用 sys_read 来接收和处理用户的输入。此函数在 Linux 系统调用表中被分配为 OPCODE 3。就像 sys_write 一样,此函数也接受 3 个参数,这些参数将在请求调用该函数的软件中断之前加载到 EDX、ECX 和 EBX 中。

The arguments passed are as follows:
传递的参数如下:

  • EDX will be loaded with the maximum length (in bytes) of the space in memory.
    EDX 将加载内存中空间的最大长度(以字节为单位)。
  • ECX will be loaded with the address of our variable created in the .bss section.
    ECX 将加载在 .bss 部分中创建的变量的地址。
  • EBX will be loaded with the file we want to read from – in this case STDIN.
    EBX 将加载我们想要读取的文件 – 在本例中为 STDIN。
As always the datatype and meaning of the arguments passed can be found in the function's definition.
与往常一样,传递的参数的数据类型和含义可以在函数的定义中找到。

When sys_read detects a linefeed, control returns to the program and the users input is located at the memory address you passed in ECX.
当 sys_read 检测到换行时,控制权将返回给程序,并且用户输入位于您在 ECX 中传递的内存地址。

helloworld-input.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
; Hello World Program (Getting input)
; Compile with: nasm -f elf helloworld-input.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld-input.o -o helloworld-input
; Run with: ./helloworld-input
 
%include        'functions.asm'
 
SECTION .data
msg1        db      'Please enter your name: ', 0h     ; message string asking user for input
msg2        db      'Hello, ', 0h                      ; message string to use after user has entered their name
 
SECTION .bss
sinput:     resb    255                                ; reserve a 255 byte space in memory for the users input string
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, msg1
    call    sprint
 
    mov     edx, 255       ; number of bytes to read
    mov     ecx, sinput    ; reserved space to store our input (known as a buffer)
    mov     ebx, 0         ; read from the STDIN file
    mov     eax, 3         ; invoke SYS_READ (kernel opcode 3)
    int     80h
 
    mov     eax, msg2
    call    sprint
 
    mov     eax, sinput    ; move our buffer into eax (Note: input contains a linefeed)
    call    sprint         ; call our print function
 
    call    quit
~$ nasm -f elf helloworld-input.asm ~$ ld -m elf_i386 helloworld-input.o -o helloworld-input ~$ ./helloworld-input Please enter your name: Daniel Givney
请输入您的姓名: Daniel Givney
Hello, Daniel Givney 您好,Daniel Givney

Lesson 10 第 10 课

Count to 10 计数到 10

Firstly, some background 首先,一些背景

Counting by numbers is not as straight forward as you would think in assembly. Firstly we need to pass sys_write an address in memory so we can't just load our register with a number and call our print function. Secondly, numbers and strings are very different things in assembly. Strings are represented by what are called ASCII values. ASCII stands for American Standard Code for Information Interchange. A good reference for ASCII can be found here. ASCII was created as a way to standardise the representation of strings across all computers.
按数字计数并不像您在组装中想象的那么简单。首先,我们需要sys_write 个内存中的地址传递一个地址,这样我们就不能只用数字加载我们的寄存器并调用我们的 print 函数。其次,数字和字符串在组装中是非常不同的东西。字符串由所谓的 ASCII 值表示。ASCII 代表美国信息交换标准代码。可以在此处找到 ASCII 的良好参考。ASCII 的创建是为了标准化所有计算机上字符串的表示。

Remember, we can't print a number - we have to print a string. In order to count to 10 we will need to convert our numbers from standard integers to their ASCII string representations. Have a look at the ASCII values table and notice that the string representation for the number '1' is actually '49' in ASCII. In fact, adding 48 to our numbers is all we have to do to convert them from integers to their ASCII string representations.
请记住,我们不能打印数字 - 我们必须打印字符串。为了数到 10,我们需要将数字从标准整数转换为它们的 ASCII 字符串表示形式。查看 ASCII 值表,请注意数字 '1' 的字符串表示形式实际上是 ASCII 中的 '49'。事实上,我们将数字加 48 就是将它们从整数转换为 ASCII 字符串表示所需要做的。

Writing our program 编写我们的程序

What we will do with our program is count from 1 to 10 using the ECX register. We will then add 48 to our counter to convert it from a number to it's ASCII string representation. We will then push this value to the stack and call our print function passing ESP as the memory address to print from. Once we have finished counting to 10 we will exit our counting loop and call our quit function.
我们将对程序执行的操作是使用 ECX 寄存器从 1 到 10 的计数。然后,我们将 48 添加到计数器中,将其从数字转换为 ASCII 字符串表示形式。然后,我们将此值推送到堆栈中,并调用我们的 print 函数,将 ESP 作为要从中打印的内存地址传递。一旦我们完成计数到 10,我们将退出计数循环并调用我们的 quit 函数。

helloworld-10.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
; Hello World Program (Count to 10)
; Compile with: nasm -f elf helloworld-10.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld-10.o -o helloworld-10
; Run with: ./helloworld-10
 
%include        'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    mov     ecx, 0         ; ecx is initalised to zero.
 
nextNumber:
    inc     ecx            ; increment ecx
 
    mov     eax, ecx       ; move the address of our integer into eax
    add     eax, 48        ; add 48 to our number to convert from integer to ascii for printing
    push    eax            ; push eax to the stack
    mov     eax, esp       ; get the address of the character on the stack
    call    sprintLF       ; call our print function
 
    pop     eax            ; clean up the stack so we don't have unneeded bytes taking up space
    cmp     ecx, 10        ; have we reached 10 yet? compare our counter with decimal 10
    jne     nextNumber     ; jump if not equal and keep counting
 
    call    quit
~$ nasm -f elf helloworld-10.asm ~$ ld -m elf_i386 helloworld-10.o -o helloworld-10 ~$ ./helloworld-10 1 2 3 4 5 6 7 8 9 :

Error: Our number 10 prints a colon (:) character instead. What's going on?
错误:我们的数字 10 打印了一个冒号 (:) 字符。这是怎么回事?


Lesson 11 第 11 课

Count to 10 (itoa) 计数到 10 (itoa)

So why did our program in Lesson 10 print out a colon character instead of the number 10?. Well lets have a look at our ASCII table. We can see that the colon character has a ASCII value of 58. We were adding 48 to our integers to convert them to their ASCII string representations so instead of passing sys_write the value '58' to print ten we actually need to pass the ASCII value for the number 1 followed by the ASCII value for the number 0. Passing sys_write '4948' is the correct string representation for the number '10'. So we can't just simply add 48 to our numbers to convert them, we first have to divide them by 10 because each place value needs to be converted individually.
那么,为什么我们在第 10 课中的程序打印出冒号字符而不是数字 10呢?好吧,让我们看看我们的 ASCII 表。我们可以看到冒号字符的 ASCII 值为 58。我们将 48 添加到整数中以将它们转换为 ASCII 字符串表示形式,因此我们实际上需要传递数字 1 的 ASCII 值,然后传递数字 0 的 ASCII 值,而不是sys_write传递值 '58' 来打印 10。传递 '4948' sys_write是数字 '10' 的正确字符串表示形式。因此,我们不能简单地将 48 添加到数字中来转换它们,我们首先必须将它们除以 10,因为每个位值都需要单独转换。

We will write 2 new subroutines in this lesson 'iprint' and 'iprintLF'. These functions will be used when we want to print ASCII string representations of numbers. We achieve this by passing the number in EAX. We then initialise a counter in ECX. We will repeatedly divide the number by 10 and each time convert the remainder to a string by adding 48. We will then push this onto the stack for later use. Once we can no longer divide the number by 10 we will enter our second loop. In this print loop we will print the now converted string representations from the stack and pop them off. Popping them off the stack moves ESP forward to the next item on the stack. Each time we print a value we will decrease our counter ECX. Once all numbers have been converted and printed we will return to our program.
在本课中,我们将编写 2 个新的子例程 'iprint' 和 'iprintLF'。当我们想要打印数字的 ASCII 字符串表示时,将使用这些函数。我们通过在 EAX 中传递数字来实现这一点。然后我们在 ECX 中初始化一个计数器。我们将重复将数字除以 10,每次通过添加 48 将余数转换为字符串。然后,我们会将其推送到堆栈上以供以后使用。一旦我们无法再将数字除以 10,我们将进入第二个循环。在这个打印循环中,我们将从堆栈中打印现在转换的字符串表示并将它们弹出。将它们从堆栈中弹出会将 ESP 向前移动到堆栈上的下一个项目。每次我们打印一个值时,我们都会减少我们的计数器 ECX。转换并打印所有号码后,我们将返回我们的程序。

How does the divide instruction work?
divide 指令如何运作?

The DIV and IDIV instructions work by dividing whatever is in EAX by the value passed to the instruction. The quotient part of the value is left in EAX and the remainder part is put into EDX (Originally called the data register).
DIV 和 IDIV 指令的工作原理是将 EAX 中的任何内容除以传递给指令的值。该值的商部分留在 EAX 中,其余部分放入 EDX (最初称为数据寄存器)。

For example. 例如。

IDIV instruction example IDIV 指令示例
1
2
3
4
mov     eax, 10        ; move 10 into eax
mov     esi, 10        ; move 10 into esi
idiv    esi            ; divide eax by esi (eax will equal 1 and edx will equal 0)
idiv    esi            ; divide eax by esi again (eax will equal 0 and edx will equal 1)
If we are only storing the remainder won't we have problems?
如果我们只存储剩余部分,我们不会有问题吗?

No, because these are integers, when you divide a number by an even bigger number the quotient in EAX is 0 and the remainder is the number itself. This is because the number divides zero times leaving the original value as the remainder in EDX. How good is that?
不,因为这些是整数,所以当您将一个数字除以更大的数字时,EAX 中的商是 0,余数是数字本身。这是因为该数字除以零次,在 EDX 中留下原始值作为余数。那有多好?

Note: Only the new functions iprint and iprintLF have comments.
注意:只有新函数 iprint 和 iprintLF 有注释。

functions.asm 函数.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
;------------------------------------------
; void iprint(Integer number)
; Integer printing function (itoa)
iprint:
    push    eax            ; preserve eax on the stack to be restored after function runs
    push    ecx            ; preserve ecx on the stack to be restored after function runs
    push    edx            ; preserve edx on the stack to be restored after function runs
    push    esi            ; preserve esi on the stack to be restored after function runs
    mov     ecx, 0         ; counter of how many bytes we need to print in the end
 
divideLoop:
    inc     ecx            ; count each byte to print - number of characters
    mov     edx, 0         ; empty edx
    mov     esi, 10        ; mov 10 into esi
    idiv    esi            ; divide eax by esi
    add     edx, 48        ; convert edx to it's ascii representation - edx holds the remainder after a divide instruction
    push    edx            ; push edx (string representation of an intger) onto the stack
    cmp     eax, 0         ; can the integer be divided anymore?
    jnz     divideLoop     ; jump if not zero to the label divideLoop
 
printLoop:
    dec     ecx            ; count down each byte that we put on the stack
    mov     eax, esp       ; mov the stack pointer into eax for printing
    call    sprint         ; call our string print function
    pop     eax            ; remove last character from the stack to move esp forward
    cmp     ecx, 0         ; have we printed all bytes we pushed onto the stack?
    jnz     printLoop      ; jump is not zero to the label printLoop
 
    pop     esi            ; restore esi from the value we pushed onto the stack at the start
    pop     edx            ; restore edx from the value we pushed onto the stack at the start
    pop     ecx            ; restore ecx from the value we pushed onto the stack at the start
    pop     eax            ; restore eax from the value we pushed onto the stack at the start
    ret
 
 
;------------------------------------------
; void iprintLF(Integer number)
; Integer printing function with linefeed (itoa)
iprintLF:
    call    iprint         ; call our integer printing function
 
    push    eax            ; push eax onto the stack to preserve it while we use the eax register in this function
    mov     eax, 0Ah       ; move 0Ah into eax - 0Ah is the ascii character for a linefeed
    push    eax            ; push the linefeed onto the stack so we can get the address
    mov     eax, esp       ; move the address of the current stack pointer into eax for sprint
    call    sprint         ; call our sprint function
    pop     eax            ; remove our linefeed character from the stack
    pop     eax            ; restore the original value of eax before our function was called
    ret
 
 
;------------------------------------------
; int slen(String message)
; String length calculation function
slen:
    push    ebx
    mov     ebx, eax
 
nextchar:
    cmp     byte [eax], 0
    jz      finished
    inc     eax
    jmp     nextchar
 
finished:
    sub     eax, ebx
    pop     ebx
    ret
 
 
;------------------------------------------
; void sprint(String message)
; String printing function
sprint:
    push    edx
    push    ecx
    push    ebx
    push    eax
    call    slen
 
    mov     edx, eax
    pop     eax
 
    mov     ecx, eax
    mov     ebx, 1
    mov     eax, 4
    int     80h
 
    pop     ebx
    pop     ecx
    pop     edx
    ret
 
 
;------------------------------------------
; void sprintLF(String message)
; String printing with line feed function
sprintLF:
    call    sprint
 
    push    eax
    mov     eax, 0AH
    push    eax
    mov     eax, esp
    call    sprint
    pop     eax
    pop     eax
    ret
 
 
;------------------------------------------
; void exit()
; Exit program and restore resources
quit:
    mov     ebx, 0
    mov     eax, 1
    int     80h
    ret
helloworld-itoa.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
; Hello World Program (Count to 10 itoa)
; Compile with: nasm -f elf helloworld-itoa.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 helloworld-itoa.o -o helloworld-itoa
; Run with: ./helloworld-itoa
 
%include        'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    mov     ecx, 0
 
nextNumber:
    inc     ecx
    mov     eax, ecx
    call    iprintLF       ; NOTE call our new integer printing function (itoa)
    cmp     ecx, 10
    jne     nextNumber
 
    call    quit
~$ nasm -f elf helloworld-itoa.asm ~$ ld -m elf_i386 helloworld-itoa.o -o helloworld-itoa
Elf_i386 Helloworld- Helloworld
~$ ./helloworld-itoa ~$ ./helloworld-伊托阿 1 2 3 4 5 6 7 8 9 10

Lesson 12 第 12 课

Calculator - addition 计算器 - 加法

In this program we will be adding the registers EAX and EBX together and we'll leave our answer in EAX. Firstly we use the MOV instruction to load EAX with an integer (in this case 90). We then MOV an integer into EBX (in this case 9). Now all we need to do is use the ADD instruction to perform our addition. EBX & EAX will be added together leaving our answer in the left most register in this instruction (in our case EAX). Then all we need to do is call our integer printing function to complete the program.
在这个程序中,我们将把寄存器 EAX 和 EBX 一起添加,我们将把我们的答案留在 EAX 中。首先,我们使用 MOV 指令加载带有整数(在本例中为 90)的 EAX。然后我们将一个整数移动到 EBX 中(在本例中为 9)。现在我们需要做的就是使用 ADD 指令来执行我们的加法。EBX和EAX将被加在一起,使我们的答案在此指令的最左边寄存器中(在我们的例子中是EAX)。然后我们需要做的就是调用我们的整数打印函数来完成程序。

calculator-addition.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
; Calculator (Addition)
; Compile with: nasm -f elf calculator-addition.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 calculator-addition.o -o calculator-addition
; Run with: ./calculator-addition
 
%include        'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, 90    ; move our first number into eax
    mov     ebx, 9     ; move our second number into ebx
    add     eax, ebx   ; add ebx to eax
    call    iprintLF   ; call our integer print with linefeed function
 
    call    quit
~$ nasm -f elf calculator-addition.asm ~$ ld -m elf_i386 calculator-addition.o -o calculator-addition ~$ ./calculator-addition 99

Lesson 13 第 13 课

Calculator - subtraction 计算器 - 减法

In this program we will be subtracting the value in the register EBX from the value in the register EAX. Firstly we load EAX and EBX with integers in the same way as Lesson 12. The only difference is we will be using the SUB instruction to perform our subtraction logic, leaving our answer in the left most register of this instruction (in our case EAX). Then all we need to do is call our integer printing function to complete the program.
在这个程序中,我们将从寄存器 EAX 中的值中减去寄存器 EBX 中的值。首先,我们按照与第 12 课相同的方式加载带有整数的 EAX 和 EBX。唯一的区别是我们将使用 SUB 指令来执行我们的减法逻辑,将我们的答案留在该指令的最左侧寄存器(在我们的例子中为 EAX)。然后我们需要做的就是调用我们的整数打印函数来完成程序。

calculator-subtraction.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
; Calculator (Subtraction)
; Compile with: nasm -f elf calculator-subtraction.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 calculator-subtraction.o -o calculator-subtraction
; Run with: ./calculator-subtraction
 
%include        'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, 90    ; move our first number into eax
    mov     ebx, 9     ; move our second number into ebx
    sub     eax, ebx   ; subtract ebx from eax
    call    iprintLF   ; call our integer print with linefeed function
 
    call    quit
~$ nasm -f elf calculator-subtraction.asm ~$ ld -m elf_i386 calculator-subtraction.o -o calculator-subtraction ~$ ./calculator-subtraction 81

Lesson 14 第 14 课

Calculator - multiplication
计算器 - 乘法

In this program we will be multiplying the value in EBX by the value present in EAX. Firstly we load EAX and EBX with integers in the same way as Lesson 12. This time though we will be calling the MUL instruction to perform our multiplication logic. The MUL instruction is different from many instructions in NASM, in that it only accepts one further argument. The MUL instruction always multiples EAX by whatever value is passed after it. The answer is left in EAX.
在这个程序中,我们将 EBX 中的值乘以 EAX 中的值。首先,我们按照与第 12 课相同的方式加载带有整数的 EAX 和 EBX。不过,这一次,我们将调用 MUL 指令来执行我们的乘法逻辑。MUL 指令与 NASM 中的许多指令不同,因为它只接受一个进一步的参数。MUL 指令始终将 EAX 乘以在其之后传递的任何值。答案留在 EAX 中。

calculator-multiplication.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
; Calculator (Multiplication)
; Compile with: nasm -f elf calculator-multiplication.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 calculator-multiplication.o -o calculator-multiplication
; Run with: ./calculator-multiplication
 
%include        'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, 90    ; move our first number into eax
    mov     ebx, 9     ; move our second number into ebx
    mul     ebx        ; multiply eax by ebx
    call    iprintLF   ; call our integer print with linefeed function
 
    call    quit
~$ nasm -f elf calculator-multiplication.asm
~$ nasm -f elf 计算器-multiplication.asm
~$ ld -m elf_i386 calculator-multiplication.o -o calculator-multiplication ~$ ./calculator-multiplication
~$ ./calculator-乘法
810

Lesson 15 第 15 课

Calculator - division 计算器 - 除法

In this program we will be dividing the value in EBX by the value present in EAX. We've used division before in our integer print subroutine. Our program requires a few extra strings in order to print out the correct answer but otherwise there's nothing complicated going on.
在这个程序中,我们将 EBX 中的值除以 EAX 中的值。我们之前在 integer print 子例程中使用过 division。我们的程序需要一些额外的字符串才能打印出正确答案,但除此之外没有什么复杂的事情。

Firstly we load EAX and EBX with integers in the same way as Lesson 12. Division logic is performed using the DIV instruction. The DIV instruction always divides EAX by the value passed after it. It will leave the quotient part of the answer in EAX and put the remainder part in EDX (the original data register). We then MOV and call our strings and integers to print out the correct answer.
首先,我们按照与第 12 课相同的方式加载带有整数的 EAX 和 EBX。除法 logic 使用 DIV 指令执行。DIV 指令始终将 EAX 除以在其之后传递的值。它将答案的商部分留在 EAX 中,并将剩余部分放在 EDX (原始数据寄存器)中。然后我们 MOV 并调用我们的字符串和整数来打印出正确答案。

calculator-division.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
; Calculator (Division)
; Compile with: nasm -f elf calculator-division.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 calculator-division.o -o calculator-division
; Run with: ./calculator-division
 
%include        'functions.asm'
 
SECTION .data
msg1        db      ' remainder '     ; a message string to correctly output result
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, 90    ; move our first number into eax
    mov     ebx, 9     ; move our second number into ebx
    div     ebx        ; divide eax by ebx
    call    iprint     ; call our integer print function on the quotient
    mov     eax, msg1  ; move our message string into eax
    call    sprint     ; call our string print function
    mov     eax, edx   ; move our remainder into eax
    call    iprintLF   ; call our integer printing with linefeed function
 
    call    quit
~$ nasm -f elf calculator-division.asm ~$ ld -m elf_i386 calculator-division.o -o calculator-division ~$ ./calculator-division 10 remainder 0 10 余数 0

Lesson 16 第 16 课

Calculator (atoi) 计算器 (atoi)

Our program will take several command line arguments and add them together printing out the result in the terminal.
我们的程序将获取几个命令行参数并将它们添加在一起,在终端中打印出结果。

Writing our program 编写我们的程序

Our program begins by using the POP instruction to get the number of passed arguments off the stack. This value is stored in ECX (originally known as the counter register). It will then POP the next value off the stack containing the program name and remove it from the number of arguments stored in ECX. It will then loop through the rest of the arguments popping each one off the stack and performing our addition logic. As we know, arguments passed via the command line are received by our program as strings. Before we can add the arguments together we will need to convert them to integers otherwise our result will not be correct. We do this by calling our Ascii to Integer function (atoi). This function will convert the ascii value into an integer and place the result in EAX. We can then add this value to EDX (originally known as the data register) where we will store the result of our additions. If the value passed to atoi is not an ascii representation of an integer our function will return zero instead. When all arguments have been converted and added together we will print out the result and call our quit function.
我们的程序首先使用 POP 指令从堆栈中获取传递的参数数量。此值存储在 ECX (最初称为 counter register) 中。然后,它将从包含程序名称的堆栈中弹出下一个值,并将其从 ECX 中存储的参数数量中删除。然后,它将遍历其余的参数,将每个参数从堆栈中弹出并执行我们的加法逻辑。正如我们所知,通过命令行传递的参数被我们的程序以字符串形式接收。在我们将参数添加在一起之前,我们需要将它们转换为整数,否则我们的结果将不正确。我们通过调用 Ascii to Integer 函数 (atoi) 来实现这一点。此函数会将 ASCII 值转换为整数,并将结果放入 EAX 中。然后,我们可以将此值添加到 EDX(最初称为数据寄存器)中,我们将在其中存储添加的结果。如果传递给 atoi 的值不是整数的 ASCII 表示,我们的函数将返回零。当所有参数都被转换并加在一起时,我们将打印出结果并调用我们的 quit 函数。

How does the atoi function work
atoi 函数是如何工作的

Converting an ascii string into an integer value is not a trivial task. We know how to convert an integer to an ascii string so the process should essentially work in reverse. Firstly we take the address of the string and move it into ESI (originally known as the source register). We will then move along the string byte by byte (think of each byte as being a single digit or decimal placeholder). For each digit we will check if it's value is between 48-57 (ascii values for the digits 0-9).
将 ASCII 字符串转换为整数值并非易事。我们知道如何将整数转换为 ASCII 字符串,因此该过程基本上应该是相反的。首先,我们获取字符串的地址并将其移动到 ESI (最初称为 source register)。然后,我们将逐字节地沿着字符串移动(将每个字节视为单个数字或小数点占位符)。对于每个数字,我们将检查它的值是否在 48-57 之间(数字 0-9 的 ASCII 值)。

Once we have performed this check and determined that the byte can be converted to an integer we will perform the following logic. We will subtract 48 from the value – converting the ascii value to it's decimal equivalent. We will then add this value to EAX (the general purpose register that will store our result). We will then multiple EAX by 10 as each byte represents a decimal placeholder and continue the loop.
一旦我们执行了此检查并确定字节可以转换为整数,我们将执行以下逻辑。我们将从值中减去 48 – 将 ascii 值转换为它的十进制等效值。然后我们将此值添加到 EAX (将存储结果的通用寄存器)。然后,我们将 EAX 乘以 10,因为每个字节代表一个小数点占位符并继续循环。

When all bytes have been converted we need to do one last thing before we return the result. The last digit of any number represents a single unit (not a multiple of 10) so we have multiplied our result one too many times. We simple divide it by 10 once to correct this and then return. If no integer arguments were pass however, we skip this divide instruction.
当所有字节都被转换后,我们需要在返回结果之前做最后一件事。任何数字的最后一位数字都代表一个单位(不是 10 的倍数),因此我们将结果乘以太多次。我们简单地将其除以 10 一次来纠正这个问题,然后返回。但是,如果未传递整数参数,则跳过此 divide 指令。

What is the BL register
什么是提单登记册

You may have noticed that the atoi function references the BL register. So far in these tutorials we have been exclusively using 32bit registers. These 32bit general purpose registers contain segments of memory that can also be referenced. These segments are available in 16bits and 8bits. We wanted a single byte (8bits) because a byte is the size of memory that is required to store a single ascii character. If we used a larger memory size we would have copied 8bits of data into 32bits of space leaving us with 'rubbish' bits - because only the first 8bits would be meaningful for our calculation.
您可能已经注意到 atoi 函数引用了 BL 寄存器。到目前为止,在这些教程中,我们一直只使用 32 位寄存器。这些 32 位通用寄存器包含也可以引用的内存段。这些段有 16 位和 8 位两种版本。我们想要一个字节 (8bits),因为一个字节是存储单个 ASCII 字符所需的内存大小。如果我们使用更大的内存大小,我们会将 8 位数据复制到 32 位空间中,留下“垃圾”位 - 因为只有前 8 位对我们的计算有意义。

The EBX register is 32bits. EBX's 16 bit segment is referenced as BX. BX contains the 8bit segments BL and BH (Lower and Higher bits). We wanted the first 8bits (lower bits) of EBX and so we referenced that storage area using BL.
EBX 寄存器为 32 位。EBX 的 16 位段称为 BX。BX 包含 8 位段 BL 和 BH(低位和高位)。我们想要 EBX 的前 8 位(较低位),因此我们使用 BL 引用该存储区域。

Almost every assembly language tutorial begins with a history of the registers, their names and their sizes. These tutorials however were written to provide a foundation in NASM by first writing code and then understanding the theory. The full story about the size of registers, their history and importance are beyond the scope of this tutorial but we will return to that story in later tutorials.
几乎每个汇编语言教程都以寄存器的历史、它们的名称和大小开始。但是,编写这些教程是为了提供 NASM 的基础,首先编写代码,然后理解理论。有关 register 的大小、它们的历史和重要性的完整故事超出了本教程的范围,但我们将在后面的教程中返回该故事。

Note: Only the new function in this file 'atoi' is shown below.
注意:下面仅显示此文件中的 new function 'atoi'。

functions.asm 函数.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
;------------------------------------------
; int atoi(Integer number)
; Ascii to integer function (atoi)
atoi:
    push    ebx            ; preserve ebx on the stack to be restored after function runs
    push    ecx            ; preserve ecx on the stack to be restored after function runs
    push    edx            ; preserve edx on the stack to be restored after function runs
    push    esi            ; preserve esi on the stack to be restored after function runs
    mov     esi, eax       ; move pointer in eax into esi (our number to convert)
    mov     eax, 0         ; initialise eax with decimal value 0
    mov     ecx, 0         ; initialise ecx with decimal value 0
 
.multiplyLoop:
    xor     ebx, ebx       ; resets both lower and uppper bytes of ebx to be 0
    mov     bl, [esi+ecx ; move a single byte into ebx register's lower half
    cmp     bl, 48         ; compare ebx register's lower half value against ascii value 48 (char value 0)
    jl      .finished      ; jump if less than to label finished
    cmp     bl, 57         ; compare ebx register's lower half value against ascii value 57 (char value 9)
    jg      .finished      ; jump if greater than to label finished
 
    sub     bl, 48         ; convert ebx register's lower half to decimal representation of ascii value
    add     eax, ebx       ; add ebx to our integer value in eax
    mov     ebx, 10        ; move decimal value 10 into ebx
    mul     ebx            ; multiply eax by ebx to get place value
    inc     ecx            ; increment ecx (our counter register)
    jmp     .multiplyLoop  ; continue multiply loop
 
.finished:
    cmp     ecx, 0         ; compare ecx register's value against decimal 0 (our counter register)
    je      .restore       ; jump if equal to 0 (no integer arguments were passed to atoi)
    mov     ebx, 10        ; move decimal value 10 into ebx
    div     ebx            ; divide eax by value in ebx (in this case 10)
 
.restore:
    pop     esi            ; restore esi from the value we pushed onto the stack at the start
    pop     edx            ; restore edx from the value we pushed onto the stack at the start
    pop     ecx            ; restore ecx from the value we pushed onto the stack at the start
    pop     ebx            ; restore ebx from the value we pushed onto the stack at the start
    ret
calculator-atoi.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
; Calculator (ATOI)
; Compile with: nasm -f elf calculator-atoi.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 calculator-atoi.o -o calculator-atoi
; Run with: ./calculator-atoi 20 1000 317
 
%include        'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    pop     ecx            ; first value on the stack is the number of arguments
    pop     edx            ; second value on the stack is the program name (discarded when we initialise edx)
    sub     ecx, 1         ; decrease ecx by 1 (number of arguments without program name)
    mov     edx, 0         ; initialise our data register to store additions
 
nextArg:
    cmp     ecx, 0h        ; check to see if we have any arguments left
    jz      noMoreArgs     ; if zero flag is set jump to noMoreArgs label (jumping over the end of the loop)
    pop     eax            ; pop the next argument off the stack
    call    atoi           ; convert our ascii string to decimal integer
    add     edx, eax       ; perform our addition logic
    dec     ecx            ; decrease ecx (number of arguments left) by 1
    jmp     nextArg        ; jump to nextArg label
 
noMoreArgs:
    mov     eax, edx       ; move our data result into eax for printing
    call    iprintLF       ; call our integer printing with linefeed function
    call    quit           ; call our quit function
~$ nasm -f elf calculator-atoi.asm ~$ ld -m elf_i386 calculator-atoi.o -o calculator-atoi ~$ ./calculator-atoi 20 1000 317 1337

Lesson 17 第 17 课

Namespace

Namespace is a necessary construct in any software project that involves a codebase that is larger than a few simple functions. Namespace provides scope to your identifiers and allows you to reuse naming conventions to make your code more readable and maintainable. In assembly language where subroutines are identified by global labels, namespace can be achieved by using local labels.
Namespace 是任何涉及比几个简单函数更大的代码库的软件项目中的必要结构。Namespace 为您的标识符提供范围,并允许您重用命名约定,使您的代码更具可读性和可维护性。在子例程由全局标签标识的汇编语言中,可以使用局部标签来实现命名空间。

Up until the last few tutorials we have been using global labels exclusively. This means that blocks of logic that essentially perform the same task needed a label with a unique identifier. A good example would be our "finished" labels. These were global in scope meaning when we needed to break out of a loop in one function we could jump to a "finished" label. But if we needed to break out of a loop in a different function we would need to name this same task something else maybe calling it "done" or "continue". Being able to reuse the label "finished" would mean that someone reading the code would know that these blocks of logic perform almost the same task.
在最近几个教程之前,我们一直只使用全局标签。这意味着本质上执行相同任务的 logic blocks 需要一个具有唯一标识符的 label。一个很好的例子是我们的 “finished” 标签。这些是全局范围的,这意味着当我们需要一个函数跳出一个循环时,我们可以跳转到“finished”标签。但是,如果我们需要在不同的函数中跳出一个循环,我们需要将同一个任务命名为其他名称,可能将其命名为 “done” 或 “continue”。能够重用 “finished” 标签意味着阅读代码的人会知道这些 logic blocks 执行几乎相同的任务。

Local labels are prepended with a "." at the beginning of their name for example ".finished". You may have noticed them appearing as our code base in functions.asm grew. A local label is given the namespace of the first global label above it. You can jump to a local label by using the JMP instruction and the compiler will calculate which local label you are referencing by determining in what scope (based on the above global labels) the instruction was called.
本地标签在其名称的开头加上 “..”,例如 “.finished”。你可能已经注意到,它们随着 functions.asm 中的代码库的增长而出现。为本地标签提供其上方第一个全局标签的命名空间。您可以使用 JMP 指令跳转到本地标签,编译器将通过确定调用该指令的范围(基于上述全局标签)来计算您引用的局部标签。

Note: The file functions.asm was modified adding namespaces in all the subroutines. This is particularly important in the "slen" subroutine which contains a "finished" global label.
注意:修改了 functions.asm 文件,在所有子例程中添加了命名空间。这在包含 “finished” 全局标签的 “slen” 子例程中尤为重要。

namespace.asm 命名空间.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
; Namespace
; Compile with: nasm -f elf namespace.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 namespace.o -o namespace
; Run with: ./namespace
 
%include        'functions.asm'
 
SECTION .data
msg1        db      'Jumping to finished label.', 0h       ; a message string
msg2        db      'Inside subroutine number: ', 0h       ; a message string
msg3        db      'Inside subroutine "finished".', 0h    ; a message string
 
SECTION .text
global  _start
 
_start:
 
subrountineOne:
    mov     eax, msg1      ; move the address of msg1 into eax
    call    sprintLF       ; call our string printing with linefeed function
    jmp     .finished      ; jump to the local label under the subrountineOne scope
 
.finished:
    mov     eax, msg2      ; move the address of msg2 into eax
    call    sprint         ; call our string printing function
    mov     eax, 1         ; move the value one into eax (for subroutine number one)
    call    iprintLF       ; call our integer printing function with linefeed function
 
subrountineTwo:
    mov     eax, msg1      ; move the address of msg1 into eax
    call    sprintLF       ; call our string print with linefeed function
    jmp     .finished      ; jump to the local label under the subrountineTwo scope
 
.finished:
    mov     eax, msg2      ; move the address of msg2 into eax
    call    sprint         ; call our string printing function
    mov     eax, 2         ; move the value two into eax (for subroutine number two)
    call    iprintLF       ; call our integer printing function with linefeed function
 
    mov     eax, msg1      ; move the address of msg1 into eax
    call    sprintLF       ; call our string printing with linefeed function
    jmp     finished       ; jump to the global label finished
 
finished:
    mov     eax, msg3      ; move the address of msg3 into eax
    call    sprintLF       ; call our string printing with linefeed function
    call    quit           ; call our quit function
~$ nasm -f elf namespace.asm
~$ nasm -f elf 命名空间.asm
~$ ld -m elf_i386 namespace.o -o namespace
~$ ld -m elf_i386 命名空间.o -o 命名空间
~$ ./namespace ~$ ./命名空间 Jumping to finished label.
跳转到已完成的标签。
Inside subroutine number: 1
内部子程序编号:1
Jumping to finished label.
跳转到已完成的标签。
Inside subroutine number: 2
内部子程序编号:2
Jumping to finished label.
跳转到已完成的标签。
Inside subroutine "finished".
在子程序 “finished” 中。

Lesson 18 第 18 课

Fizz Buzz 嘶嘶

Firstly, some background 首先,一些背景

FizzBuzz is group word game played in schools to teach children division. Players take turns to count aloud integers from 1 to 100 replacing any number divisible by 3 with the word "fizz" and any number divisible by 5 with the word "buzz". Numbers that are both divisible by 3 and 5 are replaced by the word "fizzbuzz". This children's game has also become a defacto interview screening question for computer programming jobs as it's thought to easily discover candidates that can't construct a simple logic gate.
FizzBuzz 是在学校玩的集体文字游戏,用于教孩子们除法。玩家轮流大声数出从 1 到 100 的整数,将任何可被 3 整除的数字替换为单词“fizz”,将任何可被 5 整除的数字替换为单词“buzz”。都可以被 3 和 5 整除的数字被单词 “fizzbuzz” 替换。这款儿童游戏也已成为计算机编程工作事实上的面试筛选问题,因为它被认为可以轻松发现无法构建简单逻辑门的候选人。

Writing our program 编写我们的程序

There are a number of code solutions to this simple game and some languages offer very trivial and elegant solutions. Depending on how you choose to solve it, the solution almost always involves an if statement and possibly an else statement depending whether you choose to exploit the mathematical property that anything divisible by 5 & 3 would also be divisible by 3 * 5. Being that this is an assembly language tutorial we will provide a solution that involves a structure of two cascading if statements to print the words "fizz" and/or "buzz" and an else statement in case these fail, to print the integer as an ascii value. Each iteration of our loop will then print a line feed. Once we reach 100 we call our program exit function.
这个简单的游戏有许多代码解决方案,有些语言提供了非常简单和优雅的解决方案。根据你选择如何解决它,解决方案几乎总是涉及一个if语句和一个可能的else语句,这取决于你是否选择利用数学特性,即任何能被5和3整除的东西也会被3 * 5整除。由于这是一个汇编语言教程,我们将提供一个解决方案,该解决方案涉及两个级联 if 语句的结构,用于打印单词 “fizz” 和/或 “buzz” 和一个 else 语句,如果这些语句失败,将整数打印为 ASCII 值。然后,循环的每次迭代都会打印一个换行符。一旦我们达到 100,我们调用我们的程序 exit 函数。

fizzbuzz.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
; Fizzbuzz
; Compile with: nasm -f elf fizzbuzz.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 fizzbuzz.o -o fizzbuzz
; Run with: ./fizzbuzz
 
%include        'functions.asm'
 
SECTION .data
fizz        db      'Fizz', 0h    ; a message string
buzz        db      'Buzz', 0h    ; a message string
 
SECTION .text
global  _start
 
_start:
 
    mov     esi, 0         ; initialise our checkFizz boolean variable
    mov     edi, 0         ; initialise our checkBuzz boolean variable
    mov     ecx, 0         ; initialise our counter variable
 
nextNumber:
    inc     ecx            ; increment our counter variable
 
.checkFizz:
    mov     edx, 0         ; clear the edx register - this will hold our remainder after division
    mov     eax, ecx       ; move the value of our counter into eax for division
    mov     ebx, 3         ; move our number to divide by into ebx (in this case the value is 3)
    div     ebx            ; divide eax by ebx
    mov     edi, edx       ; move our remainder into edi (our checkFizz boolean variable)
    cmp     edi, 0         ; compare if the remainder is zero (meaning the counter divides by 3)
    jne     .checkBuzz     ; if the remainder is not equal to zero jump to local label checkBuzz
    mov     eax, fizz      ; else move the address of our fizz string into eax for printing
    call    sprint         ; call our string printing function
 
.checkBuzz:
    mov     edx, 0         ; clear the edx register - this will hold our remainder after division
    mov     eax, ecx       ; move the value of our counter into eax for division
    mov     ebx, 5         ; move our number to divide by into ebx (in this case the value is 5)
    div     ebx            ; divide eax by ebx
    mov     esi, edx       ; move our remainder into edi (our checkBuzz boolean variable)
    cmp     esi, 0         ; compare if the remainder is zero (meaning the counter divides by 5)
    jne     .checkInt      ; if the remainder is not equal to zero jump to local label checkInt
    mov     eax, buzz      ; else move the address of our buzz string into eax for printing
    call    sprint         ; call our string printing function
 
.checkInt:
    cmp     edi, 0         ; edi contains the remainder after the division in checkFizz
    je     .continue       ; if equal (counter divides by 3) skip printing the integer
    cmp     esi, 0         ; esi contains the remainder after the division in checkBuzz
    je     .continue       ; if equal (counter divides by 5) skip printing the integer
    mov     eax, ecx       ; else move the value in ecx (our counter) into eax for printing
    call    iprint         ; call our integer printing function
 
.continue:
    mov     eax, 0Ah       ; move an ascii linefeed character into eax
    push    eax            ; push the address of eax onto the stack for printing
    mov     eax, esp       ; get the stack pointer (address on the stack of our linefeed char)
    call    sprint         ; call our string printing function to print a line feed
    pop     eax            ; pop the stack so we don't waste resources
    cmp     ecx, 100       ; compare if our counter is equal to 100
    jne     nextNumber     ; if not equal jump to the start of the loop
 
    call    quit           ; else call our quit function
~$ nasm -f elf fizzbuzz.asm ~$ ld -m elf_i386 fizzbuzz.o -o fizzbuzz ~$ ./fizzbuzz ~$ ./嘶嘶嘶 1 2 Fizz 嘶嘶声 4 Buzz 巴 斯 Fizz 嘶嘶声 7 8 Fizz 嘶嘶声 Buzz 巴 斯 11 Fizz 嘶嘶声 13 14 FizzBuzz 嘶嘶 16 ...

Lesson 19 第 19 课

Execute Command 执行命令

Firstly, some background 首先,一些背景

The EXEC family of functions replace the currently running process with a new process, that executes the command you specified when calling it. We will be using the SYS_EXECVE function in this lesson to replace our program's running process with a new process that will execute the linux program /bin/echo to print out “Hello World!”.
EXEC 系列函数将当前正在运行的进程替换为新进程,该进程将执行您在调用该进程时指定的命令。在本课中,我们将使用 SYS_EXECVE 函数将程序的运行进程替换为新进程,该进程将执行 linux 程序 /bin/echo 以打印出 “Hello World!”。

Naming convention 命名约定

The naming convention used for this family of functions is exec (execute) followed by one or more of the following letters.
用于此函数系列的命名约定是 exec (execute),后跟以下一个或多个字母。

  • e - An array of pointers to environment variables is explicitly passed to the new process image.
    e - 将指向环境变量的指针数组显式传递给新的进程映像。
  • l - Command-line arguments are passed individually to the function.
    l - 命令行参数单独传递给函数。
  • p - Uses the PATH environment variable to find the file named in the path argument to be executed.
    p - 使用 PATH 环境变量查找要执行的 path 参数中命名的文件。
  • v - Command-line arguments are passed to the function as an array of pointers.
    v - 命令行参数作为指针数组传递给函数。
Writing our program 编写我们的程序

The V & E at the end of our function name means we will need to pass our arguments in the following format: The first argument is a string containing the command to execute, then an array of arguments to pass to that command and then another array of environment variables that the new process will use. As we are calling a simple command we won't pass any special environment variables to the new process and instead will pass 0h (null).
我们的函数名称末尾的VE意味着我们需要按照以下格式传递我们的参数:第一个参数是一个包含要执行的命令的字符串,然后是一个要传递给该命令的参数数组,然后是新进程将使用的另一个环境变量数组。由于我们调用的是一个简单的命令,因此我们不会将任何特殊的环境变量传递给新进程,而是传递 0h (null)。

Both the command arguments and the environment arguments need to be passed as an array of pointers (addresses to memory). That's why we define our strings first and then define a simple null-terminated struct (array) of the variables names. This is then passed to SYS_EXECVE. We call the function and the process is replaced by our command and output is returned to the terminal.
command 参数和 environment 参数都需要作为指针数组(指向内存的地址)传递。这就是为什么我们首先定义字符串,然后定义变量名称的简单以 null 结尾的结构体(数组)。然后将其传递给 SYS_EXECVE。我们调用函数,进程被我们的命令替换,输出返回到终端。

execute.asm execute.asm 文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
; Execute
; Compile with: nasm -f elf execute.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 execute.o -o execute
; Run with: ./execute
 
%include        'functions.asm'
 
SECTION .data
command         db      '/bin/echo', 0h    ; command to execute
arg1            db      'Hello World!', 0h
arguments       dd      command
                dd      arg1               ; arguments to pass to commandline (in this case just one)
                dd      0h                 ; end the struct
environment     dd      0h                 ; arguments to pass as environment variables (inthis case none) end the struct
 
SECTION .text
global  _start
 
_start:
 
    mov     edx, environment   ; address of environment variables
    mov     ecx, arguments     ; address of the arguments to pass to the commandline
    mov     ebx, command       ; address of the file to execute
    mov     eax, 11            ; invoke SYS_EXECVE (kernel opcode 11)
    int     80h
 
    call    quit               ; call our quit function
~$ nasm -f elf execute.asm ~$ ld -m elf_i386 execute.o -o execute ~$ ./execute ~$ ./执行 Hello World! 世界您好!

Note: Here are a couple other commands to try.
注意:以下是可以尝试的其他几个命令。

execute.asm execute.asm 文件
8
9
10
SECTION .data
command         db      '/bin/ls', 0h      ; command to execute
arg1            db      '-l', 0h
execute.asm execute.asm 文件
8
9
10
SECTION .data
command         db      '/bin/sleep', 0h   ; command to execute
arg1            db      '5', 0h

Lesson 20 第 20 课

Process Forking 流程分叉

Firstly, some background 首先,一些背景

In this lesson we will use SYS_FORK to create a new process that duplicates our current process. SYS_FORK takes no arguments - you just call fork and the new process is created. Both processes run concurrently. We can test the return value (in eax) to test whether we are currently in the parent or child process. The parent process returns a non-negative, non-zero integer. In the child process EAX is zero. This can be used to branch your logic between the parent and child.
在本课中,我们将使用 SYS_FORK 创建一个复制当前流程的新流程。SYS_FORK 不接受任何参数 - 您只需调用 fork 并创建新进程。两个进程同时运行。我们可以测试返回值(以 eax 为单位)以测试我们当前是在父进程还是子进程中。父进程返回一个非负、非零的整数。在子进程中,EAX 为零。这可用于在 parent 和 child 之间对 logic 进行分支。

In our program we exploit this fact to print out different messages in each process.
在我们的程序中,我们利用这一事实在每个进程中打印出不同的消息。

Note: Each process is responsible for safely exiting.
注意:每个进程都负责安全退出。

fork.asm fork.asm 文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
; Fork
; Compile with: nasm -f elf fork.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 fork.o -o fork
; Run with: ./fork
 
%include        'functions.asm'
 
SECTION .data
childMsg        db      'This is the child process', 0h    ; a message string
parentMsg       db      'This is the parent process', 0h   ; a message string
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, 2             ; invoke SYS_FORK (kernel opcode 2)
    int     80h
 
    cmp     eax, 0             ; if eax is zero we are in the child process
    jz      child              ; jump if eax is zero to child label
 
parent:
    mov     eax, parentMsg     ; inside our parent process move parentMsg into eax
    call    sprintLF           ; call our string printing with linefeed function
 
    call    quit               ; quit the parent process
 
child:
    mov     eax, childMsg      ; inside our child process move childMsg into eax
    call    sprintLF           ; call our string printing with linefeed function
 
    call    quit               ; quit the child process
~$ nasm -f elf fork.asm ~$ ld -m elf_i386 fork.o -o fork ~$ ./fork ~$ ./分叉 This is the parent process
这是父进程
This is the child process
这是子进程

Lesson 21 第 21 课

Telling the time 报时

Generating a unix timestamp in NASM is easy with the SYS_TIME function of the linux kernel. Simply pass OPCODE 13 to the kernel with no arguments and you are returned the Unix Epoch in the EAX register.
使用 linux 内核的 SYS_TIME 功能,在 NASM 中生成 unix 时间戳很容易。只需将 OPCODE 13 不带参数地传递给内核,即可在 EAX 寄存器中返回 Unix Epoch

That is the number of seconds that have elapsed since January 1st 1970 UTC.
这是自 1970 年 1 月 1 日 UTC 以来经过的秒数。

time.asm 时间 .asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
; Time
; Compile with: nasm -f elf time.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 time.o -o time
; Run with: ./time
 
%include        'functions.asm'
 
SECTION .data
msg        db      'Seconds since Jan 01 1970: ', 0h    ; a message string
 
SECTION .text
global  _start
 
_start:
 
    mov     eax, msg       ; move our message string into eax for printing
    call    sprint         ; call our string printing function
 
    mov     eax, 13        ; invoke SYS_TIME (kernel opcode 13)
    int     80h            ; call the kernel
 
    call    iprintLF       ; call our integer printing function with linefeed
    call    quit           ; call our quit function
~$ nasm -f elf time.asm ~$ ld -m elf_i386 time.o -o time
~$ ld -m elf_i386 time.o -o 时间
~$ ./time ~$ ./时间 Seconds since Jan 01 1970: 1374995660
自 1970 年 1 月 1 日以来的秒数:1374995660

Lesson 22 第 22 课

File Handling - Create 文件处理 - 创建

Firstly, some background 首先,一些背景

File Handling in Linux is achieved through a small number of system calls related to creating, updating and deleting files. These functions require a file descriptor which is a unique, non-negative integer that identifies the file on the system.
Linux 中的文件处理是通过与创建、更新和删除文件相关的少量系统调用来实现的。这些函数需要一个文件描述符,它是一个唯一的非负整数,用于标识系统上的文件。

Writing our program 编写我们的程序

We begin the tutorial by creating a file using sys_creat. We will then build upon our program in each of the following file handling lessons, adding code as we go. Eventually we will have a full program that can create, update, open, close and delete files.
在本教程中,我们首先使用 sys_creat 创建一个文件。然后,我们将在以下每个文件处理课程中以我们的程序为基础,随时添加代码。最终,我们将拥有一个完整的程序,可以创建、更新、打开、关闭和删除文件。

sys_creat expects 2 arguments - the file permissions in ECX and the filename in EBX. The sys_creat opcode is then loaded into EAX and the kernel is called to create the file. The file descriptor of the created file is returned in EAX. This file descriptor can then be used for all other file handling functions.
sys_creat需要 2 个参数 - ECX 中的文件权限和 EBX 中的文件名。然后将 sys_creat 操作码加载到 EAX 中,并调用内核来创建文件。创建的文件描述符在 EAX 中返回。然后,此文件描述符可用于所有其他文件处理功能。

create.asm create.asm 文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
; Create
; Compile with: nasm -f elf create.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 create.o -o create
; Run with: ./create
 
%include    'functions.asm'
 
SECTION .data
filename db 'readme.txt', 0h   ; the filename to create
 
SECTION .text
global  _start
 
_start:
 
    mov     ecx, 0777o         ; set all permissions to read, write, execute
    mov     ebx, filename      ; filename we will create
    mov     eax, 8             ; invoke SYS_CREAT (kernel opcode 8)
    int     80h                ; call the kernel
 
    call    quit               ; call our quit function
~$ nasm -f elf create.asm ~$ ld -m elf_i386 create.o -o create ~$ ./create ~$ ./创建

Note: The file 'readme.txt' will now have been created in the folder.
注意:现在,文件 'readme.txt' 将在文件夹中创建。


Lesson 23 第 23 课

File Handling - Write 文件处理 - 写入

Building upon the previous lesson we will now use sys_write to write content to a newly created file.
在上一课的基础上,我们现在将使用 sys_write 将内容写入新创建的文件。

sys_write expects 3 arguments - the number of bytes to write in EDX, the contents string to write in ECX and the file descriptor in EBX. The sys_write opcode is then loaded into EAX and the kernel is called to write the content to the file. In this lesson we will first call sys_creat to get a file descriptor which we will then load into EBX.
sys_write需要 3 个参数 - 要在 EDX 中写入的字节数、要在 ECX 中写入的内容字符串以及 EBX 中的文件描述符。然后将 sys_write 操作码加载到 EAX 中,并调用内核将内容写入文件。在本课中,我们将首先调用 sys_creat 来获取文件描述符,然后将其加载到 EBX 中。

write.asm write.asm 文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
; Write
; Compile with: nasm -f elf write.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 write.o -o write
; Run with: ./write
 
%include    'functions.asm'
 
SECTION .data
filename db 'readme.txt', 0h   ; the filename to create
contents db 'Hello world!', 0h ; the contents to write
 
SECTION .text
global  _start
 
_start:
 
    mov     ecx, 0777o         ; code continues from lesson 22
    mov     ebx, filename
    mov     eax, 8
    int     80h
 
    mov     edx, 12            ; number of bytes to write - one for each letter of our contents string
    mov     ecx, contents      ; move the memory address of our contents string into ecx
    mov     ebx, eax           ; move the file descriptor of the file we created into ebx
    mov     eax, 4             ; invoke SYS_WRITE (kernel opcode 4)
    int     80h                ; call the kernel
 
    call    quit               ; call our quit function
~$ nasm -f elf write.asm ~$ ld -m elf_i386 write.o -o write ~$ ./write ~$ ./写入

Note: Open the newly created file 'readme.txt' in this folder and you will see the content 'Hello world!'.
注意:在此文件夹中打开新创建的文件 'readme.txt' ,您将看到内容 'Hello world!'


Lesson 24 第 24 课

File Handling - Open 文件处理 - 打开

Building upon the previous lesson we will now use sys_open to obtain the file descriptor of the newly created file. This file descriptor can then be used for all other file handling functions.
在上一课的基础上,我们现在使用 sys_open 来获取新创建的文件的文件描述符。然后,此文件描述符可用于所有其他文件处理功能。

sys_open expects 2 arguments - the access mode (table below) in ECX and the filename in EBX. The sys_open opcode is then loaded into EAX and the kernel is called to open the file and return the file descriptor.
sys_open需要 2 个参数 - ECX 中的访问模式(下表)和 EBX 中的文件名。然后将sys_open操作码加载到 EAX 中,并调用内核以打开文件并返回文件描述符。

sys_open additionally accepts zero or more file creation flags and file status flags in EDX. Click here for more information about the access mode, file creation flags and file status flags.
sys_open 还接受 EDX 中的零个或多个文件创建标志和文件状态标志。单击此处了解有关访问模式、文件创建标志和文件状态标志的更多信息

Description 描述 Value 价值
O_RDONLY open file in read only mode
以只读模式打开文件
0
O_WRONLY open file in write only mode
以只写模式打开文件
1
O_RDWR open file in read and write mode
以读写模式打开文件
2

Note: sys_open returns the file descriptor in EAX. On linux this will be a unique, non-negative integer which we will print using our integer printing function.
注意:sys_open 返回 EAX 中的文件描述符。在 linux 上,这将是一个唯一的非负整数,我们将使用整数打印函数打印它。

open.asm 打开.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
; Open
; Compile with: nasm -f elf open.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 open.o -o open
; Run with: ./open
 
%include    'functions.asm'
 
SECTION .data
filename db 'readme.txt', 0h   ; the filename to create
contents db 'Hello world!', 0h ; the contents to write
 
SECTION .text
global  _start
 
_start:
 
    mov     ecx, 0777o         ; Create file from lesson 22
    mov     ebx, filename
    mov     eax, 8
    int     80h
 
    mov     edx, 12            ; Write contents to file from lesson 23
    mov     ecx, contents
    mov     ebx, eax
    mov     eax, 4
    int     80h
 
    mov     ecx, 0             ; flag for readonly access mode (O_RDONLY)
    mov     ebx, filename      ; filename we created above
    mov     eax, 5             ; invoke SYS_OPEN (kernel opcode 5)
    int     80h                ; call the kernel
 
    call    iprintLF           ; call our integer printing function
    call    quit               ; call our quit function
~$ nasm -f elf open.asm ~$ ld -m elf_i386 open.o -o open ~$ ./open ~$ ./打开 4

Lesson 25 第 25 课

File Handling - Read 文件处理 - 读取

Building upon the previous lesson we will now use sys_read to read the content of a newly created and opened file. We will store this string in a variable.
在上一课的基础上,我们现在将使用 sys_read 来读取新创建和打开的文件的内容。我们将此字符串存储在变量中。

sys_read expects 3 arguments - the number of bytes to read in EDX, the memory address of our variable in ECX and the file descriptor in EBX. We will use the previous lessons sys_open code to obtain the file descriptor which we will then load into EBX. The sys_read opcode is then loaded into EAX and the kernel is called to read the file contents into our variable and is then printed to the screen.
sys_read需要 3 个参数 - EDX 中要读取的字节数、ECX 中变量的内存地址以及 EBX 中的文件描述符。我们将使用前面的课程sys_open代码来获取文件描述符,然后将其加载到 EBX 中。然后将 sys_read 操作码加载到 EAX 中,并调用内核将文件内容读入我们的变量,然后打印到屏幕上。

Note: We will reserve 255 bytes in the .bss section to store the contents of the file. See Lesson 9 for more information on the .bss section.
注意:我们将在 .bss 部分保留 255 字节来存储文件的内容。有关 .bss 部分的更多信息,请参阅第 9 课。

read.asm 读取.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
; Read
; Compile with: nasm -f elf read.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 read.o -o read
; Run with: ./read
 
%include    'functions.asm'
 
SECTION .data
filename db 'readme.txt', 0h   ; the filename to create
contents db 'Hello world!', 0h ; the contents to write
 
SECTION .bss
fileContents resb 255,         ; variable to store file contents
 
SECTION .text
global  _start
 
_start:
 
    mov     ecx, 0777o         ; Create file from lesson 22
    mov     ebx, filename
    mov     eax, 8
    int     80h
 
    mov     edx, 12            ; Write contents to file from lesson 23
    mov     ecx, contents
    mov     ebx, eax
    mov     eax, 4
    int     80h
 
    mov     ecx, 0             ; Open file from lesson 24
    mov     ebx, filename
    mov     eax, 5
    int     80h
 
    mov     edx, 12            ; number of bytes to read - one for each letter of the file contents
    mov     ecx, fileContents  ; move the memory address of our file contents variable into ecx
    mov     ebx, eax           ; move the opened file descriptor into EBX
    mov     eax, 3             ; invoke SYS_READ (kernel opcode 3)
    int     80h                ; call the kernel
 
    mov     eax, fileContents  ; move the memory address of our file contents variable into eax for printing
    call    sprintLF           ; call our string printing function
 
    call    quit               ; call our quit function
~$ nasm -f elf read.asm ~$ ld -m elf_i386 read.o -o read
~$ ld -m elf_i386 read.o -o 读取
~$ ./read Hello world! 世界您好!

Lesson 26 第 26 课

File Handling - Close 文件处理 - 关闭

Building upon the previous lesson we will now use sys_close to properly close an open file.
在上一课的基础上,我们现在将使用 sys_close 正确关闭打开的文件。

sys_close expects 1 argument - the file descriptor in EBX. We will use the previous lessons code to obtain the file descriptor which we will then load into EBX. The sys_close opcode is then loaded into EAX and the kernel is called to close the file and remove the active file descriptor.
sys_close需要 1 个参数 - EBX 中的文件描述符。我们将使用前面的课程代码来获取文件描述符,然后将其加载到 EBX 中。然后将 sys_close 操作码加载到 EAX 中,并调用内核以关闭文件并删除活动文件描述符。

close.asm 关闭.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
; Close
; Compile with: nasm -f elf close.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 close.o -o close
; Run with: ./close
 
%include    'functions.asm'
 
SECTION .data
filename db 'readme.txt', 0h   ; the filename to create
contents db 'Hello world!', 0h ; the contents to write
 
SECTION .bss
fileContents resb 255,         ; variable to store file contents
 
SECTION .text
global  _start
 
_start:
 
    mov     ecx, 0777o         ; Create file from lesson 22
    mov     ebx, filename
    mov     eax, 8
    int     80h
 
    mov     edx, 12            ; Write contents to file from lesson 23
    mov     ecx, contents
    mov     ebx, eax
    mov     eax, 4
    int     80h
 
    mov     ecx, 0             ; Open file from lesson 24
    mov     ebx, filename
    mov     eax, 5
    int     80h
 
    mov     edx, 12            ; Read file from lesson 25
    mov     ecx, fileContents
    mov     ebx, eax
    mov     eax, 3
    int     80h
 
    mov     eax, fileContents
    call    sprintLF
 
    mov     ebx, ebx           ; not needed but used to demonstrate that SYS_CLOSE takes a file descriptor from EBX
    mov     eax, 6             ; invoke SYS_CLOSE (kernel opcode 6)
    int     80h                ; call the kernel
 
    call    quit               ; call our quit function
~$ nasm -f elf close.asm ~$ ld -m elf_i386 close.o -o close
~$ ld -m elf_i386 close.o -o 关闭
~$ ./close ~$ ./关闭 Hello world! 世界您好!

Note: We have properly closed the file and removed the active file descriptor.
注意:我们已正确关闭文件并删除了活动文件描述符。


Lesson 27 第 27 课

File Handling - Seek 文件处理 - 查找

In this lesson we will open a file and update the file contents at the end of the file using sys_lseek.
在本课中,我们将打开一个文件,并使用 sys_lseek 更新文件末尾的文件内容。

Using sys_lseek you can move the cursor within the file by an offset in bytes. The below example will move the cursor to the end of the file, then pass 0 bytes as the offset (so we append to the end of the file and not beyond) before writing a string in that position. Try different values in ECX and EDX to write the content to different positions within the opened file.
使用 sys_lseek 可以将光标在文件中移动偏移量(以字节为单位)。下面的示例将光标移动到文件末尾,然后传递 0 字节作为偏移量(因此我们附加到文件末尾而不是超出末尾),然后在该位置写入字符串。尝试 ECX 和 EDX 中的不同值,将内容写入打开文件中的不同位置。

sys_lseek expects 3 arguments - the whence argument (table below) in EDX, the offset in bytes in ECX, and the file descriptor in EBX. The sys_lseek opcode is then loaded into EAX and we call the kernel to move the file pointer to the correct offset. We then use sys_write to update the content at that position.
sys_lseek需要 3 个参数 - EDX 中的 whence 参数(下表),ECX 中的偏移量(以字节为单位)以及 EBX 中的文件描述符。然后将 sys_lseek 操作码加载到 EAX 中,我们调用内核将文件指针移动到正确的偏移量。然后,我们使用 sys_write 更新该位置的内容。

Description 描述 Value 价值
SEEK_SET beginning of the file 文件开头 0
SEEK_CUR current file offset 当前文件偏移量 1
SEEK_END end of the file 文件结束 2

Note: A file 'readme.txt' has been included in the code folder for this lesson. This file will be updated after running the program.
注意:本课的代码文件夹中包含一个文件 'readme.txt。此文件将在运行程序后更新。

seek.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
; Seek
; Compile with: nasm -f elf seek.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 seek.o -o seek
; Run with: ./seek
 
%include    'functions.asm'
 
SECTION .data
filename db 'readme.txt', 0h   ; the filename to create
contents  db '-updated-', 0h    ; the contents to write at the start of the file
 
SECTION .text
global  _start
 
_start:
 
    mov     ecx, 1             ; flag for writeonly access mode (O_WRONLY)
    mov     ebx, filename      ; filename of the file to open
    mov     eax, 5             ; invoke SYS_OPEN (kernel opcode 5)
    int     80h                ; call the kernel
 
    mov     edx, 2             ; whence argument (SEEK_END)
    mov     ecx, 0             ; move the cursor 0 bytes
    mov     ebx, eax           ; move the opened file descriptor into EBX
    mov     eax, 19            ; invoke SYS_LSEEK (kernel opcode 19)
    int     80h                ; call the kernel
 
    mov     edx, 9             ; number of bytes to write - one for each letter of our contents string
    mov     ecx, contents      ; move the memory address of our contents string into ecx
    mov     ebx, ebx           ; move the opened file descriptor into EBX (not required as EBX already has the FD)
    mov     eax, 4             ; invoke SYS_WRITE (kernel opcode 4)
    int     80h                ; call the kernel
 
    call    quit               ; call our quit function
~$ nasm -f elf seek.asm ~$ ld -m elf_i386 seek.o -o seek ~$ ./seek

Lesson 28 第 28 课

File Handling - Delete 文件处理 - 删除

Deleting a file on linux is achieved by calling sys_unlink.
在 Linux 上删除文件是通过调用 sys_unlink 来实现的。

sys_unlink expects 1 argument - the filename in EBX. The sys_unlink opcode is then loaded into EAX and the kernel is called to delete the file.
sys_unlink需要 1 个参数 - EBX 中的文件名。然后将 sys_unlink 操作码加载到 EAX 中,并调用内核以删除该文件。

Note: A file 'readme.txt' has been included in the code folder for this lesson. This file will be deleted after running the program.
注意:本课的代码文件夹中包含一个文件 'readme.txt。运行程序后,此文件将被删除。

unlink.asm unlink.asm 文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
; Unlink
; Compile with: nasm -f elf unlink.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 unlink.o -o unlink
; Run with: ./unlink
 
%include    'functions.asm'
 
SECTION .data
filename db 'readme.txt', 0h   ; the filename to delete
 
SECTION .text
global  _start
 
_start:
 
    mov     ebx, filename      ; filename we will delete
    mov     eax, 10            ; invoke SYS_UNLINK (kernel opcode 10)
    int     80h                ; call the kernel
 
    call    quit               ; call our quit function
~$ nasm -f elf unlink.asm ~$ ld -m elf_i386 unlink.o -o unlink
~$ ld -m elf_i386 unlink.o -o 取消链接
~$ ./unlink ~$ ./取消链接

Lesson 29 第 29 课

Sockets - Create 套接字 - 创建

Firstly, some background 首先,一些背景

Socket Programming in Linux is achieved through the use of the SYS_SOCKETCALL kernel function. The SYS_SOCKETCALL function is somewhat unique in that it encapsulates a number of different subroutines, all related to socket operations, within the one function. By passing different integer values in EBX we can change the behaviour of this function to create, listen, send, receive, close and more. Click here to view the full commented source code of the completed program.
Linux 中的 Socket 编程是通过使用 SYS_SOCKETCALL 内核功能实现的。SYS_SOCKETCALL 函数在某种程度上是独一无二的,因为它在一个函数中封装了许多不同的子例程,所有这些子例程都与 socket 操作有关。通过在 EBX 中传递不同的整数值,我们可以将此函数的行为更改为 create、listen、send、receive、close 等。单击此处查看已完成程序的完整注释源代码。

Writing our program 编写我们的程序

We begin the tutorial by first initalizing some of our registers which we will use later to store important values. We will then create a socket using SYS_SOCKETCALL's first subroutine which is called 'socket'. We will then build upon our program in each of the following socket programming lessons, adding code as we go. Eventually we will have a full program that can create, bind, listen, accept, read, write and close sockets.
在本教程开始时,我们首先初始化一些 registers,稍后我们将使用这些 registers 来存储重要值。然后,我们将使用 SYS_SOCKETCALL 的第一个子例程(称为 'socket)创建一个套接字。然后,我们将在以下每个套接字编程课程中基于我们的程序进行构建,并随时添加代码。最终,我们将拥有一个完整的程序,它可以创建、绑定、监听、接受、读取、写入和关闭套接字。

SYS_SOCKETCALL's subroutine 'socket' expects 2 arguments - a pointer to an array of arguments in ECX and the integer value 1 in EBX. The SYS_SOCKETCALL opcode is then loaded into EAX and the kernel is called to create the socket. Because everything in linux is a file, we recieve back the file descriptor of the created socket in EAX. This file descriptor can then be used for performing other socket programming functions.
SYS_SOCKETCALL 的子例程 'socket' 需要 2 个参数 - 指向 ECX 中参数数组的指针和 EBX 中的整数值 1。然后将 SYS_SOCKETCALL 操作码加载到 EAX 中,并调用内核来创建套接字。因为 linux 中的所有内容都是一个文件,所以我们会收到 EAX 中创建的套接字的文件描述符。然后,此文件描述符可用于执行其他套接字编程功能。

Note: XORing a register by itself is an efficent way of ensuring the register is initalised with the integer value zero and doesn't contain an unexpected value that could corrupt your program.
注意:对寄存器本身进行 XOR 是确保 register 以整数值 0 初始化并且不包含可能损坏程序的意外值的有效方法。

socket.asm 套接字.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
; Socket
; Compile with: nasm -f elf socket.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 socket.o -o socket
; Run with: ./socket
 
%include    'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    xor     eax, eax           ; init eax 0
    xor     ebx, ebx           ; init ebx 0
    xor     edi, edi           ; init edi 0
    xor     esi, esi           ; init esi 0
 
_socket:
 
    push    byte 6             ; push 6 onto the stack (IPPROTO_TCP)
    push    byte 1             ; push 1 onto the stack (SOCK_STREAM)
    push    byte 2             ; push 2 onto the stack (PF_INET)
    mov     ecx, esp           ; move address of arguments into ecx
    mov     ebx, 1             ; invoke subroutine SOCKET (1)
    mov     eax, 102           ; invoke SYS_SOCKETCALL (kernel opcode 102)
    int     80h                ; call the kernel
 
    call    iprintLF           ; call our integer printing function (print the file descriptor in EAX or -1 on error)
 
_exit:
 
    call    quit               ; call our quit function
~$ nasm -f elf socket.asm
~$ nasm -f elf 套接字.asm
~$ ld -m elf_i386 socket.o -o socket
~$ ld -m elf_i386 socket.o -o 套接字
~$ ./socket ~$ ./套接字 3

Lesson 30 第 30 课

Sockets - Bind 套接字 - 绑定

Building on the previous lesson we will now associate the created socket with a local IP address and port which will allow us to connect to it. We do this by calling the second subroutine of SYS_SOCKETCALL which is called 'bind'.
在上一课的基础上,我们现在将创建的套接字与本地 IP 地址和端口相关联,这将允许我们连接到它。我们通过调用 SYS_SOCKETCALL 的第二个子例程(称为 'bind' )来实现这一点。

We begin by storing the file descriptor we recieved in lesson 29 into EDI. EDI was originally called the Destination Index and is traditionally used in copy routines to store the location of a target file.
首先,我们将第 29 课中收到的文件描述符存储到 EDI 中。EDI 最初称为目标索引,传统上用于复制例程中以存储目标文件的位置。

SYS_SOCKETCALL's subroutine 'bind' expects 2 arguments - a pointer to an array of arguments in ECX and the integer value 2 in EBX. The SYS_SOCKETCALL opcode is then loaded into EAX and the kernel is called to bind the socket.
SYS_SOCKETCALL 的子例程 'bind' 需要 2 个参数 - 指向 ECX 中参数数组的指针和 EBX 中的整数值 2。然后将 SYS_SOCKETCALL 操作码加载到 EAX 中,并调用内核以绑定套接字。

socket.asm 套接字.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
; Socket
; Compile with: nasm -f elf socket.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 socket.o -o socket
; Run with: ./socket
 
%include    'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    xor     eax, eax           ; initialize some registers
    xor     ebx, ebx
    xor     edi, edi
    xor     esi, esi
 
_socket:
 
    push    byte 6             ; create socket from lesson 29
    push    byte 1
    push    byte 2
    mov     ecx, esp
    mov     ebx, 1
    mov     eax, 102
    int     80h
 
_bind:
 
    mov     edi, eax           ; move return value of SYS_SOCKETCALL into edi (file descriptor for new socket, or -1 on error)
    push    dword 0x00000000   ; push 0 dec onto the stack IP ADDRESS (0.0.0.0)
    push    word 0x2923        ; push 9001 dec onto stack PORT (reverse byte order)
    push    word 2             ; push 2 dec onto stack AF_INET
    mov     ecx, esp           ; move address of stack pointer into ecx
    push    byte 16            ; push 16 dec onto stack (arguments length)
    push    ecx                ; push the address of arguments onto stack
    push    edi                ; push the file descriptor onto stack
    mov     ecx, esp           ; move address of arguments into ecx
    mov     ebx, 2             ; invoke subroutine BIND (2)
    mov     eax, 102           ; invoke SYS_SOCKETCALL (kernel opcode 102)
    int     80h                ; call the kernel
 
_exit:
 
    call    quit               ; call our quit function
~$ nasm -f elf socket.asm
~$ nasm -f elf 套接字.asm
~$ ld -m elf_i386 socket.o -o socket
~$ ld -m elf_i386 socket.o -o 套接字
~$ ./socket ~$ ./套接字

Lesson 31 第 31 课

Sockets - Listen 套接字 - 侦听

In the previous lessons we created a socket and used the 'bind' subroutine to associate it with a local IP address and port. In this lesson we will use the 'listen' subroutine of SYS_SOCKETCALL to tell our socket to listen for incoming TCP requests. This will allow us to read and write to anyone who connects to our socket.
在前面的课程中,我们创建了一个套接字,并使用 'bind' 子例程将其与本地 IP 地址和端口相关联。在本课中,我们将使用 SYS_SOCKETCALL 的 'listen' 子例程来告诉我们的套接字监听传入的 TCP 请求。这将允许我们对连接到我们的套接字的任何人进行读写操作。

SYS_SOCKETCALL's subroutine 'listen' expects 2 arguments - a pointer to an array of arguments in ECX and the integer value 4 in EBX. The SYS_SOCKETCALL opcode is then loaded into EAX and the kernel is called. If succesful the socket will begin listening for incoming requests.
SYS_SOCKETCALL 的子例程 'listen' 需要 2 个参数 - 指向 ECX 中参数数组的指针和 EBX 中的整数值 4。然后将 SYS_SOCKETCALL 操作码加载到 EAX 中并调用内核。如果成功,套接字将开始侦听传入的请求。

socket.asm 套接字.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
; Socket
; Compile with: nasm -f elf socket.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 socket.o -o socket
; Run with: ./socket
 
%include    'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    xor     eax, eax           ; initialize some registers
    xor     ebx, ebx
    xor     edi, edi
    xor     esi, esi
 
_socket:
 
    push    byte 6             ; create socket from lesson 29
    push    byte 1
    push    byte 2
    mov     ecx, esp
    mov     ebx, 1
    mov     eax, 102
    int     80h
 
_bind:
 
    mov     edi, eax           ; bind socket from lesson 30
    push    dword 0x00000000
    push    word 0x2923
    push    word 2
    mov     ecx, esp
    push    byte 16
    push    ecx
    push    edi
    mov     ecx, esp
    mov     ebx, 2
    mov     eax, 102
    int     80h
 
_listen:
 
    push    byte 1             ; move 1 onto stack (max queue length argument)
    push    edi                ; push the file descriptor onto stack
    mov     ecx, esp           ; move address of arguments into ecx
    mov     ebx, 4             ; invoke subroutine LISTEN (4)
    mov     eax, 102           ; invoke SYS_SOCKETCALL (kernel opcode 102)
    int     80h                ; call the kernel
 
_exit:
 
    call    quit               ; call our quit function
~$ nasm -f elf socket.asm
~$ nasm -f elf 套接字.asm
~$ ld -m elf_i386 socket.o -o socket
~$ ld -m elf_i386 socket.o -o 套接字
~$ ./socket ~$ ./套接字

Lesson 32 第 32 课

Sockets - Accept 套接字 - 接受

In the previous lessons we created a socket and used the 'bind' subroutine to associate it with a local IP address and port. We then used the 'listen' subroutine of SYS_SOCKETCALL to tell our socket to listen for incoming TCP requests. Now we will use the 'accept' subroutine of SYS_SOCKETCALL to tell our socket to accept those incoming requests. Our socket will then be ready to read and write to remote connections.
在前面的课程中,我们创建了一个套接字,并使用 'bind' 子例程将其与本地 IP 地址和端口相关联。然后,我们使用 SYS_SOCKETCALL 的 'listen' 子例程告诉我们的套接字监听传入的 TCP 请求。现在我们将使用 SYS_SOCKETCALL 的 'accept' 子例程来告诉我们的套接字接受这些传入的请求。然后,我们的套接字将准备好读取和写入远程连接。

SYS_SOCKETCALL's subroutine 'accept' expects 2 arguments - a pointer to an array of arguments in ECX and the integer value 5 in EBX. The SYS_SOCKETCALL opcode is then loaded into EAX and the kernel is called. The 'accept' subroutine will create another file descriptor, this time identifying the incoming socket connection. We will use this file descriptor to read and write to the incoming connection in later lessons.
SYS_SOCKETCALL 的子例程 'accept' 需要 2 个参数 - 指向 ECX 中参数数组的指针和 EBX 中的整数值 5。然后将 SYS_SOCKETCALL 操作码加载到 EAX 中并调用内核。'accept' 子例程将创建另一个文件描述符,这次标识传入的套接字连接。在后面的课程中,我们将使用此文件描述符来读取和写入传入连接。

Note: Run the program and use the command sudo netstat -plnt in another terminal to view the socket listening on port 9001.
注意:运行该程序并使用另一个终端中的命令 sudo netstat -plnt 查看侦听端口 9001 的套接字。

socket.asm 套接字.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
; Socket
; Compile with: nasm -f elf socket.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 socket.o -o socket
; Run with: ./socket
 
%include    'functions.asm'
 
SECTION .text
global  _start
 
_start:
 
    xor     eax, eax           ; initialize some registers
    xor     ebx, ebx
    xor     edi, edi
    xor     esi, esi
 
_socket:
 
    push    byte 6             ; create socket from lesson 29
    push    byte 1
    push    byte 2
    mov     ecx, esp
    mov     ebx, 1
    mov     eax, 102
    int     80h
 
_bind:
 
    mov     edi, eax           ; bind socket from lesson 30
    push    dword 0x00000000
    push    word 0x2923
    push    word 2
    mov     ecx, esp
    push    byte 16
    push    ecx
    push    edi
    mov     ecx, esp
    mov     ebx, 2
    mov     eax, 102
    int     80h
 
_listen:
 
    push    byte 1             ; listen socket from lesson 31
    push    edi
    mov     ecx, esp
    mov     ebx, 4
    mov     eax, 102
    int     80h
 
_accept:
 
    push    byte 0             ; push 0 dec onto stack (address length argument)
    push    byte 0             ; push 0 dec onto stack (address argument)
    push    edi                ; push the file descriptor onto stack
    mov     ecx, esp           ; move address of arguments into ecx
    mov     ebx, 5             ; invoke subroutine ACCEPT (5)
    mov     eax, 102           ; invoke SYS_SOCKETCALL (kernel opcode 102)
    int     80h                ; call the kernel
 
_exit:
 
    call    quit               ; call our quit function
~$ nasm -f elf socket.asm
~$ nasm -f elf 套接字.asm
~$ ld -m elf_i386 socket.o -o socket
~$ ld -m elf_i386 socket.o -o 套接字
~$ ./socket ~$ ./套接字

Lesson 33 第 33 课

Sockets - Read 套接字 - 读取

When an incoming connection is accepted by our socket, a new file descriptor identifying the incoming socket connection is returned in EAX. In this lesson we will use this file descriptor to read the incoming request headers from the connection.
当我们的套接字接受传入连接时,EAX 中会返回一个标识传入套接字连接的新文件描述符。在本课中,我们将使用此文件描述符从连接中读取传入的请求标头。

We begin by storing the file descriptor we recieved in lesson 32 into ESI. ESI was originally called the Source Index and is traditionally used in copy routines to store the location of a target file.
我们首先将第 32 课中收到的文件描述符存储到 ESI 中。ESI 最初称为源索引,传统上用于复制例程中以存储目标文件的位置。

We will use the kernel function sys_read to read from the incoming socket connection. As we have done in previous lessons, we will create a variable to store the contents being read from the file descriptor. Our socket will be using the HTTP protocol to communicate. Parsing HTTP request headers to determine the length of the incoming message and accepted response formats is beyond the scope of this tutorial. We will instead just read up to the first 255 bytes and print that to standardout.
我们将使用 kernel 函数 sys_read 从传入的 socket 连接中读取数据。正如我们在前面的课程中所做的那样,我们将创建一个变量来存储从文件描述符中读取的内容。我们的套接字将使用 HTTP 协议进行通信。解析 HTTP 请求标头以确定传入消息的长度和接受的响应格式超出了本教程的范围。相反,我们只读取前 255 个字节并将其打印到 standardout。

Once the incoming connection has been accepted, it is very common for webservers to spawn a child process to manage the read/write communication. The parent process is then free to return to the listening/accept state and accept any new incoming requests in parallel. We will implement this design pattern below using SYS_FORK and the JMP instruction prior to reading the request headers in the child process.
一旦接受传入连接,Web 服务器通常会生成一个子进程来管理读/写通信。然后,父进程可以自由返回到 listening/accept 状态,并并行接受任何新的传入请求。在读取子流程中的请求标头之前,我们将使用 SYS_FORK 和 JMP 指令在下面实现此设计模式。

To generate valid request headers we will use the commandline tool curl to connect to our listening socket. But you can also use a standard web browser to connect in the same way.
为了生成有效的请求标头,我们将使用命令行工具 curl 连接到我们的侦听套接字。但您也可以使用标准的 Web 浏览器以相同的方式进行连接。

sys_read expects 3 arguments - the number of bytes to read in EDX, the memory address of our variable in ECX and the file descriptor in EBX. The sys_read opcode is then loaded into EAX and the kernel is called to read the contents into our variable which is then printed to the screen.
sys_read需要 3 个参数 - EDX 中要读取的字节数、ECX 中变量的内存地址以及 EBX 中的文件描述符。然后将 sys_read 操作码加载到 EAX 中,并调用内核将内容读入我们的变量,然后将其打印到屏幕上。

Note: We will reserve 255 bytes in the .bss section to store the contents being read from the file descriptor. See Lesson 9 for more information on the .bss section.
注意:我们将在 .bss 部分中保留 255 字节来存储从文件描述符读取的内容。有关 .bss 部分的更多信息,请参阅第 9 课。

Note: Run the program and use the command curl http://localhost:9001 in another terminal to view the request headers being read by our program.
注意:运行程序并在另一个终端中使用命令 curl http://localhost:9001 来查看我们的程序正在读取的请求标头。

socket.asm 套接字.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
; Socket
; Compile with: nasm -f elf socket.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 socket.o -o socket
; Run with: ./socket
 
%include    'functions.asm'
 
SECTION .bss
buffer resb 255,               ; variable to store request headers
 
SECTION .text
global  _start
 
_start:
 
    xor     eax, eax           ; initialize some registers
    xor     ebx, ebx
    xor     edi, edi
    xor     esi, esi
 
_socket:
 
    push    byte 6             ; create socket from lesson 29
    push    byte 1
    push    byte 2
    mov     ecx, esp
    mov     ebx, 1
    mov     eax, 102
    int     80h
 
_bind:
 
    mov     edi, eax           ; bind socket from lesson 30
    push    dword 0x00000000
    push    word 0x2923
    push    word 2
    mov     ecx, esp
    push    byte 16
    push    ecx
    push    edi
    mov     ecx, esp
    mov     ebx, 2
    mov     eax, 102
    int     80h
 
_listen:
 
    push    byte 1             ; listen socket from lesson 31
    push    edi
    mov     ecx, esp
    mov     ebx, 4
    mov     eax, 102
    int     80h
 
_accept:
 
    push    byte 0             ; accept socket from lesson 32
    push    byte 0
    push    edi
    mov     ecx, esp
    mov     ebx, 5
    mov     eax, 102
    int     80h
 
_fork:
 
    mov     esi, eax           ; move return value of SYS_SOCKETCALL into esi (file descriptor for accepted socket, or -1 on error)
    mov     eax, 2             ; invoke SYS_FORK (kernel opcode 2)
    int     80h                ; call the kernel
 
    cmp     eax, 0             ; if return value of SYS_FORK in eax is zero we are in the child process
    jz      _read              ; jmp in child process to _read
 
    jmp     _accept            ; jmp in parent process to _accept
 
_read:
 
    mov     edx, 255           ; number of bytes to read (we will only read the first 255 bytes for simplicity)
    mov     ecx, buffer        ; move the memory address of our buffer variable into ecx
    mov     ebx, esi           ; move esi into ebx (accepted socket file descriptor)
    mov     eax, 3             ; invoke SYS_READ (kernel opcode 3)
    int     80h                ; call the kernel
 
    mov     eax, buffer        ; move the memory address of our buffer variable into eax for printing
    call    sprintLF           ; call our string printing function
 
_exit:
 
    call    quit               ; call our quit function
~$ nasm -f elf socket.asm
~$ nasm -f elf 套接字.asm
~$ ld -m elf_i386 socket.o -o socket
~$ ld -m elf_i386 socket.o -o 套接字
~$ ./socket ~$ ./套接字 GET / HTTP/1.1 获取 / HTTP/1.1 Host: localhost:9001 主机: localhost:9001 User-Agent: curl/x.xx.x 用户代理:curl/x.xx.x Accept: */* 接受:*/*

Lesson 34 第 34 课

Sockets - Write 套接字 - 写入

When an incoming connection is accepted by our socket, a new file descriptor identifying the incoming socket connection is returned in EAX. In this lesson we will use this file descriptor to send our response to the connection.
当我们的套接字接受传入连接时,EAX 中会返回一个标识传入套接字连接的新文件描述符。在本课中,我们将使用此文件描述符将我们的响应发送到连接。

We will use the kernel function sys_write to write to the incoming socket connection. As our socket will be communicating using the HTTP protocol, we will need to send some compulsory headers in order to allow HTTP speaking clients to connect. We will send these following the formatting rules set out in the RFC Standard.
我们将使用 kernel 函数 sys_write 写入传入的 socket 连接。由于我们的套接字将使用 HTTP 协议进行通信,因此我们需要发送一些强制性标头,以允许 HTTP 会话客户端进行连接。我们将按照 RFC 标准中规定的格式规则发送这些信息。

sys_write expects 3 arguments - the number of bytes to write in EDX, the response string to write in ECX and the file descriptor in EBX. The sys_write opcode is then loaded into EAX and the kernel is called to send our response back through our socket to the incoming connection.
sys_write需要 3 个参数 - 要在 EDX 中写入的字节数、要在 ECX 中写入的响应字符串以及 EBX 中的文件描述符。然后将 sys_write 操作码加载到 EAX 中,并调用内核以通过我们的套接字将我们的响应发送回传入连接。

Note: We will create a variable in the .data section to store the response we will write to the file descriptor. See Lesson 1 for more information on the .data section.
注意:我们将在 .data 部分创建一个变量来存储我们将写入文件描述符的响应。有关 .data 部分的更多信息,请参阅第 1 课。

Note: Run the program and use the command curl http://localhost:9001 in another terminal to view the response sent via our socket. Or connect to the same address using any standard web browser.
注意:运行程序并在另一个终端中使用命令 curl http://localhost:9001 来查看通过我们的套接字发送的响应。或者使用任何标准 Web 浏览器连接到同一地址。

socket.asm 套接字.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
; Socket
; Compile with: nasm -f elf socket.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 socket.o -o socket
; Run with: ./socket
 
%include    'functions.asm'
 
SECTION .data
; our response string
response db 'HTTP/1.1 200 OK', 0Dh, 0Ah, 'Content-Type: text/html', 0Dh, 0Ah, 'Content-Length: 14', 0Dh, 0Ah, 0Dh, 0Ah, 'Hello World!', 0Dh, 0Ah, 0h
 
SECTION .bss
buffer resb 255,               ; variable to store request headers
 
SECTION .text
global  _start
 
_start:
 
    xor     eax, eax           ; initialize some registers
    xor     ebx, ebx
    xor     edi, edi
    xor     esi, esi
 
_socket:
 
    push    byte 6             ; create socket from lesson 29
    push    byte 1
    push    byte 2
    mov     ecx, esp
    mov     ebx, 1
    mov     eax, 102
    int     80h
 
_bind:
 
    mov     edi, eax           ; bind socket from lesson 30
    push    dword 0x00000000
    push    word 0x2923
    push    word 2
    mov     ecx, esp
    push    byte 16
    push    ecx
    push    edi
    mov     ecx, esp
    mov     ebx, 2
    mov     eax, 102
    int     80h
 
_listen:
 
    push    byte 1             ; listen socket from lesson 31
    push    edi
    mov     ecx, esp
    mov     ebx, 4
    mov     eax, 102
    int     80h
 
_accept:
 
    push    byte 0             ; accept socket from lesson 32
    push    byte 0
    push    edi
    mov     ecx, esp
    mov     ebx, 5
    mov     eax, 102
    int     80h
 
_fork:
 
    mov     esi, eax           ; fork socket from lesson 33
    mov     eax, 2
    int     80h
 
    cmp     eax, 0
    jz      _read
 
    jmp     _accept
 
_read:
 
    mov     edx, 255           ; read socket from lesson 33
    mov     ecx, buffer
    mov     ebx, esi
    mov     eax, 3
    int     80h
 
    mov     eax, buffer
    call    sprintLF
 
_write:
 
    mov     edx, 78            ; move 78 dec into edx (length in bytes to write)
    mov     ecx, response      ; move address of our response variable into ecx
    mov     ebx, esi           ; move file descriptor into ebx (accepted socket id)
    mov     eax, 4             ; invoke SYS_WRITE (kernel opcode 4)
    int     80h                ; call the kernel
 
_exit:
 
    call    quit               ; call our quit function
~$ nasm -f elf socket.asm
~$ nasm -f elf 套接字.asm
~$ ld -m elf_i386 socket.o -o socket
~$ ld -m elf_i386 socket.o -o 套接字
~$ ./socket ~$ ./套接字

New terminal window 新建终端窗口
~$ curl http://localhost:9001
~$ 卷曲 http://localhost:9001
Hello World! 世界您好!

Lesson 35 第 35 课

Sockets - Close 套接字 - 关闭

In this lesson we will use sys_close to properly close the active socket connection in the child process after our response has been sent. This will free up some resources that can be used to accept new incoming connections.
在本课中,我们将使用 sys_close 在发送响应后正确关闭子进程中的活动套接字连接。这将释放一些可用于接受新传入连接的资源。

sys_close expects 1 argument - the file descriptor in EBX. The sys_close opcode is then loaded into EAX and the kernel is called to close the socket and remove the active file descriptor.
sys_close需要 1 个参数 - EBX 中的文件描述符。然后将 sys_close 操作码加载到 EAX 中,并调用内核以关闭套接字并删除活动文件描述符。

Note: Run the program and use the command curl http://localhost:9001 in another terminal or connect to the same address using any standard web browser.
注意:运行该程序并在另一个终端中使用命令 curl http://localhost:9001 ,或使用任何标准 Web 浏览器连接到同一地址。

socket.asm 套接字.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
; Socket
; Compile with: nasm -f elf socket.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 socket.o -o socket
; Run with: ./socket
 
%include    'functions.asm'
 
SECTION .data
; our response string
response db 'HTTP/1.1 200 OK', 0Dh, 0Ah, 'Content-Type: text/html', 0Dh, 0Ah, 'Content-Length: 14', 0Dh, 0Ah, 0Dh, 0Ah, 'Hello World!', 0Dh, 0Ah, 0h
 
SECTION .bss
buffer resb 255,               ; variable to store request headers
 
SECTION .text
global  _start
 
_start:
 
    xor     eax, eax           ; initialize some registers
    xor     ebx, ebx
    xor     edi, edi
    xor     esi, esi
 
_socket:
 
    push    byte 6             ; create socket from lesson 29
    push    byte 1
    push    byte 2
    mov     ecx, esp
    mov     ebx, 1
    mov     eax, 102
    int     80h
 
_bind:
 
    mov     edi, eax           ; bind socket from lesson 30
    push    dword 0x00000000
    push    word 0x2923
    push    word 2
    mov     ecx, esp
    push    byte 16
    push    ecx
    push    edi
    mov     ecx, esp
    mov     ebx, 2
    mov     eax, 102
    int     80h
 
_listen:
 
    push    byte 1             ; listen socket from lesson 31
    push    edi
    mov     ecx, esp
    mov     ebx, 4
    mov     eax, 102
    int     80h
 
_accept:
 
    push    byte 0             ; accept socket from lesson 32
    push    byte 0
    push    edi
    mov     ecx, esp
    mov     ebx, 5
    mov     eax, 102
    int     80h
 
_fork:
 
    mov     esi, eax           ; fork socket from lesson 33
    mov     eax, 2
    int     80h
 
    cmp     eax, 0
    jz      _read
 
    jmp     _accept
 
_read:
 
    mov     edx, 255           ; read socket from lesson 33
    mov     ecx, buffer
    mov     ebx, esi
    mov     eax, 3
    int     80h
 
    mov     eax, buffer
    call    sprintLF
 
_write:
 
    mov     edx, 78            ; write socket from lesson 34
    mov     ecx, response
    mov     ebx, esi
    mov     eax, 4
    int     80h
 
_close:
 
    mov     ebx, esi           ; move esi into ebx (accepted socket file descriptor)
    mov     eax, 6             ; invoke SYS_CLOSE (kernel opcode 6)
    int     80h                ; call the kernel
 
_exit:
 
    call    quit               ; call our quit function
~$ nasm -f elf socket.asm
~$ nasm -f elf 套接字.asm
~$ ld -m elf_i386 socket.o -o socket
~$ ld -m elf_i386 socket.o -o 套接字
~$ ./socket ~$ ./套接字

New terminal window 新建终端窗口
~$ curl http://localhost:9001
~$ 卷曲 http://localhost:9001
Hello World! 世界您好!

Note: We have properly closed the socket connections and removed their active file descriptors.
注意:我们已经正确关闭了套接字连接并删除了它们的活动文件描述符。

Lesson 36 第 36 课

Download a Webpage 下载网页

In the previous lessons we have been learning how to use the many subroutines of the SYS_SOCKETCALL kernel function to create, manage and transfer data through Linux sockets. We will continue that theme in this lesson by using the 'connect' subroutine of SYS_SOCKETCALL to connect to a remote webserver and download a webpage.
在前面的课程中,我们一直在学习如何使用 SYS_SOCKETCALL 内核函数的许多子例程通过 Linux 套接字创建、管理和传输数据。在本课中,我们将通过使用 SYS_SOCKETCALL 的 'connect' 子例程连接到远程 Web 服务器并下载网页来继续该主题。

These are the steps we need to follow to connect a socket to a remote server:
以下是我们将套接字连接到远程服务器需要遵循的步骤:

  • Call SYS_SOCKETCALL's subroutine 'socket' to create an active socket that we will use to send outbound requests.
    调用 SYS_SOCKETCALL 的子例程 'socket' 来创建一个活动套接字,我们将使用它来发送出站请求。
  • Call SYS_SOCKETCALL's subroutine 'connect' to connect our socket with a socket on the remote webserver.
    调用 SYS_SOCKETCALL 的子例程 'connect' 将我们的套接字与远程 Web 服务器上的套接字连接起来。
  • Use SYS_WRITE to send a HTTP formatted request through our socket to the remote webserver.
    使用 SYS_WRITE 通过我们的套接字向远程 Web 服务器发送 HTTP 格式的请求。
  • Use SYS_READ to recieve the HTTP formatted response from the webserver.
    使用 SYS_READ 从 Web 服务器接收 HTTP 格式的响应。
We will then use our string printing function to print the response to our terminal.
然后,我们将使用我们的字符串打印函数将响应打印到我们的终端。

What is a HTTP Request
什么是 HTTP 请求

The HTTP specification has evolved through a number of standard versions including 1.0 in RFC1945, 1.1 in RFC2068 and 2.0 in RFC7540. Version 1.1 is still the most common today.
HTTP 规范已经发展到许多标准版本,包括 RFC1945 的 1.0RFC2068 的 1.1RFC7540 的 2.0。版本 1.1 仍然是当今最常见的版本。

A HTTP/1.1 request is comprised of 3 sections:
HTTP/1.1 请求由 3 个部分组成:

  1. A line containing the request method, request url, and http version
    包含请求方法请求 urlhttp 版本的
  2. An optional section of request headers
    请求标头的可选部分
  3. An empty line that tells the remote server you have finished sending the request and you will begin waiting for the response.
    一个空行,告诉远程服务器您已完成发送请求,您将开始等待响应。

A typical HTTP request for the root document on this server would look like this:
此服务器上根文档的典型 HTTP 请求如下所示:

1
2
3
GET / HTTP/1.1                 ; A line containing the request method, url and version
Host: asmtutor.com             ; A section of request headers
                                ; A required empty line
Writing our program 编写我们的程序

This tutorial starts out like the previous ones by calling SYS_SOCKETCALL's subroutine 'socket' to initially create our socket. However, instead of calling 'bind' on this socket we will call 'connect' with an IP Address and Port Number to connect our socket to a remote webserver. We will then use the SYS_WRITE and SYS_READ kernel methods to transfer data between the two sockets by sending a HTTP request and reading the HTTP response.
本教程与前面的教程一样,首先调用 SYS_SOCKETCALL 的子例程 'socket' 来初始创建我们的套接字。但是,我们不是在这个套接字上调用 'bind',而是使用 IP 地址和端口号调用 'connect' 来将我们的套接字连接到远程 Web 服务器。然后,我们将使用 SYS_WRITE 和 SYS_READ 内核方法,通过发送 HTTP 请求并读取 HTTP 响应,在两个套接字之间传输数据。

SYS_SOCKETCALL's subroutine 'connect' expects 2 arguments - a pointer to an array of arguments in ECX and the integer value 3 in EBX. The SYS_SOCKETCALL opcode is then loaded into EAX and the kernel is called to connect to the socket.
SYS_SOCKETCALL 的子例程 'connect' 需要 2 个参数 - 指向 ECX 中参数数组的指针和 EBX 中的整数值 3。然后将 SYS_SOCKETCALL 操作码加载到 EAX 中,并调用内核以连接到套接字。

Note: In Linux we can use the following command ./crawler > index.html to save the output of our program to a file instead.
注意:在 Linux 中,我们可以使用以下命令 ./crawler > index.html 将程序的输出保存到文件中。

crawler.asm 爬虫.asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
; Crawler
; Compile with: nasm -f elf crawler.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 crawler.o -o crawler
; Run with: ./crawler
 
%include    'functions.asm'
 
SECTION .data
; our request string
request db 'GET / HTTP/1.1', 0Dh, 0Ah, 'Host: 139.162.39.66:80', 0Dh, 0Ah, 0Dh, 0Ah, 0h
 
SECTION .bss
buffer resb 1,                 ; variable to store response
 
SECTION .text
global  _start
 
_start:
 
    xor     eax, eax           ; init eax 0
    xor     ebx, ebx           ; init ebx 0
    xor     edi, edi           ; init edi 0
 
_socket:
 
    push    byte 6             ; push 6 onto the stack (IPPROTO_TCP)
    push    byte 1             ; push 1 onto the stack (SOCK_STREAM)
    push    byte 2             ; push 2 onto the stack (PF_INET)
    mov     ecx, esp           ; move address of arguments into ecx
    mov     ebx, 1             ; invoke subroutine SOCKET (1)
    mov     eax, 102           ; invoke SYS_SOCKETCALL (kernel opcode 102)
    int     80h                ; call the kernel
 
_connect:
 
    mov     edi, eax           ; move return value of SYS_SOCKETCALL into edi (file descriptor for new socket, or -1 on error)
    push    dword 0x4227a28b   ; push 139.162.39.66 onto the stack IP ADDRESS (reverse byte order)
    push    word 0x5000        ; push 80 onto stack PORT (reverse byte order)
    push    word 2             ; push 2 dec onto stack AF_INET
    mov     ecx, esp           ; move address of stack pointer into ecx
    push    byte 16            ; push 16 dec onto stack (arguments length)
    push    ecx                ; push the address of arguments onto stack
    push    edi                ; push the file descriptor onto stack
    mov     ecx, esp           ; move address of arguments into ecx
    mov     ebx, 3             ; invoke subroutine CONNECT (3)
    mov     eax, 102           ; invoke SYS_SOCKETCALL (kernel opcode 102)
    int     80h                ; call the kernel
 
_write:
 
    mov     edx, 43            ; move 43 dec into edx (length in bytes to write)
    mov     ecx, request       ; move address of our request variable into ecx
    mov     ebx, edi           ; move file descriptor into ebx (created socket file descriptor)
    mov     eax, 4             ; invoke SYS_WRITE (kernel opcode 4)
    int     80h                ; call the kernel
 
_read:
 
    mov     edx, 1             ; number of bytes to read (we will read 1 byte at a time)
    mov     ecx, buffer        ; move the memory address of our buffer variable into ecx
    mov     ebx, edi           ; move edi into ebx (created socket file descriptor)
    mov     eax, 3             ; invoke SYS_READ (kernel opcode 3)
    int     80h                ; call the kernel
 
    cmp     eax, 0             ; if return value of SYS_READ in eax is zero, we have reached the end of the file
    jz      _close             ; jmp to _close if we have reached the end of the file (zero flag set)
 
    mov     eax, buffer        ; move the memory address of our buffer variable into eax for printing
    call    sprint             ; call our string printing function
    jmp     _read              ; jmp to _read
 
_close:
 
    mov     ebx, edi           ; move edi into ebx (connected socket file descriptor)
    mov     eax, 6             ; invoke SYS_CLOSE (kernel opcode 6)
    int     80h                ; call the kernel
 
_exit:
 
    call    quit               ; call our quit function
~$ nasm -f elf crawler.asm ~$ ld -m elf_i386 crawler.o -o crawler
~$ ld -m elf_i386 crawler.o -o 爬虫
~$ ./crawler ~$ ./爬虫 HTTP/1.1 200 OK HTTP/1.1 200 正常 Content-Type: text/html 内容类型: text/html   <!DOCTYPE html> <!DOCTYPE html> <html lang="en"> <html lang=“en”> ... </html>