Linux 手写汇编,用系统调用来读取文件夹和文件

TOC

  1. 1. 系统调用简介
    1. 1.1. sys_read
      1. 1.1.1. NAME
      2. 1.1.2. SYNOPSIS
      3. 1.1.3. DESCRIPTION
      4. 1.1.4. RETURN VALUE
      5. 1.1.5. 简述
    2. 1.2. sys_write
      1. 1.2.1. NAME
      2. 1.2.2. SYNOPSIS
      3. 1.2.3. DESCRIPTION
      4. 1.2.4. RETURN VALUE
      5. 1.2.5. 简述
    3. 1.3. sys_open
      1. 1.3.1. SYNOPSIS
      2. 1.3.2. DESCRIPTION
    4. 1.4. sys_getdents
      1. 1.4.1. SYNOPSIS
      2. 1.4.2. DESCRIPTION
      3. 1.4.3. 简述
  2. 2. 读文件夹
    1. 2.1. 源码:read_dir.s
    2. 2.2. 解析:resolve.c
    3. 2.3. 运行效果
  3. 3. 读文件夹
    1. 3.1. 源码:read_file.s
    2. 3.2. 运行效果
  4. 4. 总结

鉴于大部分人不太熟悉 AT&T 汇编,这里我就不用原生的汇编来写,用的是 Intel 汇编。

在做pwn的时候,有时候你会遇到不能调用 SYS_execve 的情况,而且还可能遇到没有libc的库的情况,这时候这门技术就能让你如鱼得水。

实验中的所有文件都可打包下载:read_file_by_syscall.zip

系统调用简介

下面是我们会用到的系统调用:

rax System call rdi rsi rdx
0 sys_read unsigned int fd char *buf size_t count
1 sys_write unsigned int fd const char *buf size_t count
2 sys_open const char *filename int flags int mode
78 sys_getdents unsigned int fd struct linux_dirent *dirent unsigned int count

sys_read

官方文档:man2/read.2.html

NAME

read - read from a file descriptor

SYNOPSIS

#include <unistd.h>

ssize_t read(int fd, void *buf, size_t count);

DESCRIPTION

read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.

On files that support seeking, the read operation commences at the file offset, and the file offset is incremented by the number of bytes read. If the file offset is at or past the end of file, no bytes are read, and read() returns zero.

If count is zero, read() may detect the errors described below. In the absence of any errors, or if read() does not check for errors, a read() with a count of 0 returns zero and has no other effects.

According to POSIX.1, if count is greater than SSIZE_MAX, the result is implementation-defined; see NOTES for the upper limit on Linux.

RETURN VALUE

On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal. See also NOTES.

On error, -1 is returned, and errno is set appropriately. In this case, it is left unspecified whether the file position (if any) changes.

简述

fd是文件句柄,buf是要读的内存地址,count是要读的字节数量。

sys_write

官方文档:man2/write.2.html

NAME

write - write to a file descriptor

SYNOPSIS

#include <unistd.h>

ssize_t write(int fd, const void *buf, size_t count);

DESCRIPTION

write() writes up to count bytes from the buffer starting at buf tothe file referred to by the file descriptor fd.

The number of bytes written may be less than count if, for example,there is insufficient space on the underlying physical medium, or theRLIMIT_FSIZE resource limit is encountered (see setrlimit(2)), or thecall was interrupted by a signal handler after having written lessthan count bytes. (See also pipe(7).)

For a seekable file (i.e., one to which lseek(2) may be applied, forexample, a regular file) writing takes place at the file offset, andthe file offset is incremented by the number of bytes actuallywritten. If the file was open(2)ed with O_APPEND, the file offset isfirst set to the end of the file before writing. The adjustment ofthe file offset and the write operation are performed as an atomicstep.

POSIX requires that a read(2) that can be proved to occur after awrite() has returned will return the new data. Note that not allfilesystems are POSIX conforming.

According to POSIX.1, if count is greater than SSIZE_MAX, the resultis implementation-defined; see NOTES for the upper limit on Linux.

RETURN VALUE

On success, the number of bytes written is returned. On error, -1 isreturned, and errno is set to indicate the cause of the error.

Note that a successful write() may transfer fewer than count bytes.Such partial writes can occur for various reasons; for example,because there was insufficient space on the disk device to write allof the requested bytes, or because a blocked write() to a socket,pipe, or similar was interrupted by a signal handler after it hadtransferred some, but before it had transferred all of the requestedbytes. In the event of a partial write, the caller can make anotherwrite() call to transfer the remaining bytes. The subsequent callwill either transfer further bytes or may result in an error (e.g.,if the disk is now full).

If count is zero and fd refers to a regular file, then write() mayreturn a failure status if one of the errors below is detected. Ifno errors are detected, or error detection is not performed, 0 willbe returned without causing any other effect. If count is zero andfd refers to a file other than a regular file, the results are notspecified.

简述

fd是文件句柄,buf是要写到fd的内存地址,count是要读的字节数量。

sys_open

官方文档:man2/open.2.html

SYNOPSIS

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);

DESCRIPTION

The argument flags must include one of the following access modes:O_RDONLY, O_WRONLY, or O_RDWR. These request opening the file read-only, write-only, or read/write, respectively.

In addition, zero or more file creation flags and file status flagscan be bitwise-or’d in flags. The file creation flags are O_CLOEXEC,O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TMPFILE, andO_TRUNC. The file status flags are all of the remaining flags listedbelow. The distinction between these two groups of flags is that thefile creation flags affect the semantics of the open operationitself, while the file status flags affect the semantics ofsubsequent I/O operations. The file status flags can be retrievedand (in some cases) modified; see fcntl(2) for details.

这里我特别强调一下,O_RDONLY 的值是0,O_DIRECTORY的值是 0x10000 。

sys_getdents

官方文档:http://man7.org/linux/man-pages/man2/getdents.2.html

SYNOPSIS

int getdents(unsigned int fd, struct linux_dirent *dirp,
unsigned int count);

DESCRIPTION

These are not the interfaces you are interested in. Look at readdir(3) for the POSIX-conforming C library interface. This page documents the bare kernel system call interfaces.

The system call getdents() reads several linux_dirent structures from the directory referred to by the open file descriptor fd into the buffer pointed to by dirp. The argument count specifies the size of that buffer.

The linux_dirent structure is declared as follows:

struct linux_dirent {
unsigned long d_ino; /* Inode number */
unsigned long d_off; /* Offset to next linux_dirent */
unsigned short d_reclen; /* Length of this linux_dirent */
char d_name[]; /* Filename (null-terminated) */
/* length is actually (d_reclen - 2 -
offsetof(struct linux_dirent, d_name)) */
/*
char pad; // Zero padding byte
char d_type; // File type (only since Linux
// 2.6.4); offset is (d_reclen - 1)
*/
}

d_ino is an inode number. d_off is the distance from the start of the directory to the start of the next linux_dirent. d_reclen is the size of this entire linux_dirent. d_name is a null-terminated file‐name.

d_type is a byte at the end of the structure that indicates the file type. It contains one of the following values (defined in <dirent.h>):

  1. DT_BLK This is a block device.
  2. DT_CHR This is a character device.
  3. DT_DIR This is a directory.
  4. DT_FIFO This is a named pipe (FIFO).
  5. DT_LNK This is a symbolic link.
  6. DT_REG This is a regular file.
  7. DT_SOCK This is a UNIX domain socket.
  8. DT_UNKNOWN The file type is unknown.

The d_type field is implemented since Linux 2.6.4. It occupies a space that was previously a zero-filled padding byte in the linux_dirent structure. Thus, on kernels up to and including 2.6.3, attempting to access this field always provides the value 0 (DT_UNKNOWN).

Currently, only some filesystems (among them: Btrfs, ext2, ext3, and ext4) have full support for returning the file type in d_type. All applications must properly handle a return of DT_UNKNOWN.

简述

fd是要解析的文件夹的句柄,buf是装解析内容的内存地址,注意buf需要自己分配好内存countbuf的大小。

读文件夹

下面我来演示一下文件夹解析的例子。

源码:read_dir.s

;// gcc -c read_dir.s -o read_dir.o
;// ld -e read_dir -z noexecstack read_dir.o -o read_dir

;// 使用intel语法
.intel_syntax noprefix
.text
.globl read_dir
.type read_dir, @function
read_dir:
;// SYS_read
;// 从输入流读取我们的输入
mov rdi, 0 ;// fd
mov rsi,rsp ;// 读到栈上
mov rdx, 256 ;// nbytes
mov rax,0 ;// SYS_read
syscall

;// 把换行换成 NULL
mov rbx, rsp
add rbx, rax
dec rbx
mov byte ptr [rbx], 0

;// SYS_open
;// 打开文件夹
mov rdi, rsp ;// 我们的输入
mov rsi, 0x10000 ;// O_DIRECTORY
xor rdx, rdx ;// 置0就行
mov rax, 2 ;// SYS_open
syscall

;// SYS_getdents
;// 解析文件夹
mov rdi, rax ;// fd
mov rsi, rsp ;// buf
mov edx, 1024 ;// count
mov rax, 78 ;// SYS_getdents
syscall

;// SYS_write
;// 输入我们解析到的内容
mov rdi, 1 ;// fd 输出到输出流
mov rsi, rsp ;// buf
mov rdx, rax ;// count 把上面 SYS_getdents 返回的值当做 count
mov rax, 1 ;// SYS_write
syscall

;// SYS_exit
mov rdi, 0 ;// error_code
mov rax, 60
syscall

注释里已经说的很清楚了,具体功能就是读入输入的文件夹名,然后进行解析,输出解析的内容。

注意此时用的是 O_DIRECTORY ,如果是文件的话会返回错误。

然后我们在写出我们的解析函数。

解析:resolve.c

// compiled: gcc -g resolve.c -o resolve

#include <dirent.h> /* Defines DT_* constants */
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/syscall.h>

struct linux_dirent {
unsigned long d_ino;
unsigned long d_off;
unsigned short d_reclen;
char d_name[1];
};

int main()
{
unsigned long long fd, nread;
char buf[0x1000];
struct linux_dirent *d;
int bpos;
char d_type;

nread = read(0, buf, 0x1000);
printf("--------------- nread=%lld ---------------\n", nread);
printf("inode# file type d_reclen d_off d_name\n");
for (bpos = 0; bpos < nread;)
{
d = (struct linux_dirent *)(buf + bpos);
printf("%8ld ", d->d_ino);
d_type = *(buf + bpos + d->d_reclen - 1);
printf("%-10s ", (d_type == DT_REG) ? "regular"
: (d_type == DT_DIR) ? "directory"
: (d_type == DT_FIFO) ? "FIFO"
: (d_type == DT_SOCK) ? "socket"
: (d_type == DT_LNK) ? "symlink"
: (d_type == DT_BLK) ? "block dev"
: (d_type == DT_CHR) ? "char dev"
: "???");
printf("%4d %10lld %s\n", d->d_reclen,
(long long)d->d_off, d->d_name);
bpos += d->d_reclen;
}
}

运行效果

接下来让我们来看看效果

ex@Ex:~/read_file_by_syscall$ ll test/
total 4
drwxrwxr-x 2 ex ex 4096 May 19 12:46 dir
-rw-rw-r-- 1 ex ex 0 May 19 12:47 file.txt
ex@Ex:~/read_file_by_syscall$ echo test | ./read_dir | ./resolve
--------------- nread=104 ---------------
inode# file type d_reclen d_off d_name
8257658 regular 32 7574096023840097761 file.txt
8257545 directory 24 8832451974199512907 ..
8257656 directory 24 9134306701537199169 .
8257657 directory 24 9223372036854775807 dir

可以看到完美的解析出来了,而且没有用glibc库。

读文件夹

下面我来演示一下读文件的例子。

源码:read_file.s

;// gcc -c read_file.s -o read_file.o
;// ld -e read_file -z noexecstack read_file.o -o read_file

;// 使用intel语法
.intel_syntax noprefix
.text
.globl read_file
.type read_file, @function
read_file:
;// SYS_read
;// 从输入流读取我们的输入
mov rdi, 0 ;// fd
mov rsi,rsp ;// 读到栈上
mov rdx, 256 ;// nbytes
mov rax,0 ;// SYS_read
syscall

;// 把换行换成 NULL
mov rbx, rsp
add rbx, rax
dec rbx
mov byte ptr [rbx], 0

;// SYS_open
;// 打开文件夹
mov rdi, rsp ;// 我们的输入
mov rsi, 0 ;// O_RDONLY
xor rdx, rdx ;// 置0就行
mov rax, 2 ;// SYS_open
syscall

;// SYS_read
;// 把文件的内容读到栈上面
mov rdi, rax ;// fd ,rax就是上面打卡的文件句柄
mov rsi,rsp ;// 读到栈上
mov rdx, 1024 ;// nbytes
mov rax,0 ;// SYS_read
syscall

;// SYS_write
;// 输入我们读取到的内容
mov rdi, 1 ;// fd 输出到输出流
mov rsi, rsp ;// buf
mov rdx, rax ;// count 把上面 SYS_read 返回的值当做 count
mov rax, 1 ;// SYS_write
syscall

;// SYS_exit
mov rdi, 0 ;// error_code
mov rax, 60
syscall

注释里已经说的很清楚了,具体功能就是打开文件,然后读到栈上,最后输出。

运行效果

ex@Ex:~/read_file_by_syscall$ cat test/file.txt 
this is a test.
ex@Ex:~/read_file_by_syscall$ ./read_file
./test/file.txt
this is a test.
ex@Ex:~/read_file_by_syscall$

完美的读取出了内容。

总结

其实系统调用和我们一般的编程是差不多的,并没有我们想象的那么难