NEUZZ生成bitmaps的过程

Authro：ZERO-A-ONE
Date：2021-06-03

我们通过010Edit或者Linux下的cat程序可以发现NEUZZ生成的bitmap和AFL原版生成的bitmap是不太一样的

首先AFL是可以通过afl-showmap展示出特定的输入对于待测程序已经覆盖的边和触发的次数，例如可以执行下列命令，来展示输入./in/1对于程序./readelf的覆盖边情况输出到./tmp文件里

$ afl-showmap -m none -o ./tmp -- ./readelf ./in/1

运行后的效果如图所示：
在这里插入图片描述

Captured 29 tuples其实是说明了我们的输入总共覆盖了29条边，每一个tuples就是一个边的覆盖数据，我们可以通过查看tmp文件内容

ubuntu@VM-0-16-ubuntu:~/afl$ cat tmp
000313:1
001403:1
004051:1
004462:1
005940:1
006453:1
011049:1
011284:1
013517:1
014108:1
014687:1
017741:1
019786:1
024812:1
027974:1
029242:1
030868:1
031080:1
037812:1
037816:1
038877:1
039910:1
041315:1
048444:1
051839:1
053182:1
060373:1
063557:1
063862:1

然后我们看看NEUZZ是如何组织自己的bitmap，首先是读取已经获得的测试种子：

# shuffle training samples
seed_list = glob.glob('./seeds/*')
seed_list.sort()
SPLIT_RATIO = len(seed_list)
rand_index = np.arange(SPLIT_RATIO)
np.random.shuffle(seed_list)
new_seeds = glob.glob('./seeds/id_*')

call = subprocess.check_output

# get MAX_FILE_SIZE
cwd = os.getcwd()
max_file_name = call(['ls', '-S', cwd + '/seeds/']).decode('utf8').split('\n')[0].rstrip('\n')
MAX_FILE_SIZE = os.path.getsize(cwd + '/seeds/' + max_file_name)

# create directories to save label, spliced seeds, variant length seeds, crashes and mutated seeds.
os.path.isdir("./bitmaps/") or os.makedirs("./bitmaps")
os.path.isdir("./splice_seeds/") or os.makedirs("./splice_seeds")
os.path.isdir("./vari_seeds/") or os.makedirs("./vari_seeds")
os.path.isdir("./crashes/") or os.makedirs("./crashes")

然后我们可以发现NEUZZ将遍历每一个输入种子，然后建立了临时列表：

for f in seed_list:
        tmp_list = []

然后NEUZZ将调用showmap的输入结果保存至out

try:
# append "-o tmp_file" to strip's arguments to avoid tampering tested binary.
if argvv[0] == './strip':
	out = call(['./afl-showmap', '-q', '-e', '-o', '/dev/stdout', '-m', '512', '-t', '500'] + argvv + [f] + ['-o', 'tmp_file'])
else:
	out = call(['./afl-showmap', '-q', '-e', '-o', '/dev/stdout', '-m', '512', '-t', '500'] + argvv + [f])

如果调用showmap发现造成崩溃的输入也会提示

except subprocess.CalledProcessError:
	print("find a crash")

然后程序会把out的每一行进行分解

for line in out.splitlines():
	edge = line.split(b':')[0]
	tmp_cnt.append(edge)
	tmp_list.append(edge)

简单来说就是提取000313、001403的边的编码然后加入tmp_cnt计数和tmp_list记边，总后可以得到一个当前输入样例对应的已经覆盖的边列表

然后统计所有输入样例覆盖过的边，每个边出现的次数

counter = Counter(tmp_cnt).most_common()

然后提取出所有的边的编号作为标签

label = [int(f[0]) for f in counter]

初始化bitmap数组

bitmap = np.zeros((len(seed_list), len(label)))

然后调用enumerate()函数用于将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列，同时列出数据和数据下标，这里就是遍历种子列表，idx就是索引号，i就是具体的种子文件名称

for idx, i in enumerate(seed_list):

然后tmp就是从raw_bitmap中提取出对应种子的覆盖的了边的列表

tmp = raw_bitmap[i]

然后遍历种子覆盖的边的编码。然后通过编码查找是否存在在lable中

for j in tmp:
	if int(j) in label:

然后标记出当前种子已经覆盖的边在label数组中的位置标记为1

bitmap[idx][label.index((int(j)))] = 1

去除bitmap中的重复种子的数据，并进行排序之后输出

fit_bitmap = np.unique(bitmap, axis=1)

NEUZZ生成bitmaps的过程

猜你喜欢