【翻译】ILR-我的Gadgets去哪里了——ILR: Where’d My Gadgets Go?

ILR-我的Gadgets去哪里了

【文章为google-translate的直译结果,最近暂时没有时间修改翻译内容。google-translate的翻译结果中有很多明显的错误,遇到类似的问题,请读者结合英文仔细揣摩。】

Abstract—Through randomization of the memory space and the confinement of code to non-data pages, computer security researchers have made a wide range of attacks against program binaries more difficult. However, attacks have evolved to exploit weaknesses in these defenses.

摘要-通过内存空间随机化以及将代码限制在非数据页上,计算机安全研究人员已经使针对程序二进制文件的广泛攻击变得更困难。但是,不断演变的攻击手段正在利用这些防御措施中的弱点来发动新型的攻击。

To thwart these attacks, we introduce a novel technique called Instruction Location Randomization (ILR). Conceptually, ILR randomizes the location of every instruction in a
program, thwarting an attacker’s ability to re-use program functionality (e.g., arc-injection attacks and return-oriented programming attacks).

为了防御这些攻击,我们提出一种叫做指令位置随机(ILR)的新技术。从概念上讲,ILR将程序中每个指令的位置随机化,从而阻止攻击者重用程序功能(例如:注入攻击和面向返回编程(ROP)攻击)。

ILR operates on arbitrary executable programs, requires no compiler support, and requires no user interaction. Thus, it can be automatically applied post-deployment, allowing easy and frequent re-randomization.

ILR 可以在任意可执行二进制程序上应用,不需要编译器支持,不需要用户交互。因此,它可以在后期部署中自动运行,可以简单的频繁的进行ILR 随机化。

Our preliminary prototype, working on 32-bit x86 Linux ELF binaries, provides a high degree of entropy. Individual instructions are randomly placed within a 31-bit address space. Thus, attacks that rely on a priori knowledge of the location of code or derandomization are not feasible. We demonstrated ILR’s defensive capabilities by defeating attacks against programs with vulnerabilities, including Adobe’s PDF viewer, acroread, which had an in-the-wild vulnerability. Additionally, using an industry-standard CPU performance benchmark suite, we compared the run time of prototype ILR-protected executables to that of native executables. The average run-time overhead of ILR was 13% with more than half the programs having effectively no overhead (15 out of 29), indicating that ILR is a realistic and cost-effective mitigation technique.

我们初步原型在32位x86linux ELF二进制文件上运行,提供了高度的熵。各个指令被随机放置在31位地址空间内。因此,依赖于代码位置的先验知识或非随机化的攻击是不可行的。我们通过克服对带有漏洞的程序(包括Adobe PDF 查看器acroread)的攻击,证明了ILR的防御能力。此外,使用行业标准的CPU 性能基准套件,我们将受ILR保护的原型可执行文件的运行时间与本机可执行文件的运行时间进行了比较。ILR的平均运行时开销为13%,超过一半的程序实际上没有开销(29个中的15个),这表明ILR是一种现实且具有成本效益的缓解技术。

Keywords-Randomization; Exploit prevention; Diversity; ASLR; Return-oriented-programming, arc-injection;

关键字: 随机,漏洞利用保护, 多样性, ASLR ,面向返回的编程(ROP),注入

1.INTRODUCTION

1.介绍

Computer software controls many major aspects of modern life, including air travel, power distribution, banking, medical treatment, traffic control, and a myriad of other essential infrastructures. Unfortunately, weaknesses in software code (such as memory corruption, fixed-width integer computation errors, input validation oversights, and format string vulnerabilities) remain common. Via these weaknesses, attackers are able to hijack an application’s intended control flow to violate security policies (exfiltrating secret data, allowing remote access, bypassing authentication, or eliminating services) [1–4].

计算机软件控制着现代生活的许多主要方面,包括航空旅行,配电,银行业务,医疗,交通控制以及无数其他重要基础设施。 不幸的是,软件代码中的弱点(例如内存损坏,定宽整数计算错误,输入验证疏忽和格式字符串漏洞)仍然很常见。 通过这些弱点,攻击者能够劫持应用程序的预期控制流,从而违反安全策略(泄露机密数据,允许远程访问,绕过身份验证或消除服务)[1-4]。

Unfortunately, modern deployed defenses fail to thoroughly mitigate these threats, even when composed. Perhaps the most commonly deployed defenses are Address SpaceLayout Randomization (ASLR) [5] and W⊕X [5, 6]. In theory, ASLR randomizes the addresses used in a program. Unfortunately, only some addresses are randomized in modern implementations. For example, the main program text is not randomized on Linux implementations since programs do not have enough information to safely relocate this portion of code. Further, ASLR only randomizes the base address of loaded modules, not each address within the module. Thus, ASLR is vulnerable to information leakage and entropy-exhausting attacks [7, 8]. W⊕X seeks to delineate code from data to prevent code-injection attacks. However, arc-injection attacks and various forms of returnoriented programming (ROP) attacks bypass W⊕X through reuse of code already embedded in the program [2, 8–10].

不幸的是,即使部署了现代防御系统,防御也无法完全缓解这些威胁。 也许最常用的防御措施是地址空间布局随机化(ASLR)[5]和W⊕X[5,6]。 理论上,ASLR将程序中使用的地址随机化。 不幸的是,在现代实现中只有一些地址是随机的。 例如,由于程序没有足够的信息来安全地重新定位这部分代码,因此在Linux实现中,主程序文本不是随机的。 此外,ASLR仅随机化已加载模块的基地址,而不是模块内的每个地址。 因此,ASLR容易受到信息泄漏和熵耗尽攻击的攻击[7,8]。 W⊕X试图从数据中区分代码,以防止代码注入攻击。 但是,电弧注入攻击和各种形式的面向返回的编程(ROP)攻击通过重用已经嵌入程序中的代码来绕过W⊕X[2,8-10]。

In this paper we describe a novel technique, called Instruction Location Randomization (ILR), that conceptually randomizes the location of every instruction in a program. ILR can use the full address space of the process (e.g., 32-bits on 32-bit processors such as the x86). Information leakage attacks that discover information about the location of a code block (e.g., the randomized base address of a dynamically loaded module or the start of a function) are infeasible for two reasons: 1) the randomized code addresses are protected from leakage and 2) a leak provides no information about the location of other code blocks.

在本文中,我们描述了一种称为指令位置随机化(ILR)的新技术,该技术从概念上将程序中每条指令的位置随机化。 ILR可以使用进程的完整地址空间(例如,在32位处理器(例如x86)上为32位)。 信息泄漏攻击无法发现有关代码块位置的信息(例如,动态加载的模块的随机基址或函数的开始),原因有两个:1)保护随机代码地址不泄漏,以及2 )泄漏不提供有关其他代码块位置的信息。

ILR changes a fundamental characteristic typically used by attackers—predictable code layout. For example, programs are arranged sequentially in memory starting at a base address, as shown in the left of Figure 1.[1 For simplicity, the figure and discussion assume all instructions are one byte. Our general approach, prototype implementation, and security discussion do not rely on this fact.]

ILR改变了攻击者通常使用的基本特征-可预测的代码布局。 例如,程序从基地址开始依次排列在内存中,如图1左侧所示.[1. 为了简单起见,该图和讨论假定所有指令均为一个字节。 我们的一般方法,原型实现和安全性讨论均不依赖此事实。]

Figure 1. Traditional program creation versus an ILR-protected program. In a traditional program, instructions are arranged sequentially and predictably, allowing an attack. With an ILR-protected program, instructions are distributed across memory randomly, preventing attack.

图1.传统程序创建与ILR保护的程序。 在传统程序中,指令是按顺序和可预测地排列的,从而允许攻击。 使用受ILR保护的程序,指令可在内存中随机分配,从而防止攻击。

In this example, the address used to return from function foo (7003) might be leaked if there is a vulnerability in the function. An attacker that learns this information can easily determine the location of all other instructions. Attackers routinely rely on the fundamental assumption of predictable code layout to craft attacks such as arc-injection and the various forms of return-oriented programming. In the example, an attacker might use the address of the add instruction to mount an ROP attack using add eax, #1;ret as an ROP gadget.[2 ROP gadgets are short sequences of code, typically ending in a return instruction, that perform some small portion of the attack.] For a detailed explanation of ROP gadgets and how they are combined to form an attack, please see Shacham’s prior work [2].

在此示例中,如果函数中存在漏洞,则用于从函数foo(7003)返回的地址可能会泄漏。 学习此信息的攻击者可以轻松确定所有其他指令的位置。 攻击者通常依靠可预测代码布局的基本假设来进行攻击,例如电弧注入和各种形式的面向返回的编程。 在此示例中,攻击者可能会使用add指令的地址,使用add eax#1; ret作为ROP小工具来发起ROP攻击。[2. ROP小工具是短代码序列,通常以返回指令结尾, 请执行Shapham先前的工作[2]。有关ROP小工具及其组合方式的详细说明,请参阅Shacham的先前工作[2]。

ILR adopts an execution model where each instruction has an explicitly specified successor. Thus, each instruction’s successor is independent of its location. This model of execution allows instructions to be randomly scattered throughout the memory space. Hiding the explicit successor
information prevents an attacker from predicting the location of an instruction based on the location of another instruction.

ILR采用一种执行模型,其中每个指令都有一个明确指定的后继程序。 因此,每个指令的后继者都与其位置无关。 这种执行模型允许指令随机分布在整个存储空间中。 隐藏显式后继
该信息可防止攻击者根据另一条指令的位置来预测一条指令的位置。

ILR’s “non-sequential” execution model is provided through the use of a process-level virtual machine (PVM) based on highly efficient software dynamic translation technology [11–13]. The PVM handles executing the nonsequential, randomized code on the host machine.

通过使用基于高效软件动态转换技术的流程级虚拟机(PVM),可以提供ILR的“非顺序”执行模型[11-13]。 PVM处理在主机上执行非顺序的随机代码。

We have implemented a prototype ILR implementation for Linux on the x86 and Section III provides complete implementation details. In short, ILR operates on arbitrary executables, requires no compiler support, and no user interaction. Using a set of vulnerable programs (including a binary distributed by Adobe to read PDF files) and ASLR and W⊕X-defeating exploits, we demonstrate that ILR detects and thwarts these attacks. An important consideration of any mitigation technique is the run-time overhead. Many proposed mitigation techniques incur high overheads—as much as 90% to 2000% [14, 15]. Using a large industry standard CPU performance benchmark suite [16], we compared the run time of ILR-protected executables to that of native executables. The average run-time overhead of ILR was 13% with over half of all programs having effectively no overhead (less than 3%) indicating that ILR is a realistic and cost-effective mitigation technique.

我们已经在x86上为Linux实现了ILR实现的原型,第三节提供了完整的实现细节。 简而言之,ILR可在任意可执行文件上运行,不需要编译器支持,也不需要用户交互。 通过使用一组易受攻击的程序(包括Adobe分发的二进制文件来读取PDF文件)以及ASLR和W⊕X攻击者的漏洞,我们证明了ILR检测到并阻止了这些攻击。 任何缓解技术的重要考虑因素是运行时开销。 许多提议的缓解技术会产生高昂的开销,高达90%至2000%[14,15]。 使用大型的行业标准CPU性能基准套件[16],我们将受ILR保护的可执行文件的运行时间与本机可执行文件的运行时间进行了比较。 ILR的平均运行时开销为13%,而所有程序中有一半以上实际上没有开销(少于3%),这表明ILR是一种现实且具有成本效益的缓解技术。

This paper makes several contributions. It:

  • presents Instruction Location Randomization (ILR), a technique that provides high-entropy diversity for relocating instructions with low run-time overhead.
  • demonstrates that ILR defeats arc-injection and ROP attacks on arbitrary binaries without need for compiler,linker, operating system or hypervisor support.
  • provides a complete description of how ILR can achieve its goals despite inherent uncertainty about a program’s structure, such as where code and data reside, and
  • thoroughly analyzes the security, effectiveness, and performance of ILR in a prototype system on large,real-world benchmarks.

本文做出了一些贡献。 它包括:

  • 介绍了指令位置随机化(ILR),这是一种提供高熵分集的技术,用于以较低的运行时开销重新分配指令。
  • 证明ILR在不需要编译器,链接器,操作系统或虚拟机管理程序支持的情况下,可以克服任意二进制文件上的电弧注入和ROP攻击。
  • 完整描述了ILR如何实现其目标,尽管程序结构存在固有的不确定性,例如代码和数据的位置,以及
  • 在大型,真实的基准测试中,在原型系统中彻底分析ILR的安全性,有效性和性能。

The remainder of the paper is organized as follows: Section II first discusses the threat model within which ILR operates. Section III describes the details of ILR. Sections IV and V provide an evaluation and security discussion of the proposed techniques. Section VI compares our work to related work in the field. Finally, Section VII summarizes our findings.

本文的其余部分安排如下:第二部分首先讨论了ILR运行的威胁模型。 第三节详细介绍了ILR。 第四节和第五节提供了对所提议技术的评估和安全性讨论。 第六节将我们的工作与该领域的相关工作进行了比较。 最后,第七节总结了我们的发现。

2. THREAT MODEL

2. 威胁模型

We assume that the unprotected program is created and distributed to an end user (and possibly the attacker) in binary form. The program has been tested, but not guaranteed to be free from programmatic errors that might allow malicious exploit, such as memory errors. The program is assumed to be free from intentionally planted back doors, trojans, etc. Furthermore, the program is to be protected and deployed in a setting where the other software on the system is believed to be operating correctly, and the system administrator is trusted. An attacker does not have direct access to the system or the protected program. However, the attacker understands the protection methodology and may have access to tools for applying ILR protections. The attacker also has access to the unprotected version of the program, and can specify malicious input to the protected program.

我们假设未受保护的程序已创建并以二进制形式分发给最终用户(可能还有攻击者)。 该程序已经过测试,但不能保证没有任何可能导致恶意利用的程序错误,例如内存错误。 假定该程序没有故意植入的后门,特洛伊木马等。此外,应在认为系统上其他软件可以正常运行且值得系统管理员信任的环境中保护和部署该程序。 。 攻击者无法直接访问系统或受保护程序。 但是,攻击者了解保护方法,并且可以访问用于应用ILR保护的工具。 攻击者还可以访问该程序的不受保护版本,并可以向受保护程序指定恶意输入。

In particular, ILR focuses on preventing attacks which rely on code being located predictably. This threat model includes a large range of possible attacks against a program. For example, many attacks against client and server software fit this model. Document viewers/editors (Adobe PDF viewer, Microsoft Word), e-mail clients (Microsoft Outlook, Mozilla Thunderbird), and web browsers (Mozilla Firefox, Microsoft Internet Explorer, Google Chrome) need to be protected from these types of threats anytime a user requests the program to examine data from an untrusted source.

特别是,ILR致力于防止依赖可预测地定位代码的攻击。 此威胁模型包括对程序的各种可能的攻击。 例如,许多针对客户端和服务器软件的攻击都适合此模型。 需要随时保护文档查看器/编辑器(Adobe PDF查看器,Microsoft Word),电子邮件客户端(Microsoft Outlook,Mozilla Thunderbird)和网络浏览器(Mozilla Firefox,Microsoft Internet Explorer,Google Chrome)免受这些威胁的影响。 用户请求程序检查来自不受信任来源的数据。

3. INSTRUCTION LOCATION RANDOMIZATION

3. 指令位置随机化

ILR’s goals are to achieve high randomization and low run-time overhead. Figure 1 conceptually illustrates the effect of ILR and how it mitigates malicious attacks. The top left of the figure shows the control-flow graph of a particular program segment. The compiler and the linker collaborate to produce an executable file where instructions are laid out so they can be loaded into memory when the program is executed. A typical layout of code is shown at the bottom left of the figure.

ILR的目标是实现高随机化和低运行时间开销。 图1从概念上说明了ILR的作用以及它如何减轻恶意攻击。 图的左上方显示了特定程序段的控制流程图。 编译器和链接器协作生成可执行文件,在其中放置指令,以便在执行程序时将它们加载到内存中。 图的左下方显示了典型的代码布局。

An attacker, through knowledge of the instruction-set architecture and the executable format, can easily locate portions of code that may be useful in crafting an attack. For example, the attacker may identify the instruction sequence at locations 7004 and 7005 as being a gadget useful in crafting an ROP attack. This particular gadget adds one to register eax. By identifying a set of gadgets and exploiting a vulnerability, an attacker can cause a set of gadgets to be executed that effect the attack.

攻击者通过了解指令集体系结构和可执行格式,可以轻松找到可能对进行攻击有用的代码部分。 例如,攻击者可以将位置7004和7005处的指令序列识别为是在进行ROP攻击时有用的小工具。 这个特定的小工具会添加一个以注册eax。 通过识别一组小工具并利用漏洞,攻击者可以导致执行一系列会影响攻击的小工具。

The right side of the figure shows the layout of the code when ILR is applied. The program instructions are randomly scattered through memory. With an address space of 32 bits, it is infeasible for an attacker to locate a set of gadgets that could be used to craft an attack.

图的右侧显示了应用ILR时的代码布局。 程序指令通过内存随机散布。 凭借32位的地址空间,攻击者无法找到可以用来发起攻击的一组小工具。

To execute the randomized program, we employ a highly efficient PVM that fetches and executes the instructions in the proper order even though they are randomly scattered throughout memory. This process is accomplished via a specification that describes the execution successor of
each instruction in the program. This specification, called a fallthrough map, is shown at the top right of Figure 1. The PVM interprets the fallthrough map to fetch and execute instructions on the host hardware. The following subsections describe the process of automatically producing an ILR protected executable and its execution.

为了执行随机程序,我们采用了高效的PVM,即使它们随机分散在整个内存中,也可以以正确的顺序获取并执行指令。 此过程是通过描述以下内容的执行后继程序的规范完成的:
程序中的每条指令。 该规范称为过渡映射,显示在图1的右上角。PVM解释过渡映射以在主机硬件上获取并执行指令。 以下各节描述了自动生成受ILR保护的可执行文件的过程及其执行。

A. ILR Architecture

A. ILR 架构

Figure 2 shows the high-level architecture of the ILR process. ILR has an offline analysis phase to relocate instructions in the binary and generate a set of rewriting rules that describe how and where the newly located instructions are to be executed, and how control should flow between them, (shown as the fallthrough map in Figure 1). The randomized program is executed on the native hardware by a PVM that uses the fallthrough map to guide execution.

图2显示了ILR流程的高级体系结构。 ILR具有离线分析阶段,可以重新定位二进制文件中的指令并生成一组重写规则,这些规则描述了如何执行新位置的指令,在何处执行以及控件在它们之间的流动方式(如图1所示) )。 随机程序由PVM在本地硬件上执行,该PVM使用穿透映射来指导执行。

Figure 2. High-level overview of ILR architecture.

图2. ILR体系结构的高级概述。

The rewriting rules come in two forms. The first form, the instruction definition form, indicates that there is an instruction at a particular location. The first line of Figure 3 gives an example. In this example, address 0x39bc has the instruction cmp eax, #24. Note that the rule indicates that an instruction fetched from address 0x39bc should be the cmp instruction. However, data fetches from address 0x39bc are unaffected. This distinction allows ILR to relocate instructions even if instructions and data are overlapped.

重写规则有两种形式。 第一种形式,指令定义形式,指示在特定位置有一条指令。 图3的第一行给出了一个示例。 在此示例中,地址0x39bc具有指令cmp eax#24。 请注意,该规则指示从地址0x39bc提取的指令应为cmp指令。 但是,从地址0x39bc提取的数据不受影响。 这种区别使ILR即使指令和数据重叠也可以重定位指令。

An example of the second form of an ILR rewrite rule, the redirect form, is shown in the second line of Figure 3. This line specifies the fallthrough instruction for the cmp at location 0x39bc. A normal processor would immediately fetch from the location 0x39bd after fetching the cmp instruction. Instead, ILR execution checks for a redirection of the fallthrough. In this case, the fallthrough instruction is at 0xd27e. The remaining lines show the full set of rewrite rules for the example in Figure 1.

图3的第二行显示了ILR重写规则的第二种形式的示例,即重定向形式。该行指定位置0x39bc处cmp的fallthrough指令。 普通处理器将在提取cmp指令后立即从0x39bd位置获取。 相反,ILR执行将检查过渡的重定向。 在这种情况下,fallthrough指令位于0xd27e。 其余各行显示了图1中示例的完整重写规则集。

Figure 3. ILR rewrite rules corresponding to the example in Figure 1.

图3.与图1中的示例相对应的ILR重写规则。

The ILR architecture fetches, decodes and executes instructions in the traditional style, but checks for rewriting rules before fetching an instruction or calculating an instruction’s fallthrough address.

ILR体系结构以传统方式获取,解码和执行指令,但是在获取指令或计算指令的落入地址之前会检查重写规则。

B. Offline Analysis

B. 离线分析

The static analysis phase creates an ILR-protected program with random placement of every instruction in the program. For such randomization, the static analysis locates instructions, indirect branch targets, and identifies call sites for additional analysis. Figure 4 shows the organization of the static analysis used for ILR.

静态分析阶段创建一个受ILR保护的程序,并在程序中随机放置每条指令。 对于这种随机化,静态分析将定位指令,间接分支目标,并确定调用位置以进行其他分析。 图4显示了用于ILR的静态分析的组织。

Figure 4. High-level overview of the static analysis engine used in ILR.

图4. ILR中使用的静态分析引擎的高级概述。

  1. Disassembly Engine: The goal of the ILR disassembly engine is to locate any byte that might be the start of an instruction. We use a recursive descent disassembler (IDA Pro) and a linear scan disassembler (objdump) [17]. To ensure that all instructions are identified, we added the disassembly validator module. The disassembly validator iterates over every instruction found by either IDA Pro and objdump, and verifies that both the fallthrough and (direct) target instructions are inserted into the instruction database.

1)反汇编引擎:ILR反汇编引擎的目标是找到可能是指令开始的任何字节。 我们使用递归下降反汇编器(IDA Pro)和线性扫描反汇编器(objdump)[17]。 为了确保识别所有指令,我们添加了反汇编验证器模块。 反汇编验证器遍历IDA Pro和objdump找到的每条指令,并验证是否将fallthrough和(直接)目标指令插入到指令数据库中。

Since exact instruction start locations in the executable segment are not known, some of the instructions in the instruction database may not represent instructions that were intended by the program’s original assembly code. We make no attempt to determine which are the intended instructions, and which are not. We simply choose to relocate all of them. Any data address that is mis-identified as a code address will not be executed, therefore the corresponding rewrite rules will simply never be accessed.

由于不知道可执行段中确切的指令开始位置,因此指令数据库中的某些指令可能不代表程序原始汇编代码想要的指令。 我们不会尝试确定哪些是预期的说明,哪些不是。 我们只是选择将它们全部迁移。 任何被误识别为代码地址的数据地址都不会被执行,因此相应的重写规则将永远不会被访问。

One last responsibility of the Disassembly Engine is to record the functions that IDA Pro detects. We record each function as a set of instructions.

拆卸引擎的最后一项职责是记录IDA Pro检测到的功能。 我们将每个功能记录为一组指令。

  1. Indirect Branch Target Analysis: The goal of the indirect branch target analysis phase is to detect any location in the program that might be the target of an Indirect Branch(IB). IBs create a distinct problem for ILR. Indirect Branch Targets (IBTs) may be encoded in the instructions or data of a program, and it is challenging to determine which program bytes represent an IBT and which do not. Since we wish to randomize any arbitrary binary, our technique must tolerate imprecision in detecting which constants are an IBT in the program and which are not. Our solution is to perform a byte-by-byte scan of the program’s data, and further scan the disassembled code to determine any pointer-sized constant which could feasibly be an indirect branch target.

2)间接分支目标分析:间接分支目标分析阶段的目标是检测程序中可能成为间接分支(IB)目标的任何位置。 IB为ILR带来了一个明显的问题。 间接分支目标(IBT)可能被编码在程序的指令或数据中,要确定哪些程序字节代表IBT,哪些字节不代表IBT是一个挑战。 由于我们希望随机化任何二进制文件,因此我们的技术必须容忍不精确,才能检测程序中哪些常量是IBT,哪些不是。 我们的解决方案是对程序数据进行逐字节扫描,然后进一步扫描反汇编的代码,以确定可能是间接分支目标的任何指针大小的常量。

We find that in most programs, this simple heuristic is sufficient (see Section IV-D3 for details). However, when C++ programs use exception handling (try/catch blocks), the compiler uses location-relative addressing to encode IBTs for properly unwinding the stack, and invoking exception handlers. Our technique parses the portions of the ELF file that contain the tables used to drive the unwinding and exception throwing process, and records IBTs appropriately.

我们发现在大多数程序中,这种简单的启发式就足够了(有关详细信息,请参阅第IV-D3节)。 但是,当C ++程序使用异常处理(try / catch块)时,编译器将使用相对位置寻址对IBT进行编码,以正确展开堆栈并调用异常处理程序。 我们的技术分析ELF文件中包含用于驱动展开和异常引发过程的表的部分,并适当记录IBT。

Rewriting the bytes in the program that encode an IBT might induce an error in the program if those bytes are used for something besides jumping to an instruction. To avoid breaking the program when the analysis is wrong, we choose to leave those program bytes unmodified. Unfortunately, not rewriting the IBTs encoded in the program means that the program might jump to the address of an original program (and hence unrandomized) instruction.

如果将这些字节用于跳转指令以外的其他用途,则重写对IBT进行编码的程序中的字节可能会导致程序出错。 为了避免在分析错误时中断程序,我们选择不修改程序字节。 不幸的是,不重写程序中编码的IBT意味着程序可能会跳转到原始程序的地址(因此是非随机的)。

To accommodate indirect branches jumping to unrandomized addresses, each instruction that might be an IBT generates an additional ILR rule in the program. The additional rule uses the redirect form to map the unrandomized address to the new, randomized address. Thus, any indirect branch that targets an unrandomized address, correctly continues execution at the randomized address.

为了适应跳转到非随机地址的间接分支,每个可能是IBT的指令都会在程序中生成一个附加的ILR规则。 附加规则使用重定向形式将非随机地址映射到新的随机地址。 因此,以非随机地址为目标的任何间接分支都可以正确地在随机地址处继续执行。

Unfortunately, attackers may know the unrandomized addresses in a program, and if they can inject a control transfer to one of these addresses, they might be able to successfully perform an attack. The evaluation in Section IV-D3 shows the number of IBTs detected in most programs is very limited, and restricting attacks to only these targets significantly reduces the attack surface.

不幸的是,攻击者可能知道程序中的非随机地址,如果他们可以向这些地址之一注入控制权转移,他们就能够成功进行攻击。 IV-D3节中的评估显示,在大多数程序中检测到的IBT数量非常有限,并且仅将攻击限制在这些目标上会大大减少攻击面。

  1. Call Site Analysis: Since unrandomized instructions may allow attacks, we wish to randomize the return address for function calls. The call site analysis phase analyzes the call instructions in a program to determine if the return address can be randomized. Typically, a call instruction stores a return address, and when execution of the function completes, a ret instruction jumps to the address that was stored. Most functions obey these semantics. Unfortunately, call instructions can be used for other purposes, such as obtaining the current program counter when position independent code or data is found in a library. Such a call instruction is often called a thunk. Numerous other uses of return addresses are possible.

3)呼叫站点分析:由于非随机指令可能会发动攻击,因此我们希望将函数调用的返回地址随机化。 呼叫站点分析阶段分析程序中的呼叫指令,以确定返回地址是否可以随机化。 通常,一个调用指令存储一个返回地址,并且在函数执行完成时,一个ret指令跳转到所存储的地址。 大多数功能都遵循这些语义。 不幸的是,调用指令可以用于其他目的,例如在库中找到与位置无关的代码或数据时获取当前程序计数器。 这样的呼叫指令通常被称为thunk。 返回地址还有许多其他用途。

The analysis proceeds as follows. If the call instruction is to a known location that starts a function, we analyze the function further. If the function can be analyzed as having only standard function exits (using the return instruction), having only entrances via the function’s entry instruction, and having no direct accesses to the return value (such as with a mov eax, [ebp+4] instruction), then ILR declares that it is safe to rewrite the call instruction to store a randomized return address.

分析进行如下。 如果调用指令是到启动功能的已知位置的,则我们将进一步分析该功能。 如果可以将函数分析为只有标准函数退出(使用return指令),仅通过函数的entry指令进入而对返回值没有直接访问(例如,使用mov eax,[ebp + 4] 指令),然后ILR声明重写调用指令以存储随机返回地址是安全的。

Our heuristic makes the assumption that indirect memory accesses should not access the return address. While not strictly true for all programs, we find that the heuristic generally holds for programs compiled from high-level languages. One exception to our heuristic is again the C++ exception handling routines that “walk the stack.” The routines use the return address to locate the appropriate unwinding, cleanup, and exception handling codes to invoke. Like with the IBT analysis, we adjust the call site analysis to take into account the exception handling tables, so that call sites with exception handling cannot push a randomized return address.

我们的启发式方法假设间接内存访问不应访问返回地址。 尽管并非严格适用于所有程序,但我们发现启发式方法通常适用于从高级语言编译的程序。 我们的启发式方法之一是C ++异常处理例程,它“遍历了栈”。 例程使用返回地址来定位要调用的适当展开,清除和异常处理代码。 与IBT分析一样,我们调整呼叫站点分析以考虑到异常处理表,以使具有异常处理的呼叫站点无法推送随机返回地址。

Once the analysis is complete, the ILR rules for calls are emitted. If the call site analysis determines that the call can randomize the return address, no additional rules are required, and the call instruction’s location is randomized by simply emitting the standard rewrite rules. If, however, the non-randomized return address must be stored, we have two choices: 1) we could choose to pin the call instruction to its original location, so that the nonrandomized return address is stored, or 2) rewrite the call (using ILR rewrite rules) into a sequence of instructions that stores the unrandomized return address and transfers control appropriately. Since pinning instructions leads to a decrease in randomization, we choose the second option. Most machines can efficiently store the return address and perform the control transfer necessary to mimic a call instruction, typically using only 2-3 instructions. For example, on the IA32 instruction set architecture, a call foo instruction can be replaced with two instructions, push ; jmp foo, resulting in only one extra instruction. This transformation is exactly what is performed by our call site analysis when we detect that a call instruction cannot push a randomized return address. Furthermore, the unrandomized return address is marked as a possible indirect branch target, since we are not sure how the return address will be used.

分析完成后,将发出ILR呼叫规则。如果呼叫站点分析确定呼叫可以随机化返回地址,则不需要其他规则,并且只需发出标准重写规则即可随机化呼叫指令的位置。但是,如果必须存储非随机返回地址,则有两种选择:1)我们可以选择将调用指令固定到其原始位置,以便存储非随机返回地址,或者2)重写调用(使用ILR重写规则)到一系列指令中,这些指令存储了非随机返回地址并适当地转移了控制权。由于固定指令会导致随机性降低,因此我们选择第二个选项。大多数机器可以有效地存储返回地址,并执行模仿呼叫指令所需的控制转移,通常仅使用2-3条指令即可。例如,在IA32指令集体系结构上,调用foo指令可以替换为两个指令,即push ; jmp foo,仅产生一条额外的指令。当我们检测到呼叫指令无法推送随机返回地址时,此转换正是我们的呼叫站点分析所执行的。此外,由于我们不确定如何使用返回地址,因此将未随机化的返回地址标记为可能的间接分支目标。

  1. Reassembly Engine: After completely analyzing the program’s instructions, IBTs, and call sites, the reassembly engine gets invoked. The reassembly engine’s purpose is to create the rewrite rules necessary to create the randomized program. For each instruction in the database, the engine emits a set of rewrite rules. First, it emits the rules necessary to relocate the instruction. Note that if the instruction has a direct branch target encoded in it (such as a jmp L1), that branch target is rewritten to the randomized address of the branch target. Then, the reassembly engine emits the rule to map the instruction’s fallthrough address to the randomized location for the fallthrough instruction.

4)重组引擎:完全分析了程序的指令,IBT和调用站点后,就会调用重组引擎。 重新组装引擎的目的是创建创建随机程序所必需的重写规则。 对于数据库中的每条指令,引擎都会发出一组重写规则。 首先,它发出重新定位指令所需的规则。 请注意,如果指令中已编码直接分支目标(例如jmp L1),则该分支目标将被重写为分支目标的随机地址。 然后,重组引擎会发出规则,将指令的fallthrough地址映射到fallthrough指令的随机位置。

As a post-processing step, each byte of the original executable text gets an additional rule. If the address of the program text is marked as a possible IBT, the reassembly engine adds a rule to redirect that address to the randomized address for that instruction, effectively pinning the instruction. Any other byte of the executable code segment gets a rule to map its address to a handler that prints an error message and exits in a controlled manner. Thus, any possible arc-injection or ROP attacks must jump to the start of an instruction, and not bytes located within an instruction.

作为后处理步骤,原始可执行文本的每个字节都有一个附加规则。 如果程序文本的地址被标记为可能的IBT,则重组引擎会添加一条规则,以将该地址重定向到该指令的随机地址,从而有效地固定该指令。 可执行代码段的任何其他字节都会获得一条规则,以将其地址映射到处理程序,该处理程序将显示错误消息并以受控方式退出。 因此,任何可能的电弧注入或ROP攻击都必须跳到指令的开头,而不是跳转到指令内的字节。

C. Running an ILR-protected Program

C. 运行一个受ILR保护的程序

To apply the rewrite rules generated by the static analysis steps, ILR uses a specific ILR VM. We believe that a per-process virtual machine (PVM) is the best choice for the ILR VM since it can be easily deployed and has low performance and runtime overheads [11, 18, 19]. Figure 5 shows a typical PVM augmented with ILR extensions. The following paragraphs provide a brief introduction to typical PVM operation, and describe those extensions.

为了应用由静态分析步骤生成的重写规则,ILR使用了特定的ILR VM。 我们认为,对于ILR VM,按进程虚拟机(PVM)是最佳选择,因为它易于部署且性能和运行时开销较低[11,18,19]。 图5显示了具有ILR扩展的典型PVM。 以下各段简要介绍了典型的PVM操作,并描述了这些扩展。

Figure 5. Details of the ILR Virtual Machine.

图5 ILR虚拟机的细节

PVMs dynamically load an application and mediate application execution by examining and translating an application’s instructions before they execute on the host CPU. Most PVMs operate as co-routines with the application that they are protecting. Translated application instructions are held in a PVM-managed cache called a fragment cache. The PVM is first entered by capturing and saving the application context (e.g., program counter (PC), condition codes, registers, etc.) Following context capture, the PVM processes the next application instruction. If a translation for this instruction has been previously cached, the PVM transfers control to the cached translated instructions.

PVM通过在应用程序在主机CPU上执行之前检查和翻译应用程序的指令来动态加载应用程序并介导应用程序的执行。 大多数PVM与它们所保护的应用程序一起作为例程运行。 转换后的应用程序指令保存在称为片段缓存的PVM管理的缓存中。 首先通过捕获和保存应用程序上下文(例如,程序计数器(PC),条件代码,寄存器等)进入PVM。在上下文捕获之后,PVM处理下一条应用程序指令。 如果此指令的翻译先前已被缓存,则PVM会将控制权转移到缓存的翻译指令。

If there is no cached translation for the next application instruction, the PVM allocates storage in the fragment cache for a new fragment of translated instructions. The PVM then populates the fragment by fetching, decoding, and translating application instructions one-by-one until an end-of-fragment condition is met. As the application executes under the PVM’s control, more and more of the application’s working set of instructions materialize in the fragment cache.

如果下一条应用程序指令没有缓存的翻译,则PVM在片段缓存中为新的翻译指令片段分配存储空间。 然后,PVM通过逐个获取,解码和翻译应用程序指令填充片段,直到满足片段结束条件。 随着应用程序在PVM的控制下执行,越来越多的应用程序工作指令集在片段缓存中得以实现。

Implementation of ILR within a PVM requires several simple extensions to a typical PVM. First, we must modify the PVM startup code to read the ILR rewrite rules (not pictured). Next, we need to override the PVM’s instruction fetching mechanism to first check, then read from ILR rewrite rules as appropriate. Lastly, we need to modify the next-PC operation to obey the fallthrough map that ILR provides in the rewrite rules.

在PVM中实现ILR需要对典型PVM进行几个简单扩展。 首先,我们必须修改PVM启动代码以读取ILR重写规则(未显示)。 接下来,我们需要重写PVM的指令获取机制以进行首先检查,然后根据需要从ILR重写规则中读取。 最后,我们需要修改next-PC操作以遵守ILR在重写规则中提供的穿透映射。

One further extension is necessary for security. The PVM must take steps to protect itself and its code cache from being compromised by a program that an attacker is attempting to control. Since the PVM typically shares an address space with the program, the PVM must take care not to allow the program to attempt to jump into the PVM’s code. Further, the PVM should prevent the randomized instruction addresses from being leaked to the user. Such protections can be accomplished by making the PVM’s code and data unaccessible via standard memory protection mechanisms whenever the untrusted application code is executing. We discuss the technical details of one mechanism in Section V-A.

为了安全性,还需要进一步扩展。 PVM必须采取措施保护自己及其代码缓存,使其免受攻击者试图控制的程序的破坏。 由于PVM通常与程序共享地址空间,因此PVM必须注意不要让程序尝试跳入PVM的代码。 此外,PVM应防止将随机指令地址泄漏给用户。 只要执行不受信任的应用程序代码,就可以通过标准的内存保护机制使PVM的代码和数据不可访问,从而实现这种保护。 我们将在第V-A节中讨论一种机制的技术细节。

4. EVALUATION

4. 评估

A. Prototype Implementation

A.原型实现

Our development and evaluation system were based on a Linux kernel version 2.6.32-34-generic as part of our Ubuntu 10.04.03 LTS release configured with gcc 4.4.3. We used IDA Pro version 6.1, and objdump version 2.20.1-system.20100303 [17].

我们的开发和评估系统基于Linux内核版本2.6.32-34-,这是我们的Ubuntu 10.04.03 LTS发行版的一部分,该发行版配置了gcc 4.4.3。 我们使用了IDA Pro版本6.1和objdump版本2.20.1-system.20100303 [17]。

As the static analyzer components need to store instructions, functions, and indirect branch targets, we used a Postgres database. This choice turned out to be wise considering some programs we evaluated contained almost half a million instructions. Each instruction is marked as being part of a function, and whether is has been detected as a possible indirect branch target.

由于静态分析器组件需要存储指令,函数和间接分支目标,因此我们使用了Postgres数据库。 考虑到我们评估的某些程序包含将近一百万条指令,因此这种选择是明智的。 每个指令都被标记为功能的一部分,并且是否已将其检测为可能的间接分支目标。

We implemented the disassembly validator, call site analysis, indirect branch target analysis and reassembly engine to access the database and deposit their information back to the database. This modular design turned out to be useful for implementing, debugging, and deploying the system.

我们实现了反汇编验证器,呼叫站点分析,间接分支目标分析和重组引擎,以访问数据库并将其信息存回数据库。 事实证明,这种模块化设计对于实现,调试和部署系统很有用。

Our implementation of the reassembly engine is split into two phases. The first phase reads the database and emits a symbolic, relocatable, assembly version of the ILR rewrite rules to a file on the file system. The second step performs the randomization, and binds the assembly version of instructions to a machine code form. Splitting the tool into two portions aids in re-randomization (as the full database of instructions is no longer necessary) and run time (as the database need not be accessed at runtime).

我们对重组引擎的实施分为两个阶段。 第一阶段读取数据库,并向文件系统上的文件发出符号化,可重定位的ILR重写规则的汇编版本。 第二步执行随机化,并将指令的汇编版本绑定到机器代码形式。 将工具分为两部分有助于重新随机化(因为不再需要完整的指令数据库)和运行时间(因为无需在运行时访问数据库)。

Our ILR VM is based on Strata [11]. The modifications for ILR required about only 1K lines of code. While our prototype implementation is based on the tools and operating system above, we believe our techniques are general, and can be easily applied to any hardware, operating system, PVM, or executable format.

我们的ILR VM基于Strata [11]。 对ILR的修改仅需要大约1K行代码。 尽管我们的原型实现基于上述工具和操作系统,但我们认为我们的技术是通用的,可以轻松应用于任何硬件,操作系统,PVM或可执行格式。

B. Experimental Setup

B.实验装置

We evaluated the effectiveness and performance of the ILR prototype using the SPEC CPU2006 benchmark suite [16]. These benchmarks are state-of-the-art, industrystandardized benchmarks designed to stress a system. The benchmarks are processor, memory and compiler stressing. The benchmarks are provided as source, and we compiled them with gcc, g++, or gfortran (as dictated by the program’s source code) version 4.4.3 before applying our ILR technique. The benchmarks are compiled at optimization level -O2, and use static linking. We used static linking to thoroughly demonstrate the effectiveness of our system at randomizing large bodies of code, and to fully test the system using all the odd, compiler-specific, languagespecific, hand-coded, or otherwise abnormal code that is often found in libraries. Furthermore, having all the code packaged into one executable increases the attack surface making it easier to locate an ROP gadget. Thus, we believe our evaluation is a worst-case analysis for these benchmarks.

我们使用SPEC CPU2006基准套件[16]评估了ILR原型的有效性和性能。这些基准是旨在强调系统的最新,行业标准基准。基准是处理器,内存和编译器压力。这些基准是作为源提供的,在应用ILR技术之前,我们使用gcc,g ++或gfortran(由程序的源代码决定)版本4.4.3对其进行了编译。这些基准是在优化级别-O2上编译的,并使用静态链接。我们使用静态链接来彻底证明我们的系统在随机处理大量代码时的有效性,并使用库中经常出现的所有奇数,特定于编译器,特定于语言,手工编码或其他异常代码来全面测试系统。此外,将所有代码打包到一个可执行文件中会增加攻击面,从而更容易定位ROP小工具。因此,我们认为我们的评估是针对这些基准的最坏情况分析。

We run our experiments on a system with a 4-core, AMD Phenom II B55 processor, running at 3.2 GHz. The machine has 512KB of L1 cache, 2MB of L2 cache, 6MB of L3 cache, and 4GB of main memory. Performance numbers are gathered by averaging 3 runs of each benchmark. Unless otherwise noted, the performance of a protected binary is reported by normalizing its run time to the run time of the corresponding original binary produced by the compiler.

我们在运行于3.2 GHz的4核AMD Phenom II B55处理器的系统上运行实验。 该机器具有512KB的L1缓存,2MB的L2缓存,6MB的L3缓存和4GB的主内存。 通过平均每个基准测试3次运行来收集性能数字。 除非另有说明,否则通过将受保护二进制文件的运行时间标准化为编译器生成的相应原始二进制文件的运行时间来报告其性能。

C. Security-Related Experiments

C.与安全性相关的实验

To verify that our technique stops attacks that are successful against ASLR and W⊕X protected systems, we performed a number of tests on vulnerable programs. For each test, ASLR and W⊕X were enabled.

为了验证我们的技术是否能够阻止针对ASLR和W⊕X受保护系统的攻击,我们对易受攻击的程序进行了许多测试。 对于每个测试,都启用了ASLR和W⊕X。

In the first test, we used a small program (44 lines of code) that had a simple stack-based buffer overflow. The program assigns grades to students based on the program’s input, the student’s name. A malicious input can cause a buffer overflow enabling an attack.

在第一个测试中,我们使用了一个小程序(44行代码),该程序具有基于堆栈的简单缓冲区溢出。 程序会根据程序的输入(学生的姓名)为学生分配成绩。 恶意输入可能导致缓冲区溢出,从而引发攻击。

We created a simple arc-injection attack which causes the program to print out a grade of B when the student should receive a D. It was trivial to perform the arc-injection. ASLR was ineffective because no randomized addresses were used—only the unrandomized addresses in the main program. Similarly, W⊕X was ineffective because the attack only relied on instructions that were already part of the program. We also used a tool called ROPgadget [20] to craft an ROP attack that causes the program to start a shell which can execute an arbitrary command. Again, ASLR and W⊕X were ineffective. ILR, however, thwarted the attack.

我们创建了一个简单的电弧注入攻击,该攻击使程序在学生应收到D时打印出B级。执行电弧注入非常简单。 ASLR之所以无效,是因为没有使用随机地址,而只使用了主程序中的非随机地址。 同样,W⊕X是无效的,因为攻击仅依赖于程序中已经存在的指令。 我们还使用了称为ROPgadget [20]的工具来制作ROP攻击,该攻击会导致程序启动可以执行任意命令的Shell。 同样,ASLR和W⊕X无效。 但是,ILR阻止了这次袭击。

We next verified our technique against a vulnerability in a realistic program: a Linux PDF viewer, xpdf. We seeded a vulnerability in the input processing routines. An appropriately long input can trigger a stack overflow. In this case, we were able to use ROPgadget to craft an attack to create a shell. ILR was again able to prevent the attack.

接下来,我们针对现实程序中的漏洞验证了我们的技术:Linux PDF查看器xpdf。 我们在输入处理例程中植入了一个漏洞。 适当长的输入会触发堆栈溢出。 在这种情况下,我们能够使用ROPgadget进行攻击以创建外壳。 ILR再次能够阻止攻击。

Lastly, we used version 9.3.0 of Adobe’s PDF viewer, acroread, that we downloaded from Adobe’s website in binary form. The program has a well-documented vulnerability when parsing image files (see CVE-2006-3459) that allows arc-injection and ROP attacks [21]. Again, we used ROPgadget to craft an ROP attack payload for this vulnerability to start a shell program. Because exploiting the vulnerability is more complicated, it took additional effort to adapt the attack. Using information from Security Focus’s website, we were able to create a malicious PDF file that effected the ROP attack [21]. ILR successfully processes and randomizes the 24MB executable, and thwarts the attack.

最后,我们使用了Adobe PDF查看器9.3.0版acroread,该软件是从Adobe网站以二进制格式下载的。 该程序在解析图像文件时(见CVE-2006-3459)有一个已充分记录的漏洞,该漏洞允许电弧注入和ROP攻击[21]。 再次,我们使用ROPgadget为该漏洞设计了ROP攻击有效负载以启动Shell程序。 由于利用此漏洞更为复杂,因此需要花费更多的精力来适应该攻击。 利用Security Focus网站的信息,我们能够创建一个恶意的PDF文件,该文件影响了ROP攻击[21]。 ILR成功处理并随机分配了24MB可执行文件,并阻止了攻击。

Section IV-E discusses ILR’s effect on the use of such tools as ROPgadget, and Section V-B describes how randomized addresses needed for the attack are protected from exfiltration by the ILR VM. Consequently, we believe attacks using programs such as ROPgadget are not possible with ILR.

第IV-E节讨论了ILR对使用ROPgadget等工具的影响,第V-B节描述了如何保护攻击所需的随机地址,以防止ILR VM渗透。 因此,我们认为ILR无法使用ROPgadget等程序进行攻击。

D. Effectiveness of ILR Components

D. ILR组件的有效性

  1. Disassembly Engine: The goal of the Disassembly Engine is to locate any instruction which might be executed, so that the instruction can be relocated later. For our benchmarks, we found that the disassembly engine successfully located 100% of the executed instructions for all benchmarks. The Disassembly Engine has met its first goal. We omit further discussion on disassembly as such techniques are well studied [22–24].

1)反汇编引擎:反汇编引擎的目标是找到可能执行的任何指令,以便以后可以重定位该指令。 对于我们的基准,我们发现反汇编引擎已成功定位了所有基准的100%执行指令。 拆卸引擎已实现其第一个目标。 由于对此类技术进行了深入研究,因此我们省略了对拆卸的进一步讨论[22-24]。

A secondary goal of the Disassembly Engine is to introduce few conflicting facts about instruction locations into the database. We measured the fraction of bytes in the executable segments that belonged to more than one instruction. On average, only 0.005% of bytes were represented as part of more than one instruction with the worst-case having only 0.012% of bytes in conflict. Thus, we believe that the disassembly engine has met its second goal.

反汇编引擎的第二个目标是将关于指令位置的少量冲突事实引入数据库。 我们测量了属于多个指令的可执行段中的字节分数。 平均而言,在多个指令中,只有0.005%的字节表示为一条指令的一部分,最坏的情况是只有0.012%的字节发生冲突。 因此,我们认为拆卸引擎已达到其第二个目标。

  1. Call Site Analysis: Figure 6 shows the percentage of call sites marked as safe to randomize their return addresses. The first bar shows that our technique works well for some benchmarks. 403.gcc, for example, has 91% of the return addresses randomized while 416.gamess reaches 97%. Other benchmarks do not perform as well; 447.dealII and 483.xalancbmk only manage to identify 5% and 3% of return addresses as randomizable. The C++ benchmarks (447.dealII, 450.soplex, 453.povray, 470.lbm, and 471.omnetpp) do especially poorly. Only 10% of calls can use a randomized return address.

2)呼叫站点分析:图6显示了标记为安全以随机化其返回地址的呼叫站点的百分比。 第一栏显示我们的技术在某些基准测试中效果很好。 例如,403.gcc随机分配了91%的返回地址,而416.gamess则达到了97%。 其他基准测试效果不佳; 447.dealII和483.xalancbmk仅设法将5%和3%的返回地址标识为可随机化。 C ++基准测试(447.dealII,450.soplex,453.povray,470.lbm和471.omnetpp)的表现尤其差。 只有10%的呼叫可以使用随机返回地址。

SPEC CPU2006 Benchmarks

SPEC CPU2006基准

Figure 6. Percent of call instructions which ILR and ILR+’s call site analysis deemed safe for using a randomized return address. On average, only 58% of call instructions were identified as safe to use a randomized return address.

图6. ILR和ILR +的呼叫站点分析认为使用随机返回地址安全的呼叫说明百分比。 平均而言,只有58%的呼叫指令可以安全使用随机返回地址。

To understand why the call site analysis phase was less effective on some benchmarks, we examined the reasons that the call site analysis indicated that a randomized return address could not be used. Figure 7 shows the results as a fraction of all call instructions. We find that indirect calls (which cannot use a randomized return address because our analysis does not attempt to determine possible targets) result in a small fraction of unrandomized return addresses, resulting in 5% of calls on average. Possible non-standard uses of the return address, such as thunks, result in only 7.6% of return addresses. Interestingly, we find that direct call instructions to targets that we were not able to include in our disassembly result in 1.2% of the total call instructions. Closer inspection indicates that the compiler is actually emitting a call 0x0 instruction in many library functions. If this type of call instruction were to ever execute, it would cause a fault in the program, but the call instruction is (dynamically) unreachable code. The compiler cannot detect this fact, and so cannot eliminate the call. A minor improvement would randomize the return address for this type of call, knowing that the return address cannot be used if the call instruction causes a fault. Together, these causes represent only 21% of all unrandomized call instructions.

为了了解为什么呼叫站点分析阶段在某些基准测试中效率较低的原因,我们研究了呼叫站点分析指示无法使用随机返回地址的原因。图7将结果显示为所有调用指令的一部分。我们发现间接调用(由于我们的分析未尝试确定可能的目标,因此不能使用随机返回地址)会导致一小部分非随机返回地址,平均导致5%的调用。返回地址可能的非标准用法(例如重击)仅导致返回地址的7.6%。有趣的是,我们发现针对无法在拆卸中包括的目标的直接呼叫指令导致了总呼叫指令的1.2%。仔细检查表明,编译器实际上正在许多库函数中发出调用0x0指令。如果要执行这种类型的调用指令,将导致程序出错,但是该调用指令是(动态)不可访问的代码。编译器无法检测到此事实,因此无法消除该调用。较小的改进将使这种类型的调用的返回地址随机化,因为知道如果调用指令导致故障,则无法使用返回地址。这些原因加在一起仅占所有非随机呼叫指令的21%。

SPEC CPU2006 Benchmark

SPEC CPU2006基准

Figure 7. Breakdown of call instructions marked as unsafe for using a randomized return address. C++’s exception handling mechanism results in a severe reduction in return address randomization.

图7.标记为使用随机返回地址不安全的呼叫指令的细目分类。 C ++的异常处理机制可大大减少返回地址的随机化。

The top bar in the figure shows the real cause of the poor performance, especially in C++ programs. More than 32% of call instructions are marked as not being able to randomize the return address because of the exception handling tables used in the ELF file. In the C++ programs, this number jumps up to an average of 79%! In C++ programs, the compiler typically cannot calculate when a function, f, makes a call, whether the called function will throw an exception and need to clean up f’s stack. Consequently, the C++ compiler emits cleanup code into f, and adds to the .eh_frame and .gcc_except_table ELF sections to drive the exception handling routines. Because most functions with a call site fit this form, most call instructions cannot have a randomized return instruction.

图中的顶部栏显示了性能不佳的真正原因,尤其是在C ++程序中。 由于ELF文件中使用了异常处理表,超过32%的调用指令被标记为无法随机化返回地址。 在C ++程序中,这个数字平均上升了79%! 在C ++程序中,编译器通常无法计算函数f何时进行调用,被调用函数是否会引发异常以及是否需要清理f的堆栈。 因此,C ++编译器将清除代码发射到f中,并添加到.eh_frame和.gcc_except_table ELF节中,以驱动异常处理例程。 因为带有呼叫站点的大多数功能都适合这种形式,所以大多数呼叫指令不能具有随机返回指令。

It is interesting that even the C and Fortran benchmarks use the exception handling table. The C/Fortran benchmarks application code does not seem to directly add to these tables. Instead, the table entries come from library routines that are compiled to work with C++ source.

有趣的是,即使C和Fortran基准测试也使用异常处理表。 C / Fortran基准测试应用程序代码似乎没有直接添加到这些表中。 相反,表条目来自库例程,该例程已编译为与C ++源一起使用。

We believe that modifying the ILR toolchain to edit the exception handling tables to reflect the randomization would be feasible. The tables are in a fixed, known format and can easily be rewritten with randomized addresses. Other solutions are possible as well. For example, detecting if C++ exception handling is actually used in the program or a portion of the program would allow return address randomization to be selectively applied. While fully exploring this idea is beyond the scope of this paper, we were able to modify our ILR toolchain to ignore the exception handling tables when calculating safe calls.We term the ILR toolchain with this modifications ILR+. ILR+ represents a very close approximation to a system that could easily be achieved by rewriting the exception handling tables in a binary.

我们认为,修改ILR工具链以编辑异常处理表以反映随机化将是可行的。 这些表采用固定的已知格式,可以轻松地用随机地址重写。 其他解决方案也是可能的。 例如,检测是否在程序中或程序的一部分中实际使用了C ++异常处理,将可以有选择地应用返回地址随机化。 尽管对此思想进行全面探讨超出了本文的范围,但是我们能够修改ILR工具链以在计算安全调用时忽略异常处理表。我们将ILR工具链称为ILR +。 ILR +表示与系统的非常接近的近似值,可以通过以二进制格式重写异常处理表来轻松实现。

With ILR+, the call site analysis performs well across all benchmarks. As Figure 6 shows, 93% of all calls are marked as using a randomized return address.

使用ILR +,呼叫站点分析在所有基准测试中均表现良好。 如图6所示,所有呼叫的93%被标记为使用随机返回地址。

  1. Indirect Branch Target Analysis: We continue our evaluation of ILR by measuring the effectiveness of the analysis of indirect branch targets (including return addresses). Figure 8 shows the fraction of instructions detected as possible indirect branch targets. On average, only 2.2% and 0.60% of the instructions are marked as indirect branch targets for ILR and ILR+, respectively. Consequently, we believe our scheme for detecting possible IBTs is not too aggressive in marking instructions as possible indirect branch targets.

3)间接分支目标分析:我们通过评估间接分支目标(包括寄信人地址)的有效性来继续对ILR进行评估。 图8显示了被检测为可能的间接分支目标的指令部分。 平均而言,分别只有2.2%和0.60%的指令被标记为ILR和ILR +的间接分支目标。 因此,我们认为我们用于检测可能的IBT的方案在将指令标记为可能的间接分支目标方面并不太积极。

  1. Moved Instructions: Because we emit rewrites for every byte of the executable segment, technically all instructions are moved. However, IBTs get a rule that maps the unrandomized address to the relocated instruction. Despite technically being moved, we consider this an unmoved (or pinned) instruction because if an attacker were to inject an arc or locate an ROP gadget at the unrandomized address, they could still exploit that information in the randomized program.

4)移动指令:因为我们对可执行段的每个字节发出重写,所以从技术上讲所有指令都被移动了。 但是,IBT获得了将未随机化地址映射到重定位指令的规则。 尽管从技术上讲已被移动,但我们认为这是一条未移动的(或固定的)指令,因为如果攻击者要在非随机地址上注入弧或将ROP小工具定位,他们仍然可以在随机程序中利用该信息。

Figure 8. Percent of instructions marked as possible indirect branch targets. Only 2.2% and 0.60% of instructions are marked on average for the two techniques, indicating that ILR’s IBT analysis is effective.

图8.标记为可能的间接分支目标的指令百分比。 两种技术平均仅标记2.2%和0.60%的指令,这表明ILR的IBT分析是有效的。

Figure 9 shows the percentage of instructions moved for our benchmarks. The first bar shows the effectiveness of ILR without call site analysis; approximately 95.0% of instructions were successfully and safely located at randomized addresses. The second bar shows call site analysis for standard ILR; 97.4% of instructions are moved. The last bar shows the results for ILR+, almost all instructions (99.1%) are assigned to a randomized location in memory. This randomization represents a two order of magnitude reduction in the attack surface for arc-injection and ROP attacks.

图9显示了为基准测试而移动的指令百分比。 第一条显示了没有呼叫站点分析的ILR的有效性; 大约95.0%的指令成功且安全地位于随机地址。 第二栏显示标准ILR的呼叫站点分析。 97.4%的指令已移动。 最后一栏显示ILR +的结果,几乎所有指令(99.1%)都分配给了内存中的随机位置。 对于电弧注入和ROP攻击,这种随机化表示攻击面减少了两个数量级。

Figure 9. Percent of instructions moved using ILR. Results demonstrate that ILR can randomize the location of almost all instructions within an arbitrary binary program.

图9.使用ILR移动的指令百分比。 结果表明,ILR可以随机化任意二进制程序中几乎所有指令的位置。

E. ILR Security

E. ILR安全

To assess the security of ILR, we first note that up to 99.7% of the instructions can be randomized. Furthermore, all of the executable bytes of a program that do not make up a compiler-intended instruction sequence are marked as invalid for execution. These features of ILR reduce the attack surface for arc-injection by over two orders of magnitude. We believe it would be very difficult for an attacker to inject even one control-flow arc that achieves a meaningful result.

为了评估ILR的安全性,我们首先注意到可以将多达99.7%的指令随机化。 此外,程序的所有未组成编译器指令序列的可执行字节都被标记为对执行无效。 ILR的这些功能可将电弧注入的攻击面减少两个数量级以上。 我们认为,对于攻击者而言,即使注入一个实现有意义结果的控制流电弧,也将非常困难。

However, it has recently been shown that even small programs (with at least 20KB of program text) contain enough executable bytes to successfully produce an ROP attack [25]. The basic ILR algorithm reduces the unrandomized program text to less than 20KB for 26 of the 29 SPEC2006 benchmarks, while ILR+ reduces the attack surface to below 20KB for 28 of 29 benchmarks. On average, ILR+ reduces the attack surface to just 3KB! Thus, even state-of-the-art gadget compilers likely can not detect enough gadgets to mount an ROP attack in an ILR±protected program.

但是,最近发现,即使是小型程序(具有至少20KB的程序文本)也包含足够的可执行字节,可以成功地产生ROP攻击[25]。 对于29个SPEC2006基准测试中的26个基准,基本的ILR算法将未随机化的程序文本减小到小于20KB,而对于29个基准中的28个基准测试,ILR +将攻击面减小到20KB以下。 平均而言,ILR +将攻击面减少到只有3KB! 因此,即使是最先进的小工具编译器也可能无法检测到足够的小工具来在受ILR +保护的程序中发起ROP攻击。

To more directly validate that ILR successfully randomizes enough gadget locations to make ROP attacks infeasible, we further examine the SPEC benchmarks. While we know of no vulnerabilities in these benchmarks, they, like all large pieces of software, may in fact have an error that might allow an ROP attack. We study the feasibility of such an attack on these large applications if an appropriate vulnerability were to be found or seeded.

为了更直接地验证ILR成功地随机化了足够的小工具位置以使ROP攻击不可行,我们进一步检查了SPEC基准。 尽管我们不知道这些基准测试中存在漏洞,但它们像所有大型软件一样,实际上可能存在可能导致ROP攻击的错误。 如果发现或植入适当的漏洞,我们将研究对这些大型应用程序进行此类攻击的可行性。

To search for gadgets in these benchmarks, we use a tool available online, ROPgadget [20]. The tool contains a database of gadget patterns and scans binary programs to identify specific gadgets within an executable. For example, one of the gadget patterns is mov e?x, e?x;ret, which identifies gadgets that move one register to another. We experiment with two versions of the tool, version 2.3 and 3.1. Version 2.3’s database contains 60 gadget patterns, while version 3.1 has significantly more: 185 gadget patterns. Version 3.1 also contains a simple gadget compiler that matches gadgets with an attack template to form a complete attack payload. While these payloads do not automatically exploit a vulnerability in a program, they represent a significant portion of the attack. Converting an attack payload into an actual attack is dependent on the exact vulnerability, and is not automated. However, if ROPgadget cannot assemble the attack payload from the attack template, this failure indicates that the templated ROP attack could not proceed, even with a suitable vulnerability. ROPgadget 3.1 comes with two simple attack templates.

要在这些基准测试中搜索小工具,我们使用在线提供的工具ROPgadget [20]。该工具包含一个小工具模式数据库,并扫描二进制程序以识别可执行文件中的特定小工具。例如,小工具模式之一是mov e?x,e?x; ret,它标识将一个寄存器移动到另一个寄存器的小工具。我们尝试使用该工具的两个版本,版本2.3和3.1。版本2.3的数据库包含60个小工具模式,而版本3.1则具有更多:185个小工具模式。 3.1版还包含一个简单的小工具编译器,该小工具编译器将小工具与攻击模板进行匹配,以形成完整的攻击有效负载。尽管这些有效负载不会自动利用程序中的漏洞,但它们代表了攻击的重要部分。将攻击有效载荷转换为实际攻击取决于确切的漏洞,并且不是自动的。但是,如果ROPgadget无法从攻击模板中组合攻击有效载荷,则此失败表明即使有适当的漏洞,模板化的ROP攻击也无法进行。 ROPgadget 3.1带有两个简单的攻击模板。

For the experiment, we modified both versions of ROPgadget to ignore randomized addresses, so that the tool can only locate gadgets at the unrandomized code addresses. This modification mimics an attacker’s abilities via a remote attack. Figure 10 shows how ILR affects an attacker’s ability to mount an ROP attack. The first bar shows the percentage of unique gadgets that have been moved by ILR. We count unique gadgets because typically an attacker could re-use a gadget if needed, and any particular instance of a gadget is likely sufficient to mount an attack which used that gadget. Over 94% are moved on average, with 483.xalancbmk being the worst performing at only 87%. The second bar shows the results for ROPgadget version 3.1. Even more of the gadgets appear to be hidden; over 90% in all cases, and 96% on average. What the figure does not show, however, is that version 3.1 located slightly more gadgets in the ILR-protected version, but found many more gadgets in the unprotected version, thus the overall ratio has improved, indicating that ILR is effective at hiding most gadgets in a program, even in the face of a better gadget identification framework. This result is quantified in the last bar of the figure where we count not unique gadgets, but all gadgets (including duplicates). On average, 99.96% of the total gadgets have had their location randomized.

在实验中,我们修改了ROPgadget的两个版本以忽略随机地址,因此该工具只能将小工具定位在非随机代码地址上。此修改通过远程攻击模仿攻击者的能力。图10显示了ILR如何影响攻击者发起ROP攻击的能力。第一个栏显示了已被ILR移动的唯一小工具的百分比。我们将独特的小工具归为一类,因为攻击者通常会在需要时重新使用该小工具,并且该小工具的任何特定实例都可能足以发起使用该小工具的攻击。平均超过94%的移动,其中483.xalancbmk表现最差,仅为87%。第二栏显示ROPgadget 3.1版的结果。甚至更多的小工具似乎被隐藏了。在所有情况下超过90%,平均为96%。但是,该图未显示的是,版本3.1在受ILR保护的版本中放置的小工具略多,但在不受保护的版本中发现了更多的小工具,因此总体比率有所提高,表明ILR可有效隐藏大多数小工具。在程序中,甚至面对更好的小工具识别框架。该结果在图的最后一个栏中量化,在该栏中我们不算唯一的小工具,而是所有的小工具(包括重复项)。平均而言,有99.96%的小工具将其位置随机化。

Figure 10. Reduction of number of gadgets found using ILR. Almost all gadgets are successfully randomized, and consequently unavailable for use in an attack.

图10.减少使用ILR发现的小工具的数量。 几乎所有小工具均已成功随机化,因此无法用于攻击。

On average, only 2.48 gadgets remain in the program. The worst performing benchmark, 483.xalancbmk, has 6 unique gadgets, versus 67 for the unprotected program. Six gadgets is not enough to mount an attack in most cases. Even the two very simple attack templates included with ROPgadget require 8 and 9 gadgets. We note that on an unprotected application, the gadget compiler can successfully generate an attack payload for every program. In fact, both attacks are automatically detected as possible on 9 of the benchmarks. On the protected program, no attack payloads are ever successfully generated.

平均而言,该程序中仅保留2.48个小工具。 性能最差的基准483.xalancbmk有6个独特的小工具,而不受保护的程序有67个。 在大多数情况下,六个小工具不足以发起攻击。 甚至ROPgadget附带的两个非常简单的攻击模板也需要8和9个小工具。 我们注意到,在不受保护的应用程序上,小工具编译器可以为每个程序成功生成攻击有效载荷。 实际上,会在9个基准上自动检测到这两种攻击。 在受保护的程序上,没有成功生成攻击有效载荷。

With ILR+ (results not shown) the probability of mounting an attack is further reduced. Most ILR+ protected applications have only one gadget (21 of 29 benchmarks). In every case, this lone gadget is an int 0x80 sequence. Used alone, this gadget cannot mount an attack. On average, only 1.5 gadgets remain available with ILR+.

使用ILR +(结果未显示),发动攻击的可能性进一步降低。 大多数受ILR +保护的应用程序只有一个小工具(29个基准测试中的21个)。 在任何情况下,这个单独的小工具都是一个int 0x80序列。 单独使用时,此小工具无法发起攻击。 平均而言,ILR +仅可使用1.5个小工具。

F. Performance Metrics

F.绩效指标

  1. Run-time Overhead: Figure 11 shows the performance overhead of the base VM (Strata), as well as the overhead of ILR and ILR+. We see that Strata adds much of the overhead for the applications, and applying the randomization costs little additional overhead. On average, Strata adds only 8% overhead, with an additional 8% used for ILR. This extra overhead occurs in the short-running, but large code size benchmarks, for example, 400.perlbench, 403.gcc, and 416.gamess). The overhead added is mostly due to the startup overhead of reading the rewrite rules. In 481.wrf benchmark, for example, we note that reading the rewrite rules takes about 45 seconds, and that the 7% overhead difference between basic virtualization and ILR also corresponds to about 46 seconds. We believe that this startup overhead could be greatly reduced by a better rewrite rule format than ASCII. Section IV-F2 discusses optimizing the rewrite rules in more detail.

1)运行时开销:图11显示了基本VM(Strata)的性能开销,以及ILR和ILR +的开销。 我们看到,Strata增加了应用程序的大量开销,而应用随机化几乎不会增加额外的开销。 平均而言,Strata仅增加了8%的开销,而另外8%的ILR则用于开销。 这种额外的开销发生在运行时间短暂但代码规模较大的基准测试中,例如400.perlbench,403.gcc和416.gamess。 增加的开销主要是由于读取重写规则的启动开销。 例如,在481.wrf基准测试中,我们注意到读取重写规则大约需要45秒,而基本虚拟化和ILR之间的7%开销差异也大约相当于46秒。 我们相信,通过使用比ASCII更好的重写规则格式,可以大大减少启动开销。 IV-F2节详细讨论了优化重写规则。

Figure 11. Performance overhead of ILR and ILR+, along with the average overhead and the average without the 453.povray and 481.omnetpp benchmarks. With an average overhead of only 16% and 13%, most applications could be reasonably protected by our ILR or ILR+ prototypes. Further, ILR overhead could be reduced to that of basic virtualization, at only 8%.

图11. ILR和ILR +的性能开销,以及平均开销和没有453.povray和481.omnetpp基准测试的平均值。 由于平均开销仅为16%和13%,我们的ILR或ILR +原型可以合理地保护大多数应用程序。 此外,ILR开销可以减少到基本虚拟化的开销,只有8%。

ILR+ actually reduces the overhead (by 3% to only 13%) compared to ILR. This reduction is due to more call sites being randomized. As mentioned in Section III-B3, storing an unrandomized return address takes one extra instruction. With more return addresses randomized, the instruction count is reduced. Because ILR+ has the largest effects on the C++ benchmarks, we see this difference most in the C++ benchmarks that are ILR+ compatible (447.dealII, 450.soplex, and 483.xalancbmk).

与ILR相比,ILR +实际上减少了开销(减少了3%,仅减少了13%)。 这种减少是由于更多的呼叫站点被随机化了。 如第III-B3节所述,存储非随机返回地址需要多一条指令。 随着更多的返回地址被随机分配,减少了指令数量。 因为ILR +对C ++基准的影响最大,所以我们在与ILR +兼容的C ++基准(447.dealII,450.soplex和483.xalancbmk)中看到了最大的差异。

Taken together, we believe there is strong evidence that ILR can be implemented efficiently, perhaps as low as the basic virtualization overhead of only 8%. Even our prototype implementation, which has overheads of 13%-16% on average could be used to protect many applications.

综上所述,我们相信有充分的证据表明可以有效地实施ILR,这可能只需要8%的基本虚拟化开销即可。 甚至我们的原型实现(平均开销为13%-16%)也可以用于保护许多应用程序。

  1. Space Overhead: Our prototype implementation has memory overhead from two sources. The first is from the PVM we used to implement the ILR VM. Such overheads are well studied, and not particularly significant for modern systems [26, 27].

2)空间开销:我们的原型实现有两个方面的内存开销。 首先是用于实现ILR VM的PVM。 这种开销是经过充分研究的,对于现代系统而言并不是特别重要[26,27]。

The second source of overhead is the handling of the ILR rewrite rules. In our prototype implementation, we made the design choice to use ASCII for the ILR rewrite rules. Our choice makes sense for an evaluation prototype: we favored human readability and ease of debugging over raw performance or storage efficiency. Consequently, we note that the on-disk size of the rewrite rules can be quite large. For example, the largest benchmark, 481.wrf, has 264MB of rewrite rules. The in-memory size is even worse, 345MB. This overhead is largely due to our hashtable implementation that stores each byte of an instruction in a separate hash bucket, which allocates many words of data for each byte stored in an ILR rewrite rule. However, 481.wrf is clearly a worst-case for our benchmarks. The average size of the rewrite rules (104MB) is less than half that for 481.wrf.

开销的第二个来源是ILR重写规则的处理。 在我们的原型实现中,我们做出了将ASCII用于ILR重写规则的设计选择。 我们的选择对于评估原型很有意义:我们更喜欢人类可读性和易于调试性,而不是原始性能或存储效率。 因此,我们注意到重写规则在磁盘上的大小可能会很大。 例如,最大的基准481.wrf具有264MB的重写规则。 内存大小更糟,为345MB。 这种开销主要归因于我们的哈希表实现,该实现将指令的每个字节存储在单独的哈希桶中,该哈希桶为存储在ILR重写规则中的每个字节分配许多数据字。 但是,对于我们的基准来说,481.wrf显然是最坏的情况。 重写规则的平均大小(104MB)小于481.wrf的一半。

While our prototype implementation is currently inefficient, we do not believe the rewrite rules are an inherent limitation of ILR. Many techniques exist for minimizing this overhead. For example, we used the gzip compression utility to compress the rewrite rules, and obtained an average size of 14MB. We believe that a binary encoding of the rewrite rules and an efficient memory storage technique could easily reduce the memory used to well under 14MB. On today’s systems with multiple gigabytes of main memory, such space overhead should be easily tolerated.

尽管我们的原型实现当前效率低下,但我们认为重写规则不是ILR的固有限制。 存在许多用于最小化该开销的技术。 例如,我们使用gzip压缩实用程序来压缩重写规则,并获得14MB的平均大小。 我们认为,重写规则的二进制编码和有效的内存存储技术可以轻松地将使用的内存减少到14MB以下。 在当今具有多个千兆字节主内存的系统上,这样的空间开销应该容易容忍。

  1. Analysis Time: We measured the analysis time of the ILR technique. We were able to process the SPEC benchmarks in an average of 23 minutes each. Only the last step of the process creates any randomization, so most of that processing time can be re-used if one wanted to re-randomize. The randomization step itself took only 36 seconds, indicating that re-randomization once analysis is complete could proceed very quickly.

3)分析时间:我们测量了ILR技术的分析时间。 我们能够平均平均每23分钟处理一次SPEC基准测试。 只有过程的最后一步会创建任何随机化,因此,如果要重新随机化,则大部分处理时间都可以重用。 随机化步骤本身仅花费了36秒,这表明一旦分析完成,重新随机化可能会很快进行。

5.SECUTIRY DISCUSSION

5,安全讨论

A. Protecting the ILR VM

A.保护ILR VM

This section discusses several issues related to the security of the VM used to implement ILR.

本节讨论与用于实现ILR的VM的安全性相关的几个问题。

The first issue that arises is the VM’s potential for being vulnerable to an ROP or arc-injection attack. First, we note that the input to the VM is actually the program’s instructions and the ILR rewrite rules, which we assume to be benign. Malicious programs or malicious rewrite rules are beyond the scope of our remote-attacker threat model. Benign programs and rewrite rules help, as that is the majority of input for the VM, but does not absolutely preclude an attacker from providing input to the program that somehow exercises a vulnerability in the VM. Still, we feel this threat is minimal, and could be addressed via a variety of techniques. We believe that formal verification should be possible since the VM’s code is typically quite small. (Strata’s fully-featured IA32 implementation is only 18K lines of code.) Much of the code is related to the decoder for the machine’s ISA, which might be automatically verified or generated from an ISA description. Even without formal verification, bugs within a VM can largely be addressed via iterative refinement, code-review, static analysis, and compiler-based protection techniques. The last item has significant potential for protecting the VM in this case. If randomization (stack, heap, instruction-location, etc.) could be used on the VM at the deployed location, most attacks directly on the VM could be mitigated.

出现的第一个问题是VM容易受到ROP或电弧注入攻击的威胁。首先,我们注意到VM的输入实际上是程序的指令和ILR重写规则,我们认为它们是良性的。恶意程序或恶意重写规则不在我们的远程攻击者威胁模型的范围之内。良性程序和重写规则会有所帮助,因为这是VM的大部分输入,但是并不能绝对阻止攻击者向以某种方式行使VM漏洞的程序提供输入。不过,我们认为这种威胁很小,可以通过多种技术加以解决。我们认为,正式的验证应该可行,因为VM的代码通常很小。 (Strata的全功能IA32实现只有18K行代码。)许多代码与机器ISA的解码器有关,这些解码器可以自动验证或从ISA描述中生成。即使没有正式的验证,也可以通过迭代优化,代码审查,静态分析和基于编译器的保护技术来解决VM中的错误。在这种情况下,最后一项具有保护虚拟机的巨大潜力。如果可以在部署位置的VM上使用随机化(堆栈,堆,指令位置等),则可以减轻直接针对VM的大多数攻击。

The more significant threat to the VM is that a vulnerability in the application allows the application’s code to overwrite some portion of the VM, or to have the VM start interpreting some portion of itself. Since a process-level VM typically resides in the process’ address space, we need to
guard against these threats directly.

对VM的更大威胁是应用程序中的漏洞允许应用程序的代码覆盖VM的某些部分,或者使VM开始解释其自身的某些部分。 由于进程级虚拟机通常位于进程的地址空间中,因此我们需要
直接防范这些威胁。

We do so by augmenting the VM to verify any instruction before it is fetched for analysis. The VM ensures that the instruction originates from allowable portions of the application text (for pinned instructions) or an ILR rewrite rule. The VM is prohibited from translating itself or its generated code, and consequently the VM’s code cannot be used for arc-injection or ROP attacks. Our prototype implementation includes these protections.

为此,我们通过扩展VM以在提取指令进行分析之前对其进行验证。 VM确保指令源自应用程序文本(用于固定指令)或ILR重写规则的允许部分。 禁止VM转换自身或其生成的代码,因此VM的代码不能用于电弧注入或ROP攻击。 我们的原型实现包括这些保护。

To prevent a compromised application from overwriting the VM’s code or data, we use standard hardware memory protection mechanisms. When executing the untrusted application code, the VM turns off read, write, and execute permission on memory used by the VM, leaving only execute (but not write) permission on the code cache. The VM also watches for attempts by the application to change these permissions. Previous work shows this technique to be effective and cost very little [28].

为了防止受感染的应用程序覆盖VM的代码或数据,我们使用标准的硬件内存保护机制。 当执行不受信任的应用程序代码时,VM关闭对VM所使用的内存的读取,写入和执行权限,仅在代码缓存中保留执行(但不写入)权限。 VM还监视应用程序尝试更改这些权限的尝试。 先前的工作表明该技术有效且成本极低[28]。

Together, we believe that good coding practices, verification, randomization, and actively protecting the VM from a compromised application can result in a safe VM.

我们共同认为,良好的编码做法,验证,随机化以及积极保护VM免受入侵的应用程序的影响,可以确保VM的安全。

B. Entropy Exhausting Attacks

B. 熵耗尽攻击

The entropy of the ILR technique can be quite high. Since the ILR technique separates data and instruction memory, randomized instructions can be located anywhere in memory, even at the same addresses as program data, VM code or data. Many operating systems reserve some pages of memory specifically for code to interface with the operating system, so those pages could not be used for randomized addresses. Further, any unrandomized instructions restrict the entropy of the remaining instructions. Since there are very few unmoved instructions, and almost all other addresses are available for randomization, we believe that it would be easy to produce a system that has at least 31-bits of entropy on a 32-bit address system and at least 63-bits of entropy on a 64-bit system. Thus, randomly attempting to guess gadget addresses is completely infeasible and ILR can evade attacks which attempt to reduce the entropy of a system.

ILR技术的熵可能很高。 由于ILR技术将数据和指令存储器分开,因此随机指令可以位于存储器中的任何位置,甚至与程序数据,VM代码或数据的地址相同。 许多操作系统会保留一些内存页面,专门用于与操作系统接口的代码,因此这些页面不能用于随机地址。 此外,任何未随机化的指令都会限制其余指令的熵。 由于几乎没有固定的指令,并且几乎所有其他地址都可用于随机化,因此我们相信,在32位地址系统上产生至少具有31位熵且至少63位具有熵的系统将很容易 64位系统上的熵位。 因此,随机尝试猜测小工具地址是完全不可行的,并且ILR可以规避试图降低系统熵的攻击。

C. Information Leakage Attacks

C.信息泄漏攻击

A more likely attack scenario is that an attacker is able to leak information about randomized addresses. Fortunately, the memory-page protection techniques mentioned in Section V-A prevent leaking of information about most randomized addresses. The only randomized addresses that might be leaked are those that potentially end up in the application’s visible data. For ILR, that is the randomized return addresses that might be stored on the application’s stack. For a complete ILR+ implementation, it also includes any randomized addresses that are written into the application’s exception handling tables.

更可能的攻击情形是攻击者能够泄漏有关随机地址的信息。 幸运的是,第V-A节中提到的内存页面保护技术可以防止有关大多数随机地址的信息泄漏。 唯一可能泄漏的随机地址是那些可能最终出现在应用程序可见数据中的地址。 对于ILR,这是可能存储在应用程序堆栈中的随机返回地址。 对于完整的ILR +实现,它还包括写入应用程序的异常处理表中的所有随机地址。

In theory, all of these addresses might be leaked to an attacker. However, revisiting Figure 9, we see that on average only 5% of addresses in the total program could be known by the user. In practice, only a few randomized return addresses are available in the application at any instance, and most return addresses could not actually be leaked. If it were possible for the entire exception handling table to be leaked, the number of available addresses would likely be very close to the ILR results, and no ROP attacks are available against ILR in our benchmarks, as seen in Section IV-E.

从理论上讲,所有这些地址都可能泄露给攻击者。 但是,再看一下图9,我们看到用户平均只能知道整个程序中平均5%的地址。 实际上,在任何情况下,应用程序中只有很少的随机返回地址可用,并且大多数返回地址实际上都不会泄漏。 如果可能泄漏整个异常处理表,则可用地址的数量可能会非常接近ILR结果,并且在我们的基准测试中没有针对ILR的ROP攻击可用,如第IV-E节所示。

Furthermore, since our ILR technique is designed to be applied to arbitrary executables, re-randomization could occur regularly with little overhead. Regular re-randomization of high-entropy systems has been shown to be effective in the context of information leakage [29].

此外,由于我们的ILR技术旨在应用于任意可执行文件,因此可以以很少的开销定期进行重新随机化。 在信息泄漏的情况下,高熵系统的定期重新随机化已被证明是有效的[29]。

Thus, information leakage is not a problem for ILR.

因此,信息泄漏对于ILR而言不是问题。

D. False Detections

D.错误检测

A false detection occurs when the program performs an operation that is detected as illegal, when there is no attack underway. On our benchmark suite, we found that there were no false detections with ILR. Since our implementation of ILR+ is incomplete, we did observe two false detections. Both 453.povray and 471.omnetpp resulted in incorrect output (from faulting the program) when attempting to throw an exception. A complete implementation of ILR+ would not demonstrate this problem. We believe this indicates that false detections would be rare in real programs. Nonetheless, we discuss some possible mechanisms by which false detections might occur.

当程序执行检测为非法的操作且没有正在进行的攻击时,就会发生错误检测。 在我们的基准套件上,我们发现ILR没有发现错误。 由于我们对ILR +的执行不完整,因此我们确实观察到两个错误的检测结果。 尝试引发异常时,453.povray和471.omnetpp均导致错误输出(由于程序出错)。 完整实现ILR +不会证明此问题。 我们认为,这表明在实际程序中很少会出现错误检测。 尽管如此,我们讨论了可能发生错误检测的一些可能机制。

False detections might occur if a program calculates an indirect branch target, instead of simply storing the target in data memory as is most common. We found one example of this type of code in gcc’s library for doing arbitrary precision arithmetic. The example, shown in Figure 12 and originally written in assembly, is used to dispatch into a switch-style table of code blocks. Each block in the table is 9 bytes long. The assembly multiplies register eax by 9 (eax+eax*8), then adds the the base of the first code block before finally jumping to that address. A similar construct might be generated by a compiler, but we know of no compilers which generate this type of code for a switch statement. Other constructs exist that might hide code addresses. For example, a function pointer might be calculated for some reason, such as for obfuscation techniques.

如果程序计算间接分支目标,而不是像通常那样简单地将目标存储在数据存储器中,则可能会发生错误检测。 我们在gcc的库中找到了一种用于执行任意精度算术的此类代码示例。 该示例如图12所示,最初是用汇编语言编写的,用于将其分配到开关式代码块表中。 表中的每个块长9个字节。 程序集将寄存器eax乘以9(eax + eax * 8),然后将第一个代码块的基数相加,然后最终跳转到该地址。 编译器可能会生成类似的构造,但是我们知道没有编译器会为switch语句生成这种类型的代码。 存在其他可能隐藏代码地址的构造。 例如,出于某种原因(例如,混淆技术)可能会计算函数指针。

Figure 12. An example of a calculated branch target from gcc’s library for arbitrary precision arithmetic.

图12.从gcc库计算出的分支目标的示例,用于任意精度算术。

A more common compiler construct that might calculate an indirect branch target is position independent code (PIC). In PIC mode, the compiler will often generate a code address by emitting a sequence of instructions that adds the current PC and a constant offset, knowing that the desired code address is a fixed distance from the current PC. PIC code is not standard due to its performance overhead.

可能计算间接分支目标的更常见的编译器构造是位置无关代码(PIC)。 在PIC模式下,编译器通常会通过发出一系列指令来生成代码地址,这些指令将当前PC加一个恒定的偏移量,同时知道所需的代码地址与当前PC的距离是固定的。 由于其性能开销,PIC代码不是标准的。

In most of these cases, we believe that a more advanced indirect branch analysis would solve the problem. For example, the code in Figure 12 is prefixed by code to verify that register eax is in proper bounds. A simple range analysis on the values that can reach the jmp instruction would reveal the possible indirect branch targets.

在大多数情况下,我们认为更高级的间接分支分析将解决该问题。 例如,图12中的代码以代码为前缀,以验证寄存器eax处于适当范围内。 对可以达到jmp指令的值进行简单的范围分析,将揭示可能的间接分支目标。

Furthermore, our experience indicates that the ILR technique can easily print the address of an indirect branch target if a false detection is encountered. A profile-based or feedback-based mechanism that incorporates newly discovered IBTs would be easy to implement to reduce false detections over time if the IBT can be detected as derived only from safe sources.

此外,我们的经验表明,如果遇到错误检测,则ILR技术可以轻松打印间接分支目标的地址。 如果可以将IBT检测为仅来自安全来源,那么结合新发现的IBT的基于概要文件或基于反馈的机制将易于实施,以减少随时间的错误检测。

E. Shared Libraries

E.共享库

Modern computer systems are built using libraries that are loaded on demand, and possibly shared among many processes. Linux uses the .so (Shared Object) format, while Windows uses the .dll (Dynamically Linked Library) model. Our system is capable of processing and randomizing a program that uses dynamic linking. Generally, analysis of these types of programs is easier for our system. Since the code is divided into libraries, we know that if a library contains a constant, the constant can only be an IBT in the library being considered, not to other libraries. Thus, this separation dramatically reduces the number of potential IBTs for a library. Furthermore, externally visible functions and symbols need to be referenced by a handle that is given in the library’s headers. Extracting these types of indirect branch targets is trivial.

现代计算机系统是使用按需加载的库构建的,并可能在许多进程之间共享。 Linux使用.so(共享对象)格式,而Windows使用.dll(动态链接库)模型。 我们的系统能够处理和随机化使用动态链接的程序。 通常,对于我们的系统而言,分析这些类型的程序更容易。 由于代码被分为库,因此我们知道,如果一个库包含一个常量,则该常量只能是所考虑的库中的IBT,而不能视为其他库。 因此,这种分离极大地减少了库的潜在IBT数量。 此外,库标头中提供的句柄需要引用外部可见的函数和符号。 提取这些类型的间接分支目标很简单。

While our prototype can process and effectively randomize programs that require shared libraries, it does not actually randomize the libraries. Both Linux and Windows support some form of ASLR which provides coarse-granularity randomization of shared libraries. We believe our technique could easily be extended to include full randomization of shared libraries, but it is not clear that doing so would always be the best solution. When feasible, it seems better to provide randomization within the library itself. On Linux, this randomization could be accomplished by using a randomizing compiler to generate a per-system version of the libraries. When library source code is not available, such as on Windows-based systems, ILR-based randomization would be important. To achieve this, ILR-rewrite rules for a library would have to be loaded and symbolic addresses resolved whenever a new library entered the system. Such a mechanism could be easily included in a dynamic loader, or by having the ILR VM watch for library loading events.

虽然我们的原型可以处理并有效地随机化需要共享库的程序,但实际上并不能随机化库。 Linux和Windows都支持某种形式的ASLR,该ASLR提供共享库的粗粒度随机化。我们认为我们的技术可以轻松扩展为包括共享库的完全随机化,但是尚不清楚这样做是否总是最佳解决方案。在可行时,最好在库本身内提供随机化。在Linux上,可以通过使用随机化编译器生成每个系统版本的库来实现这种随机化。当库源代码不可用时(例如在基于Windows的系统上),基于ILR的随机化将很重要。为此,每当新的库进入系统时,都必须加载库的ILR重写规则并解析符号地址。这种机制可以轻松地包含在动态加载器中,也可以让ILR VM监视库加载事件。

F. Self-modifying Code

F.自修改代码

Our ILR implementation does not currently support selfmodifing, dynamically generated, or just-in-time compiled (JITted) code because our underlying VM does not support such constructs. However, the ILR mechanism itself should operate properly with dynamically generated and JITted code, which is significantly more common than selfmodifing code. ILR would not randomize the generated code, but we believe that to be an easy task for the JITter. A security-minded JITter would perform this simple operation.

我们的ILR实现当前不支持自修改,动态生成或即时编译(JITted)代码,因为我们的基础VM不支持此类构造。 但是,ILR机制本身应该与动态生成的JITted代码一起正常运行,这比自修改代码更为常见。 ILR不会随机化生成的代码,但是对于JITter来说,这是一件容易的事。 具有安全意识的JITter将执行此简单操作。

6. RELATED WORK

6.相关工作

A. ROP Defenses

A. ROP防御

The original authors of ROP have described ROP’s salient feature as “Turing completeness without code injection” [9]. ROP invalidates the assumption that attack payloads are intrinsically external by nature as ROP re-uses code fragments already present in a target program. Defensive techniques such as various forms of instruction-set randomization that target code injection attacks directly are completely circumvented by arc-injection attacks in general [28, 30, 31], of which ROP, return-to-libc [32, 33], and partial overwriting attacks of return addresses [10] are special cases. W⊕X is also circumvented as it implicitly assumes that external code will be executed from data pages [5].

ROP的原始作者将ROP的显着特征描述为“无需代码注入即可实现转换完整性” [9]。 由于ROP重用了目标程序中已经存在的代码片段,因此ROP使攻击有效负载本质上是外部的假设无效。 防御技术,例如直接针对目标代码注入攻击的各种形式的指令集随机化,一般都被弧注入攻击完全规避[28,30,31],其中ROP,返回libc [32、33], 返回地址的部分重写攻击[10]是特殊情况。 W⊕X也被规避了,因为它隐含地假设将从数据页执行外部代码[5]。

Since the original seminal work on ROP [2], several defensive techniques have been proposed. Early defenses targeted what would emerge to be non-essential features of ROP attacks. For example, DROP [14] instruments binaries searching for short consecutive sequences of instructions ending in a return instruction. Li et al. and Onarlioglu et al. avoid gadget-like instruction sequences altogether when generating code [34, 35]. Kil et al. permute function locations to randomize gadget locations, but require additional compiletime information [36]. ROPDefender [15] and TRUSS [37] look for mismatched calls and returns essentially using a shadow stack.

自从最初关于ROP的开创性工作[2]以来,已经提出了几种防御技术。 早期防御的目标是ROP攻击的非必需特征。 例如,DROP [14]会检测二进制文件以查找以返回指令结尾的短连续指令序列。 Li等。 和Onarlioglu等。 生成代码时完全避免使用类似小工具的指令序列[34,35]。 基尔等。 置换函数位置以使小工具位置随机化,但需要其他编译时信息[36]。 ROPDefender [15]和TRUSS [37]查找不匹配的调用,并实质上使用影子堆栈进行返回。

Checkoway et al. showed that the use of the return instruction is not a necessary condition in building ROP gadgets, thereby bypassing such ad-hoc defenses [9]. The balance against ad hoc defenses is further tilted by recent works that have automated the process of gadget discovery [20, 38, 39] and ROP exploit compilation and hardening [25].

Checkoway等。 结果表明,在构建ROP小工具时,使用返回指令不是必要条件,从而绕过了此类临时防御措施[9]。 最近的工作使小工具发现过程[20,38,39]和ROP漏洞利用程序的编译和强化自动化[25],进一步打破了针对临时防御的平衡。

TRUSS [37], ROPDefender [15], DROP [14], and TaintCheck [40] use software dynamic translation frameworks for instrumenting code and implementing their respective defenses. TaintCheck uses dynamic taint analysis and provides a comprehensive approach to thwarting ROP attacks by detecting attempts at control-flow hijacking, though it suffers from high overhead (over 20X). Performance overhead for ROPDefender is approximately 2X overhead on the SPEC2006 benchmarks, while preliminary performances measurements for DROP range from 1.9X to 21X. While not directly comparable, ILR achieves average performance overhead of only 13-16%, which makes it practical for deployment.

TRUSS [37],ROPDefender [15],DROP [14]和TaintCheck [40]使用软件动态翻译框架来检测代码并实现各自的防御。 TaintCheck使用动态污点分析,并通过检测对控制流劫持的尝试来提供全面的方法来阻止ROP攻击,尽管它遭受高额开销(超过20倍)。 在SPEC2006基准测试中,ROPDefender的性能开销约为2倍,而DROP的初步性能度量范围则为1.9倍至21倍。 尽管不能直接比较,但ILR的平均性能开销仅为13-16%,这使其易于部署。

B. Defenses based on randomization

B.基于随机的防御

In contrast to approaches that look for specific ROP patterns, ILR provides a comprehensive defense based on high-entropy diversification to thwart attacks. ILR provides 31 bits of entropy (out of a maximum of 32 for our experimental prototype) which makes derandomizing attacks impractical. ASLR on a 32-bit architecture only provides 16 bits of entropy and is susceptible to brute-force attacks [7]. Even on 64-bit architectures, there would be two potential problems. The first is that ASLR is not applied universally throughout the address space. Even when using dynamicallylinked libraries, it is common for the main program text to start at a known fixed location. Red Hat developed Position Independent Executable to remedy this situation [41]. However, PIE requires recompilation. The second problem is that ASLR and other coarse-grained technology such as PIE do not perform intra-library randomization. Any information leaked as to the location of one function, or even one address, could be used to infer the complete layout of a library. Roglia et al. demonstrated a single-shot return-tolibc attack that used ROP gadgets to leak information about the base address of libc, and bootstrapped this information into all other libc functions [8]. Their proposed remedy of encrypting the Global Offset Table was specific to their attacks and leaves open the possibility of other leakage attacks.

与寻找特定ROP模式的方法相比,ILR提供了基于高熵多样化的全面防御,以阻止攻击。 ILR提供了31位的熵(在我们的实验原型中,最大为32位),这使得去随机化攻击变得不切实际。在32位架构上的ASLR仅提供16位熵,并且容易受到蛮力攻击[7]。即使在64位体系结构上,也将存在两个潜在问题。首先是ASLR并非在整个地址空间中通用。即使使用动态链接的库,主程序文本也通常从已知的固定位置开始。红帽开发了位置独立可执行程序来纠正这种情况[41]。但是,PIE需要重新编译。第二个问题是ASLR和其他粗粒度技术(例如PIE)不执行库内随机化。关于一个函数甚至一个地址的位置泄漏的任何信息都可以用来推断库的完整布局。 Roglia等。演示了一次使用lib-return-tolibc的单次攻击,该攻击使用ROP小工具泄漏有关libc基址的信息,并将该信息引导到所有其他libc函数中[8]。他们提议的对全局偏移表进行加密的补救措施是针对他们的攻击的,并保留了其他泄漏攻击的可能性。

Bhatkar et al. use source-to-source transformation techniques to produce self-randomizing programs (SRP) to combat memory error exploits [42]. Unlike other compilerbased randomization techniques [43], SRP produces a single program image, which makes it more practical for deployment. SRP randomizes code at the granularity of individual functions and therefore retains a larger attack surface than the ILR approach of randomizing at the instruction level. Instruction Set Randomization (ISR) helps defeat codeinjection attacks, but provides no protection against arcinjection and ROP attacks [28].

Bhatkar等。 使用源到源转换技术来产生自随机化程序(SRP),以对抗内存错误利用[42]。 与其他基于编译器的随机化技术[43]不同,SRP生成单个程序映像,这使其更易于部署。 SRP在各个功能的粒度上将代码随机化,因此比在指令级进行ILR随机化的方法所保留的攻击面更大。 指令集随机化(ISR)有助于克服代码注入攻击,但不能提供防止电弧注入和ROP攻击的保护措施[28]。

C. Control Flow Integrity

C.控制流完整性

Control flow integrity (CFI) is designed to ensure the control flow of a program is not hijacked [44]. CFI relies on the Vulcan instrumentation system. The Vulcan system allows instruction discovery, static analysis, and binary rewriting.

控制流完整性(CFI)旨在确保程序的控制流不会被劫持[44]。 CFI依靠Vulcan仪器系统。 Vulcan系统允许指令发现,静态分析和二进制重写。

Figure 13 shows an example program. In the figure, CFI enforces that the return instruction (in function log) can only jump to the instruction after a call to the log function. In this case, this policy allows an arc-injection attack if the log function is vulnerable. An attacker might be able to overwrite the return address to erroneously jump to L2, thereby granting additional access. Even the best static analysis cannot mitigate these threats using CFI.

图13显示了一个示例程序。 在图中,CFI强制执行(在函数日志中)返回指令只能在调用log函数后跳转到该指令。 在这种情况下,如果日志功能易受攻击,则此策略会允许电弧注入攻击。 攻击者可能能够覆盖返回地址以错误地跳转到L2,从而授予其他访问权限。 即使是最好的静态分析,也无法使用CFI缓解这些威胁。

Figure 13. Example demonstrating CFI’s weakness. The return instruction jump either L1 and L2, possibly allowing additional access if the log function is vulnerable.

图13.展示CFI弱点的示例。 该返回指令会跳转L1和L2,如果日志功能易受攻击,则可能允许其他访问。

Further, a partial overwrite attack might defeat ASLR in this example, since the distance between the two return sites is fixed. Since ILR randomizes this distance, ILR can defeat partial-overwrite attacks.

此外,由于两个返回站点之间的距离是固定的,因此在此示例中,部分重写攻击可能会使ASLR失败。 由于ILR随机分配了此距离,因此ILR可以击败部分覆盖攻击。

7. CONCLUSIONS

7.结论

This paper presents instruction location randomization (ILR), a high-entropy technique for relocating instructions within an arbitrary binary. ILR is shown to effectively hide 99.96% of ROP gadgets from an attacker, a 3.5 order of magnitude reduction in attack surface.

本文介绍了指令位置随机化(ILR),这是一种用于在任意二进制数内重定位指令的高熵技术。 事实证明,ILR有效地向攻击者隐藏了99.96%的ROP小工具,使攻击面减少了3.5个数量级。

This work describes the general technique, as well as evaluates two versions of an ILR prototype. It further discusses the security implications of ILR. We find that ILR can be applied to a wide range of binary programs compiled from C, Fortran, and C++. Performance overhead is shown to be as low as 13% across the 29 SPEC CPU2006 industrystandard benchmarks [16].

这项工作描述了通用技术,并评估了ILR原型的两个版本。 它进一步讨论了ILR的安全隐患。 我们发现ILR可以应用于从C,Fortran和C ++编译的各种二进制程序。 在29个SPEC CPU2006行业标准基准测试中,性能开销低至13%[16]。

This work surpasses state-of-the-art techniques for defeating attacks in a variety of ways. In particular, the technique:

这项工作超越了以各种方式击败攻击的最新技术。 特别是,该技术:

  • can be easily and efficiently applied to binary programs,

  • provides up to 31 bits of entropy for instruction locations on 32-bit systems,

  • can regularly re-randomize a program to thwart entropy-exhausting or information-leakage attacks,

  • provides low execution overhead,

  • randomizes statically and dynamically linked programs, and

  • defeats attacks against large, real-world programs including the Linux PDF viewer, xpdf, and Adobe’s PDF viewer, acroread.

  • 可以轻松有效地应用于二进制程序,

  • 为32位系统上的指令位置提供最多31位的熵,

  • 可以定期将程序重新随机化,以阻止熵耗尽或信息泄漏攻击,

  • 提供较低的执行开销,

  • 随机化静态和动态链接的程序,以及

  • 抵御针对大型现实世界程序的攻击,包括Linux PDF查看器xpdf和Adobe PDF查看器acroread。

Taken together, these results demonstrate that ILR can be used in a wide variety of real-world situations to provide strong protection against attacks.

综上所述,这些结果表明,ILR可用于各种现实环境中,以提供强大的防御攻击能力。

ACKNOWLEDGMENT

致谢

This research is supported by National Science Foundation (NSF) grant CNS-0716446, the Army Research Office (ARO) grant W911-10-0131, the Air Force Research Laboratory (AFRL) contract FA8650-10-C-7025, and DoD AFOSR MURI grant FA9550-07-1-0532. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the NSF, AFRL, ARO, DoD, or the U.S. Government.

美国国家科学基金会(NSF)授予CNS-0716446,美国陆军研究办公室(ARO)授予W911-10-0131,空军研究实验室(AFRL)合同FA8650-10-C-7025和DoD AFOSR支持这项研究 MURI授予FA9550-07-1-0532。 本文包含的观点和结论是作者的观点和结论,不应解释为必然代表NSF,AFRL,ARO,DoD或美国政府的官方政策或认可,无论是明示或暗示。

REFERENCES

参考资料

[1] J. Pincus and B. Baker, “Beyond stack smashing: Recent advances in exploiting buffer overruns,” IEEE Security & Privacy, vol. 2, no. 4, pp. 20–27, Jul/Aug 2004.

[2] H. Shacham, “The geometry of innocent flesh on the bone: Return-into-libc without function calls (on the x86),” in Proceedings of the 14th ACM Conference on Computer and Communications Security. ACM, 2007, pp. 552–561.

[3] E. Buchanan, R. Roemer, H. Shacham, and S. Savage, “When good instructions go bad: Generalizing return-oriented programming to RISC,” in Proceedings of the 15th ACM Conference on Computer and Communications Security. ACM, 2008, pp. 27–38.

[4] D. Dai Zovi, “Practical return-oriented programming,” SOURCE Boston, 2010.

[5] The PAX Team, http://pax.grsecurity.net.

[6] M. Howard and M. Thomlinson, “Windows vista ISV security,” Microsoft Corporation, April, vol. 6, 2007.

[7] H. Shacham, M. Page, B. Pfaff, E. Goh, N. Modadugu, and D. Boneh, “On the effectiveness of address-space randomization,” in Proceedings of the 11th ACM Conference on Computer and Communications Security. ACM, 2004, pp. 298–307.

[8] G. Roglia, L. Martignoni, R. Paleari, and D. Bruschi, “Surgically returning to randomized lib ©,” in 2009 Annual Computer Security Applications Conference. IEEE, 2009, pp. 60–69.

[9] S. Checkoway, L. Davi, A. Dmitrienko, A. Sadeghi, H. Shacham, and M. Winandy, “Return-oriented programming without returns,” in Proceedings of the 17th ACM Conference on Computer and Communications Security. ACM, 2010, pp. 559–572.

[10] T. Durden, “Bypassing PaX ASLR protection,” Phrack Magazine, vol. 0x0b, no. 0x3b, 2002. [Online]. Available: http://www.phrack.org/issues.html?issue=59&id=9

[11] K. Scott, N. Kumar, S. Velusamy, B. R. Childers, J. W. Davidson, and M. L. Soffa, “Retargetable and reconfigurable software dynamic translation,” in International Symposium on Code Generation and Optimization. San Francisco, CA:IEEE Computer Society, Mar. 2003, pp. 36–47.

[12] V. Bala, E. Duesterwald, and S. Banerjia, “Dynamo: A transparent dynamic optimization system,” in SIGPLAN ’00 Conference on Programming Language Design and Implementation, 2000, pp. 1–12.

[13] M. Payer and T. Gross, “Fine-grained user-space security through virtualization,” in Proceedings of the 7th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. ACM, 2011, pp. 157–168.

[14] P. Chen, H. Xiao, X. Shen, X. Yin, B. Mao, and L. Xie, “DROP: Detecting return-oriented programming malicious code,” Information Systems Security, pp. 163–177, 2009.

[15] L. Davi, A. Sadeghi, and M. Winandy, “ROPdefender: A detection tool to defend against return-oriented programming attacks,” in Proceedings of the 6th ACM Symposium on Information,
Computer and Communications Security. ACM, 2011, pp. 40–51.

[16] Standard Performance Evaluation Corporation, “SPEC CPU2006 Benchmarks,” http://www.spec.org/osg/cpu2006.

[17] (2011, November) Hex-rays website. [Online]. Available: http://www.hex-rays.com/products/ida/index.shtml

[18] C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood, “Pin: Building customized program analysis tools with dynamic instrumentation,” in PLDI ’05: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation. New York, NY, USA: ACM Press, 2005, pp. 190–200.

[19] M. Voss and R. Eigenmann, “A framework for remote dynamic program optimization,” in Proceedings of the ACM Workshop on Dynamic Optimization Dynamo ’00, 2000.

[20] “Shell storm website,” http://www.shell-sorm.org/project/ROPgadget/.

[21] (2008) Libtiff tifffetchshortpair remote buffer overflow vulnerability. [Online]. Available: http://www.securityfocus.com/bid/19283

[22] A. Kapoor, “An approach towards disassembly of malicious binary executables,” Ph.D. dissertation, University of Louisiana, 2004.

[23] C. Kruegel, W. Robertson, F. Valeur, and G. Vigna, “Static disassembly of obfuscated binaries,” in Proceedings of the 13th USENIX Security Symposium, 2004, pp. 255–270.

[24] B. Schwarz, S. Debray, and G. Andrews, “Disassembly of executable code revisited,” in Proceedings of the 9th Working Conference on Reverse Engineering. IEEE, 2002, pp. 45–54.

[25] E. J. Schwartz, T. Avgerinos, and D. Brumley, “Q: Exploit hardening made easy,” in Proceedings of the USENIX Security Symposium, 2011.

[26] J. Hiser, D. Williams, W. Hu, J. Davidson, J. Mars, and B. Childers, “Evaluating indirect branch handling mechanisms in software dynamic translation systems,” in Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, 2007, pp. 61–73.

[27] A. Guha, K. Hazelwood, and M. Soffa, “Reducing exit stub memory consumption in code caches,” High Performance Embedded Architectures and Compilers, pp. 87–101, 2007.

[28] W. Hu, J. Hiser, D. Williams, A. Filipi, J. Davidson, D. Evans, J. Knight, A. Nguyen-Tuong, and J. Rowanhill, “Secure and practical defense against code-injection attacks using software dynamic translation,” in Proceedings of the 2nd International Conference on Virtual Execution Environments. ACM, 2006, pp. 2–12.

[29] A. Nguyen-Tuong, A.Wang, J. Hiser, J. Knight, and J. Davidson, “On the effectiveness of the metamorphic shield,” in Proceedings of the Fourth European Conference on Software Architecture: Companion Volume. ACM, 2010, pp. 170–174.

[30] E. G. Barrantes, D. H. Ackley, S. Forrest, and D. Stefanovic, “Randomized instruction set emulation,” ACM Transactions on Information System Security., vol. 8, no. 1, pp. 3–40, 2005.

[31] G. S. Kc, A. D. Keromytis, and V. Prevelakis, “Countering code-injection attacks with instruction-set randomization,” in CCS ’03: Proceedings of the 10th ACM Conference on Computer and Communications Security. New York, NY, USA: ACM Press, 2003, pp. 272–280.

[32] S. Designer, ““return-to-libc” attack,” Bugtraq, Aug, 1997.

[33] Nergal, “The advanced return-into-lib© exploits (PaX case study).” Phrack Magazine, 58(4), December 2001.

[34] J. Li, Z. Wang, X. Jiang, M. Grace, and S. Bahram, “Defeating return-oriented rootkits with “return-less” kernels,” in Proceedings of the 5th European Conference on Computer Systems, ser. EuroSys ’10. New York, NY, USA: ACM, 2010, pp. 195–208.

[35] K. Onarlioglu, L. Bilge, A. Lanzi, D. Balzarotti, and E. Kirda, “G-Free: defeating return-oriented programming through gadget-less binaries,” in Proceedings of the 26th Annual Computer Security Applications Conference. ACM, 2010, pp. 49–58.

[36] C. Kil, J. Jun, C. Bookholt, J. Xu, and P. Ning, “Address space layout permutation (ASLP): Towards fine-grained randomization of commodity software,” in Computer Security Applications Conference, 2006. ACSAC’06. 22nd Annual. Ieee, 2006, pp. 339–348.

[37] S. Sinnadurai, Q. Zhao, and W. fai Wong, “Transparent runtime shadow stack: Protection against malicious return address modifications,” 2008.

[38] T. Dullien and T. Kornau, “A framework for automated architecture-independent gadget search,” in 4th USENIX Workshop on Offensive Technologies, 2010.

[39] R. G. Roemer, “Finding the bad in good code: Automated return-oriented programming exploit discovery,” 2009.

[40] D. S. James Newsome, “Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software,” in Proceedings of the Network and Distributed System Security Symposium, 2005.

[41] A. van de Ven, “New security enhancements in red hat enterprise linux v.3, update 3.” Red Hat, Inc., 2004.

[42] S. Bhatkar, R. Sekar, and D. C. DuVarney, “Efficient techniques for comprehensive protection from memory exploits,” in Proceedings of the 14th Conference on USENIX Security Symposium. USENIX Association, 2005.

[43] T. Jackson, B. Salamat, A. Homescu, K. Manivannan, G. Warner, A. Gal, S. Brunthaler, C. Wimmer, and M. Franz, “Compiler-generated software diversity,” 2011.

[44] M. Abadi, M. Budiu, U´ . Erlingsson, and J. Ligatti, “Controlflow integrity,” in Proceedings of the 12th ACM Conference on Computer and Communications Security. ACM, 2005, pp. 340–353.

猜你喜欢

转载自blog.csdn.net/weixin_46222091/article/details/106322966