Background and Contribution

Background

  1. 智能合约盛行,然而其漏洞很多,财产损失严重
  2. The prevalence of integer bugs in smart contracts. 2018 年位居 top3

Contribution

Osiris

  1. a symbolic execution tool / a framework combines symbolic execution and taint analysis
  2. auto-matically detects integer bugs in EVM bytecode
  3. The tool currently covers three different types of integer bugs: arithmetic bugs, truncation bugs and signedness bugs.

Advantages:

  1. Compared with Zeus, Osiris detects more vulnerabilities, with more confidence, as the false positives rate is considerably lower with our tool.

  2. Osiris detects a greater range of bugs than existing tools,even discovered unknown vulnerabilities.

  3. 论文还对 EVM 和 solidity 编译器提出修改建议,来避免 integer bugs.

integer bugs introduction

  1. arithmetic bugs
    1. integer overflows
    2. underflows
    3. division by zero
    4. modulo zero,
  2. truncation bugs 大范围 int -> 小范围 int 的转换 Converting a value of one integral type to a narrower integral type which has a shorter range of values may introduce so-called truncation bugs.
  3. signedness bugs 有符号 <-> 无符号 可能会导致 负值 <-> 大的正值.

how to detect integer bugs contained in these contracts

见论文 p3:2.3 Integer Bugs in Ethereum Smart Contracts

Methodology

方法论,如何检测 integer bugs

Detect integer bugs at the bytecode level

  1. inferring integer types
  2. detecting integer bugs
  3. applying taint analysis to reduce false positives.

other challenges such as identifying intended checks for integer bugs.

  1. Type Inference

    Type information is usually only available at the source code level and not at the bytecode level.

    Due to certain code optimisations introduced by the Solidity compiler during compile time, it is actually possible to infer the size and the sign of integers at the bytecode level.

    然后讲了根据那些特定的代码等来推测 size 和 sign

  2. Finding Integer Bugs 分别从三种类型来讲述如何寻找。 见论文 p4 3.2 Finding Integer Bugs

  3. Taint Analysis 降低假阳率 Taint analysis can help to distinguish between benign(良性的) overflows introduced by the developer or compiler, and malicious overflows that are exploitable by an attacker.

    taint analysis

    Therefore, by deliberately ignoring integer bugs that do not originate from a source and do not flow into a sink, we can focus exclusively on actual exploitable integer bugs and gracefully reduce the number of false positives

    然后讲了常见可以作为 sources 或 sinks 对应的指令们,并收集适当选取他们,以便后面判断使用

  4. Identifying Benign Integer Bugs 尽管污点分析已经减低了一些假阳率。 there are still some cases where an integer bug might originate from an untrusted source and "ow into a sensitive sink, while being a benign integer bug.

    In order to cope with such cases, we came up with some heuristic rules. 并讲述了启发式规则。

OSIRIS

Architecture of ORISIS

We implemented Osiris on top of Oyente’s symbolic execution engine.

Design Overview

Input: the bytecode or Solidity source code of a smart contract. The latter gets internally compiled to EVM bytecode Output: whether a contract contains any integer bug (e.g. overflow, underflow, truncation, etc.)

Three main components: symbolic analysis, taint analysis and integer error detection

  1. Symbolic Analysis The symbolic analysis component constructs a Control Flow Graph (CFG) and symbolically executes the different paths of the contract. The symbolic analysis component passes the result of every executed instruction to the taint analysis component as well as to the integer error detection component.

    1. The symbolic analysis component starts by constructing a CFG from the bytecode. Osiris can output a visual representation of the CFG depicting the individual path conditions and highlighting the basic blocks that include integer bugs.
    2. After constructing the CFG, the symbolic execution engine starts by executing the entry node of the CFG.
  2. Taint Analysis The taint analysis component introduces, propagates and checks for taint across stack, memory and storage.

    1. The symbolic execution engine forwards every executed instruction to the taint analysis component.

    2. The taint analysis component checks wether the executed instruction is part of the list of defined sources

      1. If that is the case, the taint analysis component introduces taint by tagging the affected stack, memory or storage location

      2. Identifies the operands of each EVM bytecode instruction and propagates the taint according to the semantics of each instruction as defined in [38]

      3. The following principle: if an instruction uses a tainted value to derive another value, then the derived value becomes tainted as well 通过遵循此污点传播原则,该工具实现了比 Mythrill 更精确的传播 Mythril propagates taint across the stack, but for certain instructions it does not propagate taint across memory or storage.

      4. The taint analysis component verifies if a taint flow occurred, by

        1. checking whether the executed instruction is part of the list of defined sinks
        2. if any of the values it used has been tainted by an integer bug.
  3. Integer Error Detection The integer error detection component checks whether an integer bug is possible within the executed instruction.

    The integer error detection component is only called at instructions that may result in integer bugs, such as arithmetic instructions. Calls to the integer error detection component are only performed if at least one of the operands of the executed instruction is tainted. If these criteria are met, then the symbolic execution engine eventually forwards the executed instruction along with the current path conditions to the integer error detection component.

    the component follows the different techniques as described in Section 3.2 in order to detect the specific integer bugs.

    见论文 p7 Integer Error Detection 的 For Example

    1. 对于可能存在某种 integer bugs 的指令,Symbolic Analysis component 调用 Integer Error Detection 中对应的方法进行检测
    2. creates a formula with a constraint that is only feasible if an integer overflow is possible under the current path conditions.
    3. pass the formula to the Z3 solver which checks for its feasibility
    4. If the solver finds a solution to the formula, then the integer error detection component knows that an integer overflow is possible and returns an error back to the symbolic analysis component
    5. The symbolic analysis component calls the taint analysis component, which then taints the result of the AND instruction where its source represents the discovered integer bug.

EVALUATION

  1. Via the qualitative analysis we aim to determine the reliability of our tool by comparing our results with Zeus
  2. Via the quantitative analysis we intend to demonstrate the scalability of Osiris

Qalitative Analysis

重用了 Zeus 的数据集,并与 Zeus 对比。 结论: 1. Zeus 缺点 1. Zeus 并不是像它论文上声称的那样 0 假阴率(有 bug 的报告为 无 bug 的) 2. Zeus 并不能检测那些良性的 integer bugs (were induced by the developer or by the Solidity compiler as part of handling data structures of dynamic size such as arrays, strings or bytes.),即: 错误地将其识别为 bugs 3. Zeus does not check for division by zero or modulo zero bugs 2. 也有一些 Zeus 检测出来,而 Osiris 没有检测出来,但是那只是因为 Osiris 的 sources 检测列表里面没有,加上就行了。 3. 在本次测试中,Orisi 的 false negative rate 比 ZEUS 低 88%(28/32)

Qantitative Analysis

从公开的以太坊区块链上收集了 1,207,335, 其中 only 50,535 are unique in terms of their bytecode.

20200507183547

Detection of Real-World Vulnerabilities

Detecting Known Vulnerabilities

CVE 们,都成功检测出来了

Detecting Unknown Vulnerabilities

DataSet: Etherscan provides a list of top tokens ranked by their market capitalisation We downloaded the bytecode as well as the source code for these 495 smart contracts and analysed them using Osiris.

We found two overflows to be false positives

不足与改进

见论文 p12

  1. plan to extend Osiris’s taint analysis to also track taint across multiple contracts (inter-contract analysis) and across different method invocations (trace analysis).

  2. aim to switch to concolic execution using concrete values from the blockchain in order to validate and generate direct exploits. This may help us make Osiris’s detection mechanism even more precise.

  3. want to augment our evaluation on the security of Ethereum tokens.

Z3: An Efficient SMT Solver