Arvin: Greybox Fuzzing Using Approximate Dynamic CFG Analysis

Sirus Shahini, Mu Zhang, Mathias Payer, and Robert Ricci

Proceedings of the 18th ACM ASIA Conference on Computer and Communications Security (AsiaCCS) 2023.

abstract

Fuzzing has emerged as the most broadly used testing technique to discover bugs. Effective fuzzers rely on coverage to prioritize inputs that exercise new program areas. Edge-based code coverage of the Program Under Test (PUT) is the most commonly used coverage today. It is cheap to collect—a simple counter per basic block edge suffcies. Unfortunately, edge coverage lacks context information: it exclusively records how many times each edge was executed but lacks the information necessary to trace actual paths of execution.

Our new fuzzer Arvin gathers probabilistic full traces of PUT executions to construct Dynamic Control Flow Graphs (DCFGs). These DCFGs observe a richer set of program behaviors, such as the "depth" of execution, different paths to reach the same basic block, and targeting specific functions and paths. Prioritizing the most promising inputs based on these behaviors improves fuzzing efefctiveness by increasing the diversity of explored basic blocks.

Designing a DCFG-aware fuzzer raises a key challenge: collecting the required information needs complex instrumentation which results in performance overheads. Our prototype approximates DCFG and enables lightweight, asynchronous coordination between fuzzing processes, making DCFG-based fuzzing practical.

By approximating DCFGs, Arvin is fast, resulting in at least an eight-fold increase in fuzzing speed. Because it effectively prioritizes inputs using methods like depth comparison and directed exclusion, which are unavailable to other fuzzers, it finds bugs missed by others. We compare its ability to find bugs using various Linux programs and discover 50 bugs, 23 of which are uniquely found by Arvin.