author: Folmer Heikamp
title: Gray-box Network Fuzzing using Genetic Algorithms and Code Coverage
company: Thales, Hengelo
keywords: Fuzzing, security testing, protocols, networked systems
topics: Case studies and Applications , Dependability, security and performance
committee: Jaco van de Pol ,
Andreas Peter ,
Rene van Hees
end: January 2018
type: 4TU.cybersecurity master project


There are many recent news articles on the topic of cyber-security. This includes articles about companies being hacked, theft of digital information and the spread of ransomware. A big component in all these incidents is software. Defects or vulnerabilities in software might allow malicious entities to gain access to computer systems. These vulnerabilities are usually easy to remove if they are known. Finding them is however difficult since software is large, complex and connected.

Fuzzing is a method to automatically find vulnerabilities for a given target. The basic idea of fuzzing is to generate lots of data, send it to the target and see if the application handled it correctly. Random fuzzing has shown to be effective however is suffers from a serious drawback. It generally has a poor code coverage because of its random nature, which in turn means that a relatively large part of the target remains untested.

To solve this problem a new fuzzer, called MyFuzzer, was developed. MyFuzzer contains a genetic algorithm which tries to maximize code coverage. Each input gets scored on eleven metrics. Most of these metrics quantify how much new code has been executed. The two main metrics used in MyFuzzer are the number of new basic blocks found by an input and the number of new transitions between basic blocks found by an input. A basic block is a list of instructions without jumps. The genetic algorithm calculates a score for an input based on its metric values. Inputs which find a lot of new code get a higher score and have more probability of being reused. MyFuzzer has two main design restrictions. First of all the set of targets is limited to network applications and secondly the target source-code cannot be used. This means that code coverage has to determined on instruction level.

After the development of MyFuzzer research is done by executing several instances of MyFuzzer on a set of targets. The first goal of the research was to show that a genetic algorithm improved fuzzing both in terms of finding vulnerabilities and code coverage. The second goal of the research was to determine which metric or combination of metrics best describe the fitness of an input. 

The results for this project suggest that a genetic algorithm does improve fuzzing because MyFuzzer found vulnerabilities more frequently and obtained a better code coverage than fuzzers without a genetic algorithm. The results also indicate that the number of new basic block found by an input best describes the fitness of an input because focusing on new basic blocks delivered the best code coverage results.