Generating Highly-structured Input Data by Combining Search-based Testing and Grammar-based Fuzzing

Mitchell Olsthoorn, Arie van Deursen, Annibale Panichella

December, 2020

Abstract

Software testing is an important and time-consuming task that is often done manually. In the last decades, researchers have come up with techniques to generate input data (e.g., fuzzing) and automate the process of generating test cases (e.g., search-based testing). However, these techniques are known to have their own limitations: search-based testing does not generate highly-structured data; grammar-based fuzzing does not generate test case structures. To address these limitations, we combine these two techniques. By applying grammar-based mutations to the input data gathered by the search-based testing algorithm, it allows us to co-evolve both aspects of test case generation. We evaluate our approach, called G-EvoSuite, by performing an empirical study on 20 Java classes from the three most popular JSON parsers across multiple search budgets. Our results show that the proposed approach on average improves branch coverage for JSON related classes by 15 % (with a maximum increase of 50 %) without negatively impacting other classes.

Type

Conference paper

Publication

Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering

Generating Highly-structured Input Data by Combining Search-based Testing and Grammar-based Fuzzing

Abstract

Mitchell Olsthoorn

Assistant Professor