[Apr 07, 2026] Professor Sooyoung Cha’s Software Analysis Laboratory (SAL) Has a Paper Accepted to FSE 2026
- SKKU National Program of Excellence in Software
- Hit758
- 2026-04-14

The paper by Jaehan Yoon (Ph.D. student) from the Software Analysis Laboratory (Advisor: Sooyoung Cha) has been accepted to FSE 2026 (ACM International Conference on the Foundations of Software Engineering), the top conference in the field of software engineering. This research was conducted in collaboration with Yunji Seo (Korea University) and Professor Hakjoo Oh (Korea University), and will be presented in Montreal, Canada in July 2026.
The paper, "Reducing Coverage-Equivalent Inputs in Grammar-based Fuzzing by Avoiding Recurrent Rule Sequences,” proposes a new technique to improve the performance of “Grammar-based Fuzzing,” a representative software testing methodology. The motivation of this work is that recent grammar-based fuzzing tools repeatedly generate coverage-equivalent inputs that execute the same code regions during testing. To address this issue, the paper introduces a technique that automatically identifies the causes of such coverage-equivalent inputs and generates “production rules” to prevent their regeneration. The proposed approach, RSFuzz, was integrated into state-of-the-art fuzzing tools and evaluated on 12 real-world programs with various input formats. Experimental results show that integrating RSFuzz into three fuzzing tools detects 121, 46, and 17 additional bugs, improves line coverage by 6.0%, 4.8%, and 3.0%, and reduces the generation of coverage-equivalent inputs by 23.3%, 28.7%, and 14.9%, respectively.
[Paper Information]
- Title: Reducing Coverage-Equivalent Inputs in Grammar-based Fuzzing by Avoiding Recurrent Rule Sequences
- Authors: Jaehan Yoon, Yunji Seo, Hakjoo Oh, Sooyoung Cha
- Conference: ACM International Conference on the Foundations of Software Engineering (FSE 2026)
Abstract:
We present RSFuzz, a new technique to enhance grammar-based fuzzing by reducing the generation of coverage-equivalent inputs during testing. Grammar-based fuzzers apply production rules from a given grammar (e.g., forming a derivation tree) to generate well-structured inputs for the target program. However, a key limitation is that many existing fuzzers still produce a large number of "coverage-equivalent" inputs—those that revisit already explored program paths—thereby restricting their ability to uncover new bugs and improve coverage. To address this issue, RSFuzz automatically identifies recurrent sequences of production rules that lead to coverage-equivalent inputs and prevents their reuse during fuzzing. A key challenge in practice lies in the large number of coverage-equivalent input groups, each with many inputs and corresponding derivation trees, making it difficult to identify the underlying recurrent sequences. RSFuzz tackles this challenge with a customized algorithm that iteratively groups coverage-equivalent inputs, selects promising groups, and extracts recurrent sequences by abstracting derivation trees based on accumulated data while running any grammar-based fuzzer. We integrated RSFuzz with existing random, probabilistic, and grammar-coverage based fuzzers and evaluated it on 12 real-world programs using XML, JSON, CSV, and Markdown input formats. Experimental results show that incorporating RSFuzz into the three fuzzers detects 121, 46, and 17 additional crashes with distinct stack traces, increases line coverage by 6.0%, 4.8%, and 3.0%, and reduces duplicate-coverage input generation by 23.3%, 28.7% and 14.9%, respectively, compared to their performance without RSFuzz.



