Announcing the Atheris Python Fuzzer
Fuzz testing is a well-known technique for uncovering programming errors. Many of these detectable errors have serious security implications. Google has found thousands of security vulnerabilities and other bugs using this technique. Fuzzing is traditionally used on native languages such as C or C++, but last year, we built a new Python fuzzing engine. Today, we’re releasing the Atheris fuzzing engine as open source.
What can Atheris do?
One of the best uses for Atheris is for differential fuzzers. These are fuzzers that look for differences in behavior of two libraries that are intended to do the same thing. One of the example fuzzers packaged with Atheris does exactly this to compare the Python “idna” package to the C “libidn2” package. Both of these packages are intended to decode and resolve internationalized domain names. However, the example fuzzer idna_uts46_fuzzer.py shows that they don’t always produce the same results. If you ever decided to purchase a domain containing (Unicode codepoints [U+0130, U+1df9]), you’d discover that the idna and libidn2 libraries resolve that domain to two completely different websites.
In general, Atheris is useful on pure Python code whenever you have a way of expressing what the “correct” behavior is – or at least expressing what behaviors are definitely not correct. This could be as complex as custom code in the fuzzer that evaluates the correctness of a library’s output, or as simple as a check that no unexpected exceptions are raised. This last case is surprisingly useful. While the worst outcome from an unexpected exception is typically denial-of-service (by causing a program to crash), unexpected exceptions tend to reveal more serious bugs in libraries. As an example, the one YAML parsing library we tested Atheris on says that it will only raise YAMLErrors; however, yaml_fuzzer.py detects numerous other exceptions, such as ValueError from trying to interpret “-_” as an integer, or TypeError from trying to use a list as a key in a dict. (Bug report.) This indicates flaws in the parser.
Finally, Atheris supports fuzzing native Python extensions, using libFuzzer. libFuzzer is a fuzzing engine integrated into Clang, typically used for fuzzing C or C++. When using libFuzzer with Atheris, Atheris can still find all the bugs previously described, but can also find memory corruption bugs that only exist in native code. Atheris supports the Clang sanitizers Address Sanitizer and Undefined Behavior Sanitizer. These make it easy to detect corruption when it happens, rather than far later. In one case, the author of this document found an LLVM bug using an Atheris fuzzer (now fixed).
What does Atheris support?
OSS-Fuzz is a fuzzing service hosted by Google, where we execute fuzzers on open source code free of charge. OSS-Fuzz will soon support Atheris!
How can I get started?
pip3 install atheris
And then, just define a TestOneInput function that runs the code you want to fuzz:
if data == b”bad”:
That’s it! Atheris will repeatedly invoke TestOneInput and monitor the execution flow, until a crash or exception occurs.
For more details, including how to fuzz native code, see the README.
By Ian Eldred Pudney, Google Information Security
Related Google News:
- Supporting the Python ecosystem February 11, 2021
- Announcing Kotlin Symbol Processing (KSP) Alpha February 10, 2021
- Announcing New Smart Home App Discovery Features January 7, 2021
- Announcing gRPC Kotlin 1.0 for Android and Cloud December 16, 2020
- Announcing the Newest Addition to MLKit: Entity Extraction December 11, 2020
- Improve the data science experience using scalable Python data processing December 11, 2020
- How the Atheris Python Fuzzer Works December 9, 2020
- Announcing Bonus Rewards for V8 Exploits December 8, 2020