How-to Write a Python Fuzzer for TensorFlow
Posted by Laura Pak
Fuzz testing is a process of testing APIs with generated data. Fuzzing ensures that code will not break on the negative path, generating randomized inputs that try to cover every branch of code. A popular choice is to pair fuzzers with sanitizers, which are tools that check for illegal conditions and thus flag the bugs triggered by the fuzzers’ inputs.
In this way, fuzzing can find:
- Buffer overflows
- Memory leaks
- Infinite recursion
- Round-trip consistency failures
- Uncaught exceptions
- And more.
The best way to fuzz to have your fuzz tests running continuously. The more a test runs, the more inputs can be generated and tested against. In this article, you’ll learn how to add a Python fuzzer to TensorFlow.
The technical how-to
TensorFlow Python fuzzers run via OSS-Fuzz, the continuous fuzzing service for open source projects.
For Python fuzzers, OSS-Fuzz uses Atheris, a coverage-guided Python fuzzing engine. Atheris is based on the fuzzing engine libFuzzer, and it can be used with the dynamic memory error detector Address Sanitizer or the fast undefined behavior detector, Undefined Behavior Sanitizer. Atheris dependencies will be pre-installed on OSS-Fuzz base Docker images.
Here is a barebones example of a Python fuzzer for TF. The runtime will call
TestCode with different random data.
import atheris_no_libfuzzer as atheris
atheris.Setup(sys.argv, TestCode, enable_python_coverage=True)
In the tensorflow repo, in the directory with the other fuzzers, add your own Python fuzzer like above. In
TestCode, pick a TensorFlow API that you want to fuzz. In constant_fuzz.py, that API is
tf.constant. That fuzzer simply passes data to the chosen API to see if it breaks. No need for code that catches the breakage; OSS-Fuzz will detect and report the bug.
Sometimes an API needs more structured data than just one input. TensorFlow has a Python class called FuzzingHelper that allows you to generate random int lists, a random bool, etc. See an example of its use in sparseCountSparseOutput_fuzz.py, a fuzzer that checks for uncaught exceptions in the API
name = "fuzz_target_name",
srcs = ["your_fuzzer.py"],
tags = ["notap"], # Important: include to run in OSS.
Testing your fuzzer with Docker
Make sure that your fuzzer builds in OSS-Fuzz with Docker.
First install Docker. In your terminal, run command
docker image prune to remove any dangling images.
Clone oss-fuzz from Github. The project for a Python TF fuzzer, tensorflow-py, contains a
build.sh file to be executed in the Docker container defined in the Dockerfile. Build.sh defines how to build binaries for fuzz targets in tensorflow-py. Specifically, it builds all the Python fuzzers found in $SRC/tensorflow/tensorflow, including your new fuzzer!
oss-fuzz, run the following commands:
python infra/helper.py shell tensorflow
compile will run
build.sh, which will attempt to build your new fuzzer.
Once your fuzzer is up and running, you can search this dashboard for your fuzzer to see what vulnerabilities your fuzzer has uncovered.
Fuzzing is an exciting way to test software from the unhappy path. Whether you want to dabble in security or gain a deeper understanding of TensorFlow’s internals, we hope this post gives you a good place to start.
Related Google News:
- Reconstructing thousands of particles in one go at the CERN LHC with TensorFlow April 22, 2021
- TensorFlow Quantum turns one year old March 18, 2021
- Analyzing Python package downloads in BigQuery March 18, 2021
- Introducing TensorFlow Videos for a Global Audience: Vietnamese March 1, 2021
- Variational Inference with Joint Distributions in TensorFlow Probability February 17, 2021
- 3D Scene Understanding with TensorFlow 3D February 11, 2021
- Supporting the Python ecosystem February 11, 2021
- Accelerated inference on Arm microcontrollers with TensorFlow Lite for Microcontrollers and CMSIS-NN February 10, 2021