-
Notifications
You must be signed in to change notification settings - Fork 417
Add script to generate fuzz coverage #3718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 I see @jkczyz was un-assigned. |
If needed I can also push my script to generate corpus from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should include this coverage data in our codecov output in PRs...On the one hand its not "reliable" coverage data in that we aren't actually testing the behavior, but on the other hand we are at least hitting the code lines and presumable would find crashes. WDYT?
contrib/generate_fuzz_coverage.sh
Outdated
|
||
if [ -d "$TARGET_CORPUS_DIR" ]; then | ||
echo "Running cargo llvm-cov for target: $TARGET_NAME with corpus $TARGET_CORPUS_DIR" | ||
if ! CARGO_TARGET_DIR=./target cargo +${TOOLCHAIN} llvm-cov run --no-report --manifest-path "$FUZZ_DIR/Cargo.toml" --target "$HOST_TRIPLE" --features "$COMBINED_FEATURES" --bin "$TARGET_NAME" -- "$TARGET_CORPUS_DIR" -runs=1; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than running the fuzz target specifically, can we just use cargo llvm-cov
to run the tests? For each binary in the fuzz tests we already support running cargo test
and it'll pull all the files from fuzz/test_cases/X
and run it through the fuzz processor. Seems easier than relying on the libfuzzer logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about that. Running on test_cases
with cargo test
would remove the fuzziness since now it will be run test_cases and wont run the fuzzer but since its coverage report we are basically doing the same thing. I would have to check on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone doing fuzzing can presumably copy their corpus from wherever it is to test_cases
. In the comment you note that we currently expect a corpus in corpus
, but it could well be any folder name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the script to use cargo test
instance now. Surprisingly reduced the code by 99%. Although I don't see the test_cases
in any branch . Maybe you can try running the script and provide some feedback since I dont know the format on how test_cases
directory has the inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, test_cases
isn't automatically created anywhere, its just read by the tests if it exists. Not sure what to do about that wrt the script here, maybe we should have it look for the llvm-libfuzzer, afl, and honggfuzz dirs, copy them to test (if its empty) and then run cargo test
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to go ahead and do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi . sorry for the late response . Have been busy recently with my academics . Would implement this by the end of this month hopefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the PR according to the comment , finding the corpus directories is not feasible since there is no clear indication/factor that can catch that this is a corpus directory apart from the object format which might also be different depending on the fuzzer. Also the location of corpus can be out of the project and some people tend to keep their project directory clean . Instead what I have done is to check for test_cases
and if its not present/empty it will throw a message and instruction to copy paste (but not halt the fuzz run).
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
I would prefer to not add it in codecov until we have a good public corpora of fuzz inputs to run the report on. |
c45d613
to
8f005a0
Compare
One reason to do this would be to only include "trivially-reachable" things in the codecov output. We don't really want to include stuff as coverage if the fuzzer happened to reach it only once, but if the fuzzer found its way into it in a minute of CI time, its probably decent coverage :) |
Makes sense if thats the case. |
8f005a0
to
10679bb
Compare
Do you intend to go ahead and run this in CI? Or would you prefer not to? |
added this script to run in CI . |
a554c14
to
a460b7a
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, so sorry I missed your comments here.
ci/ci-tests.sh
Outdated
@@ -137,3 +137,28 @@ RUSTFLAGS="--cfg=splicing" cargo test --verbose --color always -p lightning | |||
RUSTFLAGS="--cfg=async_payments" cargo test --verbose --color always -p lightning | |||
[ "$CI_MINIMIZE_DISK_USAGE" != "" ] && cargo clean | |||
RUSTFLAGS="--cfg=lsps1_service" cargo test --verbose --color always -p lightning-liquidity | |||
|
|||
# Generate fuzz coverage report | |||
echo -e "\n\nGenerating fuzz coverage report" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, rather than doing this in ci-tests.sh
let's do this in a new script which is called from the coverage
job in CI.
contrib/generate_fuzz_coverage.sh
Outdated
export RUSTFLAGS="--cfg=fuzzing --cfg=secp256k1_fuzz --cfg=hashes_fuzz" | ||
# ignore anything in fuzz directory since we don't want coverage of targets | ||
cargo llvm-cov --html --ignore-filename-regex "fuzz/" --output-dir "$OUTPUT_DIR" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script currently checks for the existence of test corpus files but doesn't appear to use them when generating coverage with cargo llvm-cov
. For accurate fuzz coverage reporting, the command should incorporate these corpus files as test inputs.
Consider modifying the cargo llvm-cov
command to explicitly use the corpus files, perhaps with something like:
# For each target with a corpus
for target_dir in test_cases/*; do
if [ -d "$target_dir" ]; then
target_name=$(basename "$target_dir")
cargo llvm-cov --html --test-dir "$target_dir" --target "$target_name" ...
fi
done
This would ensure the coverage report reflects code paths exercised by the fuzzing corpus rather than just standard test execution.
export RUSTFLAGS="--cfg=fuzzing --cfg=secp256k1_fuzz --cfg=hashes_fuzz" | |
# ignore anything in fuzz directory since we don't want coverage of targets | |
cargo llvm-cov --html --ignore-filename-regex "fuzz/" --output-dir "$OUTPUT_DIR" | |
export RUSTFLAGS="--cfg=fuzzing --cfg=secp256k1_fuzz --cfg=hashes_fuzz" | |
# Process each fuzz target with its corpus | |
mkdir -p "$OUTPUT_DIR" | |
for target_dir in fuzz/corpus/*; do | |
if [ -d "$target_dir" ]; then | |
target_name=$(basename "$target_dir") | |
echo "Generating coverage for target: $target_name" | |
# Run coverage for this specific target with its corpus | |
cargo llvm-cov --html --ignore-filename-regex "fuzz/" \ | |
--fuzz-target "$target_name" \ | |
--corpus-dir "$target_dir" \ | |
--output-path "$OUTPUT_DIR/$target_name" | |
fi | |
done | |
# Generate a combined report | |
cargo llvm-cov --html --ignore-filename-regex "fuzz/" --output-dir "$OUTPUT_DIR/combined" | |
Spotted by Diamond
Is this helpful? React 👍 or 👎 to let us know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I dont think this is the correct way ? Cargo-llvm will use all the tests_cases (if present) and all targets in this scenario. Individually calling them seems redundant work
c81c70f
to
4baa925
Compare
Failure caused because of no storage left on the ci infra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man I hate github CI. I think if we add a deletion of some of the files in target
after the first cargo llvm-cov
in the CI build.yml
it should pass.
934753f
to
e028e19
Compare
e028e19
to
f3ec93c
Compare
f3ec93c
to
029d266
Compare
029d266
to
d9b6814
Compare
CI passed . had to nuke the entire |
Might be worth giving CI a kick now that we fixed the |
adds the coverage report generation in CI . Also make the coverage script more CI friendly.
d9b6814
to
248ba29
Compare
This script generates the fuzz coverage in a html file when run .
Considerations
corpus
directory infuzz
directory.base32_target.rs
would be undercorpus/base32_target
.libFuzzer
it may/may not work with corpus generated from other fuzzers.How to run
# from the root directory chmod +x contrib/generate_fuzz_coverage.sh contrib/generate_fuzz_coverage.sh