Skip to content

Add script to generate fuzz coverage #3718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Prabhat1308
Copy link
Contributor

This script generates the fuzz coverage in a html file when run .

Considerations

  1. The script assumes a corpus directory in fuzz directory.
  2. Also assumes the corpus for a particular target to be under the name of the target itself. for eg - corpus for base32_target.rs would be under corpus/base32_target.
  3. The corpus I worked with was generated using libFuzzer it may/may not work with corpus generated from other fuzzers.

How to run

# from the root directory
chmod +x contrib/generate_fuzz_coverage.sh
contrib/generate_fuzz_coverage.sh

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Apr 8, 2025

👋 I see @jkczyz was un-assigned.
If you'd like another reviewer assignemnt, please click here.

@Prabhat1308
Copy link
Contributor Author

If needed I can also push my script to generate corpus from libFuzzer that create the corpus according to the considerations if the reviewers deem it worthy .

@ldk-reviews-bot ldk-reviews-bot requested a review from jkczyz April 8, 2025 19:51
@jkczyz jkczyz requested review from TheBlueMatt and removed request for jkczyz April 8, 2025 21:50
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should include this coverage data in our codecov output in PRs...On the one hand its not "reliable" coverage data in that we aren't actually testing the behavior, but on the other hand we are at least hitting the code lines and presumable would find crashes. WDYT?


if [ -d "$TARGET_CORPUS_DIR" ]; then
echo "Running cargo llvm-cov for target: $TARGET_NAME with corpus $TARGET_CORPUS_DIR"
if ! CARGO_TARGET_DIR=./target cargo +${TOOLCHAIN} llvm-cov run --no-report --manifest-path "$FUZZ_DIR/Cargo.toml" --target "$HOST_TRIPLE" --features "$COMBINED_FEATURES" --bin "$TARGET_NAME" -- "$TARGET_CORPUS_DIR" -runs=1; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than running the fuzz target specifically, can we just use cargo llvm-cov to run the tests? For each binary in the fuzz tests we already support running cargo test and it'll pull all the files from fuzz/test_cases/X and run it through the fuzz processor. Seems easier than relying on the libfuzzer logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about that. Running on test_cases with cargo test would remove the fuzziness since now it will be run test_cases and wont run the fuzzer but since its coverage report we are basically doing the same thing. I would have to check on this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone doing fuzzing can presumably copy their corpus from wherever it is to test_cases. In the comment you note that we currently expect a corpus in corpus, but it could well be any folder name?

Copy link
Contributor Author

@Prabhat1308 Prabhat1308 Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the script to use cargo test instance now. Surprisingly reduced the code by 99%. Although I don't see the test_cases in any branch . Maybe you can try running the script and provide some feedback since I dont know the format on how test_cases directory has the inputs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, test_cases isn't automatically created anywhere, its just read by the tests if it exists. Not sure what to do about that wrt the script here, maybe we should have it look for the llvm-libfuzzer, afl, and honggfuzz dirs, copy them to test (if its empty) and then run cargo test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to go ahead and do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi . sorry for the late response . Have been busy recently with my academics . Would implement this by the end of this month hopefully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR according to the comment , finding the corpus directories is not feasible since there is no clear indication/factor that can catch that this is a corpus directory apart from the object format which might also be different depending on the fuzzer. Also the location of corpus can be out of the project and some people tend to keep their project directory clean . Instead what I have done is to check for test_cases and if its not present/empty it will throw a message and instruction to copy paste (but not halt the fuzz run).

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@Prabhat1308
Copy link
Contributor Author

I wonder if we should include this coverage data in our codecov output in PRs...On the one hand its not "reliable" coverage data in that we aren't actually testing the behavior, but on the other hand we are at least hitting the code lines and presumable would find crashes. WDYT?

I would prefer to not add it in codecov until we have a good public corpora of fuzz inputs to run the report on.

@TheBlueMatt
Copy link
Collaborator

I would prefer to not add it in codecov until we have a good public corpora of fuzz inputs to run the report on.

One reason to do this would be to only include "trivially-reachable" things in the codecov output. We don't really want to include stuff as coverage if the fuzzer happened to reach it only once, but if the fuzzer found its way into it in a minute of CI time, its probably decent coverage :)

@Prabhat1308
Copy link
Contributor Author

I would prefer to not add it in codecov until we have a good public corpora of fuzz inputs to run the report on.

One reason to do this would be to only include "trivially-reachable" things in the codecov output. We don't really want to include stuff as coverage if the fuzzer happened to reach it only once, but if the fuzzer found its way into it in a minute of CI time, its probably decent coverage :)

Makes sense if thats the case.

@TheBlueMatt
Copy link
Collaborator

Do you intend to go ahead and run this in CI? Or would you prefer not to?

@Prabhat1308
Copy link
Contributor Author

added this script to run in CI .

@Prabhat1308 Prabhat1308 force-pushed the pv/add_cov_script branch 3 times, most recently from a554c14 to a460b7a Compare June 15, 2025 16:54
@Prabhat1308
Copy link
Contributor Author

1.63 versions are failing because llvm-cov requires 1.65 , other 2 failures are because of storage issue on the CI instance.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, so sorry I missed your comments here.

ci/ci-tests.sh Outdated
@@ -137,3 +137,28 @@ RUSTFLAGS="--cfg=splicing" cargo test --verbose --color always -p lightning
RUSTFLAGS="--cfg=async_payments" cargo test --verbose --color always -p lightning
[ "$CI_MINIMIZE_DISK_USAGE" != "" ] && cargo clean
RUSTFLAGS="--cfg=lsps1_service" cargo test --verbose --color always -p lightning-liquidity

# Generate fuzz coverage report
echo -e "\n\nGenerating fuzz coverage report"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, rather than doing this in ci-tests.sh let's do this in a new script which is called from the coverage job in CI.

Comment on lines 54 to 56
export RUSTFLAGS="--cfg=fuzzing --cfg=secp256k1_fuzz --cfg=hashes_fuzz"
# ignore anything in fuzz directory since we don't want coverage of targets
cargo llvm-cov --html --ignore-filename-regex "fuzz/" --output-dir "$OUTPUT_DIR"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script currently checks for the existence of test corpus files but doesn't appear to use them when generating coverage with cargo llvm-cov. For accurate fuzz coverage reporting, the command should incorporate these corpus files as test inputs.

Consider modifying the cargo llvm-cov command to explicitly use the corpus files, perhaps with something like:

# For each target with a corpus
for target_dir in test_cases/*; do
  if [ -d "$target_dir" ]; then
    target_name=$(basename "$target_dir")
    cargo llvm-cov --html --test-dir "$target_dir" --target "$target_name" ...
  fi
done

This would ensure the coverage report reflects code paths exercised by the fuzzing corpus rather than just standard test execution.

Suggested change
export RUSTFLAGS="--cfg=fuzzing --cfg=secp256k1_fuzz --cfg=hashes_fuzz"
# ignore anything in fuzz directory since we don't want coverage of targets
cargo llvm-cov --html --ignore-filename-regex "fuzz/" --output-dir "$OUTPUT_DIR"
export RUSTFLAGS="--cfg=fuzzing --cfg=secp256k1_fuzz --cfg=hashes_fuzz"
# Process each fuzz target with its corpus
mkdir -p "$OUTPUT_DIR"
for target_dir in fuzz/corpus/*; do
if [ -d "$target_dir" ]; then
target_name=$(basename "$target_dir")
echo "Generating coverage for target: $target_name"
# Run coverage for this specific target with its corpus
cargo llvm-cov --html --ignore-filename-regex "fuzz/" \
--fuzz-target "$target_name" \
--corpus-dir "$target_dir" \
--output-path "$OUTPUT_DIR/$target_name"
fi
done
# Generate a combined report
cargo llvm-cov --html --ignore-filename-regex "fuzz/" --output-dir "$OUTPUT_DIR/combined"

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

Copy link
Contributor Author

@Prabhat1308 Prabhat1308 Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I dont think this is the correct way ? Cargo-llvm will use all the tests_cases (if present) and all targets in this scenario. Individually calling them seems redundant work

@Prabhat1308
Copy link
Contributor Author

Failure caused because of no storage left on the ci infra

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man I hate github CI. I think if we add a deletion of some of the files in target after the first cargo llvm-cov in the CI build.yml it should pass.

@Prabhat1308
Copy link
Contributor Author

Working locally when run with the CI workflow.

Screenshot 2025-07-08 at 9 52 01 PM

@Prabhat1308
Copy link
Contributor Author

CI passed . had to nuke the entire target directory

@tnull
Copy link
Contributor

tnull commented Jul 10, 2025

Might be worth giving CI a kick now that we fixed the No space left on device issue (for now) with #3916.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants