Skip to content

Conversation

@Marcono1234
Copy link

Tries to improve the bounds checks for LZ4.

A few general notes:

  • I only checked LZ4.java; not sure if other classes or the native code might have similar issues
  • I am not familiar with this project, so in case you are thinking about changes in other parts of the code as well, feel free to directly include my changes and close this PR

See the review comments below for additional notes.

@Marcono1234 Marcono1234 marked this pull request as draft September 14, 2025 16:33
Comment on lines 72 to 75
if (bound < 0) {
throw new IllegalArgumentException("Invalid size");
}
return bound;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checks for overflow. An alternative could be to change the return type to long, but I am not sure if the callers can handle that better, and it would be an incompatible change to this public API.


public static int compress(byte[] src, int srcOffset, byte[] dst, int dstOffset, int length) {
if (srcOffset < 0 || srcOffset + length > src.length || dstOffset < 0 || length < 0) {
if (srcOffset < 0 || srcOffset > src.length - length || dstOffset < 0 || length < 0) {
Copy link
Author

@Marcono1234 Marcono1234 Sep 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be identical (performs a - length on both sides of the comparison), except that it is overflow-safe for large length values.

Comment on lines -539 to +551
int s;
do {
int s = 255;
while (ip < srcEnd - RUN_MASK && s == 255) {
s = unsafe.getByte(src, ip++) & 0xff;
length += s;
} while (ip < srcEnd - RUN_MASK && s == 255);
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously performed an out-of-bounds read by one byte if the input buffer did not have enough data.

Not sure if that is only an issue for the first iteration of the enclosing for loop. In that case maybe the do-while loop could be kept and instead an additional bounds check could be added outside the for loop. Though I am not very familiar with the LZ4 format, and whether such a change would work.

@Marcono1234 Marcono1234 marked this pull request as ready for review September 14, 2025 16:44
return size + size / 255 + 16;
int bound = size + size / 255 + 16;
if (bound < 0) {
throw new IllegalArgumentException("Invalid size");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add detail to message, on which input it fails

Copy link
Author

@Marcono1234 Marcono1234 Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean something like "Invalid size: " + size? Besides the size parameter this method does not have any additional context it could add.

And because this method is public, it might not necessarily be a "too large" size, but could also be a negative size provided by the user (though in that case this validation does not catch all negative values).

The main point here is to protect against overflow, because otherwise at the call sites the bounds checks which rely on the result of compressBound would be ineffective in case of overflow.

Copy link
Collaborator

@max-kammerer max-kammerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests for updated logic

@Marcono1234
Copy link
Author

Have added tests and also added a missing check for dstOffset > dst.length which I had overlooked before. I hope that is what you had in mind.

}

@Test
public void compress_BoundsChecks() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use standart java code style with camelCase, please avoid _ usage here and below?

As possible variant: testCompressBoundsChecks

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I can omit the _ if you prefer. That makes it a bit more difficult to tell apart tested method (e.g. compress) from the scenario (e.g. BoundsChecks).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting point, but it differs from used test naming approach in project

int s;
do {
int s = 255;
while (ip < srcEnd - RUN_MASK && s == 255) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is decompress_Incomplete cover this change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If yes, please be aware about this check, most likely first branch in native code is executed:

    if (NativeLibrary.IS_SUPPORTED) {
        result = decompress0(src, srcOffset, dst, dstOffset, length, dst.length - dstOffset);
    } else {
        result = decompress(src, byteArrayOffset + srcOffset, dst, byteArrayOffset + dstOffset, length, dst.length - dstOffset);
    }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the test decompress_Incomplete is supposed to cover this, but good point that this is not reached if the native library is used.

But there is nothing I can do about it, can I? Or do you have any suggestion?
(locally I was testing with a modified version where always the non-native implementation was used)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Marcono1234 I think we can add environment flag to choose testing branch or simply make

private static int decompress

package local and directly call it in test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants