Skip to content

Add an error margin when comparing floats. #129721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

gmarouli
Copy link
Contributor

We add a margin of error when comparing floats to the DynamicFieldMatcher to account for a small loss of accuracy when loading fields from synthetic source.

Furthermore, we switched the set to a list to take the order of array elements also into account since it's supported.

Fixes: #129527, #129528, #129529

@gmarouli gmarouli added auto-backport Automatically create backport pull requests when merged :StorageEngine/Mapping The storage related side of mappings v8.19.0 labels Jun 19, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

}

for (int i = 0; i < normalizedActual.size(); i++) {
if (floatsEquals(normalizedActual.get(i), normalizedExpected.get(i))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if floatsEquals(...) returns true then there is no match? Maybe rename floatsEquals(...) to floatsNoEquals(...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh boy, my bad. Actually I prefer to keep it floatsEquals and fix the code because its more intuitive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also good with me 👍

@gmarouli
Copy link
Contributor Author

Thank you @martijnvg for the review, such an oversight the method name and implementation 🙃 . I run the test locally with -Dtests.iters=N, because it's a rest test it takes a bit longer so around 600+ it reached the timeout. But I think it gave me enough confidence for now.

What do you think?

@gmarouli gmarouli requested review from martijnvg and kkrik-es June 20, 2025 08:53
@martijnvg
Copy link
Member

What do you think?

Yes, that should give good confidence.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Comment on lines +76 to +84
if (normalizedActual.size() != normalizedExpected.size()) {
return noMatchSupplier.get();
}

for (int i = 0; i < normalizedActual.size(); i++) {
if (floatEquals(normalizedActual.get(i), normalizedExpected.get(i)) == false) {
return noMatchSupplier.get();
}
}
Copy link
Contributor

@jordan-powers jordan-powers Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these checks are on the normalized lists, which have null values removed. This means if there's an extra null somewhere, or if the null is in a different position between the expected and actual, we wouldn't catch it here. Not sure if this is intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct @jordan-powers , it's not exactly intentional, more preserving existing behaviour.

I have been contemplating about it. The check before was less strict, it only checked for the unique values to be the same, no order, no null, no duplicates. I thought we need to make it stricter at least by checking the order and the values. I wasn't sure about the nulls, mainly, to avoid tripping an NPE, somewhere else.

On the other hand, I guess since we made it stricter, we could take it "all the way".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more update, while snooping a little bit more around the code I found this:

// Synthetic source modifications:
// * null values are not present
// * duplicates are removed
return values.stream()
.filter(v -> v != null && Objects.equals(v, "null") == false)
.map(transform)
.distinct()
.collect(Collectors.toList());

It appears that this was intentional. But if I am not mistaken, it doesn't hold anymore, we do preserve order now, right @martijnvg ?

If that is correct, I will open a follow up PR where I address this in all both places and not only when checking doubles. Otherwise, I will revert back to the original so they are in sync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed offline, indeed this is not the case anymore. But considering the scope of this PR we do not want to expand to the other normalisers. This can be changed in follow up work.

We chose to keep the non-null values in the given order for the double normalisation because this helps the float comparison with the error margin.

@gmarouli gmarouli merged commit d859366 into elastic:main Jun 23, 2025
32 checks passed
@gmarouli gmarouli deleted the test-fix/add-error-margin-for-float-equality branch June 23, 2025 15:46
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 129721

@gmarouli
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

gmarouli added a commit to gmarouli/elasticsearch that referenced this pull request Jun 23, 2025
We add a margin of error when comparing floats to the DynamicFieldMatcher to account for a small loss of accuracy when loading fields from synthetic source.

(cherry picked from commit d859366)

# Conflicts:
#	muted-tests.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine >test Issues or PRs that are addressing/adding tests v8.19.0 v9.1.0
Projects
None yet
5 participants