Skip to content

Add an error margin when comparing floats. #129721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 0 additions & 9 deletions muted-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -532,15 +532,6 @@ tests:
- class: org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT
method: test {knn-function.KnnSearchWithKOption SYNC}
issue: https://github.com/elastic/elasticsearch/issues/129512
- class: org.elasticsearch.xpack.logsdb.qa.StandardVersusStandardReindexedIntoLogsDbChallengeRestIT
method: testMatchAllQuery
issue: https://github.com/elastic/elasticsearch/issues/129527
- class: org.elasticsearch.xpack.logsdb.qa.BulkChallengeRestIT
method: testMatchAllQuery
issue: https://github.com/elastic/elasticsearch/issues/129528
- class: org.elasticsearch.xpack.logsdb.qa.StoredSourceLogsDbVersusReindexedLogsDbChallengeRestIT
method: testMatchAllQuery
issue: https://github.com/elastic/elasticsearch/issues/129529
- class: org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceTests
method: testIORateIsAdjustedForAllRunningMergeTasks
issue: https://github.com/elastic/elasticsearch/issues/129531
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,14 @@
import java.util.List;
import java.util.Objects;
import java.util.Optional;
import java.util.Set;
import java.util.function.Function;
import java.util.stream.Collectors;
import java.util.function.Supplier;

import static org.elasticsearch.datageneration.matchers.Messages.formatErrorMessage;
import static org.elasticsearch.datageneration.matchers.Messages.prettyPrintCollections;

class DynamicFieldMatcher {
private static final double FLOAT_ERROR_MARGIN = 1e-8;
private final XContentBuilder actualMappings;
private final Settings.Builder actualSettings;
private final XContentBuilder expectedMappings;
Expand Down Expand Up @@ -62,31 +62,51 @@ public MatchResult match(List<Object> actual, List<Object> expected) {

var normalizedActual = normalizeDoubles(actual);
var normalizedExpected = normalizeDoubles(expected);
Supplier<MatchResult> noMatchSupplier = () -> MatchResult.noMatch(
formatErrorMessage(
actualMappings,
actualSettings,
expectedMappings,
expectedSettings,
"Values of dynamically mapped field containing double values don't match after normalization, normalized "
+ prettyPrintCollections(normalizedActual, normalizedExpected)
)
);

return normalizedActual.equals(normalizedExpected)
? MatchResult.match()
: MatchResult.noMatch(
formatErrorMessage(
actualMappings,
actualSettings,
expectedMappings,
expectedSettings,
"Values of dynamically mapped field containing double values don't match after normalization, normalized "
+ prettyPrintCollections(normalizedActual, normalizedExpected)
)
);
if (normalizedActual.size() != normalizedExpected.size()) {
return noMatchSupplier.get();
}

for (int i = 0; i < normalizedActual.size(); i++) {
if (floatEquals(normalizedActual.get(i), normalizedExpected.get(i)) == false) {
return noMatchSupplier.get();
}
}
Comment on lines +76 to +84
Copy link
Contributor

@jordan-powers jordan-powers Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these checks are on the normalized lists, which have null values removed. This means if there's an extra null somewhere, or if the null is in a different position between the expected and actual, we wouldn't catch it here. Not sure if this is intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct @jordan-powers , it's not exactly intentional, more preserving existing behaviour.

I have been contemplating about it. The check before was less strict, it only checked for the unique values to be the same, no order, no null, no duplicates. I thought we need to make it stricter at least by checking the order and the values. I wasn't sure about the nulls, mainly, to avoid tripping an NPE, somewhere else.

On the other hand, I guess since we made it stricter, we could take it "all the way".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more update, while snooping a little bit more around the code I found this:

// Synthetic source modifications:
// * null values are not present
// * duplicates are removed
return values.stream()
.filter(v -> v != null && Objects.equals(v, "null") == false)
.map(transform)
.distinct()
.collect(Collectors.toList());

It appears that this was intentional. But if I am not mistaken, it doesn't hold anymore, we do preserve order now, right @martijnvg ?

If that is correct, I will open a follow up PR where I address this in all both places and not only when checking doubles. Otherwise, I will revert back to the original so they are in sync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed offline, indeed this is not the case anymore. But considering the scope of this PR we do not want to expand to the other normalisers. This can be changed in follow up work.

We chose to keep the non-null values in the given order for the double normalisation because this helps the float comparison with the error margin.


return MatchResult.match();
}

return matchWithGenericMatcher(actual, expected);
}

private static Set<Float> normalizeDoubles(List<Object> values) {
/**
* We make the normalisation of double values stricter than {@link SourceTransforms#normalizeValues} to facilitate the equality of the
* values within a margin of error. Synthetic source does support duplicate values and preserves the order, but it loses some accuracy,
* this is why the margin of error is very important. In the future, we can make {@link SourceTransforms#normalizeValues} also stricter.
*/
private static List<Float> normalizeDoubles(List<Object> values) {
if (values == null) {
return Set.of();
return List.of();
}

Function<Object, Float> toFloat = (o) -> o instanceof Number n ? n.floatValue() : Float.parseFloat((String) o);
return values.stream().filter(Objects::nonNull).map(toFloat).collect(Collectors.toSet());

// We skip nulls because they trip the pretty print collections.
return values.stream().filter(Objects::nonNull).map(toFloat).toList();
}

private static boolean floatEquals(Float actual, Float expected) {
return Math.abs(actual - expected) < FLOAT_ERROR_MARGIN;
}

private MatchResult matchWithGenericMatcher(List<Object> actualValues, List<Object> expectedValues) {
Expand Down