-
Notifications
You must be signed in to change notification settings - Fork 108
Add validation/verification to the planner to avoid alias ambiguities and unresolved aliases #3405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… and unresolved aliases
@@ -229,6 +229,12 @@ private void yieldExpression(@Nonnull final RelationalExpression expression, fin | |||
} | |||
} | |||
|
|||
protected void validateNewExpression(@Nonnull final RelationalExpression expression) { | |||
Debugger.sanityCheck(() -> verifyChildrenMemoized(expression)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to be part of a sanity check? The old code always validated that the children were memoized, so this is making things laxer, at least in the production configuration of the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. My thinking was that this has never triggered and we possibly can use the cycles elsewhere.
@Nonnull final EvaluationContext evaluationContext) { | ||
final Set<CorrelationIdentifier> correlatedToWithoutChildren; | ||
if (expression instanceof RelationalExpressionWithChildren) { | ||
correlatedToWithoutChildren = ((RelationalExpressionWithChildren)expression).getCorrelatedToWithoutChildren(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this doesn't go down the children of this expression and validate those correlations as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That gets a little more tricky. In a lot of cases the expression that is yielded is the only thing that has changed. In those cases, this is exhaustive. If there is a memoization call that returns a new reference, and an expression in that new reference has a problem, we wouldn't catch it. Maybe I can call it when the exploration of that new reference starts. That however, would make the graph validation at beginning of planning superfluous.
correlatedToWithoutChildren = expression.getCorrelatedTo(); | ||
} | ||
|
||
final var visibleThroughEvaluationContext = evaluationContext.getBindings().getBoundCorrelationAliases(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for things like constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there are bunch of test cases written using incomplete graphs, and sometimes using temp tables. This seems to be the easier way out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constant themselves live in a different namespace among the bindings, so this is for actual ___corr_XXX
final var parentRefPaths = traversal.getParentRefPaths(this); | ||
|
||
if (parentRefPaths.isEmpty()) { | ||
Verify.verify(currentUnresolvedCorrelatedTo.isEmpty(), "unresolved aliases: " + currentUnresolvedCorrelatedTo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verify.verify(currentUnresolvedCorrelatedTo.isEmpty(), "unresolved aliases: " + currentUnresolvedCorrelatedTo); | |
Verify.verify(currentUnresolvedCorrelatedTo.isEmpty(), "unresolved aliases: %s", currentUnresolvedCorrelatedTo); |
To prevent us from calculating the error message even if the verify succeeds
final var parentRefPaths = traversal.getParentRefPaths(this); | ||
|
||
if (parentRefPaths.isEmpty()) { | ||
Verify.verify(currentUnresolvedCorrelatedTo.isEmpty(), "unresolved aliases: " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verify.verify(currentUnresolvedCorrelatedTo.isEmpty(), "unresolved aliases: " + | |
Verify.verify(currentUnresolvedCorrelatedTo.isEmpty(), "unresolved aliases: %s", |
|
||
// run sanity check to make sure that all aliases handed in can be uniquely resolved | ||
Debugger.sanityCheck(() -> | ||
currentRoot.verifyCorrelationsRecursive(evaluationContext.getBindings().getBoundCorrelationAliases())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How expensive do you think it would be to run this check here at the beginning of planning? Running it with every new expression (outside of a sanity check) is probably too much, but it would probably be a good idea to verify at the start that the initial input is okay. Maybe there's a concern that we'd start failing someone's query that is technically illegal (or that we identify as illegal as a bug) which results in failures in production code. We may also need to switch this away from the Verify
framework so that we generate an appropriate error code if we're using this validate user input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could always run it, sure! If this triggers I would always view it as a bug answer the plan generator should have caught it, so I guess verify
is fine.
@@ -321,9 +321,9 @@ private List<Long> multiplesOf(@Nonnull final List<Long> initial, long limit) th | |||
|
|||
final var logicalPlan = Reference.initialOf(LogicalSortExpression.unsorted(Quantifier.forEach(Reference.initialOf(recursiveUnionPlan)))); | |||
final var cascadesPlanner = (CascadesPlanner)planner; | |||
final var plan = cascadesPlanner.planGraph(() -> logicalPlan, Optional.empty(), IndexQueryabilityFilter.TRUE, EvaluationContext.empty()).getPlan(); | |||
final var evaluationContext = putTempTableInContext(seedingTempTableAlias, seedingTempTable, null); | |||
final var plan = cascadesPlanner.planGraph(() -> logicalPlan, Optional.empty(), IndexQueryabilityFilter.TRUE, evaluationContext).getPlan(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was "wrong" before, but by calling cascadesPlanner::planGraph
directly instead of FDBRecordStoreQueryTest::planGraph
, it's not sending the plans through serialization verification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me see if I can fix that.
.../test/java/com/apple/foundationdb/record/provider/foundationdb/query/RecursiveUnionTest.java
Show resolved
Hide resolved
@@ -245,6 +245,7 @@ public void bitmap(YamlTest.Runner runner) throws Exception { | |||
} | |||
|
|||
@TestTemplate | |||
@MaintainYamlTestConfig(YamlTestConfigFilters.CORRECT_EXPLAINS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaintainYamlTestConfig(YamlTestConfigFilters.CORRECT_EXPLAINS) |
Are there explains in the recursive-cte.yamsql
test file that need to be modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I need to revert that. No, I sometimes do that as it doesn't run the multi server stuff if I repeatedly run it.
if (rule instanceof ImplementationCascadesRule) { | ||
for (RelationalExpression expression : expectedList) { | ||
preExploreForRule(expression, true); | ||
preExploreForRule(expression, Traversal.withRoot(Reference.initialOf(expression)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this traversal be based on the root
rather than the expression
? It's possible the root and the expression can share subgraphs, so does that mean we'd need to copy the root and the expression as a unit to get the right traversal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I assume that root
includes group
which directly contains expression
. The rule under testing is run on group
. So the pre-explore has to run on group
.
This PR adds logic to
CascadesPlanner
REWRITING
andPLANNING
.This validation logic is implemented in
Reference.verifyCorrelationsRecursive
andReference.verifyCorrelationsForNewExpression
.I ran extensive verification tests with these checks enabled, but as coded up in the PR it will only be run if an insane
Debugger
is installed. That is currently the case for:fdb-record-layer-core:test
but not for:fdb-relational-layer-core:test
.Upon running these tests the following problems surfaced:
RecordQuery
that causedInComparisonToExplodeRule
to yield an incorrectExplodeExpression
of aQOV(parameter)
whereparameter
is a bound parameter marker. SinceQOV(...)
is only properly defined over correlations,QOV(parameter)
is an illegal alias reference asparameter
is not a correlation. This was fixed by introducingParameterValue
that can dereference an actual parameter. Other rules such asImplementInJoinRule
andImplementInUnionRule
had to be adapted to create the properInSource
flavors.RecursiveUnionExpression
(andRecordQueryRecursiveUnionPlan
) can correlate AND descendants can refer to the temp aliases. Logic was added to allow the alias resolution to understand that these temp table aliases are valid aliases to be used in subgraphs underneathRecursiveUnionExpression
(andRecordQueryRecursiveUnionPlan
).QueryPlan.strictlySorted(...)
was called (we incorrectly did not inherit the right quantifier alias). The fix is a one-liner inRecordQueryPredicatesFilterPlan
.EvaluationContext
to be passed in that the verification logic can use to identify additionally visible aliases.RuleTestHelper
asa. some query graphs were incomplete (dangling unresolved aliases)
b. the
Traversal
that is used inside ofCascadesRuleCall
to validate new expressions was incorrectly maintained inRuleTestHelper