Skip to content

HIVE-28222: Ambiguous table alias exception for queries with self joins #5998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zabetak
Copy link
Member

@zabetak zabetak commented Jul 29, 2025

What changes were proposed in this pull request?

  1. Collect all table aliases under joins
  2. Introduce derived table whenever the same alias appears more than once

Why are the changes needed?

Some queries that contain joins of the same table more than twice fail during compilation while trying to transform the optimized AST (obtained from CBO) to an Operator tree. The "Ambiguous table alias" exception is raised when a HiveJoin contains the same table scan multiple times (without an interleaving project) cause the generated AST is ambiguous.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile_regex=cbo_self_join.*

@@ -123,29 +127,35 @@ public static RelNode convertOpTree(RelNode rel, List<FieldSchema> resultSchema,
return newTopNode;
}

private static String getTblAlias(RelNode rel) {
private static class AliasCollector extends RelHomogeneousShuttle {
List<String> aliases = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when accept is called more than once on the same instance ? Is this list reseted before each?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered some issues with the previous code so I reworked the approach and force-pushed a new version. Apologies for rebasing and squashing, I haven't seen your comments before pushing the new version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Please let me know if you need another review.

Some queries that contain joins of the same table more than twice fail during compilation while trying to transform the optimized AST (obtained from CBO) to an Operator tree.
The "Ambiguous table alias" exception is raised when a HiveJoin contains the same table scan multiple times (without an interleaving project) cause the generated AST is ambiguous.
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants