Skip to content

[SPARK-52934][PYTHON] Allow yielding scalar values with Arrow-optimized Python UDTF #51640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ueshin
Copy link
Member

@ueshin ueshin commented Jul 24, 2025

What changes were proposed in this pull request?

Allows yielding scalar values with Arrow-optimized Python UDTF.

Also the error class for the case where the number of columns is different from the return schema is fixed.

Why are the changes needed?

There is a behavior difference in Arrow-optimized Python UDTF between legacy and new serialization.

  • legacy allows yielding scalar values
  • new one doesn't
@udtf(returnType="a: int")
class TestUDTF:
    def eval(self, a: int):
        yield a

The behavior should be consistent.

Does this PR introduce any user-facing change?

Yes, the new code path allows yielding scalar values.

How was this patch tested?

Updated the related tests.

Was this patch authored or co-authored using generative AI tooling?

No.

Copy link
Contributor

@allisonwang-db allisonwang-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@ueshin
Copy link
Member Author

ueshin commented Jul 24, 2025

Thanks! merging to master.

@ueshin ueshin closed this in dd2ee24 Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants