You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Optimize coalesce kernel for StringView (10-50% faster) (#7650)
# Which issue does this PR close?
- Part of #7456
# Rationale for this change
Currently the `coalesce` kernel buffers views / data until there are
enough rows and then concat's the results together. StringViewArrays can
be even worse as there is a second copy in `gc_string_view_batch`
This is wasteful because it
1. Buffers memory (has 2x the peak usage)
2. Copies the data twice
We can make it faster and more memory efficient by directly creating the
output array
# What changes are included in this PR?
1. Add a specialization for incrementally building `StringViewArray`
without buffering
Note this PR does NOT (yet) add specialized filtering -- instead it
focuses on reducing the
overhead of appending views by not copying them (again!) with
`gc_string_view_batch`
# Open questions:
1. There is substantial overlap / duplication with StringViewBuilder --
I wonder if we can / should consolidate them somehow
The differences are that the
1. Block size calculation management (aka look at the buffer sizes of
the incoming buffers)
2. Finishing array allocates sufficient space for views
# Are there any user-facing changes?
The kernel is faster, no API changes
0 commit comments