Skip to content

refactor: add apply_window_if_present and get_window_order_by methods #1947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 30, 2025

Conversation

chelsea-lin
Copy link
Contributor

Fixes internal issue 430350912 🦕

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Jul 29, 2025
@chelsea-lin
Copy link
Contributor Author

This change is one of break-down changes for #1889

@chelsea-lin chelsea-lin force-pushed the main_chelsealin_windows0 branch from 0345f55 to f5bee10 Compare July 29, 2025 18:29
@chelsea-lin chelsea-lin marked this pull request as ready for review July 29, 2025 18:29
@chelsea-lin chelsea-lin requested review from a team as code owners July 29, 2025 18:29
@chelsea-lin chelsea-lin requested a review from sycai July 29, 2025 18:29
)
self.assertEqual(
result.sql(dialect="bigquery"),
"value OVER (ORDER BY `col1` ASC NULLS LAST RANGE BETWEEN 86400000000 PRECEDING AND 43200000000 FOLLOWING)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For time window rolling like this, we need to wrap the rolling column with unix_micros()

The result should look like this:

... OVER (ORDER BY UNIX_MICROS(`col1`) ASC ...) 

Could you update the compiler code to make it happen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an example code for you to quickly verify whether the generated SQL is valid E2E:

import bigframes.pandas as bpd
import pandas as pd

df = bpd.read_gbq("bigframes-dev.sycai.test_rolling")

df.set_index('ts_col').sort_index().rolling(window=pd.Timedelta('3s')).min().sql

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request (PR) focuses on implementing just two utility functions for windows, while the full windows compilation is drafted in PR #1889. With the full implementation, the e2e codes can generate the right golden SQL (see here). IIUC, the ops.UnixMicros operator will response for the SQL UNIX_MICROS generation. Let me know if you have further questions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Thanks for the explanation

@chelsea-lin chelsea-lin enabled auto-merge (squash) July 30, 2025 21:52
@chelsea-lin chelsea-lin force-pushed the main_chelsealin_windows0 branch from 05e9e0c to 55bb9b9 Compare July 30, 2025 21:58
@chelsea-lin chelsea-lin merged commit b6aeca3 into main Jul 30, 2025
18 of 25 checks passed
@chelsea-lin chelsea-lin deleted the main_chelsealin_windows0 branch July 30, 2025 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants