Skip to content

Update datetime filter #396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 11, 2025
Merged

Update datetime filter #396

merged 8 commits into from
Jun 11, 2025

Conversation

jonhealy1
Copy link
Collaborator

@jonhealy1 jonhealy1 commented Jun 6, 2025

Related Issue(s):

Description:

  • Improved datetime query handling to only check start and end datetime values when datetime is None

PR Checklist:

  • Code is formatted and linted (run pre-commit run --all-files)
  • Tests pass (run make test)
  • Documentation has been updated to reflect changes, if applicable
  • Changes are added to the changelog

@jonhealy1 jonhealy1 marked this pull request as ready for review June 8, 2025 11:53
@jonhealy1 jonhealy1 requested review from jamesfisher-geo and rhysrevans3 and removed request for jamesfisher-geo June 8, 2025 17:10
Comment on lines +323 to +334
filter=[
Q("exists", field="properties.start_datetime"),
Q("exists", field="properties.end_datetime"),
Q(
"range",
properties__start_datetime={"lte": datetime_search["lte"]},
),
Q(
"range",
properties__end_datetime={"gte": datetime_search["gte"]},
),
],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this enough to give all possible combinations of datetime overlap? This looks like it will only return items whose date range entirely encapsulates the searched for date range.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case shows overlap.

# Test 5: Range matching null-datetime-item but not range-item's datetime
    feature_ids = await _search_and_get_ids(
        app_client,
        params={
            "datetime": "2020-01-01T12:00:00Z/2020-01-02T12:00:00Z",
            "collections": [collection_id],
        },
    )
    assert feature_ids == {
        "null-datetime-item",  # Overlaps: search range [12:00-01-01 to 12:00-02-01] overlaps item range [00:00-01-01 to 00:00-02-01]
    }, "Range search excluding range-item datetime failed"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading the test correctly the null-datetime-item items has a date range of 2020-01-01T00:00:00Z to 2020-01-02T00:00:00Z and the search is 2020-01-01T12:00:00Z/2020-01-02T12:00:00Z so the searched date range it entirely within the item's date range. If the searched for range's end date was extended by a day so was 2020-01-01T12:00:00Z/2020-01-03T12:00:00Z I suspect the item wouldn't be returned. Is that the desired behaviour? I thought that if there was any overlap then the item should be returned.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be mixed up, but the item has a 24hr. date range and so does the query so they do overlap?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The end date of the search is outside the items end date

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry yes you're 100% right! I didn't realise datetime_search["lte"] was the end date of the search and datetime_search["gte"] was the start. This is a really neat way of doing this search!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is pretty neat

),
Q(
"bool",
must_not=[Q("exists", field="properties.datetime")],
Copy link
Collaborator

@rhysrevans3 rhysrevans3 Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below.

),
Q(
"bool",
must_not=[Q("exists", field="properties.datetime")],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current best practices recommends that you populate datetime even if you have a date range.

"The specification does allow one to set the datetime field to null, but it is strongly recommended to populate the single datetime field, as that is what many clients will search on. If it is at all possible to pick a nominal or representative datetime then that should be used."

So we should probably loosen the search (remove this line?) or update the recommended practice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I think the best practices are recommended but may not always be relevant for all types of data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhysrevans3 Can you look at this issue #396? I am not 100% sure on what the right approach should be.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that for some items, defined by start and end datetimes, it may not make sense to set a datetime value just for the sake of doing so. The stac spec itself allows null datetime values.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right both the case for a null or set datetime when start and end dates are used needs to be handled. What's the expected behaviour when the datetime is set? I would expect it to search on all date fields. If that's the case then I think the must_not=[Q("exists", field="properties.datetime")] can be removed. I think the current query will ignore start and end dates if the datetime is set. Is that expected?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you thinking of documenting this in the readme or maybe something else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a few options; default - this one maybe, start and end datetimes included, only datetime

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking of a way to tell the user doing the search which behaviour to expect. But I can't think of a good place to put it, other than the queryables endpoint maybe? But including it in the readme is probably enough.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open an issue for this so we can work on it later. We can definitely use logging too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonhealy1 jonhealy1 requested a review from rhysrevans3 June 10, 2025 04:34
@jonhealy1 jonhealy1 merged commit ae81078 into main Jun 11, 2025
15 checks passed
@jonhealy1 jonhealy1 deleted the update-datetime-filter branch June 11, 2025 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants