Skip to content

Expose connection errors in HTTPFileSystem._exists #1849

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

btbest
Copy link

@btbest btbest commented May 23, 2025

I can't see why blanket-catching all types of network and client-side issues would be a good thing to do here; the git history doesn't explain it, and test_http passes for me locally with this change. Encountering an error during session.get doesn't imply the file doesn't exist, so it generally seems wrong to return False.

I'm running into this issue because I want to implement retrying if the connection drops or times out. With the current implementation, I can't distinguish whether the server actually doesn't have the file vs. the user's mobile data reception being terrible.

@btbest
Copy link
Author

btbest commented May 23, 2025

Context: ilastik/ilastik#3019 (the last commit has the workaround without upstreaming this fix)

@martindurant
Copy link
Member

Hm, in the case that the URL points to a non-existent server, exists(url) would now cause an exception rather than returning False.
It is possible that we could differentiate between "does not exist" and "other, maybe retriable error". But HTTP is tricky and generally non-standard, so it would need to consider cases like Gateway Error and Not Authenticated. In general, the right choice will depend on the remote you expect to interact with.

@btbest
Copy link
Author

btbest commented May 26, 2025

In general, the right choice will depend on the remote you expect to interact with.

I'd like to make the right choice, but the current implementation makes that impossible, because it doesn't allow the consumer to distinguish errors like non-existent server, Gateway Error and Not Authenticated from "checked successfully, and the file isn't there".

Even return r.status < 400 is semantically questionable, because 4xx statuses other than 404 and 5xx statuses don't mean that the answer to """Is there a file at the given path""" is "No" (they mean "you can't/aren't allowed to know")...

@btbest btbest force-pushed the expose-connection-errors branch 2 times, most recently from a529636 to 65f8d2f Compare May 26, 2025 09:48
@btbest btbest force-pushed the expose-connection-errors branch from 65f8d2f to 04867d4 Compare May 26, 2025 09:51
@btbest
Copy link
Author

btbest commented May 26, 2025

I get that who knows what people have implemented relying on this, so put it behind a kwarg to maintain the default behaviour (sorry about the multiple pushes, I thought "strict" would mask a kwarg downstream for a sec).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants