Skip to content

Conversation

tjvishnu
Copy link

@tjvishnu tjvishnu commented Oct 4, 2025

-Removed references to ADLS Gen1, which is retired.
-Added information on connecting to Blobs as well as ADLS Gen2

-Removed references to ADLS Gen1, which is retired. 
-Added information on connecting to Blobs as well as ADLS Gen2
Updated project description to reflect support for ADLS Gen2.
Copy link
Collaborator

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just had a few small comments.

README.md Outdated
- `location_mode`: valid values are "primary" or "secondary" and apply to RA-GRS accounts

For more argument details see all arguments for [`AzureBlobFileSystem` here](https://github.com/fsspec/adlfs/blob/f15c37a43afd87a04f01b61cd90294dd57181e1d/adlfs/spec.py#L328) and [`AzureDatalakeFileSystem` here](https://github.com/fsspec/adlfs/blob/f15c37a43afd87a04f01b61cd90294dd57181e1d/adlfs/spec.py#L69).
For more argument details see all arguments for [`AzureBlobFileSystem` here](https://github.com/fsspec/adlfs/blob/f15c37a43afd87a04f01b61cd90294dd57181e1d/adlfs/spec.py#L328)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of linking to the code, let's instead link to the rendered HTML docs: https://fsspec.github.io/adlfs/api/#adlfs.AzureBlobFileSystem. Mainly, there is a hardcoded commit SHA here so if arguments are added in the future, this link will drift away from what is in main.

README.md Outdated
By default, write operations create BlockBlobs in Azure, which, once written can not be appended. It is possible to create an AppendBlob using `mode="ab"` when creating and operating on blobs. Currently, AppendBlobs are not available if hierarchical namespaces are enabled.

### Older versions
ADLS Gen1 filesystem has officially been [retired](https://learn.microsoft.com/en-us/lifecycle/products/azure-data-lake-storage-gen1)). Hence the older versions of this package, which was designed to connect to ADLS Gen1 is obsolete.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of suggestions on this section:

  • It looks like there is an extra ) that we should remove at the end of the link.
  • Maybe we can remove this second sentence and just add on: "and support in adlfs is obsolete"? Mainly, the AzureBlobFileSystem has been around for several years so technically these older versions should work just fine if customers were using az or abfs.

@kyleknap
Copy link
Collaborator

kyleknap commented Oct 6, 2025

Sort of related, but something we should do outside of this PR... Right now there is a deprecation warning that ADLS Gen 1 support will be moved to an optional dependency. It would probably make sense to update this warning to say that it will be removed in a future version of adlfs given the ADLS Gen 1 has now been retired for more than a year.

Copy link
Collaborator

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! @martindurant this should be good for a review.


The `adl://` and `abfs://` protocols are included in fsspec's known_implementations registry
The `az://` and `abfs://` protocols are included in fsspec's known_implementations registry
in fsspec > 0.6.1, otherwise users must explicitly inform fsspec about the supported adlfs protocols.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could remove this mention of ancient fsspec, I doubt things here would still work that far back

```

To use the Gen2 filesystem you can use the protocol `abfs` or `az`:
To connect to Blobs or Azure Data Lake Storage (ADLS) Gen2 filesystem you can use the protocol `abfs` or `az`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

abfs: also for for adls2?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. abfs:// supports both non-hierarchical blob and adlfs gen 2 accounts.

Right now adlfs, only uses the blob endpoint no matter the account type, which will functionally work correctly for both types of accounts since ADLS Gen 2 is built on Blob.

However, this also means that adlfs is losing out on potential optimizations especially on renames, recursive deletes if it were to detect whether a storage account was ADLS Gen 2 enabled and use the ADLS Gen 2 endpoint instead. This is a pattern followed by other filesystem Azure Storage tools like the ABFS driver and BlobFuse where it promotes ADLS Gen 2 as a feature of Azure Blob and smartly determines which endpoints/operations to use depending on the Storage account type + operation, which reduces the cognitive overhead of having to think through account types and what API set needs to be accessed (e.g., blob endpoint or ADLS Gen 2 endpoint). So, getting adlfs moving in that direction would see those benefits and also be more consistent with other Azure Storage related software.

Azure Datalake Gen1 and Azure Datalake Gen2, that facilitate
interactions between both Azure Datalake implementations and Dask. This is done leveraging the
[intake/filesystem_spec](https://github.com/intake/filesystem_spec/tree/master/fsspec) base class and Azure Python SDKs.
The package includes pythonic filesystem implementations for both [Azure Blobs](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-overview) and [Azure Datalake Gen2 (ADLS)](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction), that facilitate interactions between these implementations and Dask. This is done leveraging the [intake/filesystem_spec](https://github.com/intake/filesystem_spec/tree/master/fsspec) base class and Azure Python SDKs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants