Add HGNetV2 to KerasHub #2293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

harshaljanjani wants to merge 3 commits into keras-team:master from harshaljanjani:hgnetv2

Collaborator

harshaljanjani commented Jun 9, 2025

Description of the Change

It is an end-to-end image classification model which had to be implemented on KerasHub as a building block towards supporting D-FINE. A number of D-FINE's presets depend on derivatives of the HGNetV2Backbone, and this model sets any required infrastructure in place to serve them. Its addition unlocks and allows follow-on integration effort toward D-FINE on KerasHub. Concurrently, I am working on exploring the integration paradigm for D-FINE with HGNetV2 layers.

Closes the first half of #2271

Reference

Please read Page 15/18, Section A.1.1 of the D-FINE paper, and the HF config files to verify this point. The "backbone": null argument in the HuggingFace configuration translates to an HGNetV2 backbone.

Colab Notebook

Usage and Numerics Matching Colab

Checklist

I have added all the necessary unit tests for my change.
I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
My PR is based on the latest changes of the main branch (if unsure, rebase the code).
I have followed the Keras Hub Model contribution guidelines in making these changes.
I have followed the Keras Hub API design guidelines in making these changes.
I have signed the Contributor License Agreement.

harshaljanjani added 2 commits

June 9, 2025 17:07


          init: Add initial project structure and files

e100813


          bug: Small bug related to weight loading in the conversion script

d4c78c1

harshaljanjani self-assigned this

harshaljanjani marked this pull request as ready for review

June 12, 2025 15:08


          finalizing: Add TIMM preprocessing layer

5b20394

Collaborator Author

harshaljanjani commented Jun 23, 2025

@mattdangerw @divyashreepathihalli We should be able to wrap up this PR soon. I’ve made considerable progress on D-FINE that I’m eager to push to GitHub, but the volume is considerable, and I want to avoid making this PR unmanageable. With that in mind, I’ve completed the functionality tests and numerics matching in the Colab Notebook linked in the PR description, and I’ve also written a standalone example from the user’s perspective, as you requested last time along with the numerics matching to wrap up this model.
I do have a few nits and some bugs to fix around tolerance in D-FINE, but nothing that affects the current functionality of HGNetV2. Additionally, I’ve written the weight conversion script for D-FINE, and I haven’t needed to make any changes to the HGNetV2 code I've developed here, it’s fully compatible with the D-FINE code, so I hope we’re good to go, thanks!

mattdangerw reviewed

View reviewed changes

Member

mattdangerw left a comment

Thanks! Left some thoughts on the exposed API.

Keep in mind with these a key goal is to be able to hot swap out classifier model for classifier model in a high level task without changing the high level fine-tuning code. I think there's a few spots we can do that better here (in the inline comments).

keras_hub/src/models/hgnetv2/hgnetv2_backbone.py

+              class HGNetV2Backbone(Backbone):
+                  """This class represents a Keras Backbone of the HGNetV2 model.
+                  This class implements an HGNetV2 backbone architecture.

Member

mattdangerw Jun 24, 2025

Can we add a little more detail here?

This class implements an HGNetV2 (High Performance GPU Net) backbone architecture, a convnet architecture for high performance image classification.

Or something like that, not actually sure what the best brief description would be

keras_hub/src/models/hgnetv2/hgnetv2_backbone.py

+                      stage_mid_channels,
+                      stage_out_channels,
+                      stage_num_blocks,
+                      stage_numb_of_layers,

Member

mattdangerw Jun 24, 2025

numb_of_layers -> num_layers

keras_hub/src/models/hgnetv2/hgnetv2_backbone.py

+                      use_learnable_affine_block: bool, whether to use learnable affine
+                          transformations.
+                      num_channels: int, the number of channels in the input image.
+                      stage_in_channels: list of ints, the input channels for each stage.

Member

mattdangerw Jun 24, 2025

I wonder if we could do something more like this? https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/xception/xception_backbone.py#L25C9-L28 list of lists here to keep the number of args to this down? what do you think?

I also think we use the term filters more often than channels for args like this

keras_hub/src/models/hgnetv2/hgnetv2_backbone.py

+                      hidden_sizes: list of ints, the sizes of the hidden layers.
+                      stem_channels: list of ints, the channels for the stem part.
+                      hidden_act: str, the activation function for hidden layers.
+                      use_learnable_affine_block: bool, whether to use learnable affine

Member

mattdangerw Jun 24, 2025

what is used if this is false?

keras_hub/src/models/hgnetv2/hgnetv2_backbone.py

+                      stage_num_blocks: list of ints, the number of blocks in each stage.
+                      stage_numb_of_layers: list of ints, the number of layers in each block.
+                      stage_downsample: list of bools, whether to downsample in each stage.
+                      stage_light_block: list of bools, whether to use light blocks in each

Member

mattdangerw Jun 24, 2025

what's a light block?

keras_hub/src/models/hgnetv2/hgnetv2_image_classifier.py



		@keras_hub_export("keras_hub.models.HGNetV2ImageClassifier")
		class HGNetV2ImageClassifier(ImageClassifier):

Member

mattdangerw Jun 25, 2025

Add a docstring, this will be public.

keras_hub/src/models/hgnetv2/hgnetv2_image_classifier.py

+                      activation=None,
+                      dropout=0.0,
+                      head_dtype=None,
+                      use_learnable_affine_block_head=False,

Member

mattdangerw Jun 25, 2025

Can we just follow the backbone for this? Do we ever want to disagree with the backbone here in practical usage? We could always add later.

keras_hub/src/models/hgnetv2/hgnetv2_image_classifier.py

+                      backbone,
+                      preprocessor,
+                      num_classes,
+                      head_filters,

Member

mattdangerw Jun 25, 2025

Probably we should add a default here. Can we do head_filters=None and read a value from the backbone that's a good default?

All our other classifiers are instantiatable from preset with a random head by just passing num_classes. If we require another arg, then you couldn't sub this in for other classifier models easily.

keras_hub/src/models/hgnetv2/hgnetv2_image_converter.py



		@keras_hub_export("keras_hub.layers.HGNetV2ImageConverter")
		class HGNetV2ImageConverter(PreprocessingLayer):

Member

mattdangerw Jun 25, 2025

Let's try to base this directly off the main image converter class. It's ok if we don't do the exact same resizing and cropping as upstream, that is something end users will be able to configure anyway (you could always chain the image converted with other keras image preprocessing layers).

mean and std will be covered by offset and scale in ImageConverter, you just need to convert the scalars. The ResizeThenCrop we don't support, but we can discuss separately whether that should be part of the image converter, or if we just allow users to do that with ResizeThenCrop directly.

keras_hub/src/models/hgnetv2/hgnetv2_presets.py

@@ @@ -0,0 +1,58 @@ @@
+              # Metadata for loading pretrained model weights.
+              backbone_presets = {
+                  "hgnetv2_b4.ssld_stage2_ft_in1k": {

Member

mattdangerw Jun 25, 2025

do we have dots in other preset names? i've never seen it. self consistency is more important that consistency with our source for these models, so probably go dot to underscore.

also, please take a look at other preset names in keras-hub and try to be as consistent as possible. i think we just use imagenet instead of in1k, for example?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet