EntityNet

EntityNet

Using Knowledge Graphs to harvest datasets for efficient CLIP model training

We use knowledge graphs and web image search to build a diverse dataset of 33M images paired with 46M texts. We show that this dataset can be used to train a generic CLIP model in a short amount of time.

Using a 10M-image subset focused on living organisms, we train domain expert models that excel at fine-grained classification of animals, plants, and fungi.

Stay tuned for the dataset release. For now, we have released our preprint and trained CLIP models.

Models

Our CLIP models are available on 🤗 Hugging Face

Models are named as follows:

Architecture
Trained on EntityNet-33M (all images) or trained on LivingThings-10M (only trained on images of living organisms)
Pretrained from scratch or finetuned

Usage

Models can be used with open_clip as follows:

import torch
from PIL import Image
import open_clip

model, _, preprocess = open_clip.create_model_and_transforms("hf-hub:lmb-freiburg/CLIP-ViT-B-16-EntityNet-33M")
model.eval()
tokenizer = open_clip.get_tokenizer('ViT-B-16', context_length=32)

image = preprocess(Image.open("assets/rabbit.jpg")).unsqueeze(0)
text = tokenizer(["a dog", "a cat", "a rabbit"])

with torch.no_grad(), torch.autocast("cuda"):
    image_features = model.encode_image(image)
    text_features = model.encode_text(text)
    image_features /= image_features.norm(dim=-1, keepdim=True)
    text_features /= text_features.norm(dim=-1, keepdim=True)
    logits = (model.logit_scale * image_features @ text_features.T)
    pred_class = logits.argmax(-1).item()

print(pred_class)  # prints: 2

News

06 May 2025: Models on Hugging Face
05 May 2025: Preprint on arXiv
23 April 2025: Talk at Stuttgart AI, Machine Learning and Computer Vision Meetup
20 March 2025: Poster at 2025 ELLIS Winter School on Foundation Models

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EntityNet

Models

Usage

News

Release roadmap

About

Uh oh!

Releases

Packages

Uh oh!

lmb-freiburg/entitynet

Folders and files

Latest commit

History

Repository files navigation

EntityNet

Models

Usage

News

Release roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Packages