Skip to content

clean up repo to reduce size #2479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
aaravgarg opened this issue May 30, 2025 · 1 comment
Open

clean up repo to reduce size #2479

aaravgarg opened this issue May 30, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@aaravgarg
Copy link
Collaborator

there are too many large files (eg. photos & videos) in the repo

these need to be stored in a better and optimized way coz rn the repo takes forever to clone

@aaravgarg aaravgarg added the bug Something isn't working label May 30, 2025
@jagdish-15
Copy link

jagdish-15 commented Jun 1, 2025

Hello @aaravgarg,

I've worked out how to significantly reduce the repository size using Git LFS with the following file patterns being tracked:

*.jpeg filter=lfs diff=lfs merge=lfs -text
*.jpg filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text
*.mp4 filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.mov filter=lfs diff=lfs merge=lfs -text
*.dill filter=lfs diff=lfs merge=lfs -text
*.gif filter=lfs diff=lfs merge=lfs -text
*.elf filter=lfs diff=lfs merge=lfs -text

I have reworked the entire git history to apply this properly. Without history rework, there is no effective way to reduce the repository size.
As a result, the clone size has been reduced from 607 MB to a mere 132 MB.

Here is proof of the cloning process output:

Cloning into 'omi-lfs-clean'...
remote: Enumerating objects: 66576, done.
remote: Counting objects: 100% (6/6), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 66576 (delta 0), reused 1 (delta 0), pack-reused 66570 (from 1)
Receiving objects: 100% (66576/66576), 132.80 MiB | 6.39 MiB/s, done.
Resolving deltas: 100% (44659/44659), done.
Updating files: 100% (3464/3464), done.

I’ve also updated the documentation (contribution.mdx) to include how to clone the repo without downloading all LFS files by default, by adding the following at the end of the file:

Cloning Repository Without Downloading All LFS Files

Because some files tracked by Git LFS can be very large, cloning the full repo may take a long time or consume lots of bandwidth.

To avoid downloading all large files immediately, you can clone the repository with only LFS pointer files (placeholders) using:

GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/BasedHardware/Omi.git

Then, when you need the actual large files, run:

git lfs pull

To download only specific large files or directories, use:

git lfs pull -I "path/to/large/file1" -I "path/to/large/dir/*"

Replace "path/to/large/file1" and "path/to/large/dir/*" with the actual paths you want.

This keeps your initial clone lightweight and lets you fetch only the LFS files you need.


All these changes are hosted in a non-fork repository because GitHub does not allow pushing rewritten history and new LFS objects to a public fork.

Here is the cleaned and restructured repository:
🔗 https://github.com/jagdish-15/omi-lfs-clean

Unfortunately, GitHub doesn’t allow PRs from non-fork repos!

Also note: Accepting this history rewrite means all existing forks will need to be re-synced, and contributors will have to:

  • Re-clone the repo OR
  • Rebase and clean their local clones
  • Force push their changes (if needed) after cleanup

Let me know how we can go ahead with this! I'm happy to coordinate or adjust as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants