Skip to content

Local vad service #2502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

thestumonkey
Copy link
Contributor

On a local environment the speech detection was not working, leading to the "teach omi your voice" being present all the time.

  • The silero model was not returning segments as it needs 16000Hz and not 8000Hz
  • The code was hardcoded to look for a seperate vad service which can be configured in backend/modal

This PR:

  • Fixes silero model use
  • Adds logic to fallback to silero if the HOSTED_VAD_API_URL is not set
  • Simplified the docker container to import the nvidia cuda image directly instead of building from scratch
  • Changed default vad port so doesn't conflict with server
  • Updated the code to call the correct /v1/vad endpoint instead of root
  • Added logic to either be able to give the google service account as a file or a json string
  • Updated docs
  • Updated a couple of libraries that were failing on the ios app build

Changed port so it doesn't run on same one as server

changed to using a docker image to build CUDA

updated vad to have correct modal endpoint and fix the silero fallback if not set

added flag to disable translation as often not needed in dev

updated docs for the vad service
Copy link

vercel bot commented Jun 3, 2025

@thestumonkey is attempting to deploy a commit to the kodjima33's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant