Skip to content

Replace mimic3 with piper #686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

val-ba
Copy link
Contributor

@val-ba val-ba commented Apr 14, 2025

Summary

fixes #530

Proposed changes

Related issues

Checklist

  • Run colcon build
  • Write documentation
  • Test on your machine
  • Test on the robot
  • Create issues for future work
  • Triage this PR and label it

@val-ba val-ba requested review from texhnolyze and Flova April 14, 2025 13:09
@val-ba val-ba self-assigned this Apr 14, 2025
@val-ba val-ba linked an issue Apr 14, 2025 that may be closed by this pull request
@github-project-automation github-project-automation bot moved this to 🆕 New in Software Apr 14, 2025
@val-ba val-ba moved this from 🆕 New to 👀 In review in Software Apr 14, 2025
@jaagut jaagut marked this pull request as ready for review May 4, 2025 12:53
Copy link
Member

@Flova Flova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things:

  • How is the model downloaded?
  • I think we should implement caching so the tts is not calculated every time (we had this in mimic3)
  • With mimic we had this seperate server that was started by systemd (configured in Ansible) that way we could e.g. announce the IP etc. at startup

@github-project-automation github-project-automation bot moved this from 👀 In review to 🏗 In progress in Software May 8, 2025
@@ -229,3 +229,6 @@ doku/*
**/workspace_status.json

.pytest_cache/

# tts model
*/bitbots_tts/model/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest **/bitbots_tts/model/ to ignore the whole directory at any depth (in case we move it at some point).

@@ -16,6 +17,12 @@

from bitbots_msgs.msg import Audio

# Load the Piper voice
bb_tts_dir = Path(__file__).parent.parent / "model" # TODO: check how to get nice relative paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean depending on where you wanna locate the model folder you should probably use:
get_package_prefix, get_package_share_path which would give you e.g. ~/colcon_ws/install/share/bitbots_tts.

The current solutions should also work, but I would then use an absolute path to ensure that it will work when building without symlinks.

}
with io.BytesIO() as buffer:
with wave.open(buffer, "wb") as wav_file:
voice.synthesize(text, wav_file, **synthesize_args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would mean that we will generate a new wav file each time?
mimic was implicitly caching the already generated voices, which was also the reason for the web server if I remember correctly.

Have you tried how long the wav generation takes?

Copy link
Contributor Author

@val-ba val-ba May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is real time on my laptop, but I did not try it on the robot yet. Maybe we should cache them again though... just to be sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

Replace mimic3 with piper
3 participants