-
Notifications
You must be signed in to change notification settings - Fork 15
Replace mimic3 with piper #686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things:
- How is the model downloaded?
- I think we should implement caching so the tts is not calculated every time (we had this in mimic3)
- With mimic we had this seperate server that was started by systemd (configured in Ansible) that way we could e.g. announce the IP etc. at startup
@@ -229,3 +229,6 @@ doku/* | |||
**/workspace_status.json | |||
|
|||
.pytest_cache/ | |||
|
|||
# tts model | |||
*/bitbots_tts/model/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest **/bitbots_tts/model/
to ignore the whole directory at any depth (in case we move it at some point).
@@ -16,6 +17,12 @@ | |||
|
|||
from bitbots_msgs.msg import Audio | |||
|
|||
# Load the Piper voice | |||
bb_tts_dir = Path(__file__).parent.parent / "model" # TODO: check how to get nice relative paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean depending on where you wanna locate the model
folder you should probably use:
get_package_prefix
, get_package_share_path
which would give you e.g. ~/colcon_ws/install/share/bitbots_tts
.
The current solutions should also work, but I would then use an absolute path to ensure that it will work when building without symlinks.
} | ||
with io.BytesIO() as buffer: | ||
with wave.open(buffer, "wb") as wav_file: | ||
voice.synthesize(text, wav_file, **synthesize_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would mean that we will generate a new wav
file each time?
mimic
was implicitly caching the already generated voices, which was also the reason for the web server if I remember correctly.
Have you tried how long the wav generation takes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is real time on my laptop, but I did not try it on the robot yet. Maybe we should cache them again though... just to be sure
Summary
fixes #530
Proposed changes
Related issues
Checklist
colcon build