-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fastspeech models #4337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
fastspeech models #4337
Conversation
Thanks for your contribution! |
Predict: | ||
batch_size: 1 | ||
model_dir: "fastspeech2csmsc" | ||
input: "今天天气真不错" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input 应该是phone?
Predict: | ||
batch_size: 1 | ||
model_dir: "pwgan_csmsc" | ||
input: "今天天气真不错" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input应该是npy或者tensor?
# limitations under the License. | ||
|
||
from .dataset_checker import TextToSpeechAcousticDatasetChecker | ||
# from .trainer import TextToSpeechTrainer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个地方也记得改一下
@@ -0,0 +1,208 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文件名不太对,应该是.en.md
@@ -0,0 +1,209 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
|
||
## III. Quick Integration | ||
Before quick integration, first install the PaddleX wheel package. For wheel installation methods, please refer to [PaddleX Local Installation Tutorial](../../../installation/installation.md). After installing the wheel package, inference for the multilingual speech synthesis acoustic module can be completed with just a few lines of code. You can freely switch models within this module, or integrate model inference from the multilingual speech synthesis module into your project. | ||
<!-- Before running the following code, please download the [sample audio](https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav){target="_blank"} to your local machine. --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个时候你的输入是个npy,sample应该不再使用audio?
@@ -0,0 +1,14 @@ | |||
# copyright (c) 2025 PaddlePaddle Authors. All Rights Reserve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useless file?
@@ -0,0 +1,14 @@ | |||
# copyright (c) 2025 PaddlePaddle Authors. All Rights Reserve. | |||
# | |||
# Licensed under the Apache License, Version 2.0 (the "License"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useless file?
@@ -0,0 +1,18 @@ | |||
# copyright (c) 2024 PaddlePaddle Authors. All Rights Reserve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2024 -> 2025
…o develop_fastspeech_item
paddlex/modules/__init__.py
Outdated
TextRecExportor, | ||
TextRecTrainer, | ||
) | ||
from .text_to_speech_vocoder import TextToSpeechVocoderDatasetChecker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whether should import others like trainer, exporter etc
entities = MODELS | ||
|
||
def __init__(self, config): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data format error
|
||
|
||
class TextToSpeechAcousticEvaluator(BaseEvaluator): | ||
"""Instance Fastspeech2Model Model Evaluator""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fastspeech2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
一些问题,有空再看看吧。
use_trt: False | ||
use_mkldnn: False | ||
cpu_threads: 1 | ||
precision: "fp32" | ||
output: "output" | ||
model_name: "fastspeech2_csmsc" | ||
speaker_dict: None | ||
lang: zh | ||
speaker_id: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
除了output,其他参数只是推理相关的吧,是不是应该放到Predict中
另外为什么还有model_name?
use_trt: False | ||
use_mkldnn: False | ||
cpu_threads: 1 | ||
precision: "fp32" | ||
output: "output" | ||
model_name: "pwgan_csmsc" | ||
speaker_dict: None | ||
lang: zh | ||
speaker_id: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同理
# # tone sandhi | ||
# sub_finals = self.tone_modifier.modified_tone(word, pos, | ||
# sub_finals) | ||
# er hua | ||
# if with_erhua: | ||
# sub_initials, sub_finals = self._merge_erhua( | ||
# sub_initials, sub_finals, word, pos) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果是没用的代码,就删掉吧。
# 多音字消歧 | ||
# word_pinyins = self.corrector.correct_pronunciation( | ||
# word, word_pinyins) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果是没用的代码,就删掉吧。
# fix wordseg bad case for sandhi | ||
# seg_cut = self.tone_modifier.pre_merge_for_modify(seg_cut) | ||
# 为了多音词获得更好的效果,这里采用整句预测 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果是没用的代码,就删掉吧。
) | ||
elif char in self.char_bopomofo_dict: | ||
partial_result[i] = pypinyin_result[i][0] | ||
# partial_result[i] = self.style_convert_func(self.char_bopomofo_dict[char][0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果是没用的代码,就删掉吧。
self.sample_rate = sample_rate | ||
def write(self, out_path, obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code style好像不对,过一下pre-commit
self.sample_rate = sample_rate | ||
def _write_obj(self, out_path, obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit
No description provided.