You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
running this command, gets me a consistent output in approx 7 seconds: melo 我的名字叫小杨 dog.wav --language ZH
/Users/zihaolam/Projects/tts-editor/MeloTTS/melo/main.py:71: UserWarning: You specified a speaker but the language is English.
warnings.warn("You specified a speaker but the language is English.")
loading pickled model from cache
loaded pickled model from cache, took 8.529947996139526
> Text split to sentences.
我的名字叫小杨
> ===========================
0%| | 0/1 [00:00<?, ?it/s]Building prefix dict from the default dictionary ...
Loading model from cache /var/folders/j4/zkddp3ms6493qzbf3qf7rfwr0000gn/T/jieba.cache
Loading model cost 0.406 seconds.
Prefix dict has been built successfully.
Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
/Users/zihaolam/Projects/tts-editor/MeloTTS/.venv/lib/python3.9/site-packages/torch/nn/functional.py:4522: UserWarning: MPS: The constant padding of more than 3 dimensions is not currently supported natively. It uses View Ops default implementation to run. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/Pad.mm:472.)
return torch._C._nn.pad(input, pad, mode, value)
/Users/zihaolam/Projects/tts-editor/MeloTTS/melo/commons.py:123: UserWarning: MPS: no support for int64 for min_max, downcasting to a smaller data type (int32/float32). Native support for int64 has been added in macOS 13.3. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/ReduceOps.mm:612.)
max_length = length.max()
100%|██████████████████████████████████████████████████████████| 1/1 [00:07<00:00, 7.51s/it]
def get_model_pkl_path(language: str):
return os.path.join(os.path.dirname(__file__), f"model_{language}.pkl")
def get_model(language: str, device: str):
model_pkl_path = get_model_pkl_path(language)
if not os.path.exists(model_pkl_path):
from melo.api import TTS
model = TTS(language=language, device=device)
with open(model_pkl_path, "wb") as f:
pickle.dump(model, f)
else:
with open(model_pkl_path, "rb") as f:
start = time.time()
print("loading pickled model from cache")
model = pickle.load(f)
print("loaded pickled model from cache, took ", time.time()-start)
return model
Using pickle for TTS Model still does not help and takes approx 7 seconds for TTS for a short sentence.
Is there a way to improve the speed or further cache anything to reduce this cold start?
The gradio web UI takes approx 1 second to generate the same text. However, I would like to use the CLI instead of running a python server. Is there a way to optimise anything such that the CLI takes same time as the web UI/server?
The text was updated successfully, but these errors were encountered:
running this command, gets me a consistent output in approx 7 seconds:
melo 我的名字叫小杨 dog.wav --language ZH
Using pickle for TTS Model still does not help and takes approx 7 seconds for TTS for a short sentence.
Is there a way to improve the speed or further cache anything to reduce this cold start?
The gradio web UI takes approx 1 second to generate the same text. However, I would like to use the CLI instead of running a python server. Is there a way to optimise anything such that the CLI takes same time as the web UI/server?
The text was updated successfully, but these errors were encountered: