Synthesizer API¶
The TTS.utils.synthesizer.Synthesizer provides an inference API for
TTS and voice conversion models. End users will normally use the higher-level
Python inference API instead, which offers many convenience
functions and uses the Synthesizer under the hood. However, you may use the
Synthesizer class directly for more control or additional outputs, including
timestamps.
Usage¶
Load a model by name or from a checkpoint file and run synthesis:
from TTS.utils.manage import ModelManager
from TTS.utils.synthesizer import Synthesizer
model_path, config_path, _ = ModelManager().download_model("tts_models/en/ljspeech/vits")
synth = Synthesizer(tts_checkpoint=model_path, tts_config_path=config_path)
wav = synth.tts("Hello World")
synth.save_wav(wav, "test_audio.wav")
Get additional outputs as a Python dictionary with return_dict=True:
>>> print(synth.tts("Hello World. This is a test.", return_dict=True))
{
'wav': [...],
'text': 'Hello world. This is a test.',
'segments': [
{'id': 0, 'start': 0.0, 'end': 0.92, 'text': 'Hello world.'},
{'id': 1, 'start': 1.37, 'end': 2.50, 'text': 'This is a test.'}
]
}