xTTS provides a list of voices, making it only possible to do a voice conversion, not a voice clone #29

MonX94 · 2024-07-30T15:03:42Z

I'm not sure if this is replicable, but in my case xTTS v2 and Your TTS provides a list of voices, while the expected behavior is to perform a voice clone with a given file. In contrary, in a demonstration video, there is no dropdown menu with a list of the voices. I assume those were added since those videos were recorded or something, hence they're not accounted for. I believe there should be an option in the UI to pick either a voice clone or voice conversion whenever the model supports cloning.

If I don't choose a voice I get a [!] Looks like you are using a multi-speaker model. You need to define either a 'speaker_idx' or a 'speaker_wav' to use a multi-speaker model. error in the terminal.

If I do choose a voice I get an error as follows:

ConfigureVoiceTab.py 88 sample
utils.sampleVoice(self.txt_sample_text.Value)

utils.py 61 sampleVoice
play(AudioSegment.from_file(app_state.sample_speaker.speak(text, output)))

Voice.py 106 speak
self.voice.tts_with_vc_to_file(

api.py 455 tts_with_vc_to_file
wav = self.tts_with_vc(

api.py 419 tts_with_vc
self.load_vc_model_by_name("voice_conversion_models/multilingual/vctk/freevc24")

api.py 158 load_vc_model_by_name
model_path, config_path, _, _, _ = self.download_model_by_name(model_name)

api.py 131 download_model_by_name
model_path, config_path, model_item = self.manager.download_model(model_name)

manage.py 409 download_model
output_model_path, output_config_path = self._find_files(output_path)

manage.py 432 _find_files
raise ValueError(" [!] Model file not found in the output path")

ValueError:
 [!] Model file not found in the output path

I assume there is some problem with the voice conversion model, though I have it installed. Nonetheless, it's a different issue.

The text was updated successfully, but these errors were encountered:

MonX94 · 2024-07-31T20:16:09Z

I fixed it locally for my own use in a hacky way, by disabling voice conversion and doing voice cloning instead for xTTS models. I can upload a solution if there's a need for it (why would anyone voice conversion with a voice cloning model anyway?) but for it to be release-worthy it needs to be designed properly.

DragonVsKira · 2024-09-28T22:36:02Z

can u show me

MonX94 · 2024-10-01T01:59:41Z

@DragonVsKira, sorry for late response. See my fork: https://github.com/MonX94/WeeaBlind/tree/xtts_no_voice_conversion. @FlorianEagox you might also be interested. It's basically a few tweaks in a single file.

Fix issue #29

FlorianEagox added a commit that referenced this issue Nov 21, 2024

Merge pull request #33 from MonX94/xtts_no_voice_conversion

c282e34

Fix issue #29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xTTS provides a list of voices, making it only possible to do a voice conversion, not a voice clone #29

xTTS provides a list of voices, making it only possible to do a voice conversion, not a voice clone #29

MonX94 commented Jul 30, 2024 •

edited

Loading

MonX94 commented Jul 31, 2024

DragonVsKira commented Sep 28, 2024

MonX94 commented Oct 1, 2024 •

edited

Loading

xTTS provides a list of voices, making it only possible to do a voice conversion, not a voice clone #29

xTTS provides a list of voices, making it only possible to do a voice conversion, not a voice clone #29

Comments

MonX94 commented Jul 30, 2024 • edited Loading

MonX94 commented Jul 31, 2024

DragonVsKira commented Sep 28, 2024

MonX94 commented Oct 1, 2024 • edited Loading

MonX94 commented Jul 30, 2024 •

edited

Loading

MonX94 commented Oct 1, 2024 •

edited

Loading