Feature Request: Automatic Trimming #2

torridgristle · 2019-09-06T16:26:35Z

There are some samples of voices that have a long tail or just plain silence after the word, it seems that detecting an audio level below a threshold and marking this as a tail would be useful so that when it's used in the middle of a word or sentence the tail can be cut off, but when it's at the end of a sentence it's preserved.

Also there's some long vowel sounds that could be automatically chopped down just fine with the cross-fading in Audition. I think a good way to handle this would be to align all the parts of words to a user specified BPM since people tend to speak with a rhythm and cut it to divisions of a beat, but you'd likely need to have a way to index the lengths of all the different parts of words so that di-ff-er-in-tuh isn't pieced together slowly, so maybe not a great idea. FL Studio's Speech Synthesizer, based on DEC Talk or KlattTalk, has a BPM option and it sounds pretty natural given the quality of speech synthesis.

MysteryPancake · 2019-09-07T02:02:45Z

Silence removal and better speech awareness is definitely something I want to add. I am also hoping to add an option to detect and adjust the pitch of each segment to maintain consistency (like Adobe Voco).

The best results usually come from aligning to another speech sample, as this ensures decent timing. Adobe Voco accomplishes this by aligning to a speech synthesizer. The transcript-only mode does not align to a speech synthesizer by default, rather choosing the longest phones by default (this is why di-ff-er-in-tuh is pieced together slowly). This can be altered with the "Choose Method" dropdown, and an alignment file can be specified in "Destination" section.

At the moment the only aligners which work on the website are Gentle and WebMAUS (by default it requests to Gentle, but it's easier to use Gentle directly so there is no need for a proxy). The JSON file from Gentle can be uploaded to the website in the "Destination" section, which often gives better results as it has decent timing information.

I'm really surprised you managed to work out these problems already! I hope to improve this a lot more so thank you for the feedback!

MysteryPancake · 2019-09-07T02:32:50Z

If you're interested, I am currently working on an audio editor which is supposed to act as an online replacement for Adobe Audition. It's still very buggy with lots of missing features and there is currently no way to align from this page, but eventually the old website will be replaced with this editor.

I hope to add a bunch of features such as silence removal, crossfading, stretching, pitching, but also options to easily swap out words and phones with alternatives.

MysteryPancake · 2019-09-12T22:57:33Z

I have added my todo list to the issues section, as it was previously on a text file.
As this issue is on the list at the moment, I will close it for now as I work down the list.

MysteryPancake self-assigned this Sep 7, 2019

MysteryPancake added the enhancement New feature or request label Sep 7, 2019

MysteryPancake closed this as completed Sep 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Automatic Trimming #2

Feature Request: Automatic Trimming #2

torridgristle commented Sep 6, 2019

MysteryPancake commented Sep 7, 2019 •

edited

Loading

MysteryPancake commented Sep 7, 2019 •

edited

Loading

MysteryPancake commented Sep 12, 2019

Feature Request: Automatic Trimming #2

Feature Request: Automatic Trimming #2

Comments

torridgristle commented Sep 6, 2019

MysteryPancake commented Sep 7, 2019 • edited Loading

MysteryPancake commented Sep 7, 2019 • edited Loading

MysteryPancake commented Sep 12, 2019

MysteryPancake commented Sep 7, 2019 •

edited

Loading

MysteryPancake commented Sep 7, 2019 •

edited

Loading