Categories
News

Thorsten-Voice on Mastodon?!

As i’ve heard lots of great stuff about Mastodon i decided to give it a try. Let’s see how it will feel 😊.

If you would like to see me there, please follow my thorstenvoice account.

Categories
News

New free 🇩🇪 voice dataset published – “Hessisch”

I’ve released my latest Thorsten-Voice dataset. This one is pronounced in my regional german dialect, which is “(Süd) Hessisch”. This is mainly spoken in the southern region of my german home state “Hessen” and based on neutral textual input.

Feel free to use it to train an AI model using machine learning technology. Free download is available on Zenodo.

More info is available here.

Categories
News

🗣️ Thorsten-Voice @ Huggingface

Even though I’ve published some audio examples of my artificial voice here, you might want to try out “my” voice with your own texts.

So I set up a huggingspace area for it. So try it out right now with your own texts in the browser.

https://huggingface.co/spaces/Thorsten-Voice/demo

Categories
News

ThorstenVoice-Dataset-2022.10 is released 🎉

Yeah 🥳, the new ThorstenVoice dataset is available for public download. As the previous datasets this is CC0 licensed too so can be used by anyone.

If you use this dataset please cite/quote/reference it using
DOI: 10.5281/zenodo.7265581 – Thank you 😊.

@dataset{muller_thorsten_2022_7265581,
  author       = {Müller, Thorsten and
                  Kreutz, Dominik},
  title        = {ThorstenVoice Dataset 2022.10},
  month        = oct,
  year         = 2022,
  publisher    = {Zenodo},
  version      = {1.0},
  doi          = {10.5281/zenodo.7265581},
  url          = {https://doi.org/10.5281/zenodo.7265581}
}

More information and download link is available on Zenodo.

Categories
News

New Thorsten-Voice voice dataset (soon)

The existing free german TTS models „Thorsten“ Tacotron2 DDC and VITS are based on my free and newly recorded voice dataset that will be published soon.

It’s name is – totally creative: „Thorsten-22.10„.

You can listen to some samples from that voice dataset, that is recorded for TTS model training.

Number of recordings12.432
Audio duration11+ hours
Samplerate22.050Hz
ChannelsMono
Normalization-24dB
Speed
(Average)
17,5 Chars / Second

Example of TTS artificially synthesized phrases with a model based on this voice dataset.

Categories
News

🗣️ New German “Thorsten” TTS model released 🎉

YEARS of passion for open voice tech,
MONTH of recording sessions,
WEEKS of computed training time,
DAYS of audio optimiziation,
HOURS of disillusion.

All for that ONE MOMENT, to share next generation of open “Thorsten-Voice” with the community!

This model is based on a completely new recorded and optimized voice dataset (Thorsten-22.05-neutral).

It’s trained using Coqui 🐸 TTS (for all “TTS-Insiders”, it’s a VITS model).

tl;dr

- pip install tts==0.7.1
- tts-server --model_name tts_models/de/thorsten/vits
- Open webbrowser on http://localhost:5002

Just have fun 🗣️🎉😄

Dominik & Thorsten

Categories
News

“Thorsten” samples from Mycroft skills

Dominik and i are still playing around to provide a new version of “Thorsten” voice to be used with Mycroft installations.

This is the current “work-in-progress” state we are working on
(thx Olaf for supporting us with compute power on HifiGAN training).

“Bitte warte einen Moment, bis ich fertig mit dem booten bin.”
“Ich bin jetzt bereit.”
“Ich verstehe das nicht, aber ich lerne jeden Tag neue Dinge.”
“Es ist im Moment klarer Himmel bei 18 Grad.”
“Mein Name ist Mycroft und ich bin funky.”
Categories
News

Audio samples of next “Thorsten” voice model

After I (again) invested months of my free time for audio recordings (this time with a good microphone and recording setup) and Dominik applied his “audio magic” things really started for both of us.

We have tried (and still try) various configurations, but want to share our current result with you.

  • > 12.000 mono audio recordings made by me with a samplerate of 22kHz
  • Trained with mit Coqui TTS (0.5.0)
  • Tacotron2 DDC (TTS-model)
  • HifGAN (Vocoder) – Thanks Olaf, for supporting us with compute power.
  • Lot’s of love 🙂

Of course, this “Thorsten” model can still be generated offline and is available free of charge under CC0 license.

But how does it sound?

Info about “Berlin” (Source: Wikipedia)

There is no date yet when the model and underlying dataset will be released as the “fine tuning” work is still ongoing. However, we are closer to the goal than to the beginning :-).

We would appreciate your feedback on the current status of the model. Either via the contact form or by email to tm@thorsten-voice.de.

This is default text for notification bar