Thorsten-Voice, die freie deutsche KI-Stimme.

News

Thorsten-Voice on Mastodon?!

Post author By Thorsten Müller
Post date 23. March 2024

As i’ve heard lots of great stuff about Mastodon i decided to give it a try. Let’s see how it will feel 😊.

If you would like to see me there, please follow my thorstenvoice account.

News

New free 🇩🇪 voice dataset published – “Hessisch”

Post author By Thorsten Müller
Post date 15. January 2024

I’ve released my latest Thorsten-Voice dataset. This one is pronounced in my regional german dialect, which is “(Süd) Hessisch”. This is mainly spoken in the southern region of my german home state “Hessen” and based on neutral textual input.

Feel free to use it to train an AI model using machine learning technology. Free download is available on Zenodo.

More info is available here.

News

🗣️ Thorsten-Voice @ Huggingface

Post author By Thorsten Müller
Post date 8. March 2023

Even though I’ve published some audio examples of my artificial voice here, you might want to try out “my” voice with your own texts.

So I set up a huggingspace area for it. So try it out right now with your own texts in the browser.

https://huggingface.co/spaces/Thorsten-Voice/demo

News

ThorstenVoice-Dataset-2022.10 is released 🎉

Post author By Thorsten Müller
Post date 8. November 2022

Yeah 🥳, the new ThorstenVoice dataset is available for public download. As the previous datasets this is CC0 licensed too so can be used by anyone.

If you use this dataset please cite/quote/reference it using
DOI: 10.5281/zenodo.7265581 – Thank you 😊.

@dataset{muller_thorsten_2022_7265581,
  author       = {Müller, Thorsten and
                  Kreutz, Dominik},
  title        = {ThorstenVoice Dataset 2022.10},
  month        = oct,
  year         = 2022,
  publisher    = {Zenodo},
  version      = {1.0},
  doi          = {10.5281/zenodo.7265581},
  url          = {https://doi.org/10.5281/zenodo.7265581}
}

More information and download link is available on Zenodo.

News

New Thorsten-Voice voice dataset (soon)

Post author By Thorsten Müller
Post date 13. October 2022

The existing free german TTS models „Thorsten“ Tacotron2 DDC and VITS are based on my free and newly recorded voice dataset that will be published soon.

It’s name is – totally creative: „Thorsten-22.10„.

You can listen to some samples from that voice dataset, that is recorded for TTS model training.

Number of recordings	12.432
Audio duration	11+ hours
Samplerate	22.050Hz
Channels	Mono
Normalization	-24dB
Speed (Average)	17,5 Chars / Second

Example of TTS artificially synthesized phrases with a model based on this voice dataset.

News

🗣️ New German “Thorsten” TTS model released 🎉

Post author By Thorsten Müller
Post date 23. June 2022

YEARS of passion for open voice tech,
MONTH of recording sessions,
WEEKS of computed training time,
DAYS of audio optimiziation,
HOURS of disillusion.

All for that ONE MOMENT, to share next generation of open “Thorsten-Voice” with the community!

This model is based on a completely new recorded and optimized voice dataset (Thorsten-22.05-neutral).

It’s trained using Coqui 🐸 TTS (for all “TTS-Insiders”, it’s a VITS model).

tl;dr

- pip install tts==0.7.1
- tts-server --model_name tts_models/de/thorsten/vits
- Open webbrowser on http://localhost:5002

Just have fun 🗣️🎉😄

Dominik & Thorsten

News

“Thorsten” samples from Mycroft skills

Post author By Thorsten Müller
Post date 8. March 2022

Dominik and i are still playing around to provide a new version of “Thorsten” voice to be used with Mycroft installations.

This is the current “work-in-progress” state we are working on
(thx Olaf for supporting us with compute power on HifiGAN training).

“Bitte warte einen Moment, bis ich fertig mit dem booten bin.”

“Ich bin jetzt bereit.”

“Ich verstehe das nicht, aber ich lerne jeden Tag neue Dinge.”

“Es ist im Moment klarer Himmel bei 18 Grad.”

“Mein Name ist Mycroft und ich bin funky.”

News

Audio samples of next “Thorsten” voice model

Post author By Thorsten Müller
Post date 7. March 2022

After I (again) invested months of my free time for audio recordings (this time with a good microphone and recording setup) and Dominik applied his “audio magic” things really started for both of us.

We have tried (and still try) various configurations, but want to share our current result with you.

> 12.000 mono audio recordings made by me with a samplerate of 22kHz
Trained with mit Coqui TTS (0.5.0)
Tacotron2 DDC (TTS-model)
HifGAN (Vocoder) – Thanks Olaf, for supporting us with compute power.
Lot’s of love 🙂

Of course, this “Thorsten” model can still be generated offline and is available free of charge under CC0 license.

But how does it sound?

Info about “Berlin” (Source: Wikipedia)

There is no date yet when the model and underlying dataset will be released as the “fine tuning” work is still ongoing. However, we are closer to the goal than to the beginning :-).

We would appreciate your feedback on the current status of the model. Either via the contact form or by email to tm@thorsten-voice.de.