F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching! The quality is pretty impressive for open source, and it even supports mps for Mac! I was able to get it going on my Mac with no problem. #TTS #ML #AI
github.com/SWivid/F5-TTS
@ZBennoui
github.com/SWivid/F5-TTS
@ZBennoui
reshared this
ΠΠΈΡΠ°π§π¬ππΊ
in reply to Chi Kim • • •Luis Carlos
in reply to Chi Kim • • •Musharraf
in reply to Luis Carlos • • •Too heavy for local use. Unless you have a GPU with at least 8GB of VRAM. Or, you know, a Mac.
Luis Carlos
in reply to Musharraf • • •the esoteric programmer
in reply to Chi Kim • • •Musharraf
in reply to the esoteric programmer • • •@esoteric_programmer
It converts the following text:
Every time I see someone light up, um, because of something Iβve made, itβs like, wow, a little piece of my inner child gets healed, you know? And, um, when...snip
To the attached speech.
Tamas G reshared this.
Musharraf
in reply to Musharraf • • •You can easily plug this into an open-source LLM and get something akin to NotebookLM.
Totally free and open-source, with very high quality.
the esoteric programmer
in reply to Musharraf • • •Musharraf
in reply to the esoteric programmer • • •Other than the text completion part, you are almost correct.
You give it some text, and an audio sample, and it tries to replicate the given voices characteristics.
Research is active in the areas of speaker verification and audio deep fake detection to combat misuse.
the esoteric programmer
in reply to Musharraf • • •Musharraf
in reply to the esoteric programmer • • •Yes that's how you'd do it.
Musharraf
in reply to the esoteric programmer • • •But again, can't you say the same thing about ElevenLabs? And other voice conversion tech?
the esoteric programmer
in reply to Musharraf • • •Erion
in reply to the esoteric programmer • • •Zachary Bennoui
in reply to Musharraf • • •