Search

isn't it possible to "pregenerate" the speech with all the necessary IDs so that you can navigate and interrupt at will?
Just as one generates SSML from rich text (including maths formulas) before generating speech.

It would even be better to catch intonations, breaths and others, unchanged instead of letting the TTS generating a "pleasant full phrase" (a wrong expectation).

I find your post intriguingly close to the emerging reaction against the Ai-generated #mundaneslop .

Items tagged with: mundaneslop

Paul L

5 days ago

Paul L
5 days ago

Search

Items tagged with: mundaneslop

Paul L 5 days ago

Paul L 5 days ago

Paul L

5 days ago

Paul L
5 days ago