Jakob Rosin

5 months ago

Jakob Rosin
5 months ago

I’ve missed a few days, but here’s todays #AUdioMo, of me playing with the new personal voice on iPadOs 26, comparing it to the new one.

#audiomo

Zach Bennoui reshared this.

in reply to Jakob Rosin

Andre Louis

in reply to Jakob Rosin 5 months ago

Second one is definitely more you, but slower to respond. That will probably fix over time.

in reply to Andre Louis

Sean Randall

in reply to Andre Louis 5 months ago

do people use them in the wild?
VoiceOver people, I mean. I get the use case for people losing their voices or whatever.
I find it creepy!

in reply to Sean Randall

@cachondo @FreakyFwoof Oh, I don’t use it myself. I don’t know if VoiceOver people will be using those.but for the case of losing your voice and being able to type out phrases with your voice this could work very well, but using this with VoiceOver is definitely creepy.

@Sean Randall @Andre Louis

in reply to Jakob Rosin

Kevin R Jones

in reply to Jakob Rosin 5 months ago

@cachondo @FreakyFwoof I think I would consider asking a friend or family member to make a voice, and then use it to read articles or books.

@Sean Randall @Andre Louis

in reply to Kevin R Jones

Sean Randall

in reply to Kevin R Jones 5 months ago

@kevinrj @FreakyFwoof I guess if you could share them, that'd be kinda interesting

@Kevin R Jones @Andre Louis

in reply to Sean Randall

Jakob Rosin

in reply to Sean Randall 5 months ago

@cachondo @kevinrj @FreakyFwoof There's a share option in iPadOs 26

@Kevin R Jones @Sean Randall @Andre Louis

in reply to Jakob Rosin

Zach Bennoui

in reply to Jakob Rosin 5 months ago

@cachondo @kevinrj @FreakyFwoof Oh really, that’s interesting. Just a bit about how this is working under the hood for people that might be curious. I believe it’s using a language model to synthesize the speech rather than the older neural approaches used by personal voice up until recently. This is the same technology 11 labs, Google, Microsoft, and Amazon are using. That’s why it sounds so good and you don’t need to provide as much training data. To my knowledge, this is the first time an LM based TTS system has been deployed for screen reader use, and I’m wondering if later on in the beta cycle they will add the newer Siri voices that are based on the same technology. It sounds really good, and you didn’t mention this during the demonstration, but the voice breathes as well at punctuation marks.

@Kevin R Jones @Sean Randall @Andre Louis

⇧

Jakob Rosin 5 months ago • •

Jakob Rosin
5 months ago