--- Log opened Thu Mar 20 00:00:51 2025 00:19 -!- darsie [~darsie@84-113-82-174.cable.dynamic.surfer.at] has joined #hplusroadmap 00:50 < fenn> i wasn't so impressed with the CosyVoice TTS. this is the new hotness, but it's a full LLM https://sparkaudio.github.io/spark-tts/ 00:51 < fenn> relatively small ~2.5GB unquantized 00:53 < fenn> it's interesting how it interprets the chinese accents as various english accents 00:53 < kanzure> can speech models be used for accent classification? 01:04 -!- TMM [hp@amanda.tmm.cx] has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] 01:04 -!- TMM [hp@amanda.tmm.cx] has joined #hplusroadmap 01:06 < fenn> i'm sure it's represented in the embedding somehow, you just have to decompose it appropriately 01:26 < fenn> SparkTTS has all speaker attributes encoded in the first 32 tokens presented to the audio decoder 01:27 < fenn> "time invariant acoustic characteristics" 01:29 < hprmbridge> kanzure> oh 01:33 < fenn> i think this includes stuff like emotion and room tone so you'd probably want to recalibrate it every scene 01:35 < fenn> you're reading books right so i would put the boundaries at every quote mark 01:36 < fenn> or not, i'm sure it's fine at pretending to be a person pretending to be a character, since it was primarily trained on that kind of audio 01:39 < fenn> i was researching the sesame conversational speech system last night. they use a transformer which gets the entire audio recording in its context, but also the text. it generates new audio conditioned on the text, but the LLM generating the text only gets a transcript so it's completely oblivious to the emotional and side channel content 01:39 < fenn> so sesame is capable of interpreting the user's emotions and audio events, in a sort of split brained way 01:40 < fenn> sparktts would only get text input, which is unfortunate because it seems like it would have worked fine if they just trained it on multi speaker data 01:41 < fenn> like obviously you want to speak with a TTS system, that's the main use case right 02:18 -!- Jenda [~jenda@coralmyn.hrach.eu] has quit [Read error: Connection reset by peer] 02:51 -!- archels [~neuralnet@static.65.156.69.159.clients.your-server.de] has quit [Ping timeout: 252 seconds] 03:31 -!- archels [~neuralnet@static.65.156.69.159.clients.your-server.de] has joined #hplusroadmap 04:23 -!- Jenda [~jenda@coralmyn.hrach.eu] has joined #hplusroadmap 05:54 < hprmbridge> kanzure> for boundaries just use LLMs to give scene directions 06:38 -!- srat3 [~srat3@user/srat3] has quit [Read error: Connection reset by peer] 06:38 -!- srat3 [~srat3@user/srat3] has joined #hplusroadmap 06:47 -!- Guest32 [~Guest32@213.134.170.141] has joined #hplusroadmap 06:49 -!- Guest32 [~Guest32@213.134.170.141] has quit [Client Quit] 07:10 -!- darsie [~darsie@84-113-82-174.cable.dynamic.surfer.at] has quit [Quit: Avoid fossil fuels and animal products. Have no/fewer children. Protest, elect sane politicians. Invest ecologically.] 07:11 -!- darsie [~darsie@84-113-82-174.cable.dynamic.surfer.at] has joined #hplusroadmap 11:44 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has quit [Quit: Konversation terminated!] 11:45 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has joined #hplusroadmap 12:18 -!- Gooberpatrol_66 [~Gooberpat@user/gooberpatrol66] has joined #hplusroadmap 12:19 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has quit [Ping timeout: 260 seconds] 13:44 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has joined #hplusroadmap 16:22 -!- Gooberpatrol_66 [~Gooberpat@user/gooberpatrol66] has quit [Quit: Konversation terminated!] 16:23 -!- Gooberpatrol_66 [~Gooberpat@user/gooberpatrol66] has joined #hplusroadmap 16:37 -!- TMM [hp@amanda.tmm.cx] has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] 16:37 -!- TMM [hp@amanda.tmm.cx] has joined #hplusroadmap 18:08 -!- darsie [~darsie@84-113-82-174.cable.dynamic.surfer.at] has quit [Ping timeout: 252 seconds] 20:10 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has joined #hplusroadmap 20:11 -!- Gooberpatrol_66 [~Gooberpat@user/gooberpatrol66] has quit [Ping timeout: 260 seconds] 20:12 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has quit [Client Quit] 20:13 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has joined #hplusroadmap 23:47 -!- TMM [hp@amanda.tmm.cx] has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] 23:47 -!- TMM [hp@amanda.tmm.cx] has joined #hplusroadmap --- Log closed Fri Mar 21 00:00:52 2025