This is laughing matter.
Laughter is an extremely wide and varied topic, sociologically, physiologically, and linguistically. There's not nearly enough room in a single blog post to do it justice, but we wanted to share some modeling updates and concepts.
Laughter in Speech
When we think of laughter, we often think of independent sequences of ha ha ha variations. And while that is definitely a type of laughter, more relevant for conversational TTS is laughter's interaction with flowing speech. When people laugh within an utterance, they commonly do so in short, sometimes repeated, bursts of exhalation. This can take place anywhere in the utterance and generally blends into the surrounding words.
We've actually modeled this. Not only that, we've modeled it in such a way that the user can actually control where the laughter goes in the sentence. Most TTS offerings don't model laughter whatsoever, and if they were to, it would necessarily be stochastically inserted into the utterance. Rime's tech allows us to generate either determined or stochastic laughter.
Here are some Rime-generated TTS clips of CEO Lily Clifford's voice saying a single sentence with laughter generated in a variety of places in the sentence, each subtly changing the rhetorical effect.
Funny Words
Little exhalations between words, like those above, are relatively easy to model in that the bits of laughter function akin to short little function words themselves. A harder and more interesting challenge is modeling words being spoken "in a laughing manner". At Rime, we've developed techniques to model and generate this as well. Listen to the clip below of this author's voice. The ALL CAPS words are spoken with simultanously laughter.
Different Types of Laughter
There's a wide variety of different types of laughter in speech, but even within a single type, there's variation in length and intensity. Take the generated clip below. I created this with quite a bit of laughter, to convey a guy nearly breathlessly relating a funny story (or as funny as I could make it off the top of my head). Note the variations in length and intensity.
Wrapping Up
At Rime, we're pushing at the forefront of conversational speech AI, modeling real, true-to-life conversational speech. And our cutting-edge modeling allows us the user to have ultimate control over the output, if they wish. Stay tuned for more updates, on laughter and otherwise!