Introducing Mist v2, the world's fastest conversational text-to-speech model
We are thrilled to announce the release of Mist v2, the latest evolution of Rime's groundbreaking conversational voice model. Building upon the success of the original Mist, which set new standards in synthetic speech, Mist v2 now powers tens of millions of phone calls every month, delivering even faster and more lifelike interactions.
Since its initial launch, Mist has been at the forefront of voice AI technology, offering ultra-fast, human-speed latency and realistic conversational-style voices in a diverse array of accents and voice types. With Mist v2, we've made significant advancements to further enhance performance, realism, and usability across a wide range of applications.
Key Enhancements
- Enhanced Realism: Building upon our extensive proprietary dataset of conversational speech, Mist v2 delivers voices that are more natural and engaging, incorporating subtle nuances like filler words, backchanneling affirmations, and natural breathing patterns.
- Multi-Lingual Support: Mist v2 now supports both English and Spanish, enabling seamless, fluid, and natural conversations in both languages. This makes it an ideal solution for customer support and automated voice agents serving diverse populations.
- Advanced Pronunciation Control: Mist v2 introduces novel capabilities for controllability of pronunciation, allowing developers to fine-tune speech output to ensure optimal clarity, tone, and emphasis. This is particularly valuable for brand-specific voice applications and domain-specific vocabulary.
- Ultra-Fast On-Prem Latency: With a super-fast model latency of just 70ms when deployed on-premises, Mist v2 is built for real-time interactions, ensuring a smooth and responsive experience for end-users without compromising quality.
- Expanded Voice Diversity: We continue to broaden our roster of voices, offering a wider range of accents, demographics, and speaking styles to meet diverse user needs.
Mist v2 is available via API today. Sign up to start building with Mist v2 and experience the future of conversational voice synthesis. Whether you are deploying customer support agents, IVR systems, or other voice-driven applications, Mist v2 provides the speed, realism, and control needed to elevate your AI-powered voice experiences.
To integrate Mist v2 is easy. Just make the following change to your API request payloads:
import requests
resp = requests.post(
"https://users.rime.ai/v1/rime-tts",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={"text": "I would love to have a conversation with you.",
"speaker": "ana",
"modelId": "mistv2" # this parameter will control which model you hit.
)
print(resp.json()) # will be base-64 encoded audio
At Rime, we believe that the future of voice technology lies in creating nuanced, curated voices that reflect the specific and unique ways that real people talk. Mist v2 embodies this philosophy, offering voices that are not only realistic but also demographically and stylistically diverse.
Join us in the next step of the evolution of synthetic speech. Contact us to learn more about how Mist v2 can enhance your voice applications.
Experience the future of voice technology with Mist v2.