Bringing Deep Linguistic Expertise to Voice AI

    Rime’s mission is to make voice AI sound and feel human.

    Founded by linguists, engineers, and product experts, Rime blends advanced ML with deep linguistic and sociolinguistic insight to create ultra-realistic voices that breathe, laugh, code-switch, and carry the subtle rhythms of real speech. These voices build trust, empathy, and engagement in every interaction to drive business outcomes.

    Our Story

    Rime was founded in 2022 by Lily Clifford (Stanford NLP PhD dropout), Brooke Larson (PhD linguist, ex-Amazon Alexa), and Ares Geovanos (Stanford engineer, product veteran). The team set out to move beyond robotic, over-polished speech toward genuine, human-like conversation.

    They built an in-house recording studio in San Francisco, capturing the biggest proprietary data set of full-duplex, spontaneous speech including interruptions, laughter, and vocal disfluencies that became the foundation for Rime’s models. Backed by Unusual Ventures, Cadenza Capital, Founders You Should Know and additional angel investors, Rime now powers tens of millions of conversations monthly spanning industries from food service to healthcare.

    Our Data

    Rime’s proprietary dataset is one of the largest collections of expressive, multi-lingual conversational speech in the world, captured both in-studio and across diverse locations in the United States. This dataset reflects a wide range of accents, dialects, demographics, and real-world communication patterns.

    Every voice model is trained to handle the practical realities of enterprise communication: brand names, tricky pronunciations, lists, spellings, numbers, IDs, and more. Fine-grained custom pronunciation tools give businesses the ability to control exactly how a word sounds, down to the syllable, to ensure accuracy and brand consistency.

    Our Models

    Rime offers two flagship text-to-speech models:

    • Arcana v2: The most human and expressive TTS available, with 300+ voices (including bilingual and multilingual options), instant code-switching between English, Spanish, and Spanglish, and unmatched realism. Designed for warmth, nuance, and emotional resonance, Arcana v2 makes conversations with AI agents indistinguishable from those with real people.
    • Mist v2: A high-speed, enterprise-grade model delivering sub-200 ms latency in the cloud and <100 ms on-prem. Mist v2 offers extensive customization, streaming APIs, and deployment flexibility, supporting even the highest-volume voice applications.

    Architectural innovations allow for smooth, natural delivery, ultra-low latency, and scalability without compromising quality. Enterprise customers can deploy Rime in the cloud, in a secure VPC, or entirely on-premise, with HIPAA and SOC 2 Type II compliance for regulated industries.

    Our Team

    Rime’s strength comes from its world-class team: linguists who understand the fine details of speech, engineers who optimize for millisecond-level performance, and product builders who know what drives customer engagement. This lean team is united by the belief that voice AI should be as rich, diverse, and expressive as the people it serves.

    Join us!

    Rime team image

    We're a small, passionate team based in San Francisco, and we're always looking for curious minds to join us on our journey. If you're excited about pushing the boundaries of voice AI and creating technology that connects with real people, we'd love to hear from you!