Text to Speech Models

15d

Mistral drops Voxtral Transcribe 2, an open-source speech model that runs on-device for pennies

Mistral AI has launched Voxtral Transcribe 2, a new on-device speech-to-text model family featuring real-time transcription, speaker diarization, and open-weights licensing—aimed at cheaper, ...

1don MSN

Indian AI lab Sarvam’s new models are a major bet on the viability of open-source AI

The new lineup includes 30-billion and 105-billion parameter models; a text-to-speech model; a speech-to-text model; and a vision model to parse documents.

18d

New Apple study shows how grouping similar sounds can speed up AI speech generation

Apple researchers figured out a way to speed up AI speech generation from text without sacrificing audio quality or breaking ...

Geeky Gadgets

OpenAI AI Audio : TTS Speech-to-Text Audio Integrated Agents

OpenAI has introduced a series of AI audio models, fundamentally redefining how voice-based AI can be integrated into modern applications wit&h ChatGPT. These advancements include state-of-the-art ...

11d

‘Pathbreaking AI’: Amitabh Kant Hails Sarvam AI’s Indigenous Text-to-Speech Model, Bulbul V3

Sarvam CEO Pratyush Kumar says Bulbul V3 is designed to generate natural, expressive speech for Indian languages and to hold ...

TechRepublic

OpenAI Gives Its Agents a Voice – Now a ‘Medieval Knight’ Can Read Your Work Emails

OpenAI Gives Its Agents a Voice – Now a ‘Medieval Knight’ Can Read Your Work Emails Your email has been sent The text-to-speech and speech-to-text tools are all based on GPT-4o. OpenAI hinted it may ...

Geeky Gadgets

ChatTTS a new open source AI voice text-to-speech AI model

ChatTTS is an open-source AI voice text-to-speech (TTS) model that has gained significant popularity on GitHub due to its impressive features and user-friendly design. This model is specifically ...

VentureBeat

Meta Introduces Spirit LM open source model that combines text and speech inputs/outputs

Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.

AI Business

Mistral Drops new Speech-to-Text AI Models

French AI startup Mistral has released a pair of new speech-to-text models that aim to set fresh benchmarks for speed, ...

11d

New Apple-backed AI model can generate sound and speech from silent videos

VSSFlow leverages a creative architecture to generate sounds and speech with a single unified system, with state-of-the-art results.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results