Mistral AI has launched Voxtral Transcribe 2, a new on-device speech-to-text model family featuring real-time transcription, speaker diarization, and open-weights licensing—aimed at cheaper, ...
The new lineup includes 30-billion and 105-billion parameter models; a text-to-speech model; a speech-to-text model; and a vision model to parse documents.
Apple researchers figured out a way to speed up AI speech generation from text without sacrificing audio quality or breaking ...
OpenAI has introduced a series of AI audio models, fundamentally redefining how voice-based AI can be integrated into modern applications wit&h ChatGPT. These advancements include state-of-the-art ...
Sarvam CEO Pratyush Kumar says Bulbul V3 is designed to generate natural, expressive speech for Indian languages and to hold ...
OpenAI Gives Its Agents a Voice – Now a ‘Medieval Knight’ Can Read Your Work Emails Your email has been sent The text-to-speech and speech-to-text tools are all based on GPT-4o. OpenAI hinted it may ...
ChatTTS is an open-source AI voice text-to-speech (TTS) model that has gained significant popularity on GitHub due to its impressive features and user-friendly design. This model is specifically ...
Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.
French AI startup Mistral has released a pair of new speech-to-text models that aim to set fresh benchmarks for speed, ...
VSSFlow leverages a creative architecture to generate sounds and speech with a single unified system, with state-of-the-art results.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results