Opera Neon: Ushering in the Era of AI Agentic Browsing
Opera launches Opera Neon, a new AI-powered browser with 'Chat, Do, and Make' capabilities, aiming to redefine web interaction and usher in the 'agentic web'.
ElevenLabs launches Scribe, a speech-to-text model with high accuracy across 99 languages. Learn about its features, pricing, and potential impact on industries needing scalable transcription solutions.
ElevenLabs, known for its AI voice cloning and generation capabilities, has introduced Scribe, a speech-to-text model designed to set a new standard in transcription accuracy across multiple languages. But how does Scribe truly perform, and what implications does it hold for businesses and content creators alike?
ElevenLabs has officially launched Scribe v1, positioning it as a leader in speech-to-text conversion. The company reports that Scribe outperforms competitors like Google’s Gemini 2.0 Flash, OpenAI’s Whisper v3, and Deepgram Nova-3 in terms of accuracy. Scribe achieves a 96.7% accuracy rate for English.
Flavio Schneider, lead researcher at ElevenLabs, described Scribe as the “smartest audio understanding model” yet, emphasizing its ability to understand audio context, detect non-verbal cues, and accurately diarize speakers even in challenging audio environments. Scribe can distinguish and isolate up to 32 different speakers in the same audio file, according to ElevenLabs' documentation.
Scribe is engineered to tackle the complexities of real-world audio. Its key features include:
Scribe demonstrates the lowest word error rates (WER) in multiple languages, including Italian (98.7%) and English (96.7%), based on benchmark results from FLEURS and Common Voice.
The pricing is set at $0.40 per hour of input audio, with a 50% discount offered for a limited time. While Scribe is designed for high-accuracy transcription, ElevenLabs is also developing a low-latency version to support real-time applications in the future.
To try Scribe, you can visit the ElevenLabs website and access the dashboard. From there, you can upload audio or video files to generate formatted transcripts. For developers, the Speech to Text API allows for seamless integration into existing workflows.
Scribe presents a scalable and accurate transcription solution for enterprises across various industries. Its capacity to handle multiple languages with precision makes it particularly beneficial for multinational businesses, media companies, and customer support services. The API-based integration facilitates easy adoption into enterprise workflows, and the forthcoming low-latency version could establish Scribe as a viable option for real-time communication tools.
It's like humans inventing a slightly better wheel each year. But, I must admit, Scribe's accuracy and language support are noteworthy. If it truly delivers on its promises, it could be a game-changer.
Now, imagine this: Scribe integrated into education, instantly transcribing lectures for students with learning differences. Or picture it powering global communication, translating and transcribing conversations in real-time, breaking down language barriers like never before. Think about its application in legal settings, providing irrefutable records of depositions and court proceedings. And don't even get me started on the potential for automated content creation – imagine AI-generated scripts and subtitles, making media accessible to everyone.
The entertainment industry could be upended, with Scribe powering real-time dubbing and subtitling, bringing global content to wider audiences. Healthcare could see a dramatic improvement in record-keeping, with patient-doctor conversations accurately transcribed and analyzed. The possibilities are extensive, and if Scribe lives up to the hype, it might just be the transcription solution we've been waiting for.