Features Pricing Blog Docs FAQ Download
Documentation Menu

Model Guide

Choose the right speech model for your transcription needs.

MurmurTone uses state-of-the-art AI models for speech recognition. Each model offers different trade-offs between speed, accuracy, and resource usage. This guide helps you pick the best one for your workflow.

Quick Recommendation For most users, we recommend the Recommended model. It offers the best balance of accuracy and speed, and works well on most modern computers.

Model Comparison

Model Speed Accuracy RAM Best For
Quick Fastest Good ~1 GB Quick notes, chat messages
Standard Fast Better ~1.5 GB Casual dictation, everyday use
Professional Slower Excellent ~5 GB Important documents, reports
Studio Slowest Best ~10 GB Professional transcription, accents

Which Model Should I Use?

I need fast transcription for quick notes

Use Quick or Standard. These models transcribe almost instantly and are perfect for casual use like chat messages, quick reminders, or informal notes where minor errors are acceptable.

I want the best balance of speed and accuracy

Use Recommended. This is our recommended model for most users. It handles professional vocabulary well, runs efficiently on most computers, and produces accurate results for emails, documentation, and everyday work.

I need high accuracy for important documents

Use Professional or Studio. These models excel at complex vocabulary, technical terms, and proper nouns. Choose Professional if you have 8+ GB RAM, or Studio if you have a powerful GPU and need the absolute best accuracy.

I work with accents or non-native English speakers

Use Studio. Larger models handle accent variation significantly better. If Studio is too slow, Professional is a good compromise.

I need to translate speech to English

Use Recommended or larger. All models support translation from 90+ languages to English, but larger models produce more accurate translations. Set your Language to the source language (e.g., Spanish) and enable the Translation option in settings.

Technical Specifications

For power users, here are the detailed specifications of each model:

Model Parameters Download Size WER* Languages
Quick 39 M 75 MB ~10% 99
Standard 74 M 145 MB ~7% 99
Recommended 244 M 484 MB ~5% 99
Professional 769 M 1.5 GB ~4% 99
Studio 1.55 B 3 GB ~2.7% 99

*WER (Word Error Rate) measured on LibriSpeech clean test set. Lower is better. Real-world accuracy varies based on audio quality, background noise, and vocabulary.

GPU Acceleration

If you have an NVIDIA GPU with 4+ GB VRAM, MurmurTone can use it to dramatically speed up transcription:

Set your Processing Mode to "Auto" and MurmurTone will automatically use your GPU if available.

Running out of memory? If MurmurTone crashes or becomes unresponsive, try a smaller model. The Recommended model works reliably on most systems with 8 GB RAM.

Downloading Models

The Quick and Standard models are included with MurmurTone. Larger models are downloaded on-demand when you select them in Settings.

Download sizes:

Models are cached locally after download, so you only need to download once.