Model Guide

Choose the right speech model for your transcription needs.

MurmurTone uses state-of-the-art AI models for speech recognition. Each model offers different trade-offs between speed, accuracy, and resource usage. This guide helps you pick the best one for your workflow.

Quick Recommendation For most users, we recommend the Recommended model. It offers the best balance of accuracy and speed, and works well on most modern computers.

Model Comparison

Model	Speed	Accuracy	RAM	Best For
`Quick`	Fastest	Good	~1 GB	Quick notes, chat messages
`Standard`	Fast	Better	~1.5 GB	Casual dictation, everyday use
`Recommended`	Moderate	Great	~2.5 GB	Documentation, emails, general work
`Professional`	Slower	Excellent	~5 GB	Important documents, reports
`Studio`	Slowest	Best	~10 GB	Professional transcription, accents

Which Model Should I Use?

I need fast transcription for quick notes

Use Quick or Standard. These models transcribe almost instantly and are perfect for casual use like chat messages, quick reminders, or informal notes where minor errors are acceptable.

I want the best balance of speed and accuracy

Use Recommended. This is our recommended model for most users. It handles professional vocabulary well, runs efficiently on most computers, and produces accurate results for emails, documentation, and everyday work.

I need high accuracy for important documents

Use Professional or Studio. These models excel at complex vocabulary, technical terms, and proper nouns. Choose Professional if you have 8+ GB RAM, or Studio if you have a powerful GPU and need the absolute best accuracy.

I work with accents or non-native English speakers

Use Studio. Larger models handle accent variation significantly better. If Studio is too slow, Professional is a good compromise.

I need to translate speech to English

Use Recommended or larger. All models support translation from 90+ languages to English, but larger models produce more accurate translations. Set your Language to the source language (e.g., Spanish) and enable the Translation option in settings.

Technical Specifications

For power users, here are the detailed specifications of each model:

Model	Parameters	Download Size	WER*	Languages
`Quick`	39 M	75 MB	~10%	99
`Standard`	74 M	145 MB	~7%	99
`Recommended`	244 M	484 MB	~5%	99
`Professional`	769 M	1.5 GB	~4%	99
`Studio`	1.55 B	3 GB	~2.7%	99

*WER (Word Error Rate) measured on LibriSpeech clean test set. Lower is better. Real-world accuracy varies based on audio quality, background noise, and vocabulary.

GPU Acceleration

If you have an NVIDIA GPU with 4+ GB VRAM, MurmurTone can use it to dramatically speed up transcription:

Quick/Standard: 10-20x faster with GPU
Recommended: 8-15x faster with GPU
Professional: 6-10x faster with GPU (requires 6+ GB VRAM)
Studio: 5-8x faster with GPU (requires 10+ GB VRAM)

Set your Processing Mode to "Auto" and MurmurTone will automatically use your GPU if available.

Running out of memory? If MurmurTone crashes or becomes unresponsive, try a smaller model. The Recommended model works reliably on most systems with 8 GB RAM.

Downloading Models

The Quick and Standard models are included with MurmurTone. Larger models are downloaded on-demand when you select them in Settings.

Download sizes:

Recommended: ~484 MB
Professional: ~1.5 GB
Studio: ~3 GB

Models are cached locally after download, so you only need to download once.