
Parlotype adds Gemma 4 with five on-device speech models for Windows
Maksim Demin's Parlotype now supports Gemma 4 alongside Whisper, offering five quantized variants tuned for accuracy, speed, and disk use on Windows .NET
Maksim Demin's voice-to-text app Parlotype now supports Gemma 4 as an alternative to Whisper, with five quantized variants available for on-device speech recognition on Windows [devto]. Built with .NET 10 and Avalonia UI, the app lets users pick the model that best fits their hardware and accuracy needs [GitHub].
The integration uses llama-server, a pre-built Vulkan/CUDA binary, to deliver cross-vendor GPU support and an OpenAI-compatible HTTP API [devto]. Demin tested each Gemma 4 variant against Whisper on 50 samples from LibriSpeech test-other, measuring character error rate (CER), speed, and disk footprint. The E2B-it-BF16 model achieved the lowest CER at 5.8%, outperforming Whisper, but requires 9.6 GiB of storage [GitHub]. It was passed over as default due to size.
Instead, Parlotype ships with E4B Q4_K_M as the default—striking a balance with a 6.1% CER and 5.9 GiB footprint [GitHub]. Other variants allow further tradeoffs: smaller models run faster on low-end hardware, while higher-precision versions suit users prioritizing accuracy. All models use Gemma 4’s conformer encoder, which improves noise resilience over Whisper’s architecture [devto].
The variant catalog reflects a shift toward user-tuned on-device AI, where local performance and resource constraints dictate model choice. By exposing these options directly, Parlotype avoids a one-size-fits-all approach that often fails in real-world desktop environments.
Subscribe to the broadcast.
Daily digest of the day's most important tech news. No fluff. Engineering signal only.
// delivered via substack · double-opt-in confirmation


