System-Wide Offline Dictation, Correction, and Suggestion Tool

This project provides a powerful, system-wide dictation tool that goes beyond simple speech-to-text. It automatically corrects your dictated text, suggests synonyms to improve your writing, and even includes a hotkey to look up homophones (e.g., “there” vs. “their”) for any word on your screen.

It’s a complete, offline writing assistant built on Vosk and LanguageTool.

Watch short AI-Demo System-wide offline dictation

Key Features

  • Offline & Private: 100% local. No data ever leaves your machine.

  • Dictate, Correct & Enhance: Automatic grammar/spelling correction and synonym suggestions.

  • Conservative RAM Usage: Intelligently manages memory, preloading models only if enough free RAM is available, ensuring other applications (like your PC games) always have priority.

  • Cross-Platform: Works on Linux, macOS, and Windows.

  • Fully Automated: Manages its own LanguageTool server. A single script handles the startup process on Linux/macOS.

  • Blazing Fast: Intelligent caching ensures instant “Listening…” notifications and fast processing.

Documentation

For a complete technical reference, including all modules and scripts, please visit our official documentation page. It is automatically generated and always up-to-date.

Go to Documentation >>

Build Status

Linux macOS Windows 11 Documentation


Installation

The setup is a two-step process:

  1. Clone this repository to your computer.

  2. Run the one-time setup script for your operating system.

The setup scripts handle everything: system dependencies, Python environment, and downloading the necessary models and tools (~4GB) directly from our GitHub Releases for maximum speed.

For Linux & macOS & windows

Open a terminal in the project’s root directory and run the script for your system:

# For Ubuntu/Debian, Manjaro/Arch, macOs  or other derivatives

bash setup/{your-os}_setup.sh

# For Windows in Admin-Powershell

setup/windows11_setup.ps1

For Windows

  1. Install AutoHotkey v2. This is required for the text-typing watcher.

  2. Run the setup script with administrator privileges “Run with PowerShell”.


Usage

1. Start the Services

On Linux & macOS

A single script handles everything. It starts the main dictation service and the file watcher automatically in the background.

# Run this from the project's root directory
./scripts/restart_venv_and_run-server.sh

On Windows

Starting the service is a two-step manual process:

  1. Start the Main Service: Run start_dictation_v2.0.bat. or start from .venv the service with python3

2. Configure Your Hotkey

To trigger dictation, you need a global hotkey that creates a specific file. We highly recommend the cross-platform tool CopyQ.

Our Recommendation: CopyQ

Create a new command in CopyQ with a global shortcut.

Command for Linux/macOS:

touch /tmp/sl5_record.trigger

Command for Windows use CopyQ:

copyq:
var filePath = 'c:/tmp/sl5_record.trigger';

var f = File(filePath);

if (f.openAppend()) {
    f.close();
} else {
    popup(
        'error',
        'cant read or open:\n' + filePath
        + '\n' + f.errorString()
    );
}

Command for Windows use AutoHotkey:

; trigger-hotkeys.ahk
; AutoHotkey v2 Skript
#SingleInstance Force ; Stellt sicher, dass nur eine Instanz des Skripts läuft

;===================================================================
; Hotkey zum Auslösen des STT Triggers
; Drücke Strg + Alt + T, um die Trigger-Datei zu schreiben.
;===================================================================
f9::
f10::
f11::
{
    local TriggerFile := "c:\tmp\sl5_record.trigger"
    FileAppend("t", TriggerFile)
    ToolTip("STT Trigger ausgelöst!")
    SetTimer(() => ToolTip(), -1500)
}

3. Start Dictating!

Click in any text field, press your hotkey, and a “Listening…” notification will appear. Speak clearly, then pause. The corrected text will be typed for you.


Advanced Configuration (Optional)

You can customize the application’s behavior by creating a local settings file.

  1. Navigate to the config/ directory.

  2. Create a copy of settings_local.py_Example.txt and rename it to settings_local.py.

  3. Edit settings_local.py to override any setting from the main config/settings.py file.

This settings_local.py file is ignored by Git, so your personal changes won’t be overwritten by updates.

Key Scripts for Windows Users

Here is a list of the most important scripts to set up, update, and run the application on a Windows system.

Setup & Update

  • setup/setup.bat: The main script for the initial one-time setup of the environment.

  • or Run powershell -Command "Set-ExecutionPolicy -ExecutionPolicy Bypass -Scope Process -Force; .\setup\windows11_setup.ps1"

  • update.bat : Rund these from Project folder get the latest code and dependencies.

Running the Application

  • start_dictation_v2.0.bat: A primary script to start the dictation service.

Core & Helper Scripts

  • dictation_service.py: The core Python service (usually started by one of the scripts above).

  • get_suggestions.py: A helper script for specific functionalities.

  • type_watcher.ahk: The AutoHotkey script that listens for recognized text and types it out system-wide.

Click to see the command used to generate this script list
{ find . -maxdepth 1 -type f \( -name "dictation_service.py" -o -name "get_suggestions.py" \) ; find . -path "./.venv" -prune -o -path "./.env" -prune -o -path "./backup" -prune -o -path "./LanguageTool-6.6" -prune -o -type f \( -name "*.bat" -o -name "*.ahk" -o -name "*.ps1" \) -print | grep -vE "make.bat|notification_watcher.ahk"; }

bit grafically look to see whats behind:

pydeps -v -o dependencies.svg scripts/py/func/main.py

Used Models:

Recommendation: use models from Mirror https://github.com/sl5net/Vosk-System-Listener/releases/tag/v0.2.0.1 (probably faster)

This Ziped models must be saved into models/ folder

mv vosk-model-*.zip models/

Model

Size

Word error rate/Speed

Notes

License

vosk-model-en-us-0.22

1.8G

5.69 (librispeech test-clean)
6.05 (tedlium)
29.78 (callcenter)

Accurate generic US English model

Apache 2.0

vosk-model-de-0.21

1.9G

9.83 (Tuda-de test)
24.00 (podcast)
12.82 (cv-test)
12.42 (mls)
33.26 (mtedx)

Big German model for telephony and server

Apache 2.0

This table provides an overview of different Vosk models, including their size, word error rate or speed, notes, and license information.

License of LanguageTool: GNU Lesser General Public License (LGPL) v2.1 or later


Support the Project

If you find this tool useful, please consider buying us a coffee! Your support helps fuel future improvements.

ko-fi