System-Wide Offline Voice to Commands or Text, Pluggable Systemยถ

Quick Startยถ

  1. Download or clone this repository

  2. Run the setup script for your OS (see setup/ folder):

    • Linux (Arch/Manjaro): bash setup/manjaro_arch_setup.sh ===> ๐Ÿงฉ read docs/LINUX_WAYLAND_dotool

    • Linux (Ubuntu/Debian): bash setup/ubuntu_setup.sh

    • Linux (openSUSE): bash setup/suse_setup.sh

    • macOS: bash setup/macos_setup.sh

    • Windows: setup/windows11_setup_with_ahk_copyq.bat

  3. Start Aura: ./scripts/restart_venv_and_run-server.sh

  4. Press your hotkey and speak โ€” full guide โ†’

โš ๏ธ System Requirements & Compatibility

  • Windows: โœ… Fully supported (uses AutoHotkey/PowerShell).

  • macOS: โœ… Fully supported (uses AppleScript).

  • Linux (X11/Xorg): โœ… Fully supported.

  • Linux (Wayland): โœ… Fully supported (tested on KDE Plasma 6 / Wayland).

  • Linux (CachyOS / Arch-based rolling release): โœ… Fully supported. Requires mimalloc (sudo pacman -S mimalloc) due to glibc 2.43 compatibility.

Welcome to SL5 Aura Service! This document provides a quick overview of our key features and their operating system compatibility.

Aura isnโ€™t just a transcriber; itโ€™s a powerful, offline processing engine that transforms your voice into precise actions and text.

Itโ€™s a complete, offline voice assistant built on Vosk (for Speech-to-Text) and LanguageTool (for Grammar/Style), now featuring an optional Local LLM (Ollama) Fallback for creative responses and advanced fuzzy matching. It is designed for ultimate customization through a pluggable rule system and a dynamic scripting engine.

Translations: This document also exists in other languages.

Note: Many texts are machine-generated translations of the original English documentation and are intended for general guidance only. In case of discrepancies or ambiguities, the English version always prevails. We welcome help from the community to improve this translation!

๐Ÿ“บ Terminal Demoยถ

Terminal Demo

๐ŸŽฅ Video Tutorialยถ

SL5 Aura (v0.16.1): HowTo crash SL5 Aura?

(Alternativer Link: skipvids.com)

Key Featuresยถ

  • Offline & Private: 100% local. No data ever leaves your machine.

  • Dynamic Scripting Engine: Go beyond text replacement. Rules can execute custom Python scripts (on_match_exec) to perform advanced actions like calling APIs (e.g., search Wikipedia), interacting with files (e.g., manage a to-do list), or generating dynamic content (e.g., a context-aware email greeting).

  • Context-Aware Rules: Restrict rules to specific applications. Using only_in_windows, you can ensure a rule only triggers if a specific window title (e.g., โ€œTerminalโ€, โ€œVS Codeโ€ or โ€œBrowserโ€) is active. This works cross-platform (Linux, Windows, macOS).

  • High-Control Transformation Engine: Implements a configuration-driven, highly customizable processing pipeline. Rule priority, command detection, and text transformations are determined purely by the sequential order of rules in the Fuzzy Maps, requiring configuration, not coding.

  • Conservative RAM Usage: Intelligently manages memory, preloading models only if enough free RAM is available, ensuring other applications (like your PC games) always have priority.

  • Cross-Platform: Works on Linux, macOS, and Windows.

  • Fully Automated: Manages its own LanguageTool server (but you can use a external also).

  • Blazing Fast: Intelligent caching ensures instant โ€œListeningโ€ฆโ€ notifications and fast processing.

Documentationยถ

For a complete technical reference, including all modules and scripts, please visit our official documentation page. It is automatically generated and always up-to-date.

Go to Documentation >>

Build Statusยถ

Linux Manjaro Linux Ubuntu Linux Suse macOS Windows 11

Documentation

Read this in other languages:

๐Ÿ‡ฌ๐Ÿ‡ง English | ๐Ÿ‡ธ๐Ÿ‡ฆ ุงู„ุนุฑุจูŠุฉ | ๐Ÿ‡ฉ๐Ÿ‡ช Deutsch | ๐Ÿ‡ช๐Ÿ‡ธ Espaรฑol | ๐Ÿ‡ซ๐Ÿ‡ท Franรงais | ๐Ÿ‡ฎ๐Ÿ‡ณ เคนเคฟเคจเฅเคฆเฅ€ | ๐Ÿ‡ฏ๐Ÿ‡ต ๆ—ฅๆœฌ่ชž | ๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ตญ์–ด | ๐Ÿ‡ต๐Ÿ‡ฑ Polski | ๐Ÿ‡ต๐Ÿ‡น Portuguรชs | ๐Ÿ‡ง๐Ÿ‡ท Portuguรชs Brasil | ๐Ÿ‡จ๐Ÿ‡ณ ็ฎ€ไฝ“ไธญๆ–‡


Installationยถ

The setup is a two-step process:

  1. Download last Release or master ( https://github.com/sl5net/SL5-aura-service/archive/master.zip ) or clone this repository to your computer.

  2. Run the one-time setup script for your operating system.

The setup scripts handle everything: system dependencies, Python environment, and downloading the necessary models and tools (~4GB) directly from our GitHub Releases for maximum speed.

For Linux, macOS, and Windows (with Optional Language Exclusion)ยถ

To save disk space and bandwidth, you can exclude specific language models (de, en) or all optional models (all) during setup. Core components (LanguageTool, lid.176) are always included.

Open a terminal in the projectโ€™s root directory and run the script for your system:

# For Ubuntu/Debian, Manjaro/Arch, macOS, or other derivatives
# (Note: Use bash or sh to execute the setup script)

bash setup/{your-os}_setup.sh [OPTION]

# For Arch-based systems (Manjaro, CachyOS, EndeavourOS, etc.):
`bash setup/manjaro_arch_setup.sh`

`sudo pacman -S mimalloc`


# Examples:
# Install everything (Default):
# bash setup/manjaro_arch_setup.sh

# Exclude German models:
# bash setup/manjaro_arch_setup.sh exclude=de

# Exclude all VOSK language models:
# bash setup/manjaro_arch_setup.sh exclude=all

# For Windows in an Admin-Powershell session

setup/windows11_setup.ps1 -Exclude [OPTION]

# Examples:
# Install everything (Default):
# setup/windows11_setup.ps1

# Exclude English models:
# setup/windows11_setup.ps1 -Exclude "en"

# Exclude German and English models:
# setup/windows11_setup.ps1 -Exclude "de,en"

# Or (recommend) - Start des BAT: 
windows11_setup.bat -Exclude "en"

For Windowsยถ

Run the setup script with administrator privileges.

Install a tool for read and run e.g. CopyQ or AutoHotkey v2. This is required for the text-typing watcher.

The installation is fully automated and takes about 8-10 minutes when using 2 Models on a fresh system.

  1. Navigate to the setup folder.

  2. Double-click on windows11_setup_with_ahk_copyq.bat.

    • The script will automatically prompt for Administrator privileges.

    • It installs the Core System, Language Models, AutoHotkey v2, and CopyQ.

  3. Once the installation is complete, Aura Dictation will launch automatically.

Note: You do not need to install Python or Git beforehand; the script handles everything.


Advanced / Custom Installationยถ

If you prefer not to install the client tools (AHK/CopyQ) or want to save disk space by excluding specific languages, you can run the core script via the command line:

# Core Setup only (No AHK, No CopyQ)
setup/windows11_setup_with_ahk_copyq.bat

# Exclude specific language models (saves space):
# Exclude English:
setup/windows11_setup_with_ahk_copyq.bat -Exclude "en"

# Exclude German and English:
setup/windows11_setup_with_ahk_copyq.bat -Exclude "de,en"

Usageยถ

1. Start the Servicesยถ

On Linux & macOSยถ

A single script handles everything. It starts the main dictation service and the file watcher automatically in the background.

# Run this from the project's root directory
./scripts/restart_venv_and_run-server.sh

On Windowsยถ

Starting the service is a two-step manual process:

  1. Start the Main Service: Run start_aura.bat. or start from .venv the service with python3

2. Configure Your Hotkeyยถ

To trigger dictation, you need a global hotkey that creates a specific file. We highly recommend the cross-platform tool CopyQ.

Our Recommendation: CopyQยถ

Create a new command in CopyQ with a global shortcut.

Command for Linux/macOS:

touch /tmp/sl5_record.trigger

Command for Windows when use CopyQ:

copyq:
var filePath = 'c:/tmp/sl5_record.trigger';

var f = File(filePath);

if (f.openAppend()) {
    f.close();
} else {
    popup(
        'error',
        'cant read or open:\n' + filePath
        + '\n' + f.errorString()
    );
}

Command for Windows when use AutoHotkey:

; trigger-hotkeys.ahk
; AutoHotkey v2 Skript
#SingleInstance Force ; Stellt sicher, dass nur eine Instanz des Skripts lรคuft

;===================================================================
; Hotkey zum Auslรถsen des Aura Triggers
; Drรผcke Strg + Alt + T, um die Trigger-Datei zu schreiben.
;===================================================================
f9::
f10::
f11::
{
    local TriggerFile := "c:\tmp\sl5_record.trigger"
    FileAppend("t", TriggerFile)
    ToolTip("Aura Trigger ausgelรถst!")
    SetTimer(() => ToolTip(), -1500)
}

3. Start Dictating!ยถ

Click in any text field, press your hotkey, and a โ€œListeningโ€ฆโ€ notification will appear. Speak clearly, then pause. The corrected text will be typed for you.


Advanced Configuration (Optional)ยถ

You can customize the applicationโ€™s behavior by creating a local settings file.

  1. Navigate to the config/ directory.

  2. Create a copy of config/settings_local.py_Example.txt and rename it to config/settings_local.py.

  3. Edit config/settings_local.py (it overrides any setting from the main config/settings.py file).

This config/settings_local.py file is (maybe) ignored by Git, so your personal changes (maybe) wonโ€™t be overwritten by updates.

Plug-in Structure and Logicยถ

The systemโ€™s modularity allows for robust extension via the plugins/ directory.

The processing engine strictly adheres to a Hierarchical Priority Chain:

  1. Module Loading Order (High Priority): Rules loaded from core language packs (de-DE, en-US) take precedence over rules loaded from the plugins/ directory (which load last alphabetically).

  2. In-File Order (Micro Priority): Within any given map file (FUZZY_MAP_pre.py), rules are processed strictly by line number (top-to-bottom).

This architecture ensures that core system rules are protected, while project-specific or context-aware rules (like those for CodeIgniter or game controls) can be easily added as low-priority extensions via plug-ins.

Key Scripts for Windows Usersยถ

Here is a list of the most important scripts to set up, update, and run the application on a Windows system.

Setup & Updateยถ

  • chmod +x update.sh; ./update.sh

  • setup/setup.bat: The main script for the initial one-time setup of the environment.

  • or Run powershell -Command "Set-ExecutionPolicy -ExecutionPolicy Bypass -Scope Process -Force; .\setup\windows11_setup.ps1"

  • update.bat : Rund these from Project folder get the latest code and dependencies.

Running the Applicationยถ

  • start_aura.bat: A primary script to start the dictation service.

Core & Helper Scriptsยถ

  • aura_engine.py: The core Python service (usually started by one of the scripts above).

  • get_suggestions.py: A helper script for specific functionalities.

๐Ÿš€ Key Features & OS Compatibilityยถ

Legend for OS Compatibility:

  • ๐Ÿง Linux (e.g., Arch, Ubuntu)

    • ๐Ÿ macOS

  • ๐ŸชŸ Windows

  • ๐Ÿ“ฑ Android (for mobile-specific features)


Core Speech-to-Text (Aura) Engineยถ

Our primary engine for offline speech recognition and audio processing.

Aura-Core/ ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”œโ”€ aura_engine.py (Main Python service orchestrating Aura) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”œโ”ฌ Live Hot-Reload (Config & Maps) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚โ”œ Secure Private Map Loading (Integrity-First) ๐Ÿ”’ ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚โ”‚ * Workflow: Loads password-protected ZIP archives.
โ”‚โ”œ Text Processing & Correction/ Grouped by Language ( e.g. de-DE, en-US, โ€ฆ )
โ”‚โ”œ 1. normalize_punctuation.py (Standardizes punctuation post-transcription) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚โ”œ 2. Intelligent Pre-Correction (FuzzyMap Pre - The Primary Command Layer) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚โ”‚ * Dynamic Script Execution: Rules can trigger custom Python scripts (on_match_exec) to perform advanced actions like API calls, file I/O, or generate dynamic responses.
โ”‚โ”‚ * Cascading Execution: Rules are processed sequentially and their effects are cumulative. Later rules apply to text modified by earlier rules.
โ”‚โ”‚ * Highest Priority Stop Criterion: If a rule achieves a Full Match (^โ€ฆ$), the entire processing pipeline for that token stops immediately. This mechanism is critical for implementing reliable voice commands.
โ”‚โ”œ 3. correct_text_by_languagetool.py (Integrates LanguageTool for grammar/style correction) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚โ”œ 4. Hierarchical RegEx-Rule-Engine with Ollama AI Fallback ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚โ”‚ * Deterministic Control: Uses RegEx-Rule-Engine for precise, high-priority command and text control.
โ”‚โ”‚ * Ollama AI (Local LLM) Fallback: Serves as an optional, low-priority check for creative answers, Q&A, and advanced Fuzzy Matching when no deterministic rule is met.
โ”‚โ”‚ * Status: Local LLM integration. โ”‚โ”” 5. Intelligent Post-Correction (FuzzyMap)โ€“ Post-LT Refinement ๐Ÿง ๐Ÿ ๐ŸชŸ โ”‚โ”‚ * Applied after LanguageTool to correct LT-specific outputs. Follows the same strict cascading priority logic as the Pre-Correction layer.
โ”‚โ”‚ * Dynamic Script Execution: Rules can trigger custom Python scripts (on_match_exec) to perform advanced actions like API calls, file I/O, or generate dynamic responses.
โ”‚โ”‚ * Fuzzy Fallback: The Fuzzy Similarity Check (controlled by a threshold, e.g., 85%) acts as the lowest priority error-correction layer. It is only executed if the entire preceding deterministic/cascading rule run failed to find a match (current_rule_matched is False), optimizing performance by avoiding slow fuzzy checks whenever possible.
โ”œโ”ฌ Model Management/
โ”‚โ”œโ”€ prioritize_model.py (Optimizes model loading/unloading based on usage) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚โ””โ”€ setup_initial_model.py (Configures the first-time model setup) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”œโ”€ Adaptive VAD Timeout ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”œโ”€ Adaptive Hotkey (Start/Stop) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ””โ”€ Instant Language Switching (Experimental via model preloading) ๐Ÿง ๐Ÿ

SystemUtilities/
โ”œโ”ฌ LanguageTool Server Management/
โ”‚โ”œโ”€ start_languagetool_server.py (Initializes the local LanguageTool server) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚โ””โ”€ stop_languagetool_server.py (Shuts down the LanguageTool server) ๐Ÿง ๐Ÿ โ”œโ”€ monitor_mic.sh (e.g. for use with Headset without use keyboard and Monitor) ๐Ÿง ๐Ÿ ๐ŸชŸ

Model & Package Managementยถ

Tools for robust handling of large language models.  

ModelManagement/ ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”œโ”€ Robust Model Downloader (GitHub Release chunks) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”œโ”€ split_and_hash.py (Utility for repo owners to split large files and generate checksums) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ””โ”€ download_all_packages.py (Tool for end-users to download, verify, and reassemble multi-part files) ๐Ÿง ๐Ÿ ๐ŸชŸ

Development & Deployment Helpersยถ

Scripts for environment setup, testing, and service execution.  

Tip: glogg enables you to use regular expressions to search for interesting events in your log files.
Please check the checkbox when installing to associate with log-files.
https://glogg.bonnefon.org/

Tip: After defining your regex patterns, run python3 tools/map_tagger.py to automatically generate searchable examples for the CLI tools. See Map Maintenance Tools for details.

Then maybe Double Click log/aura_engine.log

DevHelpers/
โ”œโ”ฌ Virtual Environment Management/
โ”‚โ”œ scripts/restart_venv_and_run-server.sh (Linux/macOS) ๐Ÿง ๐Ÿ
โ”‚โ”” scripts/restart_venv_and_run-server.ahk (Windows) ๐ŸชŸ
โ”œโ”ฌ System-wide Dictation Integration/
โ”‚โ”œ Vosk-System-Listener Integration ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚โ”œ scripts/monitor_mic.sh (Linux-specific microphone monitoring) ๐Ÿง
โ”‚โ”” scripts/type_watcher.ahk (AutoHotkey listens for recognized text and types it out system-wide) ๐ŸชŸ
โ””โ”€ CI/CD Automation/
โ””โ”€ Expanded GitHub Workflows (Installation, testing, docs deployment) ๐Ÿง ๐Ÿ ๐ŸชŸ (Runs on GitHub Actions)

Upcoming / Experimental Featuresยถ

Features currently under development or in draft status.  

ExperimentalFeatures/
โ”œโ”€ ENTER_AFTER_DICTATION_REGEX Example activation rule โ€œ(ExampleAplicationThatNotExist|Pi, your personal AI)โ€ ๐Ÿง
โ”œโ”ฌPlugins
โ”‚โ•ฐโ”ฌ Live Lazy-Reload (*) ๐Ÿง ๐Ÿ ๐ŸชŸ
(Changes to Plugin activation/deactivation, and their configurations, are applied on the next processing run without service restart.)
โ”‚ โ”œ git commands (Voice control for send git commands) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚ โ”œ wannweil (Map for Location Germany-Wannweil) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚ โ”œ Poker Plugin (Draft) (Voice control for poker applications) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ”‚ โ”” 0 A.D. Plugin (Draft) (Voice control for 0 A.D. game) ๐Ÿง
โ”œโ”€ Sound Output when Start or End a Session (Description pending) ๐Ÿง
โ”œโ”€ Speech Output for Visually Impaired (Description pending) ๐Ÿง ๐Ÿ ๐ŸชŸ
โ””โ”€ SL5 Aura Android Prototype (Not fully offline yet) ๐Ÿ“ฑ


(Note: Specific Linux distributions like Arch (ARL) or Ubuntu (UBT) are covered by the general Linux ๐Ÿง symbol. Detailed distinctions might be covered in installation guides.)

Click to see the command used to generate this script list
{ find . -maxdepth 1 -type f \( -name "aura_engine.py" -o -name "get_suggestions.py" \) ; find . -path "./.venv" -prune -o -path "./.env" -prune -o -path "./backup" -prune -o -path "./LanguageTool-6.6" -prune -o -type f \( -name "*.bat" -o -name "*.ahk" -o -name "*.ps1" \) -print | grep -vE "make.bat|notification_watcher.ahk"; }

bit grafically look to see whats behind:ยถ

yappi_call_graph

pydeps -v -o dependencies.svg scripts/py/func/main.py

Used Models:ยถ

Recommendation: use models from Mirror https://github.com/sl5net/SL5-aura-service/releases/tag/v0.2.0.1 (probably faster)

This Ziped models must be saved into models/ folder

mv vosk-model-*.zip models/

Model

Size

Word error rate/Speed

Notes

License

vosk-model-en-us-0.22

1.8G

5.69 (librispeech test-clean)
6.05 (tedlium)
29.78 (callcenter)

Accurate generic US English model

Apache 2.0

vosk-model-de-0.21

1.9G

9.83 (Tuda-de test)
24.00 (podcast)
12.82 (cv-test)
12.42 (mls)
33.26 (mtedx)

Big German model for telephony and server

Apache 2.0

This table provides an overview of different Vosk models, including their size, word error rate or speed, notes, and license information.

License of LanguageTool: GNU Lesser General Public License (LGPL) v2.1 or later


Support the Projectยถ

If you find this tool useful, please consider buying us a coffee! Your support helps fuel future improvements.

ko-fi

Stripe-Buy Now