v1.0.0

Voice Reply

stolot0mt0m stolot0mt0m ← All skills

Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud. Supports multiple languages including German (thorsten) and English (ryan) voices. Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag.

Downloads
1.2k
Stars
2
Versions
1
Updated
2026-02-24

Install

npx clawhub@latest install voice-reply

Documentation

Voice Reply

Generate voice audio replies using local Piper TTS via sherpa-onnx. Completely offline, no cloud APIs needed.

Features

  • -100% Local - No internet connection required after setup
  • -No API Keys - Free to use, no accounts needed
  • -Multi-language - German and English voices included
  • -Telegram Ready - Outputs voice notes that display as bubbles
  • -Auto-detect Language - Automatically selects voice based on text

Prerequisites

1. sherpa-onnx runtime installed

2. Piper voice models downloaded

3. ffmpeg for audio conversion

Installation

Quick Install

cd scripts

sudo ./install.sh

Manual Installation

#### 1. Install sherpa-onnx

sudo mkdir -p /opt/sherpa-onnx

cd /opt/sherpa-onnx

curl -L -o sherpa.tar.bz2 "https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.23/sherpa-onnx-v1.12.23-linux-x64-shared.tar.bz2"

sudo tar -xjf sherpa.tar.bz2 --strip-components=1

rm sherpa.tar.bz2

#### 2. Download Voice Models

sudo mkdir -p /opt/piper-voices

cd /opt/piper-voices

German - thorsten (medium quality, natural male voice)

curl -L -o thorsten.tar.bz2 "https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten-medium.tar.bz2"

sudo tar -xjf thorsten.tar.bz2 && rm thorsten.tar.bz2

English - ryan (high quality, clear US male voice)

curl -L -o ryan.tar.bz2 "https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-high.tar.bz2"

sudo tar -xjf ryan.tar.bz2 && rm ryan.tar.bz2

#### 3. Install ffmpeg

sudo apt install -y ffmpeg

#### 4. Set Environment Variables

Add to your OpenClaw service or shell:

export SHERPA_ONNX_DIR="/opt/sherpa-onnx"

export PIPER_VOICES_DIR="/opt/piper-voices"

Usage

{baseDir}/bin/voice-reply "Text to speak" [language]

Parameters

| Parameter | Description | Default |

|-----------|-------------|---------|

| text | The text to convert to speech | (required) |

| language | de for German, en for English | auto-detect |

Examples

German (explicit)

{baseDir}/bin/voice-reply "Hallo, ich bin dein Assistent!" de

English (explicit)

{baseDir}/bin/voice-reply "Hello, I am your assistant!" en

Auto-detect (detects German from umlauts and common words)

{baseDir}/bin/voice-reply "Guten Tag, wie geht es dir?"

Auto-detect (defaults to English)

{baseDir}/bin/voice-reply "The weather is nice today."

Output Format

The script outputs two lines that OpenClaw processes for Telegram:

[[audio_as_voice]]

MEDIA:/tmp/voice-reply-output.ogg

  • -[[audio_as_voice]] - Tag that tells Telegram to display as voice bubble
  • -MEDIA:path - Path to the generated OGG Opus audio file

Available Voices

| Language | Voice | Quality | Description |

|----------|-------|---------|-------------|

| German (de) | thorsten | medium | Natural male voice, clear pronunciation |

| English (en) | ryan | high | Clear US male voice, professional tone |

Adding More Voices

Browse available Piper voices at:

  • -https://rhasspy.github.io/piper-samples/
  • -https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models

Download and extract to $PIPER_VOICES_DIR, then modify the script to include the new voice.

Troubleshooting

"TTS binary not found"

Ensure SHERPA_ONNX_DIR is set and contains bin/sherpa-onnx-offline-tts.

"Failed to generate audio"

Check that voice model files exist: *.onnx, tokens.txt, espeak-ng-data/

Audio plays as file instead of voice bubble

Ensure the output includes [[audio_as_voice]] tag on its own line before the MEDIA: line.

Credits

  • -[sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) - Offline speech processing
  • -[Piper](https://github.com/rhasspy/piper) - Fast local TTS voices
  • -[Thorsten Voice](https://github.com/thorstenMueller/Thorsten-Voice) - German voice dataset

Launch an agent with Voice Reply on Termo.