Fix Voice Message Speed Control in OpenClaw + Telegram
Convert TTS output to OGG Opus so Telegram voice messages get proper speed control buttons.
The problem
Your OpenClaw agent sends TTS voice messages on Telegram. They play fine, but the speed control button is missing. No 1.5x. No 2x. Just fixed-speed audio.
If your agent sends briefings or long content, you’re stuck at 1x. A 3-minute update takes 3 minutes. At 2x, it takes 90 seconds.
The issue is the audio format. OpenClaw’s TTS outputs MP3. Telegram’s sendVoice API technically accepts MP3, but it doesn’t treat it as a real voice message. No waveform. No speed control.
The fix
Convert MP3 to OGG Opus before sending.
ffmpeg -i input.mp3 -c:a libopus -b:a 48k -vbr on -compression_level 10 -frame_duration 60 -application voip output.ogg
Send the .ogg file instead. Telegram now shows the waveform and speed buttons (0.5x/1x/1.5x/2x).
Why OGG Opus
Telegram’s docs say voice messages must be “.ogg encoded with OPUS.” But they don’t say what happens when you ignore this. MP3 files still play. They just lose features.
The ffmpeg flags that matter:
-c:a libopus— Opus codec-b:a 48k— 48kbps bitrate (clear voice quality)-application voip— Optimizes for speech, not music. This one’s key.-frame_duration 60— Faster playback start-compression_level 10— Smallest file size
Quick setup for OpenClaw users
Tell your agent this:
“From now on, before sending any TTS voice message, convert the MP3 to OGG Opus using this command:
ffmpeg -i input.mp3 -c:a libopus -b:a 48k -vbr on -compression_level 10 -frame_duration 60 -application voip output.ogg. Send the .ogg file instead of the .mp3. Always do this for every voice message.”
Or create a conversion script on your server:
#!/bin/bash
# Save as ~/tts-to-opus.sh
INPUT="$1"
OUTPUT="${INPUT%.mp3}.ogg"
ffmpeg -i "$INPUT" -c:a libopus -b:a 48k -vbr on \
-compression_level 10 -frame_duration 60 \
-application voip "$OUTPUT" -y -loglevel error
echo "$OUTPUT"
chmod +x ~/tts-to-opus.sh
Then use it: ~/tts-to-opus.sh my-briefing.mp3
Cost implications
Zero. ffmpeg is free. OGG Opus is royalty-free. Conversion takes under 2 seconds for a 3-minute file. No extra API costs.
If you’re paying for TTS (ElevenLabs, Azure, Google), the conversion adds nothing to your bill. It’s a local processing step.
Future improvements
OpenClaw could handle this internally. If the target is Telegram and the media is voice, auto-convert to OGG Opus before sending. That would make the fix invisible.
Until then, the ffmpeg command works.