Code review of the bridge/bot/docker work found:
- TTS bug: bridge called PiperVoice.synthesize(text, wav) but that method
returns AudioChunks and takes a SynthesisConfig as its 2nd arg, not a wav
file -> TTS would fail. Switched to synthesize_wav(text, wav_file).
Verified: produces a valid 22050Hz mono WAV.
- run-bot.sh now waits if ANY of DISCORD_BOT_TOKEN/APP_ID/GUILD_ID is missing
(config.ts throws on a missing one), preventing a supervisor crash-loop.
Verified clean: discord.js Events.ClientReady == 'clientReady' (existing
handler correct); image rebuilds.
GPU acceleration is now on by default and verified end-to-end on the
Blackwell RTX 5050 (sm_120):
- Ollama offloads 100% to GPU (log: library=CUDA compute=12.0,
BLACKWELL_NATIVE_FP4=1). compose passes GPU via CDI
(devices: nvidia.com/gpu=all) to both ollama and javis.
- Whisper STT on GPU: faster-whisper>=1.1.0 + nvidia-cublas/cudnn cu12,
LD_LIBRARY_PATH baked into the image. Verified float16 transcribe on
sm_120; bridge auto-falls back to CPU when no GPU is present.
- Model: default chat model -> qwen3:8b (best 8GB-VRAM tool-calling,
~5GB Q4). Embed stays nomic-embed-text.
- README documents the host one-time setup (nvidia-container-toolkit +
`nvidia-ctk cdi generate`) and GPU on/off.
Verified: image builds; GPU visible in both containers via compose;
ollama ps = 100% GPU; faster-whisper cuda OK + CPU fallback OK;
bridge /health 200.