Code review of the bridge/bot/docker work found:
- TTS bug: bridge called PiperVoice.synthesize(text, wav) but that method
returns AudioChunks and takes a SynthesisConfig as its 2nd arg, not a wav
file -> TTS would fail. Switched to synthesize_wav(text, wav_file).
Verified: produces a valid 22050Hz mono WAV.
- run-bot.sh now waits if ANY of DISCORD_BOT_TOKEN/APP_ID/GUILD_ID is missing
(config.ts throws on a missing one), preventing a supervisor crash-loop.
Verified clean: discord.js Events.ClientReady == 'clientReady' (existing
handler correct); image rebuilds.
GPU acceleration is now on by default and verified end-to-end on the
Blackwell RTX 5050 (sm_120):
- Ollama offloads 100% to GPU (log: library=CUDA compute=12.0,
BLACKWELL_NATIVE_FP4=1). compose passes GPU via CDI
(devices: nvidia.com/gpu=all) to both ollama and javis.
- Whisper STT on GPU: faster-whisper>=1.1.0 + nvidia-cublas/cudnn cu12,
LD_LIBRARY_PATH baked into the image. Verified float16 transcribe on
sm_120; bridge auto-falls back to CPU when no GPU is present.
- Model: default chat model -> qwen3:8b (best 8GB-VRAM tool-calling,
~5GB Q4). Embed stays nomic-embed-text.
- README documents the host one-time setup (nvidia-container-toolkit +
`nvidia-ctk cdi generate`) and GPU on/off.
Verified: image builds; GPU visible in both containers via compose;
ollama ps = 100% GPU; faster-whisper cuda OK + CPU fallback OK;
bridge /health 200.
`docker compose up -d --build` now brings up the whole thing automatically —
no host setup needed:
- All-in-one javis image: TigerVNC+XFCE desktop, Chrome, Python brain bridge,
Node/bun bot, managed by supervisord (verified: all 6 programs RUNNING).
- ollama service + one-shot ollama-init that auto-pulls chat+embed models
(verified end-to-end; `ollama list` shows pulled models).
- Discord token deferred: without DISCORD_BOT_TOKEN the desktop, bridge,
Ollama and models all run; only the bot waits (no crash loop).
- Slim container deps (bridge/requirements-bridge.txt) drop the unused
PyQt6/torch/chatterbox/sounddevice stack. Piper voice + Whisper models
auto-download into named volumes.
- Configurable host ports (VNC_PORT/NOVNC_PORT/BRIDGE_PORT) to avoid clashing
with a host VNC already on 5901. Bridge binds 0.0.0.0 in-container.
Verified: image builds; brain imports; bridge /health 200; noVNC 200;
X display :1 @1920x1080; auto-pull completes; supervisorctl status all RUNNING.