Two broadcast-experience improvements:
- Audio: the Go-Live stream was video-only. Capture the desktop sound (the
default PipeWire/Pulse sink monitor, @DEFAULT_MONITOR@) as a second ffmpeg
input and mux AAC into the mpegts; the library re-encodes it to Opus for
Discord. Controlled by STREAM_AUDIO / STREAM_AUDIO_SOURCE (default on). ffmpeg
inherits XDG_RUNTIME_DIR to reach the pulse socket. Verified: the streamer now
reports "Found audio stream" and the monitor carries Chrome audio (~-11 dB).
- Subtitles: in the browse scenario, default captions OFF, but auto-enable a
Korean track when the video offers one (getOption captions tracklist ->
setOption / unloadModule).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Go-Live broadcast looked badly choppy: video and scrolling stuttered while
the cursor stayed smooth. Root cause is TigerVNC: it only refreshes its
framebuffer while a VNC client is attached, but the broadcast reads that
framebuffer with x11grab (not as a VNC client). With no viewer attached the
captured screen idled at ~1.5 fps (measured 3/30 distinct frames); the cursor
looked smooth only because x11grab overlays the live cursor on every frame.
- Add a headless RFB keepalive (vnc-keepalive.ts) that stays connected for the
life of the stream and requests incremental framebuffer updates at the stream
framerate. SelfbotStreamer starts it on broadcast start and tears it down on
stop/self-end. Measured 3/30 -> 57/60 distinct frames at 60 fps. Fail-open;
authenticates with VNC_PASSWORD or the ~/.config/tigervnc/passwd file.
- Fix a resource leak: when the Go-Live ended on its own, only the active flag
was cleared, leaving the x11grab->nvenc ffmpeg running forever (pinning a CPU
core while no media was transmitted, with only the gateway TCP left and no UDP
media). The self-end path now tears down capture, keepalive and voice like
stop() does.
- Tests for both paths (self-end teardown; keepalive DES auth, port mapping,
password resolution). Add @types/bun so bun:test typechecks; document the
keepalive and recommended Chrome flags in README and .env.example.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bump the default broadcast to 1080p 60fps at 8 Mbps and route both encode
stages through the GPU (RTX 5050, h264_nvenc) so 60fps stays smooth without
loading the 4-core host.
- selfbot.ts: capture ffmpeg uses h264_nvenc when streamHw is on (falls back
to software x264 otherwise), and prepareStream now passes Encoders.nvenc()
so the library's transcode runs on the GPU too. Guard loadLib for Encoders.
- config.ts: VNC_FRAMERATE default 30 -> 60, VNC_BITRATE_KBPS 4000 -> 8000.
- .env.example: document the new 1080p60/8 Mbps defaults and STREAM_HW.
Verified locally: h264_nvenc x11grab holds a steady 60fps with headroom,
Encoders.nvenc() returns valid h264_nvenc settings, and tsc --noEmit passes.
Live Discord voice-channel verification pending a host reboot.
GPU acceleration is now on by default and verified end-to-end on the
Blackwell RTX 5050 (sm_120):
- Ollama offloads 100% to GPU (log: library=CUDA compute=12.0,
BLACKWELL_NATIVE_FP4=1). compose passes GPU via CDI
(devices: nvidia.com/gpu=all) to both ollama and javis.
- Whisper STT on GPU: faster-whisper>=1.1.0 + nvidia-cublas/cudnn cu12,
LD_LIBRARY_PATH baked into the image. Verified float16 transcribe on
sm_120; bridge auto-falls back to CPU when no GPU is present.
- Model: default chat model -> qwen3:8b (best 8GB-VRAM tool-calling,
~5GB Q4). Embed stays nomic-embed-text.
- README documents the host one-time setup (nvidia-container-toolkit +
`nvidia-ctk cdi generate`) and GPU on/off.
Verified: image builds; GPU visible in both containers via compose;
ollama ps = 100% GPU; faster-whisper cuda OK + CPU fallback OK;
bridge /health 200.
`docker compose up -d --build` now brings up the whole thing automatically —
no host setup needed:
- All-in-one javis image: TigerVNC+XFCE desktop, Chrome, Python brain bridge,
Node/bun bot, managed by supervisord (verified: all 6 programs RUNNING).
- ollama service + one-shot ollama-init that auto-pulls chat+embed models
(verified end-to-end; `ollama list` shows pulled models).
- Discord token deferred: without DISCORD_BOT_TOKEN the desktop, bridge,
Ollama and models all run; only the bot waits (no crash loop).
- Slim container deps (bridge/requirements-bridge.txt) drop the unused
PyQt6/torch/chatterbox/sounddevice stack. Piper voice + Whisper models
auto-download into named volumes.
- Configurable host ports (VNC_PORT/NOVNC_PORT/BRIDGE_PORT) to avoid clashing
with a host VNC already on 5901. Bridge binds 0.0.0.0 in-container.
Verified: image builds; brain imports; bridge /health 200; noVNC 200;
X display :1 @1920x1080; auto-pull completes; supervisorctl status all RUNNING.