- Add STREAM_BROWSER (.env) gating screen-share/browser mode. false => the
/자비스 stream command stays voice + API/MCP only (no Go-Live); true (default)
=> screen share as before. (Browser-driven info retrieval in true mode is a
follow-up build; the bot has no browser-control tools yet.)
- Make the two test-time fixes broadcast-wide defaults via broadcast-helper.mjs:
it now also watches every tab for HTML5 fullscreen and toggles Chrome window
fullscreen so the address bar is hidden for ANY video (xfwm4 won't hide it on
'f' alone), restoring on exit. Subtitles were already enforced per video.
scenario.mjs drops its own fullscreen toggle and relies on the helper.
- Revert the test-settings env vars from .env.example (not wanted).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses review of the ad/subtitle work (the ad-skip.mjs -> broadcast-helper.mjs
rename's other half; the prior commit only recorded the deletion):
- ad mute leak: the ad-skipper muted during an ad but never un-muted, so the
main video stayed silent after the first ad. Save the pre-ad muted/playbackRate
and restore them when the ad ends (verified: muted false -> true -> false).
- captions were only applied once when scenario.mjs ran, not for the whole
broadcast. The persistent helper now applies the rule (OFF by default, Korean
ON if offered) per video and ENFORCES it every tick - one-shot did not hold
because YouTube silently re-enabled captions (verified it stays off across 8s).
- ad-skip + captions merged into broadcast-helper.mjs (one CDP process).
- the 60fps MV test now lives in the repo: scenario.mjs gains MV_QUERY (search +
auto-pick the first >=60fps result) and WATCH_SECONDS, plus the
fullscreen-toolbar-hide fix. The broadcast runs via the committed
stream-hold.ts (audio + keepalive), not an out-of-repo copy.
- document the test env vars (CDP_PORT, HOLD_MS, TEST_*, MV_QUERY, WATCH_SECONDS).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two broadcast-experience improvements:
- Audio: the Go-Live stream was video-only. Capture the desktop sound (the
default PipeWire/Pulse sink monitor, @DEFAULT_MONITOR@) as a second ffmpeg
input and mux AAC into the mpegts; the library re-encodes it to Opus for
Discord. Controlled by STREAM_AUDIO / STREAM_AUDIO_SOURCE (default on). ffmpeg
inherits XDG_RUNTIME_DIR to reach the pulse socket. Verified: the streamer now
reports "Found audio stream" and the monitor carries Chrome audio (~-11 dB).
- Subtitles: in the browse scenario, default captions OFF, but auto-enable a
Korean track when the video offers one (getOption captions tracklist ->
setOption / unloadModule).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Go-Live broadcast looked badly choppy: video and scrolling stuttered while
the cursor stayed smooth. Root cause is TigerVNC: it only refreshes its
framebuffer while a VNC client is attached, but the broadcast reads that
framebuffer with x11grab (not as a VNC client). With no viewer attached the
captured screen idled at ~1.5 fps (measured 3/30 distinct frames); the cursor
looked smooth only because x11grab overlays the live cursor on every frame.
- Add a headless RFB keepalive (vnc-keepalive.ts) that stays connected for the
life of the stream and requests incremental framebuffer updates at the stream
framerate. SelfbotStreamer starts it on broadcast start and tears it down on
stop/self-end. Measured 3/30 -> 57/60 distinct frames at 60 fps. Fail-open;
authenticates with VNC_PASSWORD or the ~/.config/tigervnc/passwd file.
- Fix a resource leak: when the Go-Live ended on its own, only the active flag
was cleared, leaving the x11grab->nvenc ffmpeg running forever (pinning a CPU
core while no media was transmitted, with only the gateway TCP left and no UDP
media). The self-end path now tears down capture, keepalive and voice like
stop() does.
- Tests for both paths (self-end teardown; keepalive DES auth, port mapping,
password resolution). Add @types/bun so bun:test typechecks; document the
keepalive and recommended Chrome flags in README and .env.example.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bump the default broadcast to 1080p 60fps at 8 Mbps and route both encode
stages through the GPU (RTX 5050, h264_nvenc) so 60fps stays smooth without
loading the 4-core host.
- selfbot.ts: capture ffmpeg uses h264_nvenc when streamHw is on (falls back
to software x264 otherwise), and prepareStream now passes Encoders.nvenc()
so the library's transcode runs on the GPU too. Guard loadLib for Encoders.
- config.ts: VNC_FRAMERATE default 30 -> 60, VNC_BITRATE_KBPS 4000 -> 8000.
- .env.example: document the new 1080p60/8 Mbps defaults and STREAM_HW.
Verified locally: h264_nvenc x11grab holds a steady 60fps with headroom,
Encoders.nvenc() returns valid h264_nvenc settings, and tsc --noEmit passes.
Live Discord voice-channel verification pending a host reboot.
GPU acceleration is now on by default and verified end-to-end on the
Blackwell RTX 5050 (sm_120):
- Ollama offloads 100% to GPU (log: library=CUDA compute=12.0,
BLACKWELL_NATIVE_FP4=1). compose passes GPU via CDI
(devices: nvidia.com/gpu=all) to both ollama and javis.
- Whisper STT on GPU: faster-whisper>=1.1.0 + nvidia-cublas/cudnn cu12,
LD_LIBRARY_PATH baked into the image. Verified float16 transcribe on
sm_120; bridge auto-falls back to CPU when no GPU is present.
- Model: default chat model -> qwen3:8b (best 8GB-VRAM tool-calling,
~5GB Q4). Embed stays nomic-embed-text.
- README documents the host one-time setup (nvidia-container-toolkit +
`nvidia-ctk cdi generate`) and GPU on/off.
Verified: image builds; GPU visible in both containers via compose;
ollama ps = 100% GPU; faster-whisper cuda OK + CPU fallback OK;
bridge /health 200.
`docker compose up -d --build` now brings up the whole thing automatically —
no host setup needed:
- All-in-one javis image: TigerVNC+XFCE desktop, Chrome, Python brain bridge,
Node/bun bot, managed by supervisord (verified: all 6 programs RUNNING).
- ollama service + one-shot ollama-init that auto-pulls chat+embed models
(verified end-to-end; `ollama list` shows pulled models).
- Discord token deferred: without DISCORD_BOT_TOKEN the desktop, bridge,
Ollama and models all run; only the bot waits (no crash loop).
- Slim container deps (bridge/requirements-bridge.txt) drop the unused
PyQt6/torch/chatterbox/sounddevice stack. Piper voice + Whisper models
auto-download into named volumes.
- Configurable host ports (VNC_PORT/NOVNC_PORT/BRIDGE_PORT) to avoid clashing
with a host VNC already on 5901. Bridge binds 0.0.0.0 in-container.
Verified: image builds; brain imports; bridge /health 200; noVNC 200;
X display :1 @1920x1080; auto-pull completes; supervisorctl status all RUNNING.