Files
javis_bot/bot/scripts/stream-test
javis-bot c420d5da53 feat(stream): true-mode browser-action core + Gemini scaffold + mode design
First increment of the STREAM_BROWSER real-time-info modes (true = browser,
false = Gemini):

- browse-search.mjs: drives the on-screen Chrome via CDP so the action shows on
  the broadcast. `search` returns the top Google results (title/url/snippet);
  `youtube` plays the first result. Verified live: real-time Seoul weather
  results, and IU 'Good Day' MV playback.
- .env.example: GEMINI_API_KEY / GEMINI_MODEL for the false-mode Gemini account.
- docs/stream_browser_modes.md: architecture + integration map (brain config,
  the two mode-gated tools, registry, design decisions) for the remaining wiring.

The Python brain wiring (config.py mode/gemini fields, browseAndSearch +
geminiSearch tools, registry, specs, llm_contexts) lands next - it needs a
running brain and a Gemini key to verify, rather than committing untested edits
into the 39k-line engine.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-10 16:36:35 +09:00
..

stream-test

Operational scripts for manually verifying the selfbot Go-Live broadcast with a real browsing session captured from the X display.

Files

  • stream-hold.ts - joins the voice channel and keeps the Go-Live stream up until stopped. All params from .env (DISCORD_SELFBOT_TOKEN, DISCORD_GUILD_ID, DISCORD_VOICE_CHANNEL_ID, VNC_RESOLUTION, VNC_FRAMERATE, VNC_BITRATE_KBPS, STREAM_HW, VNC_DISPLAY).
  • human.mjs - human-like interaction helpers. Input is injected into the X server with xdotool (synthetic X input, not a physical HID device, but the browser and the captured screen see genuine pointer/keyboard events with a visibly moving cursor); Playwright only locates elements. Every action is such input: address-bar navigation (Ctrl+L + typing), search typing, clicking the video / settings menu / autoplay toggle / play button, fullscreen via the f key, and scrolling. Elements are brought into view with a real wheel scroll (no DOM scrollIntoView); if an element has no on-screen box the click fails rather than falling back to a synthetic click. The CDP/DOM API is used only to read state for verification, never to act.
  • scenario.mjs - the browse scenario (YouTube -> 1080p -> fullscreen -> Naver -> 나무위키), driven with the human helpers. Connects to a Chrome already running with --remote-debugging-port (CDP_PORT, default 9222) on the streamed display. Defaults to a fixed concert clip; set MV_QUERY to instead search and auto-pick the first result that really reports >=60fps. WATCH_SECONDS (default 20) sets the windowed/fullscreen watch durations.
  • broadcast-helper.mjs - persistent CDP helper that injects one watcher into every tab (current and future) and (1) auto-skips YouTube ads - clicks "Skip ad" instantly, closes overlay ads, fast-forwards unskippable ads (seek-to-end
    • 16x + mute) and RESTORES the pre-ad muted/playbackRate when the ad ends; and (2) applies the subtitle rule per video: captions OFF by default, Korean ON when the video offers a Korean track. Run it alongside the broadcast; it reconnects across Chrome restarts.

Run

# keep the broadcast up (separate process / service)
bun bot/scripts/stream-test/stream-hold.ts

# keep ads auto-skipped + subtitles correct for the whole broadcast:
node bot/scripts/stream-test/broadcast-helper.mjs

# Chrome on the streamed display with remote debugging, then run a browse pass:
node bot/scripts/stream-test/scenario.mjs
# ...or the 60fps MV variant:
MV_QUERY="4K 60fps MV" WATCH_SECONDS=30 node bot/scripts/stream-test/scenario.mjs

Recommended Chrome flags on the streamed display (avoids the "restore pages?" bubble after an unclean exit and keeps a single clean window):

google-chrome --remote-debugging-port=9222 --start-maximized \
  --hide-crash-restore-bubble --disable-session-crashed-bubble \
  --autoplay-policy=no-user-gesture-required <url>

Smooth capture (VNC keepalive)

TigerVNC only refreshes its framebuffer while a VNC client is attached. The Discord broadcast reads the framebuffer with x11grab (not as a VNC client), so with no viewer attached the captured screen idles at ~1.5 fps and the stream looks badly choppy while the cursor still moves smoothly (x11grab overlays the live cursor each frame). SelfbotStreamer fixes this automatically: it keeps a tiny headless RFB client (vnc-keepalive.ts) connected for the life of the stream, requesting incremental updates at the stream framerate. Measured: 3/30 distinct frames without it, ~57/60 with it. The keepalive authenticates with VNC_PASSWORD (or the ~/.config/tigervnc/passwd file) and is fail-open.

A/B framerate/resolution

Lower settings to compare what Discord actually delivers to viewers, e.g.:

VNC_RESOLUTION=1280x720 VNC_FRAMERATE=30 bun bot/scripts/stream-test/stream-hold.ts

Notes

  • Selfbot streaming violates Discord ToS; use a burner account.
  • Requires xdotool, an X display, and a system ffmpeg with x11grab/nvenc.
  • Prereqs (playwright, system Chrome) are not bot dependencies; install separately where you run the scenario.