javis_bot

Author	SHA1	Message	Date
javis-bot	208fbbc851	feat(selfbot): broadcast desktop audio + smart subtitles in the browse scenario Two broadcast-experience improvements: - Audio: the Go-Live stream was video-only. Capture the desktop sound (the default PipeWire/Pulse sink monitor, @DEFAULT_MONITOR@) as a second ffmpeg input and mux AAC into the mpegts; the library re-encodes it to Opus for Discord. Controlled by STREAM_AUDIO / STREAM_AUDIO_SOURCE (default on). ffmpeg inherits XDG_RUNTIME_DIR to reach the pulse socket. Verified: the streamer now reports "Found audio stream" and the monitor carries Chrome audio (~-11 dB). - Subtitles: in the browse scenario, default captions OFF, but auto-enable a Korean track when the video offers one (getOption captions tracklist -> setOption / unloadModule). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 15:50:32 +09:00
javis-bot	4176a68873	fix(selfbot): smooth VNC capture via keepalive + stop ffmpeg leak on stream end The Go-Live broadcast looked badly choppy: video and scrolling stuttered while the cursor stayed smooth. Root cause is TigerVNC: it only refreshes its framebuffer while a VNC client is attached, but the broadcast reads that framebuffer with x11grab (not as a VNC client). With no viewer attached the captured screen idled at ~1.5 fps (measured 3/30 distinct frames); the cursor looked smooth only because x11grab overlays the live cursor on every frame. - Add a headless RFB keepalive (vnc-keepalive.ts) that stays connected for the life of the stream and requests incremental framebuffer updates at the stream framerate. SelfbotStreamer starts it on broadcast start and tears it down on stop/self-end. Measured 3/30 -> 57/60 distinct frames at 60 fps. Fail-open; authenticates with VNC_PASSWORD or the ~/.config/tigervnc/passwd file. - Fix a resource leak: when the Go-Live ended on its own, only the active flag was cleared, leaving the x11grab->nvenc ffmpeg running forever (pinning a CPU core while no media was transmitted, with only the gateway TCP left and no UDP media). The self-end path now tears down capture, keepalive and voice like stop() does. - Tests for both paths (self-end teardown; keepalive DES auth, port mapping, password resolution). Add @types/bun so bun:test typechecks; document the keepalive and recommended Chrome flags in README and .env.example. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 15:21:44 +09:00
javis-bot	1e30a49562	fix: cap selfbot stream -maxrate at lib's 10 Mbps ceiling; add stream-test tooling - selfbot.ts: the @dank074 lib advertises a hardcoded max_bitrate of 10 Mbps to Discord (BaseMediaConnection: `max_bitrate: 10000 * 1000`). Our encoder used -maxrate = 1.5x target (12 Mbps at 8 Mbps target), so high-motion bursts exceeded the negotiated ceiling and WebRTC dropped packets (viewer stutter). Cap -maxrate at 10 Mbps. - Add bot/scripts/stream-test/: env-driven stream-hold.ts (persistent Go-Live holder), human.mjs (real xdotool mouse/keyboard + char-by-char typing), and scenario.mjs (YouTube/Naver browse). Channel/guild/video are env-parametrised. - .env.example: document DISCORD_VOICE_CHANNEL_ID for the stream-test scripts.	2026-06-10 12:50:24 +09:00
javis-bot	7a148f8caa	fix: don't unlock active in startup catch when a newer attempt owns it The startup catch cleared this.active unconditionally. In a stop()+restart race during the slow login/pauses, the first attempt's catch would fire after the second start() had already taken the lock, unlocking it mid-startup and letting a third start() race in. Guard the active/state reset with `this.controller === controller`, matching the field-null and playStream .finally guards. Verified live: stop during login then restart keeps the restart's lock (active stays true), and it clears to false only once truly stopped; no crash.	2026-06-10 12:10:01 +09:00
javis-bot	2fd5e0fe9e	chore: lengthen humanised selfbot startup delays Join/go-live still felt a touch fast. Widen the pauses: ~2.5-4.5s after coming online before joining voice, ~6-10s after joining before Go Live.	2026-06-10 11:47:31 +09:00
javis-bot	2c7f0a95b5	fix: make humanised selfbot startup abort- and concurrency-safe The human-pause delays leave start() in-flight for several seconds, which exposed two races: - stop() during a pause only ended the pause; start() continued and called joinVoice on the streamer stop() had already nulled (null deref). - `active` was set only just before go-live, so a second /stream during the delay passed the guard and both calls raced on the same overwritten streamer. Now start() locks `active` before any await, keeps controller/streamer/capture as local refs, and calls signal.throwIfAborted() after each await so an interleaved stop() unwinds into a catch that tears down via the local refs and clears instance state only if it still points at this attempt. isActive() now reflects "starting" during the delay too. Verified live: concurrent start is rejected ("이미 송출 중입니다"), stop() mid- startup returns a cancel message with isActive=false and no uncaught error, and the happy path still goes live and tears down cleanly. tsc --noEmit passes.	2026-06-10 11:42:57 +09:00
javis-bot	b6cf05f6cf	feat: humanise selfbot voice-join and go-live pacing Joining voice and starting the broadcast instantly looks like a bot. Add randomised, human-plausible pauses (~0.9-2.2s after coming online before joining the channel, ~2.5-5s after joining before hitting Go Live) so the cadence isn't machine-instant or fingerprintable. The pause resolves immediately on stop() so teardown never hangs mid-wait. Verified live: end-to-end join -> settle -> Go Live took ~8s before the stream went live, held for 15s, and tore down cleanly. tsc --noEmit passes.	2026-06-10 11:37:39 +09:00
javis-bot	40fd7dbb59	fix: single-pass NVENC encode for selfbot stream (no double encode) Address review: the capture ffmpeg had no -b:v, so it encoded at nvenc's low default (~2.47 Mbps) and the library then re-encoded to 8 Mbps, which only upscaled already-lost detail. The double encode also kept CPU decode + scale + re-encode in the library, contradicting the "GPU handles it" claim. Now the system ffmpeg produces the final Discord-ready H264 in one pass (-b:v/-maxrate at the configured bitrate, -bf 0, 1s keyframes, yuv420p, -forced-idr) and prepareStream uses noTranscoding:true to remux only. One GPU encode, no library decode/scale/re-encode. Verified locally: high-motion source fills 8.7 Mbps at these args (vs the ~2.47 Mbps no-bitrate default), real :1 desktop holds 60fps at realtime, and the capture -> copy/remux chain yields h264 1920x1080 yuv420p 60fps has_b_frames=0. tsc --noEmit passes. Live Discord test pending reboot.	2026-06-10 11:23:52 +09:00
javis-bot	ad0caa8142	feat: 1080p60 NVENC selfbot broadcast (8 Mbps default) Bump the default broadcast to 1080p 60fps at 8 Mbps and route both encode stages through the GPU (RTX 5050, h264_nvenc) so 60fps stays smooth without loading the 4-core host. - selfbot.ts: capture ffmpeg uses h264_nvenc when streamHw is on (falls back to software x264 otherwise), and prepareStream now passes Encoders.nvenc() so the library's transcode runs on the GPU too. Guard loadLib for Encoders. - config.ts: VNC_FRAMERATE default 30 -> 60, VNC_BITRATE_KBPS 4000 -> 8000. - .env.example: document the new 1080p60/8 Mbps defaults and STREAM_HW. Verified locally: h264_nvenc x11grab holds a steady 60fps with headroom, Encoders.nvenc() returns valid h264_nvenc settings, and tsc --noEmit passes. Live Discord voice-channel verification pending a host reboot.	2026-06-10 11:17:44 +09:00
javis-bot	5137fdeaf7	selfbot streaming: verified live; capture via system ffmpeg x11grab Some checks failed Release / build-windows (push) Blocked by required conditions Details Release / build-macos (arm64, macos-latest) (push) Blocked by required conditions Details Release / build-macos (x64, macos-15-intel) (push) Blocked by required conditions Details Release / release-main (push) Blocked by required conditions Details Release / release-develop (push) Blocked by required conditions Details Release / semantic-release (push) Successful in 24s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 10m1s Details Release / build-linux (push) Failing after 7m35s Details End-to-end verified with a real burner token + voice channel: login OK, posts to the text channel, joins voice, and Go-Live streams the host :1 desktop. - selfbot.ts now captures the X display with the SYSTEM ffmpeg (reliable x11grab) and pipes it into prepareStream, instead of relying on the lib's bundled libav input devices (not portable). Capture process is killed on stop. - package.json: trustedDependencies (node-av, @lng2004/node-datachannel) so the native streaming deps build automatically on bun install (incl. Docker). - Dropped the unused nvenc path (the lib's exported `nvenc` is undefined at runtime); software H264 encode for now.	2026-06-10 10:38:28 +09:00
javis-bot	b56c9c7721	Address remaining review items (queue, selfbot v6 API, ldconfig, resample) Some checks failed Release / semantic-release (push) Successful in 22s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 9m56s Details Release / build-linux (push) Failing after 7m15s Details Release / build-windows (push) Has been cancelled Details Release / build-macos (arm64, macos-latest) (push) Has been cancelled Details Release / build-macos (x64, macos-15-intel) (push) Has been cancelled Details Release / release-main (push) Has been cancelled Details Release / release-develop (push) Has been cancelled Details - voice.ts: reply playback is now a FIFO queue (AudioPlayerStatus.Idle drains it) so concurrent speakers no longer cut each other's replies off. - selfbot.ts: rewritten against the REAL @dank074/discord-video-stream v6 API (verified from its d.ts): prepareStream(input, opts, signal)->{command,output}, playStream(output, streamer, {type:"go-live"}, signal), Streamer.joinVoice. x11grab via customInputOptions; optional NVENC encode (RTX 5050) via exported `nvenc`. package.json pinned to ^6.0.0 (was a wrong ^4.2.1). - Dockerfile: dropped the hardcoded python3.12 LD_LIBRARY_PATH. faster-whisper >=1.1 self-locates the pip CUDA libs; ldconfig (full path, glob) registers them as a robust fallback. Verified: ld.so cache lists libcublas/libcudnn and GPU whisper works with LD_LIBRARY_PATH empty. - bridge: STT resample 48k->16k upgraded from nearest-neighbor to linear (np.interp). Verified: tsc clean, image builds, GPU whisper OK via ldconfig, compose valid.	2026-06-09 18:47:25 +09:00
javis-bot	c4abf63f38	Add Discord-native hybrid front-end for Jarvis (bot + bridge) Some checks failed Release / semantic-release (push) Successful in 59s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 13m45s Details Release / build-linux (push) Failing after 7m47s Details Release / build-windows (push) Has been cancelled Details Release / build-macos (arm64, macos-latest) (push) Has been cancelled Details Release / build-macos (x64, macos-15-intel) (push) Has been cancelled Details Release / release-main (push) Has been cancelled Details Release / release-develop (push) Has been cancelled Details Transform isair/jarvis into a Discord-controlled voice assistant running on the Ubuntu VNC desktop, keeping the mature ~39k-line Python brain intact. - bot/ (Node + bun, discord.js): /자비스 slash commands (ephemeral), voice channel join + voice receive/playback, pluggable VNC screen broadcast (selfbot live / noVNC / screenshot) - bridge/ (Python, Flask): wraps jarvis STT + run_reply_engine + Piper TTS behind a thin localhost HTTP API - .env.example, scripts/ (start_bridge/start_bot/dev), README rewrite, docs/language-comparison.md and docs/vnc-xfce-setup.md Language decision: hybrid (Python brain + Node/bun Discord layer) because Discord blocks bot video; native screen broadcast only works via a Node selfbot library.	2026-06-09 14:51:05 +09:00

12 Commits