javis_bot

Author	SHA1	Message	Date
javis-bot	ef6f6ff57d	feat(stream): STREAM_BROWSER flag + make toolbar-hide/subtitles broadcast-wide - Add STREAM_BROWSER (.env) gating screen-share/browser mode. false => the /자비스 stream command stays voice + API/MCP only (no Go-Live); true (default) => screen share as before. (Browser-driven info retrieval in true mode is a follow-up build; the bot has no browser-control tools yet.) - Make the two test-time fixes broadcast-wide defaults via broadcast-helper.mjs: it now also watches every tab for HTML5 fullscreen and toggles Chrome window fullscreen so the address bar is hidden for ANY video (xfwm4 won't hide it on 'f' alone), restoring on exit. Subtitles were already enforced per video. scenario.mjs drops its own fullscreen toggle and relies on the helper. - Revert the test-settings env vars from .env.example (not wanted). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 16:17:29 +09:00
javis-bot	f93b241575	fix(stream-test): restore audio after ads, enforce subtitle rule broadcast-wide, commit the 60fps MV path Addresses review of the ad/subtitle work (the ad-skip.mjs -> broadcast-helper.mjs rename's other half; the prior commit only recorded the deletion): - ad mute leak: the ad-skipper muted during an ad but never un-muted, so the main video stayed silent after the first ad. Save the pre-ad muted/playbackRate and restore them when the ad ends (verified: muted false -> true -> false). - captions were only applied once when scenario.mjs ran, not for the whole broadcast. The persistent helper now applies the rule (OFF by default, Korean ON if offered) per video and ENFORCES it every tick - one-shot did not hold because YouTube silently re-enabled captions (verified it stays off across 8s). - ad-skip + captions merged into broadcast-helper.mjs (one CDP process). - the 60fps MV test now lives in the repo: scenario.mjs gains MV_QUERY (search + auto-pick the first >=60fps result) and WATCH_SECONDS, plus the fullscreen-toolbar-hide fix. The broadcast runs via the committed stream-hold.ts (audio + keepalive), not an out-of-repo copy. - document the test env vars (CDP_PORT, HOLD_MS, TEST_*, MV_QUERY, WATCH_SECONDS). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 16:09:31 +09:00
javis-bot	208fbbc851	feat(selfbot): broadcast desktop audio + smart subtitles in the browse scenario Two broadcast-experience improvements: - Audio: the Go-Live stream was video-only. Capture the desktop sound (the default PipeWire/Pulse sink monitor, @DEFAULT_MONITOR@) as a second ffmpeg input and mux AAC into the mpegts; the library re-encodes it to Opus for Discord. Controlled by STREAM_AUDIO / STREAM_AUDIO_SOURCE (default on). ffmpeg inherits XDG_RUNTIME_DIR to reach the pulse socket. Verified: the streamer now reports "Found audio stream" and the monitor carries Chrome audio (~-11 dB). - Subtitles: in the browse scenario, default captions OFF, but auto-enable a Korean track when the video offers one (getOption captions tracklist -> setOption / unloadModule). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 15:50:32 +09:00
javis-bot	4176a68873	fix(selfbot): smooth VNC capture via keepalive + stop ffmpeg leak on stream end The Go-Live broadcast looked badly choppy: video and scrolling stuttered while the cursor stayed smooth. Root cause is TigerVNC: it only refreshes its framebuffer while a VNC client is attached, but the broadcast reads that framebuffer with x11grab (not as a VNC client). With no viewer attached the captured screen idled at ~1.5 fps (measured 3/30 distinct frames); the cursor looked smooth only because x11grab overlays the live cursor on every frame. - Add a headless RFB keepalive (vnc-keepalive.ts) that stays connected for the life of the stream and requests incremental framebuffer updates at the stream framerate. SelfbotStreamer starts it on broadcast start and tears it down on stop/self-end. Measured 3/30 -> 57/60 distinct frames at 60 fps. Fail-open; authenticates with VNC_PASSWORD or the ~/.config/tigervnc/passwd file. - Fix a resource leak: when the Go-Live ended on its own, only the active flag was cleared, leaving the x11grab->nvenc ffmpeg running forever (pinning a CPU core while no media was transmitted, with only the gateway TCP left and no UDP media). The self-end path now tears down capture, keepalive and voice like stop() does. - Tests for both paths (self-end teardown; keepalive DES auth, port mapping, password resolution). Add @types/bun so bun:test typechecks; document the keepalive and recommended Chrome flags in README and .env.example. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 15:21:44 +09:00
javis-bot	1e30a49562	fix: cap selfbot stream -maxrate at lib's 10 Mbps ceiling; add stream-test tooling - selfbot.ts: the @dank074 lib advertises a hardcoded max_bitrate of 10 Mbps to Discord (BaseMediaConnection: `max_bitrate: 10000 * 1000`). Our encoder used -maxrate = 1.5x target (12 Mbps at 8 Mbps target), so high-motion bursts exceeded the negotiated ceiling and WebRTC dropped packets (viewer stutter). Cap -maxrate at 10 Mbps. - Add bot/scripts/stream-test/: env-driven stream-hold.ts (persistent Go-Live holder), human.mjs (real xdotool mouse/keyboard + char-by-char typing), and scenario.mjs (YouTube/Naver browse). Channel/guild/video are env-parametrised. - .env.example: document DISCORD_VOICE_CHANNEL_ID for the stream-test scripts.	2026-06-10 12:50:24 +09:00
javis-bot	ad0caa8142	feat: 1080p60 NVENC selfbot broadcast (8 Mbps default) Bump the default broadcast to 1080p 60fps at 8 Mbps and route both encode stages through the GPU (RTX 5050, h264_nvenc) so 60fps stays smooth without loading the 4-core host. - selfbot.ts: capture ffmpeg uses h264_nvenc when streamHw is on (falls back to software x264 otherwise), and prepareStream now passes Encoders.nvenc() so the library's transcode runs on the GPU too. Guard loadLib for Encoders. - config.ts: VNC_FRAMERATE default 30 -> 60, VNC_BITRATE_KBPS 4000 -> 8000. - .env.example: document the new 1080p60/8 Mbps defaults and STREAM_HW. Verified locally: h264_nvenc x11grab holds a steady 60fps with headroom, Encoders.nvenc() returns valid h264_nvenc settings, and tsc --noEmit passes. Live Discord voice-channel verification pending a host reboot.	2026-06-10 11:17:44 +09:00
javis-bot	0dbc0300d7	Enable GPU: LLM + Whisper on the RTX 5050, pick qwen3:8b Some checks failed Release / semantic-release (push) Successful in 19s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 9m54s Details Release / build-linux (push) Failing after 7m14s Details Release / build-windows (push) Has been cancelled Details Release / build-macos (arm64, macos-latest) (push) Has been cancelled Details Release / build-macos (x64, macos-15-intel) (push) Has been cancelled Details Release / release-main (push) Has been cancelled Details Release / release-develop (push) Has been cancelled Details GPU acceleration is now on by default and verified end-to-end on the Blackwell RTX 5050 (sm_120): - Ollama offloads 100% to GPU (log: library=CUDA compute=12.0, BLACKWELL_NATIVE_FP4=1). compose passes GPU via CDI (devices: nvidia.com/gpu=all) to both ollama and javis. - Whisper STT on GPU: faster-whisper>=1.1.0 + nvidia-cublas/cudnn cu12, LD_LIBRARY_PATH baked into the image. Verified float16 transcribe on sm_120; bridge auto-falls back to CPU when no GPU is present. - Model: default chat model -> qwen3:8b (best 8GB-VRAM tool-calling, ~5GB Q4). Embed stays nomic-embed-text. - README documents the host one-time setup (nvidia-container-toolkit + `nvidia-ctk cdi generate`) and GPU on/off. Verified: image builds; GPU visible in both containers via compose; ollama ps = 100% GPU; faster-whisper cuda OK + CPU fallback OK; bridge /health 200.	2026-06-09 15:49:21 +09:00
javis-bot	25c77ac794	Dockerize: one-command stack with auto Ollama model pull Some checks failed Release / semantic-release (push) Successful in 22s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 9m55s Details Release / build-linux (push) Failing after 7m36s Details Release / build-windows (push) Has been cancelled Details Release / build-macos (arm64, macos-latest) (push) Has been cancelled Details Release / build-macos (x64, macos-15-intel) (push) Has been cancelled Details Release / release-main (push) Has been cancelled Details Release / release-develop (push) Has been cancelled Details `docker compose up -d --build` now brings up the whole thing automatically — no host setup needed: - All-in-one javis image: TigerVNC+XFCE desktop, Chrome, Python brain bridge, Node/bun bot, managed by supervisord (verified: all 6 programs RUNNING). - ollama service + one-shot ollama-init that auto-pulls chat+embed models (verified end-to-end; `ollama list` shows pulled models). - Discord token deferred: without DISCORD_BOT_TOKEN the desktop, bridge, Ollama and models all run; only the bot waits (no crash loop). - Slim container deps (bridge/requirements-bridge.txt) drop the unused PyQt6/torch/chatterbox/sounddevice stack. Piper voice + Whisper models auto-download into named volumes. - Configurable host ports (VNC_PORT/NOVNC_PORT/BRIDGE_PORT) to avoid clashing with a host VNC already on 5901. Bridge binds 0.0.0.0 in-container. Verified: image builds; brain imports; bridge /health 200; noVNC 200; X display :1 @1920x1080; auto-pull completes; supervisorctl status all RUNNING.	2026-06-09 15:27:41 +09:00
javis-bot	c4abf63f38	Add Discord-native hybrid front-end for Jarvis (bot + bridge) Some checks failed Release / semantic-release (push) Successful in 59s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 13m45s Details Release / build-linux (push) Failing after 7m47s Details Release / build-windows (push) Has been cancelled Details Release / build-macos (arm64, macos-latest) (push) Has been cancelled Details Release / build-macos (x64, macos-15-intel) (push) Has been cancelled Details Release / release-main (push) Has been cancelled Details Release / release-develop (push) Has been cancelled Details Transform isair/jarvis into a Discord-controlled voice assistant running on the Ubuntu VNC desktop, keeping the mature ~39k-line Python brain intact. - bot/ (Node + bun, discord.js): /자비스 slash commands (ephemeral), voice channel join + voice receive/playback, pluggable VNC screen broadcast (selfbot live / noVNC / screenshot) - bridge/ (Python, Flask): wraps jarvis STT + run_reply_engine + Piper TTS behind a thin localhost HTTP API - .env.example, scripts/ (start_bridge/start_bot/dev), README rewrite, docs/language-comparison.md and docs/vnc-xfce-setup.md Language decision: hybrid (Python brain + Node/bun Discord layer) because Discord blocks bot video; native screen broadcast only works via a Node selfbot library.	2026-06-09 14:51:05 +09:00

9 Commits