javis_bot

Author	SHA1	Message	Date
javis-bot	b56c9c7721	Address remaining review items (queue, selfbot v6 API, ldconfig, resample) Some checks failed Release / semantic-release (push) Successful in 22s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 9m56s Details Release / build-linux (push) Failing after 7m15s Details Release / build-windows (push) Has been cancelled Details Release / build-macos (arm64, macos-latest) (push) Has been cancelled Details Release / build-macos (x64, macos-15-intel) (push) Has been cancelled Details Release / release-main (push) Has been cancelled Details Release / release-develop (push) Has been cancelled Details - voice.ts: reply playback is now a FIFO queue (AudioPlayerStatus.Idle drains it) so concurrent speakers no longer cut each other's replies off. - selfbot.ts: rewritten against the REAL @dank074/discord-video-stream v6 API (verified from its d.ts): prepareStream(input, opts, signal)->{command,output}, playStream(output, streamer, {type:"go-live"}, signal), Streamer.joinVoice. x11grab via customInputOptions; optional NVENC encode (RTX 5050) via exported `nvenc`. package.json pinned to ^6.0.0 (was a wrong ^4.2.1). - Dockerfile: dropped the hardcoded python3.12 LD_LIBRARY_PATH. faster-whisper >=1.1 self-locates the pip CUDA libs; ldconfig (full path, glob) registers them as a robust fallback. Verified: ld.so cache lists libcublas/libcudnn and GPU whisper works with LD_LIBRARY_PATH empty. - bridge: STT resample 48k->16k upgraded from nearest-neighbor to linear (np.interp). Verified: tsc clean, image builds, GPU whisper OK via ldconfig, compose valid.	2026-06-09 18:47:25 +09:00
javis-bot	964123682f	Review fixes: correct Piper TTS API + bot env gating Some checks failed Release / semantic-release (push) Successful in 21s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 9m53s Details Release / build-linux (push) Failing after 7m12s Details Release / build-windows (push) Has been cancelled Details Release / build-macos (arm64, macos-latest) (push) Has been cancelled Details Release / build-macos (x64, macos-15-intel) (push) Has been cancelled Details Release / release-main (push) Has been cancelled Details Release / release-develop (push) Has been cancelled Details Code review of the bridge/bot/docker work found: - TTS bug: bridge called PiperVoice.synthesize(text, wav) but that method returns AudioChunks and takes a SynthesisConfig as its 2nd arg, not a wav file -> TTS would fail. Switched to synthesize_wav(text, wav_file). Verified: produces a valid 22050Hz mono WAV. - run-bot.sh now waits if ANY of DISCORD_BOT_TOKEN/APP_ID/GUILD_ID is missing (config.ts throws on a missing one), preventing a supervisor crash-loop. Verified clean: discord.js Events.ClientReady == 'clientReady' (existing handler correct); image rebuilds.	2026-06-09 16:16:55 +09:00
javis-bot	0dbc0300d7	Enable GPU: LLM + Whisper on the RTX 5050, pick qwen3:8b Some checks failed Release / semantic-release (push) Successful in 19s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 9m54s Details Release / build-linux (push) Failing after 7m14s Details Release / build-windows (push) Has been cancelled Details Release / build-macos (arm64, macos-latest) (push) Has been cancelled Details Release / build-macos (x64, macos-15-intel) (push) Has been cancelled Details Release / release-main (push) Has been cancelled Details Release / release-develop (push) Has been cancelled Details GPU acceleration is now on by default and verified end-to-end on the Blackwell RTX 5050 (sm_120): - Ollama offloads 100% to GPU (log: library=CUDA compute=12.0, BLACKWELL_NATIVE_FP4=1). compose passes GPU via CDI (devices: nvidia.com/gpu=all) to both ollama and javis. - Whisper STT on GPU: faster-whisper>=1.1.0 + nvidia-cublas/cudnn cu12, LD_LIBRARY_PATH baked into the image. Verified float16 transcribe on sm_120; bridge auto-falls back to CPU when no GPU is present. - Model: default chat model -> qwen3:8b (best 8GB-VRAM tool-calling, ~5GB Q4). Embed stays nomic-embed-text. - README documents the host one-time setup (nvidia-container-toolkit + `nvidia-ctk cdi generate`) and GPU on/off. Verified: image builds; GPU visible in both containers via compose; ollama ps = 100% GPU; faster-whisper cuda OK + CPU fallback OK; bridge /health 200.	2026-06-09 15:49:21 +09:00
javis-bot	c4abf63f38	Add Discord-native hybrid front-end for Jarvis (bot + bridge) Some checks failed Release / semantic-release (push) Successful in 59s Details tests / Unit tests (Linux, Python 3.11) (push) Successful in 13m45s Details Release / build-linux (push) Failing after 7m47s Details Release / build-windows (push) Has been cancelled Details Release / build-macos (arm64, macos-latest) (push) Has been cancelled Details Release / build-macos (x64, macos-15-intel) (push) Has been cancelled Details Release / release-main (push) Has been cancelled Details Release / release-develop (push) Has been cancelled Details Transform isair/jarvis into a Discord-controlled voice assistant running on the Ubuntu VNC desktop, keeping the mature ~39k-line Python brain intact. - bot/ (Node + bun, discord.js): /자비스 slash commands (ephemeral), voice channel join + voice receive/playback, pluggable VNC screen broadcast (selfbot live / noVNC / screenshot) - bridge/ (Python, Flask): wraps jarvis STT + run_reply_engine + Piper TTS behind a thin localhost HTTP API - .env.example, scripts/ (start_bridge/start_bot/dev), README rewrite, docs/language-comparison.md and docs/vnc-xfce-setup.md Language decision: hybrid (Python brain + Node/bun Discord layer) because Discord blocks bot video; native screen broadcast only works via a Node selfbot library.	2026-06-09 14:51:05 +09:00

4 Commits