Files
javis_bot/.env.example
javis-bot c420d5da53 feat(stream): true-mode browser-action core + Gemini scaffold + mode design
First increment of the STREAM_BROWSER real-time-info modes (true = browser,
false = Gemini):

- browse-search.mjs: drives the on-screen Chrome via CDP so the action shows on
  the broadcast. `search` returns the top Google results (title/url/snippet);
  `youtube` plays the first result. Verified live: real-time Seoul weather
  results, and IU 'Good Day' MV playback.
- .env.example: GEMINI_API_KEY / GEMINI_MODEL for the false-mode Gemini account.
- docs/stream_browser_modes.md: architecture + integration map (brain config,
  the two mode-gated tools, registry, design decisions) for the remaining wiring.

The Python brain wiring (config.py mode/gemini fields, browseAndSearch +
geminiSearch tools, registry, specs, llm_contexts) lands next - it needs a
running brain and a Gemini key to verify, rather than committing untested edits
into the 39k-line engine.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-10 16:36:35 +09:00

110 lines
5.1 KiB
Plaintext

# ============================================================================
# Javis Bot — environment configuration
# Copy to `.env` and fill in. Never commit your real `.env`.
# ============================================================================
# ---------------------------------------------------------------------------
# Discord bot (normal bot account) — voice I/O + slash commands
# ---------------------------------------------------------------------------
# From https://discord.com/developers/applications → your app
DISCORD_BOT_TOKEN=
DISCORD_APP_ID=
# The (single) server this bot serves. Guild-scoped commands appear instantly.
DISCORD_GUILD_ID=
# Voice channel used by the stream-test scripts (bot/scripts/stream-test).
DISCORD_VOICE_CHANNEL_ID=
# ---------------------------------------------------------------------------
# Brain bridge (Python service in bridge/) — STT + reply engine + TTS
# ---------------------------------------------------------------------------
BRIDGE_URL=http://127.0.0.1:8765
BRIDGE_HOST=127.0.0.1
BRIDGE_PORT=8765
JARVIS_BRAIN_ENABLED=1
JARVIS_TTS_ENABLED=1
# faster-whisper device/compute. GPU by default (RTX 5050 / sm_120, verified).
# Falls back to CPU automatically if no GPU is passed to the container.
WHISPER_DEVICE=cuda
WHISPER_COMPUTE_TYPE=float16
# Optional explicit Piper voice model (.onnx). If empty, the jarvis default is used.
TTS_PIPER_MODEL_PATH=
# ---------------------------------------------------------------------------
# Jarvis brain (Ollama-backed). In Docker these populate the rendered
# config (docker/jarvis-config.template.json). See src/jarvis/config.py.
# ---------------------------------------------------------------------------
# In docker-compose this is overridden to http://ollama:11434 automatically.
OLLAMA_BASE_URL=http://127.0.0.1:11434
# qwen3:8b — best 8GB-VRAM pick: strongest tool-calling, ~5GB Q4, fits the RTX 5050.
OLLAMA_CHAT_MODEL=qwen3:8b
OLLAMA_EMBED_MODEL=nomic-embed-text
WHISPER_MODEL=small
# ---------------------------------------------------------------------------
# Docker desktop (VNC) — used only by the container image
# ---------------------------------------------------------------------------
# VNC viewer password (max 8 chars effective). Watch the screen at localhost:5901.
# Also used by the broadcast keepalive: TigerVNC only refreshes its framebuffer
# while a VNC client is attached, so the stream keeps a tiny client connected to
# avoid a choppy (~1.5 fps) capture. Must match the VNC server's password. If
# unset, the keepalive falls back to the obfuscated passwd file (VNC_PASSWD_FILE,
# default ~/.config/tigervnc/passwd).
VNC_PASSWORD=javis123
# VNC_PASSWD_FILE=/home/claude/.config/tigervnc/passwd
# Auto-opened page in the in-container Chrome.
CHROME_START_URL=about:blank
# ---------------------------------------------------------------------------
# Screen-share + browser mode.
# true = the bot may go Live (screen-share the VNC desktop) and drive the
# on-screen browser for real-time info (search / play / read screen).
# false = no screen share; voice only, real-time info via the Gemini API.
STREAM_BROWSER=true
# Gemini account (used for real-time info when STREAM_BROWSER=false). Get a key
# at https://aistudio.google.com/app/apikey and paste it here.
GEMINI_API_KEY=
GEMINI_MODEL=gemini-2.0-flash
# ---------------------------------------------------------------------------
# VNC screen broadcast
# selfbot = real live "Go Live" stream (needs a USER/burner token; ToS risk)
# novnc = share a noVNC browser link (safe, real-time, not native)
# screenshot = periodic screenshots to the channel (safe, low fps)
# none = disabled
# ---------------------------------------------------------------------------
STREAM_BACKEND=selfbot
# The VNC desktop runs on X display :1 (see docs/vnc-xfce-setup.md)
VNC_DISPLAY=:1
VNC_RESOLUTION=1920x1080
# 1080p60 broadcast. 8 Mbps suits 60fps (YouTube-style 1080p60 sits ~8-12 Mbps);
# drop to 30/4000 for a lighter stream. Max bitrate is 1.5x this value.
VNC_FRAMERATE=60
VNC_BITRATE_KBPS=8000
# --- selfbot backend ---
# A THROWAWAY/burner Discord user account token. NEVER your main account.
# Using a selfbot violates Discord ToS and can get the account banned.
DISCORD_SELFBOT_TOKEN=
# Hardware (NVENC) encode for the stream. 1 = use the GPU (recommended for
# 1080p60), 0 = software x264. Requires an NVIDIA GPU + ffmpeg built with nvenc.
STREAM_HW=1
# Capture desktop audio into the broadcast so the stream has sound. 1 = on,
# 0 = mute. Pulls the PipeWire/Pulse monitor of the default sink; override the
# source with STREAM_AUDIO_SOURCE (e.g. a specific "<sink>.monitor").
STREAM_AUDIO=1
STREAM_AUDIO_SOURCE=@DEFAULT_MONITOR@
# --- novnc backend ---
# e.g. http://192.168.10.9:6080/vnc.html (websockify --web=/usr/share/novnc 6080 localhost:5901)
NOVNC_URL=
# --- screenshot backend ---
SCREENSHOT_INTERVAL_SEC=5
# ---------------------------------------------------------------------------
# Voice behaviour
# ---------------------------------------------------------------------------
# Silence (ms) that marks the end of an utterance before sending to the brain.
VOICE_SILENCE_MS=800