javis_bot/docs/stream_browser_modes.md

# Real-time info modes (`STREAM_BROWSER`)

The bot answers via the Python brain (`bridge/server.py` -> `src/jarvis`). Real-time
info is fetched by a tool the reply engine calls. `STREAM_BROWSER` selects HOW:

- **true** (default): drive the on-screen Chrome (CDP at `CDP_PORT`, default 9222)
  to Google-search / play YouTube / read the page. The action is visible on the
  Go-Live broadcast. The browser is already up on the VNC display `:1`.
- **false**: use the Google Gemini API (grounded with Google Search) for
  real-time info. No screen share needed (voice + API only).

## Components

| Piece | Path | Status |
|---|---|---|
| Mode flag (bot) | `bot/src/config.ts` `screenBrowser`, enforced in `selfbot.ts` | done |
| Browser search core (Node/CDP) | `bot/scripts/stream-test/browse-search.mjs` | this change |
| Brain mode read | `src/jarvis/config.py` `stream_browser` from env | TODO |
| Gemini key/model | `GEMINI_API_KEY`, `GEMINI_MODEL` (.env) + `config.py` | scaffolded |
| `browseAndSearch` tool (true) | `src/jarvis/tools/builtin/browse_and_search.py` -> subprocess the Node core | TODO |
| `geminiSearch` tool (false) | `src/jarvis/tools/builtin/gemini_search.py` (REST, no new dep) | TODO |
| Registry (mode-gated) | `src/jarvis/tools/registry.py` `BUILTIN_TOOLS` | TODO |
| Specs + `docs/llm_contexts.md` | alongside each tool | TODO |

## Design decisions

- The browser tool (Python) **subprocesses a Node script** rather than adding a
  Python CDP/playwright dependency: the Node layer already owns Chrome/CDP
  (`broadcast-helper.mjs`, `selfbot.ts`), so the brain shells out to
  `node browse-search.mjs <query>` and wraps the JSON result in the engine's
  `UNTRUSTED WEB EXTRACT` envelope. Keeps the 39k-line Python brain dep-free.
- Gemini uses the REST endpoint (`generativelanguage.googleapis.com`) via stdlib
  `urllib` with the `google_search` grounding tool - no SDK dependency.
- Tools return the same `ToolExecutionResult(success, reply_text)` envelope shape
  as `webSearch`, so downstream synthesis is unchanged. The brain reads
  `STREAM_BROWSER` once at startup and registers the matching tool.

## To finish / verify
- Provide `GEMINI_API_KEY` to build + verify the false-mode path (a real call is
  needed to confirm grounding output).
- Wire `config.py` + the two Python tools + registry, update specs and
  `docs/llm_contexts.md` (new Gemini LLM context).