Dockerize: one-command stack with auto Ollama model pull
Some checks failed
Release / semantic-release (push) Successful in 22s
tests / Unit tests (Linux, Python 3.11) (push) Successful in 9m55s
Release / build-linux (push) Failing after 7m36s
Release / build-windows (push) Has been cancelled
Release / build-macos (arm64, macos-latest) (push) Has been cancelled
Release / build-macos (x64, macos-15-intel) (push) Has been cancelled
Release / release-main (push) Has been cancelled
Release / release-develop (push) Has been cancelled

`docker compose up -d --build` now brings up the whole thing automatically —
no host setup needed:

- All-in-one javis image: TigerVNC+XFCE desktop, Chrome, Python brain bridge,
  Node/bun bot, managed by supervisord (verified: all 6 programs RUNNING).
- ollama service + one-shot ollama-init that auto-pulls chat+embed models
  (verified end-to-end; `ollama list` shows pulled models).
- Discord token deferred: without DISCORD_BOT_TOKEN the desktop, bridge,
  Ollama and models all run; only the bot waits (no crash loop).
- Slim container deps (bridge/requirements-bridge.txt) drop the unused
  PyQt6/torch/chatterbox/sounddevice stack. Piper voice + Whisper models
  auto-download into named volumes.
- Configurable host ports (VNC_PORT/NOVNC_PORT/BRIDGE_PORT) to avoid clashing
  with a host VNC already on 5901. Bridge binds 0.0.0.0 in-container.

Verified: image builds; brain imports; bridge /health 200; noVNC 200;
X display :1 @1920x1080; auto-pull completes; supervisorctl status all RUNNING.
This commit is contained in:
javis-bot
2026-06-09 15:27:41 +09:00
parent c4abf63f38
commit 25c77ac794
14 changed files with 448 additions and 4 deletions

18
.dockerignore Normal file
View File

@@ -0,0 +1,18 @@
.git
.github
**/node_modules
bot/node_modules
.venv
**/__pycache__
**/*.pyc
.pytest_cache
*.db
*.sqlite
.env
.env.local
release_output.log
build
dist
tests
evals
docs/img

View File

@@ -27,11 +27,22 @@ WHISPER_COMPUTE_TYPE=auto
TTS_PIPER_MODEL_PATH=
# ---------------------------------------------------------------------------
# Jarvis brain (Ollama-backed). See src/jarvis/config.py for the full list.
# Jarvis brain (Ollama-backed). In Docker these populate the rendered
# config (docker/jarvis-config.template.json). See src/jarvis/config.py.
# ---------------------------------------------------------------------------
# In docker-compose this is overridden to http://ollama:11434 automatically.
OLLAMA_BASE_URL=http://127.0.0.1:11434
# OLLAMA_CHAT_MODEL=...
# WHISPER_MODEL=...
OLLAMA_CHAT_MODEL=llama3.1:8b
OLLAMA_EMBED_MODEL=nomic-embed-text
WHISPER_MODEL=small
# ---------------------------------------------------------------------------
# Docker desktop (VNC) — used only by the container image
# ---------------------------------------------------------------------------
# VNC viewer password (max 8 chars effective). Watch the screen at localhost:5901.
VNC_PASSWORD=javis123
# Auto-opened page in the in-container Chrome.
CHROME_START_URL=about:blank
# ---------------------------------------------------------------------------
# VNC screen broadcast

51
Dockerfile Normal file
View File

@@ -0,0 +1,51 @@
# ============================================================================
# Javis Bot — all-in-one container
# VNC + XFCE desktop + Chrome + Python brain bridge + Node/bun Discord bot.
# Ollama (the LLM backend) runs as a separate service (see docker-compose.yml).
# ============================================================================
FROM ubuntu:24.04
ENV DEBIAN_FRONTEND=noninteractive \
LANG=C.UTF-8 \
DISPLAY=:1 \
PATH=/opt/venv/bin:/root/.bun/bin:/usr/local/bin:/usr/bin:/bin
# --- System packages: desktop, VNC, Chrome deps, ffmpeg, python, ocr ---
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates curl wget gnupg unzip procps \
tigervnc-standalone-server tigervnc-common tigervnc-tools \
xfce4 xfce4-goodies dbus-x11 x11-utils xfonts-base \
fonts-noto-cjk fonts-noto-cjk-extra fonts-nanum \
ffmpeg tesseract-ocr \
python3 python3-venv python3-pip \
novnc websockify supervisor gettext-base \
&& rm -rf /var/lib/apt/lists/*
# --- Google Chrome (stable) ---
RUN wget -q -O /tmp/chrome.deb https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb \
&& (apt-get update && apt-get install -y --no-install-recommends /tmp/chrome.deb || (apt-get -f install -y)) \
&& rm -f /tmp/chrome.deb && rm -rf /var/lib/apt/lists/*
# --- bun (Discord bot runtime/package manager) ---
RUN curl -fsSL https://bun.sh/install | bash
# --- Python brain/bridge deps (slim set) ---
COPY bridge/requirements-bridge.txt /app/bridge/requirements-bridge.txt
RUN python3 -m venv /opt/venv \
&& /opt/venv/bin/pip install --no-cache-dir --upgrade pip \
&& /opt/venv/bin/pip install --no-cache-dir -r /app/bridge/requirements-bridge.txt
# --- Discord bot deps (cache layer on lockfile) ---
COPY bot/package.json bot/bun.lock /app/bot/
RUN cd /app/bot && bun install --frozen-lockfile || bun install
# --- App source ---
COPY . /app
WORKDIR /app
# --- Default Piper voice (best-effort at build; entrypoint retries if absent) ---
RUN bash docker/download-piper.sh || true
EXPOSE 5901 6080 8765
ENTRYPOINT ["/app/docker/entrypoint.sh"]

View File

@@ -47,7 +47,41 @@ Discord ──voice / video / slash──▶ bot/ (Node + bun, discord.js
---
## 설치 & 실행
## 실행 — Docker (권장)
환경 설정 없이 통째로 컨테이너에서 돌립니다. VNC 데스크톱 + 크롬 + Python 브릿지 + Node 봇이 한 컨테이너(`javis`)에, LLM 백엔드(Ollama)가 별도 컨테이너에 뜹니다. **올리기만 하면 Ollama 모델까지 자동으로** 받아집니다.
```bash
# 빌드 & 기동 — 이게 전부입니다.
docker compose up -d --build
```
`docker compose up` 한 번이면 자동으로:
- Ollama 서버가 뜨고, `ollama-init`이 채팅/임베딩 모델을 **자동 pull**
- VNC+XFCE 데스크톱 + 크롬 + Python 브릿지가 기동
- Whisper STT 모델 / Piper TTS 음성 자동 다운로드(볼륨에 캐시)
화면 보기: VNC 뷰어 → `localhost:5901` (비밀번호 = `.env``VNC_PASSWORD`, 기본 `javis123`) 또는 브라우저 → `http://localhost:6080/vnc.html`.
로그: `docker compose logs -f javis`.
### 디스코드 토큰은 마지막에
토큰 없이도 위의 모든 게 정상 동작합니다(봇만 대기). 준비되면:
```bash
cp .env.example .env # DISCORD_BOT_TOKEN / DISCORD_APP_ID / DISCORD_GUILD_ID 채우기
docker compose up -d # 봇이 시작되고 /자비스 명령 등록
```
디스코드에서 `/자비스 join` 으로 호출하세요. (`OLLAMA_CHAT_MODEL` 등 모델을 바꾸려면 `.env`에서 지정 후 `docker compose up -d`.)
- GPU(RTX 5050) 가속: 호스트에 nvidia-container-toolkit 설치 후 `docker-compose.yml`의 GPU 블록 주석 해제, `.env`에서 `WHISPER_DEVICE=cuda` / `WHISPER_COMPUTE_TYPE=float16`.
- 데이터(메모리 DB), Whisper 캐시, Piper 음성은 named volume에 영속됩니다.
- 셀프봇 영상 송출 의존성은 이미지에 기본 포함하지 않습니다. 쓰려면 컨테이너에서 `cd /app/bot && bun add discord.js-selfbot-v13 @dank074/discord-video-stream` 후 재시작(또는 Dockerfile에 추가).
---
## 실행 — 수동(도커 없이)
```bash
# 1) 환경 변수

View File

@@ -0,0 +1,27 @@
# Slim dependency set for the containerized brain bridge.
# Excludes the upstream desktop GUI / dictation / packaging / alternate-TTS
# stack (PyQt6, pyinstaller, sounddevice, webrtcvad, pynput, pygame,
# chatterbox-tts/torch, mlx) which are unused in the Discord+VNC deployment.
# --- Brain runtime (imported when the reply engine loads) ---
python-dotenv==1.0.1
faster-whisper==1.0.3
mcp==1.13.1
numpy<2.0.0
rapidfuzz==3.6.1
requests==2.32.3
# --- Bridge HTTP service ---
flask>=3.0.0
# --- Text-to-speech (Piper) ---
piper-tts>=1.3.0
# --- Built-in tools (lazily imported; needed for full functionality) ---
beautifulsoup4>=4.12.0
lxml>=4.9.0
html2text>=2020.1.16
geoip2==4.8.0
Pillow==10.4.0
pytesseract==0.3.13
faiss-cpu>=1.7.4

84
docker-compose.yml Normal file
View File

@@ -0,0 +1,84 @@
# ============================================================================
# Javis Bot — Docker Compose
# ollama : the LLM backend for the jarvis brain
# ollama-init : one-shot, auto-pulls the chat + embed models on startup
# javis : all-in-one container (VNC desktop + Chrome + bridge + bot)
#
# Just bring it up — everything (incl. Ollama models) comes up automatically:
# docker compose up -d --build
#
# The Discord token can be added LAST: without it the desktop, brain bridge,
# Ollama and models all run; only the bot waits. Then put DISCORD_BOT_TOKEN in
# .env and re-run `docker compose up -d`.
#
# Watch the desktop: VNC viewer -> localhost:5901 (or browser -> localhost:6080)
# ============================================================================
services:
ollama:
image: ollama/ollama:latest
restart: unless-stopped
volumes:
- ollama_models:/root/.ollama
# --- GPU (optional): needs nvidia-container-toolkit on the host ---
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: all
# capabilities: [gpu]
# Auto-pull the models the brain needs, then exit. Idempotent (re-runnable).
ollama-init:
image: ollama/ollama:latest
depends_on:
- ollama
restart: "no"
environment:
OLLAMA_HOST: http://ollama:11434
CHAT_MODEL: ${OLLAMA_CHAT_MODEL:-llama3.1:8b}
EMBED_MODEL: ${OLLAMA_EMBED_MODEL:-nomic-embed-text}
entrypoint: ["/bin/sh", "-c"]
command:
- |
echo "[ollama-init] waiting for ollama server...";
until ollama list >/dev/null 2>&1; do sleep 2; done;
echo "[ollama-init] pulling $$CHAT_MODEL";
ollama pull "$$CHAT_MODEL";
echo "[ollama-init] pulling $$EMBED_MODEL";
ollama pull "$$EMBED_MODEL";
echo "[ollama-init] models ready.";
javis:
build: .
restart: unless-stopped
env_file:
- path: .env
required: false
environment:
# Point the brain at the ollama service and the bot at the in-container bridge.
OLLAMA_BASE_URL: http://ollama:11434
OLLAMA_CHAT_MODEL: ${OLLAMA_CHAT_MODEL:-llama3.1:8b}
OLLAMA_EMBED_MODEL: ${OLLAMA_EMBED_MODEL:-nomic-embed-text}
WHISPER_MODEL: ${WHISPER_MODEL:-small}
BRIDGE_URL: http://127.0.0.1:8765
depends_on:
- ollama
shm_size: "1gb" # Chrome needs a larger /dev/shm
ports:
# Host ports are overridable. If the HOST already runs VNC on 5901
# (see docs/vnc-xfce-setup.md), set VNC_PORT=5902 in .env.
- "${VNC_PORT:-5901}:5901" # VNC
- "${NOVNC_PORT:-6080}:6080" # noVNC (open in a browser)
- "${BRIDGE_PORT:-8765}:8765" # brain bridge (usually internal-only)
volumes:
- javis_data:/data # jarvis db + memory
- whisper_cache:/root/.cache/huggingface # cached Whisper models
- piper_voices:/opt/piper-voices # TTS voices
# --- GPU (optional): mirror the ollama GPU block above to accelerate Whisper ---
volumes:
ollama_models:
javis_data:
whisper_cache:
piper_voices:

30
docker/download-piper.sh Executable file
View File

@@ -0,0 +1,30 @@
#!/usr/bin/env bash
# Download the default Piper voice model if it is not already present.
# Used both at image build time and (as a fallback) at container start.
set -euo pipefail
VOICE="${PIPER_VOICE:-en_GB-alan-medium}"
DEST_DIR="${PIPER_VOICE_DIR:-/opt/piper-voices}"
BASE="https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0"
# en_GB-alan-medium -> en/en_GB/alan/medium
lang2="${VOICE%%-*}" # en_GB
lang1="${lang2%%_*}" # en
rest="${VOICE#*-}" # alan-medium
name="${rest%%-*}" # alan
quality="${rest#*-}" # medium
path="${lang1}/${lang2}/${name}/${quality}"
mkdir -p "$DEST_DIR"
onnx="$DEST_DIR/${VOICE}.onnx"
json="$DEST_DIR/${VOICE}.onnx.json"
if [ -f "$onnx" ] && [ -f "$json" ]; then
echo "[piper] voice already present: $onnx"
exit 0
fi
echo "[piper] downloading voice $VOICE ..."
wget -q -O "$onnx" "${BASE}/${path}/${VOICE}.onnx"
wget -q -O "$json" "${BASE}/${path}/${VOICE}.onnx.json"
echo "[piper] saved to $onnx"

42
docker/entrypoint.sh Executable file
View File

@@ -0,0 +1,42 @@
#!/usr/bin/env bash
# Container entrypoint: render config from env, set the VNC password, ensure the
# Piper voice exists, then hand off to supervisord (which runs the desktop,
# bridge, and bot).
set -euo pipefail
# --- Defaults (override via .env / compose) ---
: "${VNC_PASSWORD:=javis123}"
: "${VNC_RESOLUTION:=1920x1080}"
: "${OLLAMA_BASE_URL:=http://ollama:11434}"
: "${OLLAMA_CHAT_MODEL:=llama3.1:8b}"
: "${OLLAMA_EMBED_MODEL:=nomic-embed-text}"
: "${WHISPER_MODEL:=small}"
: "${WHISPER_DEVICE:=cpu}"
: "${WHISPER_COMPUTE_TYPE:=int8}"
: "${JARVIS_DB_PATH:=/data/jarvis.db}"
: "${BRIDGE_HOST:=0.0.0.0}"
: "${BRIDGE_PORT:=8765}"
: "${PIPER_VOICE:=en_GB-alan-medium}"
: "${PIPER_VOICE_DIR:=/opt/piper-voices}"
: "${TTS_PIPER_MODEL_PATH:=${PIPER_VOICE_DIR}/${PIPER_VOICE}.onnx}"
export VNC_RESOLUTION OLLAMA_BASE_URL OLLAMA_CHAT_MODEL OLLAMA_EMBED_MODEL \
WHISPER_MODEL WHISPER_DEVICE WHISPER_COMPUTE_TYPE JARVIS_DB_PATH \
PIPER_VOICE PIPER_VOICE_DIR TTS_PIPER_MODEL_PATH BRIDGE_HOST BRIDGE_PORT
mkdir -p /data /app/config "$(dirname "$JARVIS_DB_PATH")"
# --- VNC password file ---
mkdir -p /root/.vnc
echo "$VNC_PASSWORD" | tigervncpasswd -f > /root/.vnc/passwd
chmod 600 /root/.vnc/passwd
# --- Render jarvis brain config from template ---
envsubst < /app/docker/jarvis-config.template.json > /app/config/jarvis.json
export JARVIS_CONFIG_PATH=/app/config/jarvis.json
# --- Ensure the Piper voice exists (best effort) ---
bash /app/docker/download-piper.sh || echo "[entrypoint] piper download failed; TTS may be unavailable"
echo "[entrypoint] display=$DISPLAY ollama=$OLLAMA_BASE_URL whisper=$WHISPER_MODEL/$WHISPER_DEVICE"
exec supervisord -c /app/docker/supervisord.conf

View File

@@ -0,0 +1,18 @@
{
"db_path": "${JARVIS_DB_PATH}",
"sqlite_vss_path": null,
"ollama_base_url": "${OLLAMA_BASE_URL}",
"ollama_embed_model": "${OLLAMA_EMBED_MODEL}",
"ollama_chat_model": "${OLLAMA_CHAT_MODEL}",
"tts_enabled": true,
"tts_engine": "piper",
"tts_piper_model_path": "${TTS_PIPER_MODEL_PATH}",
"whisper_model": "${WHISPER_MODEL}",
"whisper_backend": "faster-whisper",
"whisper_device": "${WHISPER_DEVICE}",
"whisper_compute_type": "${WHISPER_COMPUTE_TYPE}",
"location_enabled": true,
"web_search_enabled": true,
"wikipedia_fallback_enabled": true,
"mcps": {}
}

22
docker/run-bot.sh Executable file
View File

@@ -0,0 +1,22 @@
#!/usr/bin/env bash
# Wait for the brain bridge, then run the Discord bot.
#
# The Discord token is intentionally deferred: if DISCORD_BOT_TOKEN is not set
# yet, the rest of the stack (desktop, bridge, ollama) still runs fully. The bot
# just waits. Add the token to .env and `docker compose up -d` to start it.
set -e
cd /app/bot
if [ -z "${DISCORD_BOT_TOKEN:-}" ]; then
echo "[bot] DISCORD_BOT_TOKEN 미설정 — 봇 대기 중. .env에 토큰을 넣고 'docker compose up -d' 하면 시작됩니다."
echo "[bot] (그동안 VNC 데스크톱 / 브릿지 / Ollama 는 정상 동작합니다.)"
exec sleep infinity
fi
BRIDGE="${BRIDGE_URL:-http://127.0.0.1:8765}"
for i in $(seq 1 60); do
curl -fsS "$BRIDGE/health" >/dev/null 2>&1 && break
sleep 1
done
bun run register || echo "[bot] slash command registration failed (continuing)"
exec bun run start

14
docker/run-chrome.sh Executable file
View File

@@ -0,0 +1,14 @@
#!/usr/bin/env bash
# Wait for the desktop, then launch Chrome on :1 so the VNC screen shows a
# controllable browser (jarvis can also drive it). Runs as root -> --no-sandbox.
set -e
for i in $(seq 1 40); do
xdpyinfo -display :1 >/dev/null 2>&1 && break
sleep 1
done
sleep 3
export DISPLAY=:1
exec google-chrome \
--no-sandbox --no-first-run --disable-dev-shm-usage \
--password-store=basic --start-maximized \
"${CHROME_START_URL:-about:blank}"

12
docker/run-xfce.sh Executable file
View File

@@ -0,0 +1,12 @@
#!/usr/bin/env bash
# Wait for the X server, then start the XFCE session (with a dbus session).
set -e
for i in $(seq 1 30); do
xdpyinfo -display :1 >/dev/null 2>&1 && break
sleep 1
done
export DISPLAY=:1
export XDG_DATA_DIRS=/usr/local/share:/usr/share
export XDG_CONFIG_DIRS=/etc/xdg
# startxfce4 bails when X is already up; call the session directly.
exec dbus-launch --exit-with-session xfce4-session

10
docker/run-xvnc.sh Executable file
View File

@@ -0,0 +1,10 @@
#!/usr/bin/env bash
# Start the TigerVNC X server on display :1.
# NOTE: do NOT pass `-extension RENDER` — it blanks XFCE menus/panels
# (see docs/vnc-xfce-setup.md §3-4).
set -e
: "${VNC_RESOLUTION:=1920x1080}"
exec /usr/bin/Xvnc :1 \
-geometry "$VNC_RESOLUTION" -depth 24 \
-rfbport 5901 -rfbauth /root/.vnc/passwd \
-SecurityTypes VncAuth -localhost no -AlwaysShared

71
docker/supervisord.conf Normal file
View File

@@ -0,0 +1,71 @@
[supervisord]
nodaemon=true
user=root
logfile=/var/log/supervisord.log
pidfile=/run/supervisord.pid
[unix_http_server]
file=/run/supervisor.sock
[supervisorctl]
serverurl=unix:///run/supervisor.sock
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[program:xvnc]
command=/app/docker/run-xvnc.sh
priority=100
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
[program:xfce]
command=/app/docker/run-xfce.sh
priority=200
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
[program:novnc]
command=websockify --web=/usr/share/novnc 6080 localhost:5901
priority=250
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
[program:bridge]
command=/opt/venv/bin/python -m bridge.server
directory=/app
priority=300
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
[program:chrome]
command=/app/docker/run-chrome.sh
priority=350
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
[program:bot]
command=/app/docker/run-bot.sh
directory=/app/bot
priority=400
autorestart=true
startretries=999
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0