Demo stack resources for Matthew's self-hosted talk

Nix 100%

Find a file

Matthew Brahms 3aebe99088 nixOS examples		2026-05-05 00:53:09 -05:00
linux-nvidia	initial commit of demo stack	2026-05-05 00:08:17 -05:00
mac-arm64	initial commit of demo stack	2026-05-05 00:08:17 -05:00
mac-intel	initial commit of demo stack	2026-05-05 00:08:17 -05:00
nixOS	nixOS examples	2026-05-05 00:53:09 -05:00
opencode	adding opencode	2026-05-05 00:35:30 -05:00
searxng	initial commit of demo stack	2026-05-05 00:08:17 -05:00
README.md	nixOS examples	2026-05-05 00:53:09 -05:00

README.md

Self-Hosting AI on Your Laptop

Demo stack for the DevOpsDays Austin self-hosting talk. This brings up a local AI stack in Docker with no cloud dependencies. Welcome to freedom...

What's included

Service	URL	Description
Open WebUI	http://localhost:8080	ChatGPT-like interface for local models
Perplexica	http://localhost:3000	AI-powered web search (like Perplexity)
SearXNG	http://localhost:8888	Private meta-search engine
Dockhand	http://localhost:3100	Docker container management UI
OpenCode	Terminal	AI coding assistant backed by local models
Ollama API	http://localhost:11434	LLM inference backend (native on M-series, containerized on Intel/Linux)

Prerequisites

All platforms:

Docker Desktop (or Docker Engine on Linux)
Node.js 18+ (for OpenCode)
20+ GB free disk space for models

Apple Silicon Mac only:

Ollama for Mac — runs natively for full Metal GPU acceleration

Linux only:

NVIDIA drivers
NVIDIA Container Toolkit

Quick Start — Apple Silicon (M1/M2/M3/M4/M5)

1. Install and configure Ollama

Download and install Ollama for Mac, then configure it to accept connections from Docker containers:

# Allow Ollama to listen on all interfaces (not just localhost)
launchctl setenv OLLAMA_HOST "0.0.0.0"

Restart the Ollama app from the menu bar after running that command. Verify it's working:

ollama list

2. Pull models

ollama pull llama3.2
ollama pull mistral
ollama pull nomic-embed-text

llama3.2 (~2 GB) — chat model for Open WebUI
mistral (~4.1 GB) — chat model for Perplexica and OpenCode
nomic-embed-text (~275 MB) — embeddings for Open WebUI RAG

Start these early in the talk so they're ready for the live demo at the end.

3. Pull and start the stack

cd mac-arm64
docker compose pull
docker compose up -d

4. Set up OpenCode

npm install -g opencode-ai
mkdir -p ~/.config/opencode
cp ../opencode/opencode.json ~/.config/opencode/opencode.json

5. Open the UIs

Open WebUI → http://localhost:8080
Perplexica → http://localhost:3000
SearXNG → http://localhost:8888
Dockhand → http://localhost:3100
OpenCode → opencode in any project directory

Quick Start — Intel Mac

1. Tune Docker Desktop memory

Open Docker Desktop → Settings → Resources:

Memory: 10 GB+ (8 GB minimum)
CPUs: 4+

2. Pull and start the stack

cd mac-intel
docker compose pull
docker compose up -d

3. Pull models

docker exec ollama ollama pull llama3.2:3b-instruct-q4_K_M
docker exec ollama ollama pull mistral
docker exec ollama ollama pull nomic-embed-text

llama3.2:3b-instruct-q4_K_M (~2 GB) — chat model for Open WebUI
mistral (~4.1 GB) — chat model for Perplexica and OpenCode
nomic-embed-text (~275 MB) — embeddings for Open WebUI RAG

4. Set up OpenCode

npm install -g opencode-ai
mkdir -p ~/.config/opencode
cp ../opencode/opencode.json ~/.config/opencode/opencode.json

5. Open the UIs

Open WebUI → http://localhost:8080
Perplexica → http://localhost:3000
SearXNG → http://localhost:8888
Dockhand → http://localhost:3100
OpenCode → opencode in any project directory

Quick Start — x86 Linux (NVIDIA GPU)

1. Pull and start the stack

cd linux-nvidia
docker compose pull
docker compose up -d

2. Pull models

docker exec ollama ollama pull llama3.1:8b
docker exec ollama ollama pull nomic-embed-text

llama3.1:8b (~4.7 GB) — chat model for Open WebUI, Perplexica, and OpenCode
nomic-embed-text (~275 MB) — embeddings for Open WebUI RAG

3. Set up OpenCode

npm install -g opencode-ai
mkdir -p ~/.config/opencode
cp ../opencode/opencode.json ~/.config/opencode/opencode.json

Linux users: edit ~/.config/opencode/opencode.json and change the default model:

"model": "ollama-local/llama3.1:8b"

4. Open the UIs

Open WebUI → http://localhost:8080
Perplexica → http://localhost:3000
SearXNG → http://localhost:8888
Dockhand → http://localhost:3100
OpenCode → opencode in any project directory

Using the services

Open WebUI — Select a model from the dropdown and start chatting. Enable web search via the search icon in the chat bar. No login required (auth disabled for demo).

Perplexica — AI-powered search that cites sources. Uses SearXNG under the hood — no API keys needed. On first launch, go to Settings and set the chat model to mistral (M-series / Intel Mac) or llama3.1:8b (Linux) before running any searches.

Dockhand — Browse and manage all running containers. On first load, click the "No environments" dropdown in the top bar and select Local Docker to activate the pre-configured environment.

OpenCode — Terminal AI coding assistant. Navigate to any project directory and run opencode. Uses mistral by default (or llama3.1:8b on Linux). Best for single-file edits and explanations — 7B models are hit-or-miss on multi-step agentic tasks.

Stopping the stack

# From inside the platform directory you started
docker compose down

To also delete all data volumes (full reset):

docker compose down -v

Tips

Start pulls early: Kick off docker compose pull and your ollama pull commands at the beginning of the talk so everything is ready for the live demo at the end.

Slow first response: The first request after startup loads the model into memory (~30 seconds). Subsequent requests are much faster.

Out of memory (Intel Mac / Linux): If containers crash, use a smaller model. llama3.2:3b-instruct-q4_K_M is the most conservative option.

Ollama not reachable from containers (M-series): Make sure you ran launchctl setenv OLLAMA_HOST "0.0.0.0" and restarted the Ollama app. You can verify with curl http://localhost:11434 from your terminal.

Taking it further — NixOS

The nixOS/ folder contains illustrative NixOS configurations for four common self-hosting use cases. These aren't meant to be run directly — they're reference material for the talk showing how the same services you're running in Docker Compose can be expressed declaratively in NixOS.

Host	Role
`hosts/ai-server`	GPU-accelerated Ollama + Open WebUI
`hosts/media-server`	Jellyfin + Immich photo management
`hosts/home-gateway`	AdGuard DNS blocking + Caddy reverse proxy
`hosts/cloud-vps`	Public blog, SSO, and Tailscale mesh VPN

A shared modules/common.nix applies SSH hardening, user config, and firewall defaults to every host automatically — no repetition.