Initial commit: _deploy_app skill

Deploy new apps or push updates to existing deployments via
Docker Compose + Caddy + Gitea webhooks. Multi-server profiles,
auto-detection of deployment status, full infrastructure provisioning.

- SKILL.md: 715-line workflow documentation
- scripts/detect_deployment.py: deployment status detection
- scripts/validate_compose.py: compose file validation
- references/: infrastructure, compose patterns, Caddy patterns
- assets/: Makefile and compose templates
- config.json: mew server profile
This commit is contained in:
Darren Neese
2026-03-25 21:12:30 -04:00
commit 994332a3f0
11 changed files with 3006 additions and 0 deletions

3
.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
__pycache__/
*.pyc
.DS_Store

715
SKILL.md Normal file
View File

@@ -0,0 +1,715 @@
---
name: _deploy_app
description: Deploy a new app or push updates to an existing deployment. Detects deployment status, provisions infrastructure (Gitea repo, DNS, Caddy, webhook), and deploys Docker Compose stacks. Supports multiple servers via profiles. Use when the user says "deploy this", "push to production", "deploy to mew", or invokes "/_deploy_app".
---
# /_deploy_app
Deploy a new application to a production server or push updates to an existing deployment. Detect deployment status automatically, provision all required infrastructure (Gitea repo, Cloudflare DNS, Caddy reverse proxy, webhook auto-deploy), and bring Docker Compose stacks online.
## When to Use
- Deploy a new app to a production server for the first time
- Push updates to an existing deployment (triggers auto-deploy via webhook)
- Check deployment status of the current project
- Trigger phrases: "deploy this", "push to production", "deploy to mew", `/_deploy_app`
## Prerequisites
| Requirement | Details |
|-------------|---------|
| SSH access | `ssh mew` (or target server alias) must work without password prompt |
| Gitea secrets | `~/.claude/secrets/gitea.json` with `token`, `url`, `owner`, `ssh_host` |
| Cloudflare secrets | `~/.claude/secrets/cloudflare.json` with `token`, `zones` (domain-to-zone_id map) |
| Docker Compose | Project must have `docker-compose.yaml` (or this skill helps generate one) |
| Webhook secret | Stored in `~/.claude/secrets/gitea.json` as `webhook_secret` |
If any secrets file is missing, stop and tell the user which file is needed and what keys it must contain.
## Workflow Overview
```mermaid
flowchart TD
A[📋 Load Server Profile] --> B[🔍 Detect Project]
B --> C{🐍 Run detect_deployment.py}
C -->|Not Deployed| D[🆕 New Deploy Workflow]
C -->|Already Deployed| E[🔄 Redeploy Workflow]
D --> D1[Ensure Compose + .env] --> D2[Create Gitea Repo]
D2 --> D3[Git Init + Push] --> D4[Create DNS Record]
D4 --> D5[Clone on Server] --> D6[Add to Deploy Map]
D6 --> D7[Add Caddy Site Block] --> D8[Docker Compose Up]
D8 --> D9[Add Webhook] --> F
E --> E1[Commit Changes] --> E2[Git Push]
E2 --> E3[Wait for Webhook] --> F
F[✅ Verify & Report]
style D fill:#c8e6c9
style E fill:#bbdefb
style F fill:#fff9c4
```
---
## Step 1: Load Server Profile
Read `~/.claude/skills/_deploy_app/config.json` to determine the target server.
### If `config.json` exists
1. Read the file and parse the `active_profile` key.
2. If `--profile=name` was passed, use that profile instead. Error if it does not exist.
3. Display a summary:
> **🎯 Deployment target:** {profile.name}
> **🌐 Domain:** *.{profile.domain}
> **🖥️ Server:** {profile.ssh_host} ({profile.ssh_user}@{profile.ssh_host})
> **🔀 Proxy:** {profile.proxy_type}
> **📁 Deploy path:** {profile.deploy_path}
4. Ask: "Proceed with this profile, or switch to another?"
### If `config.json` is missing
Walk the user through creating their first profile:
| # | Question | Default |
|---|----------|---------|
| 1 | Profile name (short ID, e.g. `mew`) | _(required)_ |
| 2 | Description (human-readable) | _(required)_ |
| 3 | SSH host alias (e.g. `mew`) | _(required)_ |
| 4 | SSH user | `darren` |
| 5 | Server hostname (for local detection) | _(required)_ |
| 6 | Server IP (for DNS A records) | _(required)_ |
| 7 | Wildcard domain (e.g. `lavender.spl.tech`) | _(required)_ |
| 8 | Deploy path on server | `/srv/git` |
| 9 | Proxy type (`caddy` or `none`) | `caddy` |
| 10 | Caddy compose path _(skip if none)_ | `/data/docker/caddy` |
| 11 | Caddy container name _(skip if none)_ | `caddy` |
| 12 | Docker proxy network name | `proxy` |
| 13 | Gitea host (e.g. `git.lavender-daydream.com`) | _(from secrets)_ |
Write `config.json`, confirm, then proceed.
### Profile variables reference
| Variable | Description | Example |
|----------|-------------|---------|
| `{profile.name}` | Human-readable name | `Mew Server` |
| `{profile.ssh_host}` | SSH alias or hostname | `mew` |
| `{profile.ssh_user}` | SSH login user | `darren` |
| `{profile.server_hostname}` | Actual hostname (for local detection) | `mew` |
| `{profile.server_ip}` | Public IP address | `155.94.170.136` |
| `{profile.domain}` | Wildcard domain | `lavender.spl.tech` |
| `{profile.deploy_path}` | Root path for deployed repos | `/srv/git` |
| `{profile.proxy_type}` | `caddy` or `none` | `caddy` |
| `{profile.caddy_compose_path}` | Caddy's docker-compose directory | `/data/docker/caddy` |
| `{profile.caddy_container}` | Caddy container name | `caddy` |
| `{profile.proxy_network}` | Docker network for proxy traffic | `proxy` |
### Execution context detection
Determine whether commands run locally or remotely:
```bash
current_host=$(hostname)
```
- If `current_host` matches `{profile.server_hostname}` --> **local execution** (run commands directly)
- If no match --> **remote execution** (wrap commands in `ssh {profile.ssh_user}@{profile.ssh_host} "command"`)
Store this decision as `run_on_server` for use throughout the workflow.
---
## Step 2: Detect Project
Scan the current working directory to gather project metadata.
1. **Detect app name** -- use the directory basename, lowercased and hyphenated.
2. **Check for git remote** -- if `origin` exists, extract the repo name from the URL.
3. **Detect project type** -- look for these files (in order):
- `docker-compose.yaml` / `docker-compose.yml` / `compose.yaml`
- `Dockerfile`
- `package.json`
- `requirements.txt` / `pyproject.toml`
- `go.mod`
4. **Determine container port** -- parse the compose file for `ports:` mapping or `EXPOSE` in Dockerfile. Default to `3000` if not detectable.
5. **Determine container name** -- from the compose file's main service, or `{app_name}` as fallback.
---
## Step 3: Check Deployment Status
Run the detection script to determine if this app is already deployed:
```bash
python3 ~/.claude/skills/_deploy_app/scripts/detect_deployment.py \
--repo-name {owner}/{app_name} \
--config ~/.claude/skills/_deploy_app/config.json
```
Parse the JSON output:
| Field | Meaning |
|-------|---------|
| `deployed` | `true` if the app exists on the server |
| `gitea_repo_exists` | `true` if the Gitea repo exists |
| `dns_exists` | `true` if the DNS record exists |
| `caddy_configured` | `true` if Caddy has a site block |
| `webhook_exists` | `true` if the Gitea webhook is configured |
| `container_running` | `true` if the Docker container is up |
**If `deployed` is true** --> go to [Step 4b: Redeploy Workflow](#step-4b-redeploy-workflow).
**If `deployed` is false** --> go to [Step 4a: New Deploy Workflow](#step-4a-new-deploy-workflow).
---
## Step 4a: New Deploy Workflow
Execute all substeps in order. Each substep is idempotent -- skip if the resource already exists.
### 4a.1: Confirm Domain
Propose a default domain and ask the user to confirm or override:
> **Proposed domain:** `{app_name}.{profile.domain}`
> Accept this domain, or provide a different one?
Store the confirmed domain as `{domain}`.
### 4a.2: Ensure docker-compose.yaml
If no compose file exists in the project:
1. Read `~/.claude/skills/_compose/references/proxy-patterns.md` for Caddy patterns.
2. Generate a compose file appropriate for the detected project type.
3. **MUST** include the proxy network as an external network:
```yaml
networks:
proxy:
external: true
services:
{app_name}:
# ... service config ...
networks:
- proxy
- default
```
4. Pin all image versions -- never use `latest`.
5. Set `restart: unless-stopped` on all services.
### 4a.3: Ensure .env
If `.env` does not exist:
1. Generate with real random secrets:
```bash
openssl rand -hex 16
```
2. Include all required environment variables with sensible defaults.
3. Group by section with comments (e.g. `# === Database ===`).
### 4a.4: Ensure .env.example
Copy `.env` and replace all secret values with descriptive placeholders:
```
DB_PASSWORD=changeme-use-a-strong-password
SECRET_KEY=generate-with-openssl-rand-hex-32
```
### 4a.5: Ensure Makefile
Copy from `~/.claude/skills/_deploy_app/assets/Makefile.template` if it exists, otherwise generate:
```makefile
.PHONY: up down logs pull restart ps
up:
docker compose up -d
down:
docker compose down
logs:
docker compose logs -f
pull:
docker compose pull
restart:
docker compose restart
ps:
docker compose ps
```
### 4a.6: Ensure .gitignore
At minimum, include:
```
.env
*.log
```
**⚠️ IMPORTANT:** Do NOT modify an existing `.gitignore` without explicit user permission (per global guardrails).
### 4a.7: Validate Compose File
Run the compose validation script:
```bash
python3 ~/.claude/skills/_compose/scripts/validate-compose.py \
./docker-compose.yaml --strict
```
Fix all errors before proceeding. Review warnings and fix where appropriate.
### 4a.8: Create Gitea Repository
Read the Gitea token from secrets and create the repo:
```bash
GITEA_TOKEN=$(python3 -c "import json; print(json.load(open('$HOME/.claude/secrets/gitea.json'))['token'])")
GITEA_URL=$(python3 -c "import json; print(json.load(open('$HOME/.claude/secrets/gitea.json'))['url'])")
GITEA_OWNER=$(python3 -c "import json; print(json.load(open('$HOME/.claude/secrets/gitea.json'))['owner'])")
curl -s -X POST -H "Authorization: token $GITEA_TOKEN" \
"$GITEA_URL/api/v1/user/repos" \
-H "Content-Type: application/json" \
-d '{"name": "{app_name}", "private": false, "auto_init": false}'
```
If the repo already exists (HTTP 409), skip this step.
### 4a.9: Git Init and Push
```bash
git init -b main
git add -A
git commit -m "Initial commit"
git remote add origin git@{gitea_ssh_host}:{owner}/{app_name}.git
git push -u origin main
```
If git is already initialized, add the remote (if missing) and push.
### 4a.10: Create Cloudflare DNS Record
Read the Cloudflare token and zone ID, then create an A record:
```bash
CF_TOKEN=$(python3 -c "import json; print(json.load(open('$HOME/.claude/secrets/cloudflare.json'))['token'])")
```
Determine the zone ID by matching the domain's root against the `zones` map in `cloudflare.json`.
**Check if record already exists:**
```bash
curl -s -H "Authorization: Bearer $CF_TOKEN" \
"https://api.cloudflare.com/client/v4/zones/{zone_id}/dns_records?type=A&name={domain}"
```
**If no record exists, create one:**
```bash
curl -s -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/dns_records" \
-H "Authorization: Bearer $CF_TOKEN" \
-H "Content-Type: application/json" \
-d '{"type":"A","name":"{domain}","content":"{profile.server_ip}","ttl":1,"proxied":false}'
```
- `proxied: false` -- Caddy handles TLS via Let's Encrypt; Cloudflare proxy would interfere.
- `ttl: 1` -- automatic TTL.
**If record exists but points to the wrong IP**, update it with PUT.
### 4a.11: Clone Repository on Server
Use the `run_on_server` helper:
```bash
# On server:
cd {profile.deploy_path} && git clone git@{gitea_ssh_host}:{owner}/{app_name}.git
```
If the directory already exists, pull instead:
```bash
# On server:
cd {profile.deploy_path}/{app_name} && git pull origin main
```
### 4a.12: Add to Deploy Map
Read the current deploy map, add the new entry, and write back:
```bash
# On server:
jq '. + {"{owner}/{app_name}": "{profile.deploy_path}/{app_name}"}' \
/etc/deploy-listener/deploy-map.json \
| sudo tee /etc/deploy-listener/deploy-map.json.tmp \
&& sudo mv /etc/deploy-listener/deploy-map.json.tmp /etc/deploy-listener/deploy-map.json
```
If `/etc/deploy-listener/deploy-map.json` does not exist, create it with just this entry.
### 4a.13: Add Caddy Site Block
**⚠️ Skip this step if `{profile.proxy_type}` is `none`.** Note to the user: "Reverse proxy is set to `none` -- configure your own proxy to point to this stack."
Append a new site block to the Caddyfile on the server:
```
# === {App Name} ===
{domain} {
encode zstd gzip
reverse_proxy {container_name}:{port}
}
```
Where `{container_name}` is the main app container and `{port}` is its internal port.
**Caddyfile location:** `{profile.caddy_compose_path}/Caddyfile`
After appending, restart Caddy:
```bash
# On server:
cd {profile.caddy_compose_path} && docker compose restart {profile.caddy_container}
```
### 4a.14: Deploy the Stack
On the server, bring up the Docker Compose stack:
```bash
# On server:
cd {profile.deploy_path}/{app_name}
```
**If a Dockerfile exists** (custom build):
```bash
docker compose up -d --build
```
**If only pre-built images** (no Dockerfile):
```bash
docker compose pull && docker compose up -d
```
### 4a.15: Add Gitea Webhook
Create a push webhook so future `git push` events trigger auto-deploy:
```bash
WEBHOOK_SECRET=$(python3 -c "import json; print(json.load(open('$HOME/.claude/secrets/gitea.json'))['webhook_secret'])")
curl -s -X POST -H "Authorization: token $GITEA_TOKEN" \
"$GITEA_URL/api/v1/repos/{owner}/{app_name}/hooks" \
-H "Content-Type: application/json" \
-d '{
"type": "gitea",
"active": true,
"branch_filter": "main master",
"config": {
"url": "https://deploy.{profile.domain}/webhook",
"content_type": "json",
"secret": "'"$WEBHOOK_SECRET"'"
},
"events": ["push"]
}'
```
### 4a.16: Verify Deployment
Wait 10 seconds for TLS certificate provisioning, then verify:
**HTTP check:**
```bash
curl -s -o /dev/null -w "%{http_code}" https://{domain}/
```
| Code | Meaning |
|------|---------|
| `200` | ✅ Deployment verified |
| `301`/`302` | ✅ Redirect -- likely working (SSL or app redirect) |
| `502` | ❌ Caddy cannot reach container -- check proxy network and container status |
| `0` / timeout | ❌ DNS not propagated or Caddy not restarted |
**Container check:**
```bash
# On server:
docker ps --filter name={container_name} --format '{{.Status}}'
```
### 4a.17: Report
Display a deployment summary:
> **✅ Deployment complete!**
>
> | Item | Status |
> |------|--------|
> | 🌐 URL | https://{domain} |
> | 🐳 Container | {status from docker ps} |
> | 📦 Gitea repo | {gitea_url}/{owner}/{app_name} |
> | 🪝 Webhook | Active (auto-deploy on push to main) |
> | 📡 DNS | {domain} → {profile.server_ip} |
---
## Step 4b: Redeploy Workflow
For apps that are already deployed, the workflow is simplified: commit, push, and let the webhook handle the rest.
### 4b.1: Check for Uncommitted Changes
```bash
git status --porcelain
```
If output is non-empty, display the changes and ask:
> "There are uncommitted changes. Commit and deploy, or abort?"
### 4b.2: Commit Changes
If the user confirms:
```bash
git add -A
git commit -m "{descriptive message based on changed files}"
```
### 4b.3: Push to Remote
```bash
git push origin main
```
If no remote named `origin` exists pointing to the Gitea host, add it first:
```bash
git remote add origin git@{gitea_ssh_host}:{owner}/{app_name}.git
git push -u origin main
```
### 4b.4: Wait for Webhook
Wait 5 seconds for the webhook to fire and the deploy listener to process:
```bash
sleep 5
```
### 4b.5: Verify
Check the deploy listener logs on the server:
```bash
# On server:
journalctl -u deploy-listener -n 10 --no-pager
```
Curl the live domain:
```bash
curl -s -o /dev/null -w "%{http_code}" https://{domain}/
```
Check container status:
```bash
# On server:
docker ps --filter name={container_name} --format '{{.Status}}'
```
### 4b.6: Report
> **🔄 Redeployment complete!**
> **🌐 URL:** https://{domain}
> **🐳 Container:** {status}
> **🪝 Triggered via:** webhook (push to main)
---
## Helper: run_on_server
All server-side commands use this pattern for execution context:
```python
def run_on_server(command):
if is_local:
# hostname matches profile.server_hostname
run(command)
else:
# wrap in SSH
run(f'ssh {profile.ssh_user}@{profile.ssh_host} "{command}"')
```
When constructing SSH commands:
- Escape double quotes inside the command.
- For multi-line commands, use `ssh ... 'bash -s' << 'EOF'` heredoc syntax.
- For commands requiring sudo, ensure the SSH user has passwordless sudo configured for the needed commands.
---
## Configuration
### config.json Structure
```json
{
"active_profile": "mew",
"profiles": {
"mew": {
"name": "Mew Server",
"ssh_host": "mew",
"ssh_user": "darren",
"server_hostname": "mew",
"server_ip": "155.94.170.136",
"domain": "lavender.spl.tech",
"deploy_path": "/srv/git",
"proxy_type": "caddy",
"caddy_compose_path": "/data/docker/caddy",
"caddy_container": "caddy",
"proxy_network": "proxy"
}
}
}
```
### Secrets Files
**`~/.claude/secrets/gitea.json`:**
```json
{
"token": "gitea-api-token",
"url": "https://git.lavender-daydream.com",
"ssh_host": "git.lavender-daydream.com",
"owner": "darren",
"webhook_secret": "shared-secret-for-webhooks"
}
```
**`~/.claude/secrets/cloudflare.json`:**
```json
{
"token": "cloudflare-api-bearer-token",
"zones": {
"lavender.spl.tech": "zone-id-here",
"spl.tech": "zone-id-here"
}
}
```
---
## Adding a New Server Profile
Follow these steps to add a second (or third) deployment target.
### 1. Prepare the Server
On the new server:
1. Install Docker and Docker Compose.
2. Set up the deploy listener (`deploy-listener.py`) as a systemd service.
3. Create the deploy map: `sudo mkdir -p /etc/deploy-listener && echo '{}' | sudo tee /etc/deploy-listener/deploy-map.json`
4. Set up Caddy (or chosen reverse proxy) with Docker Compose.
5. Create the deploy path: `sudo mkdir -p /srv/git`
6. Ensure SSH key access from the workstation (`ssh new-server` must work).
7. Ensure the server can clone from Gitea (add SSH key to Gitea if needed).
### 2. Add the Profile
Edit `~/.claude/skills/_deploy_app/config.json` and add a new entry under `profiles`:
```json
{
"active_profile": "mew",
"profiles": {
"mew": { "...existing..." },
"new-server": {
"name": "New Server Description",
"ssh_host": "new-server",
"ssh_user": "darren",
"server_hostname": "new-server",
"server_ip": "1.2.3.4",
"domain": "new.example.com",
"deploy_path": "/srv/git",
"proxy_type": "caddy",
"caddy_compose_path": "/data/docker/caddy",
"caddy_container": "caddy",
"proxy_network": "proxy"
}
}
}
```
### 3. Deploy to the New Server
Use the `--profile` flag:
```
/_deploy_app --profile=new-server
```
Or set `active_profile` to the new server name in `config.json`.
---
## Troubleshooting
| Problem | Cause | Fix |
|---------|-------|-----|
| **Caddy 502 Bad Gateway** | Container not on the `proxy` network, or container not started | Verify: `docker network inspect proxy` -- check the app container is listed. Run `docker network connect proxy {container_name}` if missing. |
| **Caddy 502 after restart** | Caddy restarted before container was ready | Wait for container healthcheck, then restart Caddy: `docker compose restart caddy` |
| **Webhook not firing** | Webhook misconfigured or deploy listener down | Check Gitea webhook delivery history: Gitea UI → Repo → Settings → Webhooks → Recent Deliveries. Check deploy listener: `systemctl status deploy-listener` |
| **DNS not resolving** | Cloudflare propagation delay or wrong zone | Verify with `dig {domain}`. Check Cloudflare dashboard. Propagation is usually instant but can take up to 5 minutes. |
| **Git push rejected** | Remote URL incorrect or SSH key not authorized | Verify remote: `git remote -v`. Test SSH: `ssh -T git@{gitea_ssh_host}`. Check Gitea deploy keys. |
| **Deploy listener not running** | Service crashed or not enabled | Check: `systemctl status deploy-listener`. Restart: `sudo systemctl restart deploy-listener`. Enable: `sudo systemctl enable deploy-listener`. |
| **Container exits immediately** | Missing .env, bad config, or port conflict | Check logs: `docker compose logs {service}`. Verify `.env` exists on server. Check port conflicts: `ss -tlnp \| grep {port}`. |
| **TLS cert not provisioned** | DNS not pointed, or rate limited | Caddy auto-provisions via Let's Encrypt. Verify DNS resolves first. Check Caddy logs: `docker compose logs caddy`. Let's Encrypt rate limits: 50 certs per domain per week. |
| **Permission denied on server** | SSH user lacks sudo or file ownership wrong | Verify user is in the `git` group: `groups darren`. Check file ownership: `ls -la {deploy_path}/{app_name}`. |
---
## Important Rules
- Always load and confirm the server profile before doing anything else.
- Always run `detect_deployment.py` before choosing the new-deploy or redeploy path.
- Never start Docker containers without explicit user confirmation on first deploy.
- Always create real random secrets in `.env` -- never use placeholder passwords.
- Always pin Docker image versions -- never use `latest`.
- Always include the proxy network in generated compose files.
- Always verify the deployment with both an HTTP check and a container status check.
- Never modify `.gitignore` without explicit user permission.
- Check all git remotes for public providers (github.com, gitlab.com, etc.) before pushing -- warn the user if found.
- If `config.json` is modified, write it back immediately.
- Prefer the `_compose` skill's `validate-compose.py` script for compose file validation.
## Resources
### scripts/
- **`detect_deployment.py`** -- Check Gitea API, Cloudflare DNS, server filesystem, and Docker status to determine if an app is already deployed. Return structured JSON. _(To be created.)_
- **`validate_compose.py`** -- Delegate to `~/.claude/skills/_compose/scripts/validate-compose.py`.
### references/
- **`compose-patterns.md`** -- Common Docker Compose patterns for different app types (Node.js, Python, Go, static sites). _(To be created.)_
- Reuse `~/.claude/skills/_compose/references/proxy-patterns.md` for proxy configuration guidance.
- Reuse `~/.claude/skills/_compose/references/troubleshooting.md` for Docker troubleshooting.
### assets/
- **`Makefile.template`** -- Standard Makefile for deployed apps. _(To be created.)_
## Cross-Platform Notes
- All API calls use `curl`, available on both Windows and Linux.
- Python scripts use `#!/usr/bin/env python3` for portability.
- SSH commands work from both WSL and native Linux.
- Use `$HOME` (not `~`) in scripts for compatibility.
- Path separators: always forward slashes.

22
assets/Makefile.template Normal file
View File

@@ -0,0 +1,22 @@
.PHONY: up down logs pull restart ps build
up:
docker compose up -d
down:
docker compose down
logs:
docker compose logs -f
pull:
docker compose pull
restart:
docker compose restart
ps:
docker compose ps
build:
docker compose up -d --build

View File

@@ -0,0 +1,21 @@
# Template: Replace {app_name}, {image}, and {port} with actual values
services:
{app_name}:
image: {image}
container_name: {app_name}
restart: unless-stopped
# Uncomment if building from Dockerfile:
# build: .
env_file:
- .env
networks:
- proxy
- default
# Uncomment and set the internal port:
# expose:
# - "{port}"
networks:
proxy:
name: proxy
external: true

42
config.example.json Normal file
View File

@@ -0,0 +1,42 @@
{
"active_profile": "my-server",
"profiles": {
"my-server": {
"server": {
"name": "my-server",
"ip": "1.2.3.4",
"ssh_user": "deploy-user",
"ssh_host": "my-server",
"deploy_map": "/etc/deploy-listener/deploy-map.json",
"deploy_env": "/etc/deploy-listener/deploy-listener.env",
"caddyfile": "/data/docker/caddy/Caddyfile",
"caddy_container": "caddy",
"caddy_compose_dir": "/data/docker/caddy",
"compose_dir": "/srv/git",
"proxy_network": "proxy",
"proxy_gateway": "10.0.12.1"
},
"gitea": {
"external_url": "https://git.example.com",
"internal_url": "http://gitea-container-ip:3000",
"api_path": "/api/v1",
"default_owner": "your-username",
"ssh_host": "git.example.com",
"ssh_port": 2222
},
"webhook": {
"url": "https://deploy.example.com/webhook",
"events": ["push"],
"branch_filter": "main master"
},
"domains": {
"default_pattern": "{app}.example.com",
"available": ["example.com"]
},
"secrets": {
"gitea_token": "~/.claude/secrets/gitea.json",
"cloudflare_token": "~/.claude/secrets/cloudflare.json"
}
}
}
}

42
config.json Normal file
View File

@@ -0,0 +1,42 @@
{
"active_profile": "mew",
"profiles": {
"mew": {
"server": {
"name": "mew",
"ip": "155.94.170.136",
"ssh_user": "darren",
"ssh_host": "mew",
"deploy_map": "/etc/deploy-listener/deploy-map.json",
"deploy_env": "/etc/deploy-listener/deploy-listener.env",
"caddyfile": "/data/docker/caddy/Caddyfile",
"caddy_container": "caddy",
"caddy_compose_dir": "/data/docker/caddy",
"compose_dir": "/srv/git",
"proxy_network": "proxy",
"proxy_gateway": "10.0.12.1"
},
"gitea": {
"external_url": "https://git.lavender-daydream.com",
"internal_url": "http://10.0.12.5:3000",
"api_path": "/api/v1",
"default_owner": "darren",
"ssh_host": "git.lavender-daydream.com",
"ssh_port": 2222
},
"webhook": {
"url": "https://deploy.lavender.spl.tech/webhook",
"events": ["push"],
"branch_filter": "main master"
},
"domains": {
"default_pattern": "{app}.lavender.spl.tech",
"available": ["lavender-daydream.com", "spl.tech"]
},
"secrets": {
"gitea_token": "~/.claude/secrets/gitea.json",
"cloudflare_token": "~/.claude/secrets/cloudflare.json"
}
}
}
}

View File

@@ -0,0 +1,252 @@
# Caddyfile Patterns Reference
Reusable Caddyfile site block patterns for the mew server. All blocks go in `/data/docker/caddy/Caddyfile`. After editing, reload or restart Caddy (see infrastructure.md for details).
---
## 1. Standard Reverse Proxy
The most common pattern. Terminate TLS, compress responses, and forward to a container.
```
# === My App ===
myapp.lavender-daydream.com {
encode zstd gzip
reverse_proxy myapp:3000
}
```
### Breakdown
- **Domain line**: Caddy automatically provisions a Let's Encrypt certificate for this domain.
- **`encode zstd gzip`**: Compress responses with zstd (preferred) or gzip (fallback). Include this in every site block.
- **`reverse_proxy myapp:3000`**: Forward requests to the container named `myapp` on port 3000. Caddy resolves the container name via the shared `proxy` Docker network.
### Prerequisites
- DNS A record pointing the domain to `155.94.170.136`.
- The target container is running and joined to the `proxy` network.
- The container name and port match what is specified in the `reverse_proxy` directive.
---
## 2. WebSocket Support
For applications that use WebSocket connections (chat apps, real-time dashboards, collaborative editors, etc.).
```
# === Real-time App ===
realtime.lavender-daydream.com {
encode zstd gzip
reverse_proxy realtime-app:3000 {
header_up X-Real-IP {remote_host}
header_up X-Forwarded-For {remote_host}
header_up X-Forwarded-Proto {scheme}
}
}
```
### Notes
- Caddy 2 handles WebSocket upgrades transparently. There is no special `websocket` directive needed — `reverse_proxy` detects the `Upgrade: websocket` header and handles the protocol switch automatically.
- The `header_up` directives forward the real client IP and protocol to the backend, which is important for applications that log connections or enforce security based on client IP.
- If the application uses a non-standard WebSocket path (e.g., `/ws` or `/socket.io`), this pattern still works without changes — Caddy proxies all paths by default.
---
## 3. Multiple Domains
Serve the same application from multiple domains (e.g., bare domain and `www` subdomain, or a vanity domain alongside the primary).
```
# === My App (multi-domain) ===
myapp.lavender-daydream.com, www.myapp.lavender-daydream.com {
encode zstd gzip
reverse_proxy myapp:3000
}
```
### With Redirect
Redirect one domain to the canonical domain instead of serving from both:
```
# === My App (canonical redirect) ===
www.myapp.lavender-daydream.com {
redir https://myapp.lavender-daydream.com{uri} permanent
}
myapp.lavender-daydream.com {
encode zstd gzip
reverse_proxy myapp:3000
}
```
### Notes
- Caddy provisions separate TLS certificates for each domain listed.
- Ensure DNS A records exist for every domain in the site block.
- Use `permanent` (301) redirects for SEO-friendly canonical domain enforcement.
- The `{uri}` placeholder preserves the request path and query string during the redirect.
---
## 4. HTTPS Upstream
For services that speak HTTPS internally (e.g., Cockpit, some management UIs). Caddy must be told to connect to the upstream over TLS.
```
# === Cockpit ===
cockpit.lavender-daydream.com {
encode zstd gzip
reverse_proxy https://cockpit:9090 {
transport http {
tls_insecure_skip_verify
}
}
}
```
### Notes
- Prefix the upstream address with `https://` to instruct Caddy to connect over TLS.
- `tls_insecure_skip_verify` disables certificate verification for the upstream connection. Use this when the upstream uses a self-signed certificate, which is common for management interfaces like Cockpit.
- Do NOT use `tls_insecure_skip_verify` if the upstream has a valid, trusted certificate — remove the entire `transport` block in that case.
- This pattern is uncommon. Most containers speak plain HTTP internally, and Caddy handles TLS termination on the frontend only.
---
## 5. Rate Limiting
Protect sensitive endpoints (login forms, APIs, webhooks) from abuse with rate limiting.
```
# === Rate-Limited App ===
myapp.lavender-daydream.com {
encode zstd gzip
# Rate limit login endpoint: 10 requests per minute per IP
@login {
path /api/auth/login
}
rate_limit @login {
zone login_zone {
key {remote_host}
events 10
window 1m
}
}
# Rate limit API endpoints: 60 requests per minute per IP
@api {
path /api/*
}
rate_limit @api {
zone api_zone {
key {remote_host}
events 60
window 1m
}
}
reverse_proxy myapp:3000
}
```
### Notes
- Rate limiting requires the `caddy-ratelimit` plugin. Verify it is included in the Caddy build before using these directives. If it is not available, implement rate limiting at the application level instead.
- The `@name` syntax defines a named matcher that scopes the rate limit to specific paths.
- `key {remote_host}` rate-limits per client IP address.
- `events` is the maximum number of requests allowed within the `window` period.
- Clients that exceed the limit receive a `429 Too Many Requests` response.
- Apply stricter limits to authentication endpoints and more generous limits to general API usage.
### Alternative: Application-Level Rate Limiting
If the Caddy rate-limit plugin is not installed, skip the `rate_limit` directives and use the standard reverse proxy pattern. Configure rate limiting within the application instead (e.g., `express-rate-limit` for Node.js, `slowapi` for FastAPI).
---
## 6. Path-Based Routing
Route different URL paths to different backend services. Common for monorepo deployments where `/api` goes to a backend service and `/` goes to a frontend.
```
# === Full-Stack App (path-based) ===
myapp.lavender-daydream.com {
encode zstd gzip
# API requests → backend container
handle /api/* {
reverse_proxy myapp-api:8000
}
# WebSocket endpoint → backend container
handle /ws/* {
reverse_proxy myapp-api:8000
}
# Everything else → frontend container
handle {
reverse_proxy myapp-frontend:80
}
}
```
### Notes
- `handle` blocks are evaluated in the order they appear. More specific paths must come before the catch-all.
- The final `handle` (with no path argument) is the catch-all — it matches everything not matched above.
- Use `handle_path` instead of `handle` if you need to strip the path prefix before forwarding. For example:
```
handle_path /api/* {
reverse_proxy myapp-api:8000
}
```
This strips `/api` from the request path, so `/api/users` becomes `/users` when it reaches the backend. Only use this if the backend does not expect the `/api` prefix.
- Ensure all referenced containers (`myapp-api`, `myapp-frontend`) are on the `proxy` network.
### Variation: Static Files + API
Serve static files directly from Caddy for the frontend, with API requests proxied to a backend:
```
# === Static Frontend + API Backend ===
myapp.lavender-daydream.com {
encode zstd gzip
handle /api/* {
reverse_proxy myapp-api:8000
}
handle {
root * /srv/myapp/dist
try_files {path} /index.html
file_server
}
}
```
This requires the static files to be accessible from within the Caddy container (via a volume mount).
---
## Universal Conventions
Apply these conventions to every site block:
1. **Comment header**: Place `# === App Name ===` above each site block.
2. **Compression**: Always include `encode zstd gzip` as the first directive.
3. **Container names**: Use container names, not IP addresses, in `reverse_proxy`.
4. **One domain per block** unless intentionally serving multiple domains (pattern 3).
5. **Order matters**: Place more specific `handle` blocks before less specific ones.
6. **Test after changes**: After modifying the Caddyfile, reload Caddy and verify the site responds:
```bash
docker exec caddy caddy reload --config /etc/caddy/Caddyfile
curl -I https://myapp.lavender-daydream.com
```
If reload fails, check Caddy logs:
```bash
docker logs caddy --tail 50
```

View File

@@ -0,0 +1,351 @@
# Docker Compose Patterns Reference
Reusable `docker-compose.yaml` templates for common application types deployed on mew. Every template includes the external `proxy` network required for Caddy reverse proxying.
---
## 1. Node.js / Express with Dockerfile Build
Build a Node.js app from a local Dockerfile. The container exposes an internal port that Caddy proxies to.
```yaml
version: "3.8"
services:
app:
build:
context: .
dockerfile: Dockerfile
container_name: myapp
restart: unless-stopped
expose:
- "3000"
environment:
- NODE_ENV=production
- PORT=3000
env_file:
- .env
networks:
- proxy
networks:
proxy:
name: proxy
external: true
```
### Companion Dockerfile
```dockerfile
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]
```
### Notes
- Use `expose` (not `ports`) to keep the port internal to Docker networks only.
- Set `container_name` to a unique, descriptive name — Caddy uses this name in its `reverse_proxy` directive.
- The app listens on port 3000 inside the container. Caddy reaches it via `myapp:3000`.
---
## 2. Python / FastAPI with Dockerfile Build
Build a Python FastAPI app from a local Dockerfile. Uses Uvicorn as the ASGI server.
```yaml
version: "3.8"
services:
app:
build:
context: .
dockerfile: Dockerfile
container_name: myapi
restart: unless-stopped
expose:
- "8000"
environment:
- PYTHONUNBUFFERED=1
env_file:
- .env
networks:
- proxy
networks:
proxy:
name: proxy
external: true
```
### Companion Dockerfile
```dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
```
### Notes
- `PYTHONUNBUFFERED=1` ensures log output appears immediately in `docker compose logs`.
- For production, consider adding `--workers 4` to the Uvicorn command or switching to Gunicorn with Uvicorn workers.
- Caddy reaches this via `myapi:8000`.
---
## 3. Static Site (nginx)
Serve pre-built static files (HTML, CSS, JS) via nginx.
```yaml
version: "3.8"
services:
app:
image: nginx:alpine
container_name: mysite
restart: unless-stopped
expose:
- "80"
volumes:
- ./dist:/usr/share/nginx/html:ro
- ./nginx.conf:/etc/nginx/conf.d/default.conf:ro
networks:
- proxy
networks:
proxy:
name: proxy
external: true
```
### Companion nginx.conf
```nginx
server {
listen 80;
server_name _;
root /usr/share/nginx/html;
index index.html;
location / {
try_files $uri $uri/ /index.html;
}
# Cache static assets
location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff2?)$ {
expires 30d;
add_header Cache-Control "public, immutable";
}
}
```
### Notes
- Mount the build output directory (e.g., `./dist`) into the nginx html root.
- The `try_files` fallback to `/index.html` supports client-side routing (React Router, Vue Router, etc.).
- Mount the nginx config as read-only (`:ro`).
- Caddy reaches this via `mysite:80`.
---
## 4. Pre-built Image Only
Pull and run a published Docker image with no local build. Suitable for off-the-shelf applications like wikis, dashboards, and link pages.
```yaml
version: "3.8"
services:
app:
image: lscr.io/linuxserver/bookstack:latest
container_name: bookstack
restart: unless-stopped
expose:
- "6875"
env_file:
- .env
volumes:
- ./data:/config
networks:
- proxy
networks:
proxy:
name: proxy
external: true
```
### Notes
- Replace the `image` and `expose` port with whatever the application requires.
- Check the image documentation for required environment variables and volume mount paths.
- Persist application data by mounting a local `./data` directory.
- Caddy reaches this via `bookstack:6875`.
---
## 5. App with PostgreSQL Database
A two-service stack with an application and a PostgreSQL database. The database is on an internal-only network. The app joins both the internal and proxy networks.
```yaml
version: "3.8"
services:
app:
build:
context: .
dockerfile: Dockerfile
container_name: myapp
restart: unless-stopped
expose:
- "3000"
env_file:
- .env
depends_on:
db:
condition: service_healthy
networks:
- proxy
- internal
db:
image: postgres:16-alpine
container_name: myapp-db
restart: unless-stopped
environment:
POSTGRES_DB: ${POSTGRES_DB:-myapp}
POSTGRES_USER: ${POSTGRES_USER:-myapp}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:?Set POSTGRES_PASSWORD in .env}
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-myapp}"]
interval: 10s
timeout: 5s
retries: 5
networks:
- internal
volumes:
pgdata:
networks:
proxy:
name: proxy
external: true
internal:
driver: bridge
```
### Notes
- The database is **only** on the `internal` network — it is not reachable from Caddy or any other container outside this stack.
- The app is on **both** `proxy` (so Caddy can reach it) and `internal` (so it can reach the database).
- `depends_on` with `condition: service_healthy` ensures the app waits for PostgreSQL to be ready before starting.
- The `${POSTGRES_PASSWORD:?...}` syntax causes compose to fail with an error if the variable is not set, preventing accidental deploys with no database password.
- Use a named volume (`pgdata`) for database persistence.
- In the app's `.env`, set the database URL:
```
DATABASE_URL=postgresql://myapp:secretpassword@myapp-db:5432/myapp
```
Note the hostname is the database container name (`myapp-db`), not `localhost`.
---
## 6. App with Environment File
Pattern for managing configuration through `.env` files with a `.env.example` template checked into version control.
```yaml
version: "3.8"
services:
app:
build:
context: .
dockerfile: Dockerfile
container_name: myapp
restart: unless-stopped
expose:
- "3000"
env_file:
- .env
networks:
- proxy
networks:
proxy:
name: proxy
external: true
```
### Companion .env.example
Check this file into version control as a template. The actual `.env` file contains secrets and is listed in `.gitignore` on public repos only (on private Gitea repos, `.env` is committed per project conventions).
```env
# Application
NODE_ENV=production
PORT=3000
APP_URL=https://myapp.lavender-daydream.com
# Database (if applicable)
DATABASE_URL=postgresql://user:password@myapp-db:5432/myapp
# Secrets
SESSION_SECRET=generate-a-random-string-here
API_KEY=your-api-key-here
# Email (Mailgun)
MAILGUN_API_KEY=
MAILGUN_DOMAIN=
MAILGUN_FROM=noreply@lavender-daydream.com
# Deploy listener webhook secret (must match /etc/deploy-listener/deploy-listener.env)
WEBHOOK_SECRET=must-match-deploy-listener
```
### Notes
- The `env_file` directive in compose loads all variables from `.env` into the container environment.
- Variables defined in `env_file` are available both to the containerized application and to compose variable interpolation (`${VAR}` syntax in the compose file).
- Always provide a `.env.example` with placeholder values and comments explaining each variable.
- For the deploy listener to work, the repo's webhook secret must match the value in `/etc/deploy-listener/deploy-listener.env`.
---
## Universal Compose Conventions
These conventions apply to ALL stacks on mew:
1. **Always include the proxy network** if Caddy needs to reach the container:
```yaml
networks:
proxy:
name: proxy
external: true
```
2. **Use `expose`, not `ports`**: Keep ports internal to Docker networks. Never bind to the host unless absolutely necessary.
3. **Set `container_name`** explicitly: Caddy resolves containers by name. Avoid auto-generated names.
4. **Set `restart: unless-stopped`**: Containers restart automatically after crashes or server reboots, but stay stopped if manually stopped.
5. **Use `env_file` for secrets**: Do not hardcode secrets in the compose file.
6. **Use health checks** for databases and critical dependencies to ensure proper startup ordering.
7. **Persist data with named volumes or bind mounts**: Never rely on container-internal storage for important data.

View File

@@ -0,0 +1,431 @@
# Infrastructure Reference — mew Server (155.94.170.136)
This document describes every infrastructure component on the mew server relevant to deploying Docker Compose applications behind Caddy with automated Gitea-triggered deployments.
---
## 1. Deploy Listener
### Overview
A Python webhook listener that receives push events from Gitea/Forgejo and automatically deploys the corresponding Docker Compose stack.
### Filesystem Locations
| Item | Path |
|------|------|
| Script | `/usr/local/bin/deploy-listener.py` |
| Systemd unit | `deploy-listener.service` |
| Deploy map | `/etc/deploy-listener/deploy-map.json` |
| Environment file | `/etc/deploy-listener/deploy-listener.env` |
| Service user home | `/var/lib/deploy` |
### Service User
- **User**: `deploy`
- **Groups**: `docker`, `git`
- **Home directory**: `/var/lib/deploy`
The `deploy` user has Docker socket access through the `docker` group and repository access through the `git` group.
### Network Binding
- **Port**: 50500
- **Bind address**: 0.0.0.0
- **Firewall**: UFW blocks external access to port 50500. Only Docker's internal 10.0.0.0/8 range is allowed. Caddy reaches the listener at `10.0.12.1:50500` (the proxy network gateway).
### Deploy Map
Location: `/etc/deploy-listener/deploy-map.json`
Format — a JSON object mapping `owner/repo` to the absolute path of the compose directory:
```json
{
"darren/compose-bookstack": "/srv/git/compose-bookstack",
"darren/compose-linkstack": "/srv/git/compose-linkstack",
"darren/my-app": "/srv/git/my-app"
}
```
Add a new entry to this file for every application that should be auto-deployed on push.
### Environment File
Location: `/etc/deploy-listener/deploy-listener.env`
```env
WEBHOOK_SECRET=<the-shared-secret>
LISTEN_PORT=50500
```
The `WEBHOOK_SECRET` value must match the secret configured in each Gitea/Forgejo webhook.
### Request Validation & Behavior
1. **HMAC-SHA256 validation**: The listener reads the `X-Gitea-Signature` or `X-Forgejo-Signature` header and validates the request body against the `WEBHOOK_SECRET` using HMAC-SHA256. Requests that fail validation are rejected.
2. **Branch filter**: Only pushes to `main` or `master` (checked via the `ref` field) trigger a deploy. All other branches are ignored.
3. **Deploy map lookup**: The `repository.full_name` field (e.g., `darren/my-app`) is looked up in the deploy map. If not found, the request is ignored.
4. **Deploy sequence**: On a valid push, the listener executes:
```bash
cd /srv/git/my-app
git pull
docker compose pull
docker compose up -d
```
5. **Concurrency control**: A file lock prevents concurrent deploys. If a deploy is already running, the incoming request is queued or rejected.
### Health Check
Verify the listener is running:
```bash
curl https://deploy.lavender.spl.tech/health
```
A successful response confirms the listener is reachable through Caddy and functioning.
### Systemd Management
```bash
# Check status
sudo systemctl status deploy-listener
# Restart
sudo systemctl restart deploy-listener
# View logs
sudo journalctl -u deploy-listener -f
```
---
## 2. Caddy Reverse Proxy
### Overview
Caddy serves as the TLS-terminating reverse proxy for all applications on mew. It automatically provisions and renews certificates via Let's Encrypt.
### Filesystem Locations
| Item | Path |
|------|------|
| Caddyfile | `/data/docker/caddy/Caddyfile` |
| Compose file | `/data/docker/caddy/docker-compose.yaml` |
| Container name | `caddy` |
| Image | `caddy:2-alpine` |
### Network
- **Network name**: `proxy`
- **Type**: external Docker network
- **Subnet**: 10.0.12.0/24
- **Gateway**: 10.0.12.1
- All application containers MUST join the `proxy` network for Caddy to reach them by container name.
### TLS
- **Method**: Automatic via Let's Encrypt
- **Email**: `postmaster@lavender-daydream.com`
- No manual certificate management required. Caddy handles provisioning, renewal, and OCSP stapling automatically.
### Deploy Endpoint
The deploy listener is exposed externally through Caddy:
```
deploy.lavender.spl.tech → 10.0.12.1:50500
```
This routes through the proxy network gateway to the host-bound deploy listener.
### Reloading the Caddyfile
**Standard reload** (when Caddyfile content changed but inode is the same):
```bash
docker exec caddy caddy reload --config /etc/caddy/Caddyfile
```
**Full restart** (required when the Caddyfile inode changed, e.g., after replacing the file rather than editing in-place):
```bash
cd /data/docker/caddy && docker compose restart caddy
```
Always check whether the file was edited in-place or replaced. If replaced, you MUST restart rather than reload.
### Site Block Format
Follow this exact format when adding new site blocks to the Caddyfile:
```
# === App Name ===
domain.example.com {
encode zstd gzip
reverse_proxy container_name:port
}
```
- Place the comment header (`# === App Name ===`) above each block for readability.
- Always include `encode zstd gzip` for compression.
- Use the container name (not IP) in the `reverse_proxy` directive — Caddy resolves container names on the proxy network.
---
## 3. Gitea API
### Connection Details
| Item | Value |
|------|-------|
| Internal URL (from mew host) | `http://10.0.12.5:3000` |
| External URL | `https://git.lavender-daydream.com` |
| API base path | `/api/v1` |
| Token location | `~/.claude/secrets/gitea.json` |
### Authentication
Include the token as a header on every API request:
```
Authorization: token {GITEA_TOKEN}
```
### Key Endpoints
#### Check if a repo exists
```
GET /api/v1/repos/{owner}/{repo}
```
- **200**: Repo exists (response includes repo details).
- **404**: Repo does not exist.
#### Create a new repo
```
POST /api/v1/user/repos
Content-Type: application/json
{
"name": "my-app",
"private": false,
"auto_init": false
}
```
Set `auto_init` to `false` when pushing an existing local repo. Set to `true` if you want Gitea to create an initial commit.
#### Add a webhook
```
POST /api/v1/repos/{owner}/{repo}/hooks
Content-Type: application/json
{
"type": "gitea",
"active": true,
"branch_filter": "main master",
"config": {
"url": "https://deploy.lavender.spl.tech/webhook",
"content_type": "json",
"secret": "<WEBHOOK_SECRET>"
},
"events": ["push"]
}
```
The `secret` in the webhook config MUST match the `WEBHOOK_SECRET` in `/etc/deploy-listener/deploy-listener.env`.
#### List repos
```
GET /api/v1/repos/search?limit=50
```
Returns up to 50 repositories. Use `page` parameter for pagination.
---
## 4. Forgejo API
### Connection Details
| Item | Value |
|------|-------|
| Container name | `forgejo` |
| Internal port | 3000 |
| External URL | `https://forgejo.lavender-daydream.com` |
| SSH port | 2223 |
### API Compatibility
Forgejo is a fork of Gitea. The API format, endpoints, authentication, and request/response structures are identical to those documented in the Gitea section above. Use the same patterns — just substitute the Forgejo base URL.
### SSH Access
```bash
git remote add forgejo ssh://git@forgejo.lavender-daydream.com:2223/owner/repo.git
```
---
## 5. Cloudflare DNS
### Token & Zone Configuration
Location: `~/.claude/secrets/cloudflare.json`
Format:
```json
{
"CLOUDFLARE_API_TOKEN": "your-api-token-here",
"zones": {
"lavender-daydream.com": "zone_id_for_lavender_daydream",
"spl.tech": "zone_id_for_spl_tech"
}
}
```
### Authentication
Include the token as a Bearer header:
```
Authorization: Bearer {CLOUDFLARE_API_TOKEN}
```
### Create an A Record
```
POST https://api.cloudflare.com/client/v4/zones/{zone_id}/dns_records
Content-Type: application/json
{
"type": "A",
"name": "{subdomain}",
"content": "155.94.170.136",
"ttl": 1,
"proxied": false
}
```
- **`name`**: The subdomain portion (e.g., `myapp` for `myapp.lavender-daydream.com`, or the full FQDN).
- **`content`**: Always `155.94.170.136` (mew's public IP).
- **`ttl`**: `1` means automatic TTL.
- **`proxied`**: Set to `false` so Caddy handles TLS directly. Setting to `true` would route through Cloudflare's proxy and interfere with Let's Encrypt.
### Choosing the Zone
Pick the zone based on the desired domain suffix:
- `*.lavender-daydream.com` → use the `lavender-daydream.com` zone ID
- `*.spl.tech` → use the `spl.tech` zone ID
---
## 6. Docker Networking
### The `proxy` Network
| Property | Value |
|----------|-------|
| Name | `proxy` |
| Subnet | 10.0.12.0/24 |
| Gateway | 10.0.12.1 |
| Type | External (created once, referenced by all stacks) |
### Requirements
- **Every application container** that Caddy must reach MUST join the `proxy` network.
- Caddy resolves container names to IPs on this network — use container names (not IPs) in `reverse_proxy` directives.
- The network is created externally (not by any single compose file). If it does not exist, create it:
```bash
docker network create --subnet=10.0.12.0/24 --gateway=10.0.12.1 proxy
```
### Compose Configuration
Every compose file that needs Caddy access must include:
```yaml
networks:
proxy:
name: proxy
external: true
```
And each service that Caddy proxies to must list `proxy` in its `networks` key:
```yaml
services:
app:
# ...
networks:
- proxy
```
If the stack also has internal-only services (e.g., a database), create an additional internal network:
```yaml
networks:
proxy:
name: proxy
external: true
internal:
driver: bridge
```
---
## 7. Compose Stack Locations
### Core Infrastructure Stacks
Location: `/data/docker/`
These are foundational services that support the entire server:
| Directory | Service |
|-----------|---------|
| `/data/docker/caddy/` | Caddy reverse proxy |
| `/data/docker/gitea/` | Gitea git forge |
| `/data/docker/forgejo/` | Forgejo git forge |
| `/data/docker/email/` | Email services |
| `/data/docker/website/` | Main website |
| `/data/docker/linkstack-berlyn/` | Berlyn's linkstack |
### Application Stacks
Location: `/srv/git/`
These are deployed applications managed by the deploy listener:
| Directory | Application |
|-----------|-------------|
| `/srv/git/compose-bookstack/` | BookStack wiki |
| `/srv/git/compose-linkstack/` | LinkStack |
| `/srv/git/compose-portainer/` | Portainer |
| `/srv/git/compose-wishthis/` | WishThis |
| `/srv/git/compose-anythingllm/` | AnythingLLM |
### Ownership & Permissions
- **Owner**: `root:git`
- **Permissions**: `2775` (setgid)
- The setgid bit ensures new files and directories inherit the `git` group, so both `root` and members of the `git` group (including `deploy` and `darren`) can read/write.
### Standard Stack Contents
Each compose stack directory should contain:
| File | Purpose |
|------|---------|
| `docker-compose.yaml` | Service definitions |
| `.env` | Environment variables (secrets, config) |
| `Makefile` | Convenience targets (`make up`, `make down`, `make logs`) |
| `README.md` | Stack documentation |

View File

@@ -0,0 +1,403 @@
#!/usr/bin/env python3
"""
detect_deployment.py — Detect whether an app is already deployed.
Checks three signals:
1. Deploy map — repo entry in /etc/deploy-listener/deploy-map.json
2. Gitea — repo exists on the Gitea instance
3. Caddy — a Caddyfile reverse_proxy maps to the app's container name
Outputs JSON to stdout and uses exit code to indicate status:
0 — deployed (at least deploy_map + gitea match)
1 — not deployed
Usage:
python3 detect_deployment.py --repo-name darren/my-app [--config path/to/config.json]
Cross-platform: works on Linux and Windows (WSL / Git Bash).
Stdlib only — no pip dependencies.
"""
from __future__ import annotations
import argparse
import json
import os
import platform
import re
import socket
import subprocess
import sys
import urllib.error
import urllib.request
from pathlib import Path
from typing import Any
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _load_json(path: str | Path) -> Any:
"""Load and return parsed JSON from *path*."""
with open(path, encoding="utf-8") as fh:
return json.load(fh)
def _resolve_config_path(explicit: str | None, script_dir: Path) -> Path:
"""Return the config.json path — explicit arg wins, otherwise default."""
if explicit:
return Path(explicit).resolve()
return (script_dir.parent / "config.json").resolve()
def _is_local(server_hostname: str) -> bool:
"""Return True when we appear to be running *on* the target server."""
local_name = socket.gethostname().lower()
if local_name == server_hostname.lower():
return True
# Also check common aliases
try:
server_ip = socket.gethostbyname(server_hostname)
local_ips = socket.gethostbyname_ex(socket.gethostname())[2]
if server_ip in local_ips:
return True
except (socket.gaierror, socket.herror, OSError):
pass
return False
def _run_local(cmd: list[str], timeout: int = 15) -> tuple[int, str, str]:
"""Run *cmd* locally and return (returncode, stdout, stderr)."""
try:
proc = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=timeout,
)
return proc.returncode, proc.stdout, proc.stderr
except FileNotFoundError:
return 127, "", f"Command not found: {cmd[0]}"
except subprocess.TimeoutExpired:
return 124, "", f"Command timed out after {timeout}s"
def _run_ssh(ssh_user: str, ssh_host: str, remote_cmd: str, timeout: int = 15) -> tuple[int, str, str]:
"""Execute *remote_cmd* over SSH and return (returncode, stdout, stderr)."""
ssh_target = f"{ssh_user}@{ssh_host}" if ssh_user else ssh_host
cmd = [
"ssh",
"-o", "BatchMode=yes",
"-o", "ConnectTimeout=10",
"-o", "StrictHostKeyChecking=accept-new",
ssh_target,
remote_cmd,
]
return _run_local(cmd, timeout=timeout)
def _read_remote_file(ssh_user: str, ssh_host: str, remote_path: str) -> str | None:
"""Read a file on the remote server via SSH. Returns contents or None on failure."""
rc, stdout, stderr = _run_ssh(ssh_user, ssh_host, f"cat {remote_path}")
if rc == 0:
return stdout
return None
def _read_file_local_or_remote(
path: str,
is_local: bool,
ssh_user: str,
ssh_host: str,
) -> str | None:
"""Read a file — locally if we are on the server, otherwise via SSH."""
if is_local:
try:
return Path(path).read_text(encoding="utf-8")
except (OSError, UnicodeDecodeError):
return None
return _read_remote_file(ssh_user, ssh_host, path)
def _read_gitea_token(token_path: str) -> str | None:
"""Read the Gitea API token from a file path."""
try:
return Path(token_path).expanduser().read_text(encoding="utf-8").strip()
except (OSError, UnicodeDecodeError):
return None
# ---------------------------------------------------------------------------
# Signal 1: Deploy Map
# ---------------------------------------------------------------------------
def check_deploy_map(
repo_name: str,
deploy_map_path: str,
is_local: bool,
ssh_user: str,
ssh_host: str,
) -> tuple[bool, dict[str, Any]]:
"""
Check if *repo_name* exists in the deploy map JSON file.
Returns (found: bool, details: dict).
"""
details: dict[str, Any] = {}
raw = _read_file_local_or_remote(deploy_map_path, is_local, ssh_user, ssh_host)
if raw is None:
details["error"] = "Could not read deploy map"
return False, details
try:
deploy_map = json.loads(raw)
except json.JSONDecodeError as exc:
details["error"] = f"Invalid JSON in deploy map: {exc}"
return False, details
# The deploy map may use different key formats — check common patterns:
# "owner/repo", "repo", or nested by owner.
repo_lower = repo_name.lower()
repo_short = repo_name.split("/")[-1].lower() if "/" in repo_name else repo_name.lower()
# Flat dict keyed by "owner/repo"
if isinstance(deploy_map, dict):
for key, value in deploy_map.items():
if key.lower() == repo_lower or key.lower() == repo_short:
details["matched_key"] = key
if isinstance(value, dict):
details["stack_dir"] = value.get("stack_dir", value.get("path", ""))
else:
details["stack_dir"] = str(value)
return True, details
# List of entries with a "repo" field
if isinstance(deploy_map, list):
for entry in deploy_map:
if not isinstance(entry, dict):
continue
entry_repo = (entry.get("repo") or entry.get("repository") or "").lower()
if entry_repo == repo_lower or entry_repo == repo_short:
details["matched_key"] = entry_repo
details["stack_dir"] = entry.get("stack_dir", entry.get("path", ""))
return True, details
return False, details
# ---------------------------------------------------------------------------
# Signal 2: Gitea
# ---------------------------------------------------------------------------
def check_gitea(
repo_name: str,
gitea_url: str,
gitea_token: str | None,
) -> tuple[bool, dict[str, Any]]:
"""
Check if *repo_name* (owner/repo) exists on Gitea.
Returns (exists: bool, details: dict).
"""
details: dict[str, Any] = {}
# Ensure owner/repo format
if "/" not in repo_name:
details["error"] = "repo_name must be in owner/repo format for Gitea check"
return False, details
owner, repo = repo_name.split("/", 1)
api_url = f"{gitea_url.rstrip('/')}/api/v1/repos/{owner}/{repo}"
details["gitea_url"] = api_url
req = urllib.request.Request(api_url, method="GET")
req.add_header("Accept", "application/json")
if gitea_token:
req.add_header("Authorization", f"token {gitea_token}")
try:
with urllib.request.urlopen(req, timeout=15) as resp:
if resp.status == 200:
body = json.loads(resp.read().decode("utf-8"))
details["full_name"] = body.get("full_name", "")
details["html_url"] = body.get("html_url", "")
details["description"] = body.get("description", "")
return True, details
except urllib.error.HTTPError as exc:
if exc.code == 404:
details["status"] = 404
return False, details
details["error"] = f"HTTP {exc.code}: {exc.reason}"
return False, details
except urllib.error.URLError as exc:
details["error"] = f"URL error: {exc.reason}"
return False, details
except OSError as exc:
details["error"] = f"Connection error: {exc}"
return False, details
return False, details
# ---------------------------------------------------------------------------
# Signal 3: Caddy
# ---------------------------------------------------------------------------
def check_caddy(
repo_name: str,
caddyfile_path: str,
is_local: bool,
ssh_user: str,
ssh_host: str,
) -> tuple[bool, dict[str, Any]]:
"""
Check if the Caddyfile has a reverse_proxy pointing to this app's container.
Heuristic: look for the container name (short repo name) in reverse_proxy
directives or upstream blocks.
Returns (found: bool, details: dict).
"""
details: dict[str, Any] = {}
container_name = repo_name.split("/")[-1].lower() if "/" in repo_name else repo_name.lower()
raw = _read_file_local_or_remote(caddyfile_path, is_local, ssh_user, ssh_host)
if raw is None:
details["error"] = "Could not read Caddyfile"
return False, details
# Parse the Caddyfile looking for:
# reverse_proxy <container_name>:<port>
# reverse_proxy http://<container_name>:<port>
# Also capture the domain (site block header) associated with the match.
lines = raw.splitlines()
current_domain = ""
# Pattern: matches a Caddy site-block header (domain line) — simplified heuristic
domain_pattern = re.compile(r"^(\S+\.\S+)\s*\{?\s*$")
proxy_pattern = re.compile(
r"reverse_proxy\s+(?:https?://)?" + re.escape(container_name) + r"[:\s]",
re.IGNORECASE,
)
for line in lines:
stripped = line.strip()
domain_match = domain_pattern.match(stripped)
if domain_match:
current_domain = domain_match.group(1)
if proxy_pattern.search(stripped):
details["domain"] = current_domain
details["matched_line"] = stripped
details["container_name"] = container_name
return True, details
return False, details
# ---------------------------------------------------------------------------
# Main orchestration
# ---------------------------------------------------------------------------
def detect(repo_name: str, config: dict[str, Any]) -> dict[str, Any]:
"""Run all three detection signals and return the combined result dict."""
server = config.get("server", {})
ssh_host = server.get("ssh_host", "")
ssh_user = server.get("ssh_user", "")
server_hostname = server.get("hostname", ssh_host)
deploy_map_path = server.get("deploy_map_path", "/etc/deploy-listener/deploy-map.json")
caddyfile_path = server.get("caddyfile_path", "/etc/caddy/Caddyfile")
gitea_cfg = config.get("gitea", {})
gitea_url = gitea_cfg.get("url", "")
gitea_token_path = config.get("secrets", {}).get("gitea_token", "")
gitea_token = _read_gitea_token(gitea_token_path) if gitea_token_path else None
local = _is_local(server_hostname)
# --- Signal 1: Deploy Map ---
dm_found, dm_details = check_deploy_map(
repo_name, deploy_map_path, local, ssh_user, ssh_host,
)
# --- Signal 2: Gitea ---
gt_found, gt_details = check_gitea(repo_name, gitea_url, gitea_token)
# --- Signal 3: Caddy ---
cd_found, cd_details = check_caddy(
repo_name, caddyfile_path, local, ssh_user, ssh_host,
)
# Merge details
all_details: dict[str, Any] = {}
if dm_details.get("stack_dir"):
all_details["stack_dir"] = dm_details["stack_dir"]
if cd_details.get("domain"):
all_details["domain"] = cd_details["domain"]
if gt_details.get("html_url"):
all_details["gitea_url"] = gt_details["html_url"]
deployed = dm_found and gt_found # primary condition
return {
"deployed": deployed,
"signals": {
"deploy_map": dm_found,
"gitea": gt_found,
"caddy": cd_found,
},
"details": all_details,
}
# ---------------------------------------------------------------------------
# CLI entry point
# ---------------------------------------------------------------------------
def main() -> None:
parser = argparse.ArgumentParser(
description="Detect whether an app is already deployed.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=__doc__,
)
parser.add_argument(
"--repo-name",
required=True,
help="Repository name in owner/repo format (e.g. darren/my-app).",
)
parser.add_argument(
"--config",
default=None,
help="Path to config.json. Default: <script_dir>/../config.json",
)
args = parser.parse_args()
script_dir = Path(__file__).resolve().parent
config_path = _resolve_config_path(args.config, script_dir)
if not config_path.exists():
print(
json.dumps({"error": f"Config file not found: {config_path}"}),
file=sys.stderr,
)
sys.exit(1)
try:
config = _load_json(config_path)
except (json.JSONDecodeError, OSError) as exc:
print(
json.dumps({"error": f"Failed to load config: {exc}"}),
file=sys.stderr,
)
sys.exit(1)
result = detect(args.repo_name, config)
# JSON to stdout
print(json.dumps(result, indent=2))
# Exit code: 0 = deployed, 1 = not deployed
sys.exit(0 if result["deployed"] else 1)
if __name__ == "__main__":
main()

724
scripts/validate_compose.py Normal file
View File

@@ -0,0 +1,724 @@
#!/usr/bin/env python3
"""
validate-compose.py — Production-readiness validator for Docker Compose files.
Usage:
python3 validate-compose.py <path/to/docker-compose.yaml> [--strict]
Exit codes:
0 — passed (no errors; warnings may exist)
1 — failed (one or more errors found)
On Windows, use `python` instead of `python3` if needed.
"""
import argparse
import re
import sys
from pathlib import Path
from typing import Any
try:
import yaml # type: ignore[import-untyped]
except ImportError:
# Attempt stdlib tomllib fallback note — yaml is stdlib-adjacent but not
# truly stdlib. Provide a clear message rather than silently failing.
print("❌ ERROR: PyYAML is required. Install with: pip install pyyaml")
sys.exit(1)
# ---------------------------------------------------------------------------
# Result accumulator
# ---------------------------------------------------------------------------
class ValidationResult:
"""Accumulates errors, warnings, and info messages from all checks."""
def __init__(self) -> None:
self.errors: list[str] = []
self.warnings: list[str] = []
self.infos: list[str] = []
def error(self, msg: str) -> None:
self.errors.append(msg)
def warn(self, msg: str) -> None:
self.warnings.append(msg)
def info(self, msg: str) -> None:
self.infos.append(msg)
@property
def passed(self) -> bool:
return len(self.errors) == 0
def print_report(self) -> None:
"""Print a formatted validation report to stdout."""
total = len(self.errors) + len(self.warnings) + len(self.infos)
if total == 0:
print("✅ No issues found.")
return
if self.errors:
print(f"\n🔴 ERRORS ({len(self.errors)})")
for e in self.errors:
print(f"{e}")
if self.warnings:
print(f"\n🟡 WARNINGS ({len(self.warnings)})")
for w in self.warnings:
print(f"{w}")
if self.infos:
print(f"\n🔵 INFO ({len(self.infos)})")
for i in self.infos:
print(f" {i}")
print()
if self.passed:
print(f"✅ Passed ({len(self.warnings)} warning(s), {len(self.infos)} info(s))")
else:
print(f"❌ Failed ({len(self.errors)} error(s), {len(self.warnings)} warning(s))")
# ---------------------------------------------------------------------------
# Helper utilities
# ---------------------------------------------------------------------------
_SECRET_PATTERNS = [
re.compile(r"(password|passwd|secret|token|key|api_key|apikey|auth|credential)", re.I),
]
_HARDCODED_VALUE_PATTERN = re.compile(
r"^(?!.*\$\{)(?!changeme)(?!placeholder)(?!your-).{8,}$"
)
_ENV_VAR_REF_PATTERN = re.compile(r"\$\{([A-Za-z_][A-Za-z0-9_]*)\}")
_PREFERRED_PORT_MIN = 50000
_PREFERRED_PORT_MAX = 60000
_DB_CACHE_IMAGES = [
"postgres", "postgresql",
"mariadb", "mysql",
"mongo", "mongodb",
"redis", "valkey",
"memcached",
"cassandra",
"couchdb",
"influxdb",
]
def _iter_services(compose: dict[str, Any]):
"""Yield (name, service_dict) for every service in the compose file."""
for name, svc in (compose.get("services") or {}).items():
yield name, (svc or {})
def _get_depends_on_names(depends_on: Any) -> list[str]:
"""Normalise depends_on to a flat list of service name strings."""
if isinstance(depends_on, list):
return depends_on
if isinstance(depends_on, dict):
return list(depends_on.keys())
return []
def _image_name_and_tag(image: str) -> tuple[str, str]:
"""Split 'image:tag' into (image_name, tag). Tag defaults to '' if absent."""
if ":" in image:
parts = image.rsplit(":", 1)
return parts[0], parts[1]
return image, ""
def _is_db_cache_image(image: str) -> bool:
name, _ = _image_name_and_tag(image)
base = name.split("/")[-1].lower()
return any(base == db or base.startswith(db) for db in _DB_CACHE_IMAGES)
def _collect_all_string_values(obj: Any, result: list[str]) -> None:
"""Recursively collect all string leaf values from a nested structure."""
if isinstance(obj, str):
result.append(obj)
elif isinstance(obj, dict):
for v in obj.values():
_collect_all_string_values(v, result)
elif isinstance(obj, list):
for item in obj:
_collect_all_string_values(item, result)
def _parse_host_port(port_spec: Any) -> int | None:
"""
Extract the host (published) port from a port mapping.
Supports:
- "8080:80"
- "127.0.0.1:8080:80"
- {"published": 8080, "target": 80}
- 8080 (short form — interpreted as host==container)
"""
if isinstance(port_spec, dict):
published = port_spec.get("published")
if published is not None:
try:
return int(published)
except (ValueError, TypeError):
pass
return None
spec = str(port_spec)
parts = spec.split(":")
# "hostip:hostport:containerport" → parts[-2] is host port
# "hostport:containerport" → parts[0] is host port
# "containerport" → no explicit host port mapping
if len(parts) >= 2:
try:
return int(parts[-2].split("/")[0])
except (ValueError, IndexError):
pass
elif len(parts) == 1:
try:
return int(parts[0].split("/")[0])
except (ValueError, IndexError):
pass
return None
# ---------------------------------------------------------------------------
# Individual checks (original set)
# ---------------------------------------------------------------------------
def validate_image_tags(compose: dict[str, Any], result: ValidationResult) -> None:
"""Warn on :latest or untagged images."""
for name, svc in _iter_services(compose):
image = svc.get("image", "")
if not image:
continue
img_name, tag = _image_name_and_tag(image)
if not tag:
result.warn(f"[{name}] Image '{img_name}' has no tag — pin to a specific version.")
elif tag == "latest":
result.warn(f"[{name}] Image '{img_name}:latest' — never use :latest in production.")
def validate_restart_policy(compose: dict[str, Any], result: ValidationResult) -> None:
"""Check that all services have a restart policy."""
for name, svc in _iter_services(compose):
restart = svc.get("restart")
if not restart:
result.warn(f"[{name}] No restart policy — add 'restart: unless-stopped'.")
def validate_healthchecks(compose: dict[str, Any], result: ValidationResult) -> None:
"""Check that all services define or inherit a healthcheck."""
for name, svc in _iter_services(compose):
hc = svc.get("healthcheck")
if hc is None:
result.info(f"[{name}] No healthcheck defined — add one if the image supports it.")
elif isinstance(hc, dict) and hc.get("disable"):
result.info(f"[{name}] Healthcheck explicitly disabled.")
def validate_no_hardcoded_secrets(compose: dict[str, Any], result: ValidationResult) -> None:
"""Detect hardcoded secrets in environment and labels."""
for name, svc in _iter_services(compose):
env = svc.get("environment") or {}
items: list[tuple[str, str]] = []
if isinstance(env, dict):
items = list(env.items())
elif isinstance(env, list):
for entry in env:
if "=" in str(entry):
k, v = str(entry).split("=", 1)
items.append((k, v))
for key, value in items:
if not value or str(value).startswith("${"):
continue
for pat in _SECRET_PATTERNS:
if pat.search(key):
result.error(
f"[{name}] Possible hardcoded secret in env var '{key}'"
"use ${VAR_NAME} references and store values in .env."
)
break
def validate_resource_limits(compose: dict[str, Any], result: ValidationResult, strict: bool) -> None:
"""In strict mode, require resource limits on all services."""
if not strict:
return
for name, svc in _iter_services(compose):
deploy = svc.get("deploy") or {}
resources = deploy.get("resources") or {}
limits = resources.get("limits") or {}
mem = limits.get("memory") or svc.get("mem_limit")
cpus = limits.get("cpus") or svc.get("cpus")
if not mem:
result.error(f"[{name}] No memory limit set (strict mode) — add deploy.resources.limits.memory.")
if not cpus:
result.warn(f"[{name}] No CPU limit set — consider adding deploy.resources.limits.cpus.")
def validate_logging(compose: dict[str, Any], result: ValidationResult) -> None:
"""Warn when no logging config is specified."""
for name, svc in _iter_services(compose):
if not svc.get("logging"):
result.info(
f"[{name}] No logging config — consider adding logging.driver and options "
"(e.g. json-file with max-size/max-file)."
)
def validate_privileged_mode(compose: dict[str, Any], result: ValidationResult) -> None:
"""Warn on privileged containers."""
for name, svc in _iter_services(compose):
if svc.get("privileged"):
result.warn(f"[{name}] Running in privileged mode — grant only if strictly required.")
def validate_host_network(compose: dict[str, Any], result: ValidationResult) -> None:
"""Warn on host network mode."""
for name, svc in _iter_services(compose):
network_mode = svc.get("network_mode", "")
if network_mode == "host":
result.warn(f"[{name}] Using host network mode — isolate with a bridge network if possible.")
def validate_sensitive_volumes(compose: dict[str, Any], result: ValidationResult) -> None:
"""Warn on sensitive host paths mounted into containers."""
sensitive_paths = ["/etc", "/var/run/docker.sock", "/proc", "/sys", "/root", "/home"]
for name, svc in _iter_services(compose):
volumes = svc.get("volumes") or []
for vol in volumes:
if isinstance(vol, str):
host_part = vol.split(":")[0]
elif isinstance(vol, dict):
host_part = str(vol.get("source", ""))
else:
continue
for sensitive in sensitive_paths:
if host_part == sensitive or host_part.startswith(sensitive + "/"):
result.warn(
f"[{name}] Sensitive host path mounted: '{host_part}'"
"verify this is intentional."
)
def validate_traefik_network_consistency(compose: dict[str, Any], result: ValidationResult) -> None:
"""Ensure services with Traefik labels are joined to the Traefik network."""
traefik_network_names: set[str] = set()
# Heuristic: networks named 'traefik*' or 'proxy*' are Traefik-facing
for net_name in (compose.get("networks") or {}).keys():
if "traefik" in net_name.lower() or "proxy" in net_name.lower():
traefik_network_names.add(net_name)
for name, svc in _iter_services(compose):
labels = svc.get("labels") or {}
label_items: list[str] = []
if isinstance(labels, dict):
label_items = list(labels.keys())
elif isinstance(labels, list):
label_items = [str(l).split("=")[0] for l in labels]
has_traefik_label = any("traefik" in lbl.lower() for lbl in label_items)
if not has_traefik_label:
continue
svc_networks = set()
svc_net_section = svc.get("networks") or {}
if isinstance(svc_net_section, list):
svc_networks = set(svc_net_section)
elif isinstance(svc_net_section, dict):
svc_networks = set(svc_net_section.keys())
if traefik_network_names and not svc_networks.intersection(traefik_network_names):
result.warn(
f"[{name}] Has Traefik labels but is not on a Traefik-facing network "
f"({', '.join(traefik_network_names)})."
)
def validate_traefik_router_uniqueness(compose: dict[str, Any], result: ValidationResult) -> None:
"""Error on duplicate Traefik router names across services."""
seen_routers: dict[str, str] = {}
router_pattern = re.compile(r"traefik\.http\.routers\.([^.]+)\.", re.I)
for name, svc in _iter_services(compose):
labels = svc.get("labels") or {}
label_keys: list[str] = []
if isinstance(labels, dict):
label_keys = list(labels.keys())
elif isinstance(labels, list):
label_keys = [str(l).split("=")[0] for l in labels]
for key in label_keys:
m = router_pattern.match(key)
if m:
router_name = m.group(1).lower()
if router_name in seen_routers:
result.error(
f"[{name}] Duplicate Traefik router name '{router_name}' "
f"(also used in service '{seen_routers[router_name]}')."
)
else:
seen_routers[router_name] = name
def validate_container_name_uniqueness(compose: dict[str, Any], result: ValidationResult) -> None:
"""Error on duplicate container_name values."""
seen: dict[str, str] = {}
for name, svc in _iter_services(compose):
container_name = svc.get("container_name")
if not container_name:
continue
if container_name in seen:
result.error(
f"[{name}] Duplicate container_name '{container_name}' "
f"(also used by service '{seen[container_name]}')."
)
else:
seen[container_name] = name
def validate_depends_on(compose: dict[str, Any], result: ValidationResult) -> None:
"""Check that depends_on references valid service names."""
service_names = set((compose.get("services") or {}).keys())
for name, svc in _iter_services(compose):
deps = _get_depends_on_names(svc.get("depends_on") or [])
for dep in deps:
if dep not in service_names:
result.error(
f"[{name}] depends_on references unknown service '{dep}'."
)
def validate_networks(compose: dict[str, Any], result: ValidationResult) -> None:
"""Check that service networks are declared at the top level."""
declared = set((compose.get("networks") or {}).keys())
for name, svc in _iter_services(compose):
svc_nets = svc.get("networks") or {}
if isinstance(svc_nets, list):
used = set(svc_nets)
elif isinstance(svc_nets, dict):
used = set(svc_nets.keys())
else:
used = set()
for net in used:
if net not in declared:
result.error(
f"[{name}] Uses network '{net}' which is not declared in the "
"top-level 'networks' section."
)
def validate_volumes(compose: dict[str, Any], result: ValidationResult) -> None:
"""Check for undefined named volumes and orphaned top-level volume declarations."""
declared_volumes = set((compose.get("volumes") or {}).keys())
used_volumes: set[str] = set()
for name, svc in _iter_services(compose):
for vol in (svc.get("volumes") or []):
if isinstance(vol, str):
parts = vol.split(":")
ref = parts[0]
elif isinstance(vol, dict):
ref = str(vol.get("source", ""))
else:
continue
# Named volumes don't start with . / ~ or a drive letter pattern
if ref and not re.match(r"^[./~]|^[A-Za-z]:[/\\]", ref):
used_volumes.add(ref)
if declared_volumes and ref not in declared_volumes:
result.error(
f"[{name}] Uses named volume '{ref}' which is not declared "
"in the top-level 'volumes' section."
)
for vol in declared_volumes:
if vol not in used_volumes:
result.warn(f"Top-level volume '{vol}' is declared but never used by any service.")
def validate_port_conflicts(compose: dict[str, Any], result: ValidationResult) -> None:
"""Error on duplicate host port bindings."""
seen_ports: dict[int, str] = {}
for name, svc in _iter_services(compose):
for port_spec in (svc.get("ports") or []):
host_port = _parse_host_port(port_spec)
if host_port is None:
continue
if host_port in seen_ports:
result.error(
f"[{name}] Host port {host_port} conflicts with service "
f"'{seen_ports[host_port]}'."
)
else:
seen_ports[host_port] = name
# ---------------------------------------------------------------------------
# NEW checks
# ---------------------------------------------------------------------------
def validate_circular_dependencies(compose: dict[str, Any], result: ValidationResult) -> None:
"""Detect circular dependencies in the depends_on graph using DFS."""
services = compose.get("services") or {}
# Build adjacency list: service_name -> list of dependencies
graph: dict[str, list[str]] = {}
for name, svc in services.items():
graph[name] = _get_depends_on_names((svc or {}).get("depends_on") or [])
visited: set[str] = set()
in_stack: set[str] = set()
def dfs(node: str, path: list[str]) -> bool:
"""Return True if a cycle is detected."""
visited.add(node)
in_stack.add(node)
for neighbour in graph.get(node, []):
if neighbour not in graph:
# Unknown dependency — already caught by validate_depends_on
continue
if neighbour not in visited:
if dfs(neighbour, path + [neighbour]):
return True
elif neighbour in in_stack:
cycle_path = "".join(path + [neighbour])
result.error(
f"Circular dependency detected: {cycle_path}"
)
return True
in_stack.discard(node)
return False
for service_name in graph:
if service_name not in visited:
dfs(service_name, [service_name])
def validate_port_range(compose: dict[str, Any], result: ValidationResult) -> None:
"""Warn if host ports are outside the preferred 50000-60000 range."""
for name, svc in _iter_services(compose):
for port_spec in (svc.get("ports") or []):
host_port = _parse_host_port(port_spec)
if host_port is None:
continue
if not (_PREFERRED_PORT_MIN <= host_port <= _PREFERRED_PORT_MAX):
result.warn(
f"[{name}] Host port {host_port} is outside the preferred range "
f"{_PREFERRED_PORT_MIN}-{_PREFERRED_PORT_MAX}."
)
def validate_network_isolation(compose: dict[str, Any], result: ValidationResult) -> None:
"""Warn if database/cache services are exposed on external networks."""
top_level_networks = compose.get("networks") or {}
for name, svc in _iter_services(compose):
image = svc.get("image", "")
if not image or not _is_db_cache_image(image):
continue
svc_nets = svc.get("networks") or {}
if isinstance(svc_nets, list):
net_names = svc_nets
elif isinstance(svc_nets, dict):
net_names = list(svc_nets.keys())
else:
net_names = []
for net_name in net_names:
net_config = top_level_networks.get(net_name) or {}
# A network is considered "external" if it has external: true
# or if it is named in a way that suggests it is the proxy/public network.
is_external = net_config.get("external", False)
is_proxy_net = any(
kw in net_name.lower() for kw in ("traefik", "proxy", "public", "frontend")
)
if is_external or is_proxy_net:
result.warn(
f"[{name}] Database/cache service is connected to external or proxy "
f"network '{net_name}' — use an internal network for isolation."
)
def validate_version_tags(compose: dict[str, Any], result: ValidationResult) -> None:
"""Check image version tag quality beyond just the :latest check."""
semver_full = re.compile(r"^\d+\.\d+\.\d+") # major.minor.patch
semver_minor = re.compile(r"^\d+\.\d+$") # major.minor only
semver_major = re.compile(r"^\d+$") # major only
for name, svc in _iter_services(compose):
image = svc.get("image", "")
if not image:
continue
img_name, tag = _image_name_and_tag(image)
if not tag:
# Already caught by validate_image_tags — skip to avoid duplicate noise
continue
if tag == "latest":
# Also caught above — error level comes from validate_image_tags
result.error(
f"[{name}] Image '{img_name}:latest' — :latest is forbidden in production."
)
continue
if semver_full.match(tag):
# Fully pinned — great
pass
elif semver_minor.match(tag):
result.warn(
f"[{name}] Image '{img_name}:{tag}' uses major.minor only — "
"pin to a full major.minor.patch tag for reproducible builds."
)
elif semver_major.match(tag):
result.warn(
f"[{name}] Image '{img_name}:{tag}' uses major version only — "
"pin to at least major.minor.patch."
)
else:
# Non-semver tags (sha digests, named releases, etc.) — accept as info
result.info(
f"[{name}] Image '{img_name}:{tag}' uses a non-semver tag — "
"verify this is a pinned, stable release."
)
def validate_env_references(compose: dict[str, Any], result: ValidationResult) -> None:
"""
Check that ${VAR} references in service configs have matching definitions.
Scans all string values in each service's config for ${VAR} patterns, then
checks whether those variables appear in the service's `environment` block
or are referenced via `env_file`. Cannot validate the contents of .env files
— only structural consistency within the compose file itself is checked.
"""
for name, svc in _iter_services(compose):
# Collect all ${VAR} references from the service's values
all_values: list[str] = []
_collect_all_string_values(svc, all_values)
referenced_vars: set[str] = set()
for val in all_values:
for match in _ENV_VAR_REF_PATTERN.finditer(val):
referenced_vars.add(match.group(1))
if not referenced_vars:
continue
# Collect defined variable names from the environment block
env_section = svc.get("environment") or {}
defined_vars: set[str] = set()
if isinstance(env_section, dict):
defined_vars = set(env_section.keys())
elif isinstance(env_section, list):
for entry in env_section:
key = str(entry).split("=")[0]
defined_vars.add(key)
has_env_file = bool(svc.get("env_file"))
for var in sorted(referenced_vars):
if var in defined_vars:
continue # Explicitly defined — fine
if has_env_file:
# Likely in the .env file — we can't verify, so just note it
result.info(
f"[{name}] ${{{var}}} is referenced but not defined inline — "
"ensure it is present in the env_file."
)
else:
result.warn(
f"[{name}] ${{{var}}} is referenced but has no inline definition "
"and no env_file is configured — ensure it is in your .env file."
)
# ---------------------------------------------------------------------------
# Orchestrator
# ---------------------------------------------------------------------------
def run_all_checks(compose: dict[str, Any], strict: bool) -> ValidationResult:
"""Run every registered check and return the aggregated result."""
result = ValidationResult()
# Original checks
validate_image_tags(compose, result)
validate_restart_policy(compose, result)
validate_healthchecks(compose, result)
validate_no_hardcoded_secrets(compose, result)
validate_resource_limits(compose, result, strict)
validate_logging(compose, result)
validate_privileged_mode(compose, result)
validate_host_network(compose, result)
validate_sensitive_volumes(compose, result)
validate_traefik_network_consistency(compose, result)
validate_traefik_router_uniqueness(compose, result)
validate_container_name_uniqueness(compose, result)
validate_depends_on(compose, result)
validate_networks(compose, result)
validate_volumes(compose, result)
validate_port_conflicts(compose, result)
# New checks
validate_circular_dependencies(compose, result)
validate_port_range(compose, result)
validate_network_isolation(compose, result)
validate_version_tags(compose, result)
validate_env_references(compose, result)
return result
# ---------------------------------------------------------------------------
# Entry point
# ---------------------------------------------------------------------------
def main() -> None:
parser = argparse.ArgumentParser(
description="Validate a Docker Compose file for production readiness.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=__doc__,
)
parser.add_argument("compose_file", help="Path to the docker-compose.yaml file.")
parser.add_argument(
"--strict",
action="store_true",
help="Enable strict mode: resource limits become errors, not warnings.",
)
args = parser.parse_args()
compose_path = Path(args.compose_file)
if not compose_path.exists():
print(f"❌ File not found: {compose_path}")
sys.exit(1)
try:
with compose_path.open(encoding="utf-8") as fh:
compose = yaml.safe_load(fh)
except yaml.YAMLError as exc:
print(f"❌ YAML parse error: {exc}")
sys.exit(1)
if not isinstance(compose, dict):
print("❌ Compose file did not parse to a mapping — is it a valid YAML file?")
sys.exit(1)
mode_label = " [STRICT]" if args.strict else ""
print(f"🐳 Validating: {compose_path}{mode_label}")
print("-" * 60)
result = run_all_checks(compose, strict=args.strict)
result.print_report()
sys.exit(0 if result.passed else 1)
if __name__ == "__main__":
main()