Deployment

CI handles everything. After merging to main, the GitHub Actions pipeline:

Builds the Docker image and pushes it to GHCR
SSHs to the droplet and pulls the new image
Migrations + seeds run automatically on container start (via scripts/docker-entrypoint.sh)
The auto-transition scheduler runs automatically in production Node (no env knob)

So for a fresh deploy of the test cycles vertical, the answer is nothing manual. Push to main and wait.

Host resources (swap, disk) + runner recovery

The droplet runs the app container and self-managed Postgres and (since the May-2026 migration) the self-hosted GitHub Actions runner. That's a lot on one box, so two failure modes are worth knowing.

Out of memory → "lost communication". A cold next build / Docker image build spikes past 2 GB RAM. Without swap, the kernel OOM-killer kills processes under pressure — and when it kills the runner agent, the job fails with "The self-hosted runner lost communication with the server" and the step logs come back empty (the runner died before flushing them). Fix once, idempotently:

sudo ./scripts/setup-swap.sh 4G    # creates /swapfile, persists to fstab, swappiness=10
free -h                            # confirm Swap line is non-zero

setup-server.sh now runs this automatically, so freshly-provisioned boxes get swap. A box provisioned before this existed picks it up on the next setup-server.sh run, or just run the script directly.

Disk full → stuck deploy. Image pulls fail halfway and containers won't start. deploy.sh logs a warning when the Docker data root drops below 3 GB free. To reclaim space safely (never touches volumes — Postgres data lives in one):

docker system prune -af && docker builder prune -af

Recovering a dead runner. If a deploy failed with the "lost communication" error, the runner agent may still be offline — and since check (and everything downstream) runs on it, nothing will deploy until it's back:

ssh deploy@<droplet>
systemctl status 'actions.runner.*'      # is it active?
sudo systemctl restart 'actions.runner.*'  # if not
free -h && df -h /                         # confirm RAM/disk headroom

Database

The Postgres database is currently a self-managed instance on the droplet with no automated backups and no verified restore path. A move to managed Postgres (daily backups + point-in-time recovery) is planned.

For connection topology, backup policy, the restore drill, the cutover plan, and all DB-related env vars (DATABASE_URL, DB_CA_CERT, DB_SSL_MODE, DB_POOL_MAX, DB_STATEMENT_TIMEOUT_MS), see the operational runbook: Database.

When you might still SSH

The seed:tc-onboarding seed needs at least one user with role_id = 1 (ADMIN). If the database has no admin yet, the seed fails non-fatally on container start. After someone signs in to create their users row, promote them:
```
ssh deploy@<droplet>
docker exec hackorda-app sh -c \
  "psql \"\$DATABASE_URL\" -c \"UPDATE users SET role_id=1 WHERE email='you@example.com';\""
docker compose -f ~/hackorda-mvp/docker-compose.prod.yml up -d --force-recreate
```
The recreate triggers the seed again, and now it succeeds.

AI triage agent

The issue intake agent (src/lib/ai/intake.ts) is fire-and-forget — when a tester submits an issue, we call Anthropic in the background and write back aiSuggestions (suggested title / severity / bug type). The UI shows them as soft hints with one-click Apply buttons. Drafts skip the analysis.

Enable it by setting one env var on the droplet:

ANTHROPIC_API_KEY=sk-ant-…

Without it, isAiEnabled() returns false and every intake call no-ops. Issues still file fine; the AI card just never appears.

Cost: ~$0.005–0.015 per issue (Sonnet, ≤3 attached images). Logged to ai_runs.cost_usd_cents for monitoring. To see what the agent has been doing:

select kind, status, model, cost_usd_cents, latency_ms, created_at, error_message
from ai_runs order by created_at desc limit 20;

Provenance: every call appends one row to ai_runs. Failed runs land with status='failed' and the error message captured. The issue gets ai_intake_run_id pointing at the row that produced its current suggestions.

Disable for one cycle: unset the env var on the droplet, restart the container.

Optional kill-switch

If you ever scale to multiple containers and want only one to run the scheduler:

SCHEDULER_DISABLED=true

…on the containers you want quiet. Default is on in production — you only set this if you specifically want to disable it.

Smoke test

After a deploy:

https://hackorda.kz/app/test-cycles shows the Hackorda Onboarding — QA shake-down cycle
File an issue with a screenshot → it appears in All Issues
docker logs hackorda-app | grep scheduler shows [scheduler] registered

That's it. Hand the QA the onboarding handbook and test cases.