Agent Platform
Strategic document. Defines how Hackorda becomes accessible to and runnable by AI agents — the MCP/connector surface, the compute map, the permissions model extended to agents, and the two-phase roadmap. Direction set with the owner on 2026-05-30.
Companion docs: Core Feature Roadmap · Infra Roadmap · System Overview · Flows · Feature Matrix
1. The reference model: Firecrawl + AgentMail
Both Firecrawl and AgentMail made the same strategic bet: expose your domain capability as infrastructure that AI agents can call, not just as a UI humans click. They did this by:
- Wrapping existing API endpoints as typed MCP tools with clear names, inputs, and outputs.
- Adding agent-native auth — scoped API keys with permission bounds, so
an agent can be given exactly
read:cycles + file:issuesand nothing more. - Designing outputs for LLM consumption — structured JSON, not HTML; summaries + IDs, not full blobs; pagination that agents can walk.
- Publishing an MCP server that any MCP-compatible host (Claude Desktop, Cursor, custom agent) can install in seconds.
The result: their product became a verb in AI workflows. A user doesn't go to Firecrawl's UI to scrape; they tell their agent "scrape this site" and the agent calls Firecrawl under the hood.
Hackorda can do the same — and the domain is even better suited: QA workflows are repetitive, structured, and benefit from automation at every step.
2. The two-phase strategy
Phase 1 — Agents USE Hackorda (the MCP surface)
External AI agents (Claude, GPT, custom agents in customer workflows, or Hackorda's own workflow automation) call Hackorda as a service:
Customer agent or AI workflow
│
│ MCP tools / REST API + API key
▼
Hackorda MCP Server
│
▼
Existing Next.js API (/api/test-cycles/*, /api/admin/*)
│
▼
Postgres (existing data model)Compute: light. A stateless MCP server wrapping the existing API — same compute footprint as an extra API route.
Business model: usage-based API (per tool call) or included in per-seat SaaS. Agents calling Hackorda = more activity = more value lock-in.
Phase 2 — Hackorda RUNS agents (agentic QA)
Hackorda itself operates autonomous AI testers that drive real browsers, find bugs, and file them as a service. Humans assign test targets; agents do the testing.
Admin assigns target URL + test scenario
│
▼
Hackorda Agent Runner
├── Spawns sandboxed browser (Playwright)
├── Drives agent loop (LLM → actions → observations)
├── Detects anomalies / unexpected states
└── Files structured issues via the same API
│
▼
Normal triage/payout flowCompute: heavy. Sandboxed browser processes per concurrent run, LLM inference per test session, artifact storage (screenshots, traces). Billed separately per run.
Business model: per-test-run or per-bug-found. Replaces (or augments) human testers for regression / smoke test coverage.
3. Full permissions model — agents as actors
The existing model has two axes: system role × per-cycle role (see
system-overview.md §4). Agents add a third actor
type that must be mapped into both axes cleanly.
3.1 The five actor types
| Actor | Identity | System role | Per-cycle role | Notes |
|---|---|---|---|---|
| Human admin | Clerk user | ADMIN (1) or SUPER_ADMIN (5) | — | Global access |
| Human tester | Clerk user | QA (4) | tester or lead | Cycle-scoped |
| Human observer | Clerk user | QA (4) | observer | Read-only in cycle |
| Agent (external) | API key | Bound to key's permission set | Optional cycle scope | Programmatic access |
| Agent (runner) | Internal system token | SUPER_ADMIN (5) internally | Writes to any cycle | Only the Runner service, not exposed externally |
3.2 Agent API key model
External agents authenticate via a scoped API key — a new auth layer sitting alongside Clerk. Each key has:
ApiKey {
id uuid
orgId uuid -- which org this key belongs to
name text -- e.g. "Claude workflow - staging"
keyHash text -- SHA-256 of the actual key
scopes text[] -- e.g. ['cycles:read', 'issues:write', 'triage:read']
cycleIds uuid[] -- optional: restrict to specific cycles; empty = all in org
rateLimit int -- calls per minute
expiresAt timestamptz -- optional expiry
lastUsedAt timestamptz
createdBy uuid -- user who created it
}The key is passed as Authorization: Bearer hk_live_... — distinct prefix
from Clerk tokens so the middleware can route correctly.
3.3 Permission scopes catalog
Scopes follow <resource>:<action> naming, matching the MCP tool catalog:
| Scope | What it allows |
|---|---|
cycles:read | List + get cycles, docs, members |
cycles:write | Create/update cycles (admin scope) |
issues:read | Read issues, comments, attachments |
issues:write | File issues, add comments |
issues:triage | Approve/reject/reclassify — admin scope |
runs:write | Start/end test runs |
payouts:read | Read payout status + balance |
payouts:write | Trigger batch + mark paid — super-admin scope |
analytics:read | Usage events, cycle reports |
ai:write | Trigger AI re-analysis on an issue |
admin:read | Read org/product/user data — admin scope |
Scopes are additive + least-privilege. A key for a triage automation
gets issues:read + issues:triage; a key for a filing agent gets
cycles:read + issues:write + runs:write.
3.4 Per-cycle scoping
A key with cycleIds: [uuid1, uuid2] can only interact with those cycles —
the API enforces the same checkTestCycleAccess() gate that human testers
hit, but resolves via the key's cycleIds instead of the testers table.
An empty cycleIds means all cycles in the key's org.
3.5 How the middleware changes
Incoming request
│
├── Authorization: Bearer ey... (Clerk token) → existing Clerk flow
│
└── Authorization: Bearer hk_... (API key) → new path:
│
├── Resolve key from DB (cache in Redis/memory)
├── Validate scopes vs route requirements
├── Validate cycleId restriction if present
├── Rate-limit check
└── Inject ApiKeyContext (replaces AuthContext)No change to existing human flows — the new path is additive.
4. The MCP tool catalog
The MCP server exposes Hackorda's domain as typed tools. Each tool maps to one or more existing API routes. Initial catalog — Phase 1 launch set:
4.1 Cycle tools
| Tool name | Maps to | Scope | Description |
|---|---|---|---|
list_cycles | GET /api/test-cycles/browse | cycles:read | List cycles the key's org has access to. Returns id, name, status, product. |
get_cycle | GET /api/test-cycles/[id] | cycles:read | Full cycle detail: docs, members, payout rates, status. |
create_cycle | POST /api/admin/test-cycles | cycles:write | Create a cycle for an org/product. |
update_cycle_status | PATCH /api/admin/test-cycles/[id] | cycles:write | Advance status: planned → active → review → closed. |
list_cycle_docs | GET /api/test-cycles/[id]/documents | cycles:read | List docs (briefs, runbooks, reports) for a cycle. |
get_cycle_doc | GET /api/test-cycles/[id]/documents/[docId] | cycles:read | Full markdown content of a cycle doc. |
4.2 Issue tools
| Tool name | Maps to | Scope | Description |
|---|---|---|---|
list_issues | GET /api/test-cycles/issues | issues:read | Cross-cycle issue list. Filterable by severity, status, payout status, cycle. |
get_issue | GET /api/test-cycles/[id]/issues/[issueId] | issues:read | Full issue: description, steps, attachments, AI suggestions, payout. |
file_issue | POST /api/test-cycles/[id]/issues | issues:write | File a new bug. Accepts title, description, steps, expected/actual, severity, attachments. Returns issueId. |
comment_on_issue | POST /api/test-cycles/[id]/issues/[issueId]/comments | issues:write | Add a comment (markdown). |
get_issue_comments | GET /api/test-cycles/[id]/issues/[issueId]/comments | issues:read | Thread of comments on an issue. |
trigger_ai_analysis | POST /api/test-cycles/[id]/issues/[issueId]/intake | ai:write | Re-run AI intake on an issue (get title/severity/type suggestions). |
4.3 Triage tools (admin scope)
| Tool name | Maps to | Scope | Description |
|---|---|---|---|
list_triage_queue | GET /api/admin/triage | issues:triage | All pending issues awaiting a payout decision. |
decide_issue | POST /api/admin/triage/decide | issues:triage | Approve or reject a payout. Accepts issueId, decision, optional new severity + amount. |
get_payout_status | GET /api/admin/test-cycles/[id]/payouts/by-tester | payouts:read | Payout breakdown per tester for a cycle. |
4.4 Run tools
| Tool name | Maps to | Scope | Description |
|---|---|---|---|
start_run | POST /api/test-cycles/[id]/runs | runs:write | Start a test run in a cycle. Returns runId. |
complete_run | PATCH /api/test-cycles/[id]/runs/[runId] | runs:write | Mark a run complete with notes. |
4.5 Resource / read tools
| Tool name | Maps to | Scope | Description |
|---|---|---|---|
get_balance | GET /api/me/balance | payouts:read | Tester's earnings breakdown (pending verification, available, paid). |
list_organizations | GET /api/admin/organizations | admin:read | Orgs the key has access to. |
get_cycle_report | GET /api/admin/test-cycles/[id] | cycles:read | Cycle summary: issue counts by severity/status, payout totals. |
4.6 Tool output design principles
All tools return agent-readable JSON — not paginated HTML or UI-shaped responses:
- IDs always present for follow-up calls (
issueId,cycleId,runId). - Status as enum strings (
"open","approved") not display labels. - Truncate long markdown bodies by default; pass
full=trueto get the full content. - List responses include
total+ cursor for agents that need to walk pages. - Error responses always include
code(machine-readable) +message(human-readable).
5. Key use cases
UC-1: Triage automation agent
An agent in an admin's workflow runs each morning, reviews the triage queue, applies consistent severity standards, and pre-approves low-risk issues.
list_triage_queue()
→ for each issue:
get_issue(issueId)
trigger_ai_analysis(issueId) # ensure fresh suggestions
if issue.aiSuggestions.confidence > 0.9 and severity == 'low':
decide_issue(issueId, decision='approve', severity='low')
else:
# leave for human reviewScope needed: issues:read + issues:triage + ai:write
Value: Admin saves 60–80% of triage time on low-severity backlog.
UC-2: Automated bug filing from CI
A CI pipeline (GitHub Actions, Jenkins) catches a failing test and automatically files a structured bug report in the active cycle.
# In CI workflow, on test failure:
list_cycles(status='active', productId=env.PRODUCT_ID)
→ get the active cycle id
start_run(cycleId)
→ runId
file_issue(cycleId, {
title: test.name,
description: test.failureMessage,
stepsToReproduce: test.steps,
severity: 'high',
url: deployUrl,
type: 'bug'
})
complete_run(cycleId, runId)Scope needed: cycles:read + issues:write + runs:write
Value: Zero-latency bug reporting from automated test suites. Every
CI failure becomes a tracked, payable bug if a tester confirms it.
UC-3: QA agent in Claude/Cursor
A developer uses Claude Desktop with the Hackorda MCP server installed. "What bugs are open in the v0.9 cycle?" → agent calls
list_issuesand returns a structured summary. "File that as a bug" → agent callsfile_issuewith the conversation context.
Scope needed: cycles:read + issues:read + issues:write
Value: QA workflow lives inside the developer's existing AI assistant.
No context switch to a separate tool.
UC-4: Linear → Hackorda sync agent (pairs with roadmap F)
An agent polls Linear for status changes and updates the corresponding Hackorda issue's
externalStatus, keeping the payout pipeline accurate without manual intervention.
Scope needed: issues:read + issues:write (to update external status)
Value: Closes the "deferred" Linear webhook gap without a full webhook
infrastructure build.
UC-5 (Phase 2): Autonomous regression tester
Admin schedules "run a smoke test against staging.product.com after every deploy." Hackorda's agent runner boots a sandboxed browser, navigates the app following the cycle's test plan, and files any anomalies it finds.
Scope needed: Internal runner token (not external API key). Compute: See §6.2.
6. Compute map
6.1 Phase 1 — MCP surface (light)
| Component | Compute | Where |
|---|---|---|
| MCP server process | ~50 MB, stateless | Same DO droplet as app, or tiny dedicated |
| API key table + scope check | Postgres query | Existing DB (Neon) |
| Rate limiter | Existing rate_limit_buckets table | Existing DB |
| Key cache | In-process LRU (< 1 MB) | MCP server memory |
No new infra needed for Phase 1. The MCP server is a thin gateway process deployable on the existing droplet.
6.2 Phase 2 — Agent runner (heavy)
| Component | Compute | Scale |
|---|---|---|
| Browser sandbox | 1–2 CPU + 2 GB RAM per concurrent run (Playwright) | 1 container per run |
| LLM inference | Anthropic API calls per agent step (~10–50 steps/run) | Per-run cost |
| Artifact storage | Screenshots, traces, video (~50–200 MB/run) | DO Spaces / S3 |
| Runner orchestrator | 1 small process | Shared with worker (Phase 1) |
| Sandbox isolation | Docker-in-Docker or separate container per run | 1 run = 1 container |
Rough per-run cost estimate:
- Browser container: ~$0.01–0.05 (5–15 min of a 2 vCPU/2 GB droplet)
- Anthropic calls: ~$0.05–0.20 (10–50 steps × ~$0.003/step, Sonnet)
- Storage: ~$0.001–0.005 (50–200 MB at DO Spaces pricing)
- Total: ~$0.10–0.30 per run
Infra shape for Phase 2:
Runner VM (separate from app, 4 vCPU / 8 GB):
├── Runner orchestrator process (pg-boss worker)
├── Docker daemon
├── Container pool: up to N concurrent browser sandboxes
└── Artifact uploader → DO Spaces
N concurrent runs = N × (2 CPU + 2 GB)
A 4 vCPU / 8 GB droplet → 2 concurrent runs
Scale: add runner VMs horizontally7. The full wiki structure
Based on the Firecrawl/Linear model — docs organized for both humans and AI agents (agents increasingly read docs to understand how to use a platform).
docs/
├── README.md ← index / "start here"
├── system-overview.md ← architecture, stack, permissions
├── roadmap.md ← infra roadmap (Phase 0→4)
├── feature-roadmap.md ← product roadmap (buckets A→I)
├── feature-matrix.md ← current feature state
├── agent-platform.md ← this doc: agent strategy
│
├── guides/ ← NEW: task-oriented how-tos
│ ├── agent-quickstart.md ← "file your first bug via MCP in 5 min"
│ ├── api-key-setup.md ← create + scope an API key
│ ├── mcp-server-install.md ← Claude Desktop, Cursor, custom agent
│ └── ci-integration.md ← GitHub Actions + Hackorda
│
├── reference/ ← NEW: exhaustive reference (agent-readable)
│ ├── mcp-tools.md ← every MCP tool, inputs/outputs, examples
│ ├── api-routes.md ← existing (update with agent endpoints)
│ ├── permissions.md ← NEW: full scope/role/key model
│ ├── webhooks.md ← NEW (Phase 1B): event webhooks
│ └── errors.md ← error codes catalog
│
├── flows/ ← canonical user journeys (F-01→F-16)
│ └── (existing 16 flow files)
│
├── use-cases/ ← NEW: UC-1→UC-N (this doc §5, expanded)
│ ├── triage-automation.md
│ ├── ci-bug-filing.md
│ ├── claude-desktop-qa.md
│ └── autonomous-regression.md ← Phase 2
│
├── authentication.md ← existing (update with API key model)
├── deployment.md ← existing
└── ops/
├── database.md ← existing
└── self-hosted-runner.md ← existing8. The build roadmap for agent features
Phase 1A — Foundation (prerequisite, no user-visible features)
api_keystable + Drizzle schema- API key middleware (sits alongside Clerk middleware)
- Admin UI: create / revoke / scope API keys per org
- Rate-limiting reuse of existing
rate_limit_buckets
Phase 1B — MCP server
packages/mcp-server/— standalone Node process using@modelcontextprotocol/sdk- Implements the Phase 1 launch set (§4.1–4.5)
- Deployed alongside the app
- Published to npm for self-hosting (optional, low cost)
- Agent-readable error messages + structured outputs
Phase 1C — Connector ecosystem
- REST API documentation (OpenAPI spec → usable in any connector platform)
- Zapier / Make.com connector (file issue, list issues, update status)
- Webhook outbound events (issue filed, triage decided, payout released)
Phase 2A — Runner infrastructure
runner/service: pg-boss worker + Docker orchestrator- Sandbox container image (Playwright + Anthropic SDK + artifact uploader)
- Admin UI: schedule a run, set target URL + test plan
- Run result → structured issues + tester attribution
Phase 2B — Agentic test plans
- AI agent uses cycle's doc (test plan/runbook) as the instruction set
- Executes steps, takes screenshots at each, compares to expected state
- Files anomalies with full context: screenshot, page URL, console errors
9. Business value summary
| Capability | Value to customers | Revenue model |
|---|---|---|
| MCP server | QA workflow in their AI assistant (no context switch) | Included in SaaS plan or per-seat API tier |
| API keys | Integrate Hackorda into CI/CD, internal tools | Unblocks enterprise customers who won't use OAuth |
| Webhooks | Real-time sync to Slack, Linear, Jira without polling | Sticky integration = retention |
| Agent triage | Reduces admin triage time by 60–80% | Feature premium or included in growth tier |
| Agent runner | Continuous regression testing without human testers | Per-run billing — new revenue stream |
| Connector marketplace | Lower barrier to adoption | Marketplace distribution = acquisition |
Why this matters competitively: QA platforms that become composable infrastructure (callable by agents) have a different retention curve than those that are just UI tools. Every AI workflow that depends on Hackorda is a workflow that can't easily be ripped out.
10. Open decisions (owner's call)
- API key billing tier — included in existing seats, or a separate API plan? (Affects pricing architecture before Phase 1A ships.)
- MCP server distribution — hosted only (SaaS), self-hosted (open source npm package), or both? Firecrawl does both.
- Webhook delivery — in-process (fire-and-forget, same durability gap as current AI calls) vs durable (through the Phase 1 job queue). Should wait for Phase 1 queue.
- Runner isolation — Docker-in-Docker on a shared VM vs dedicated per-run Firecracker/Fly Machines. DinD is simpler; Firecracker is more secure for untrusted targets.
- Agent runner pricing — per-run flat fee vs per-minute vs per-bug-found? Per-bug-found is most aligned but hard to meter.