Hackorda Docs

Agent Platform

Strategic document. Defines how Hackorda becomes accessible to and runnable by AI agents — the MCP/connector surface, the compute map, the permissions model extended to agents, and the two-phase roadmap. Direction set with the owner on 2026-05-30.

Companion docs: Core Feature Roadmap · Infra Roadmap · System Overview · Flows · Feature Matrix


1. The reference model: Firecrawl + AgentMail

Both Firecrawl and AgentMail made the same strategic bet: expose your domain capability as infrastructure that AI agents can call, not just as a UI humans click. They did this by:

  1. Wrapping existing API endpoints as typed MCP tools with clear names, inputs, and outputs.
  2. Adding agent-native auth — scoped API keys with permission bounds, so an agent can be given exactly read:cycles + file:issues and nothing more.
  3. Designing outputs for LLM consumption — structured JSON, not HTML; summaries + IDs, not full blobs; pagination that agents can walk.
  4. Publishing an MCP server that any MCP-compatible host (Claude Desktop, Cursor, custom agent) can install in seconds.

The result: their product became a verb in AI workflows. A user doesn't go to Firecrawl's UI to scrape; they tell their agent "scrape this site" and the agent calls Firecrawl under the hood.

Hackorda can do the same — and the domain is even better suited: QA workflows are repetitive, structured, and benefit from automation at every step.


2. The two-phase strategy

Phase 1 — Agents USE Hackorda (the MCP surface)

External AI agents (Claude, GPT, custom agents in customer workflows, or Hackorda's own workflow automation) call Hackorda as a service:

Customer agent or AI workflow

        │  MCP tools / REST API + API key

  Hackorda MCP Server


  Existing Next.js API (/api/test-cycles/*, /api/admin/*)


  Postgres (existing data model)

Compute: light. A stateless MCP server wrapping the existing API — same compute footprint as an extra API route.

Business model: usage-based API (per tool call) or included in per-seat SaaS. Agents calling Hackorda = more activity = more value lock-in.

Phase 2 — Hackorda RUNS agents (agentic QA)

Hackorda itself operates autonomous AI testers that drive real browsers, find bugs, and file them as a service. Humans assign test targets; agents do the testing.

Admin assigns target URL + test scenario


  Hackorda Agent Runner
  ├── Spawns sandboxed browser (Playwright)
  ├── Drives agent loop (LLM → actions → observations)
  ├── Detects anomalies / unexpected states
  └── Files structured issues via the same API


  Normal triage/payout flow

Compute: heavy. Sandboxed browser processes per concurrent run, LLM inference per test session, artifact storage (screenshots, traces). Billed separately per run.

Business model: per-test-run or per-bug-found. Replaces (or augments) human testers for regression / smoke test coverage.


3. Full permissions model — agents as actors

The existing model has two axes: system role × per-cycle role (see system-overview.md §4). Agents add a third actor type that must be mapped into both axes cleanly.

3.1 The five actor types

ActorIdentitySystem rolePer-cycle roleNotes
Human adminClerk userADMIN (1) or SUPER_ADMIN (5)Global access
Human testerClerk userQA (4)tester or leadCycle-scoped
Human observerClerk userQA (4)observerRead-only in cycle
Agent (external)API keyBound to key's permission setOptional cycle scopeProgrammatic access
Agent (runner)Internal system tokenSUPER_ADMIN (5) internallyWrites to any cycleOnly the Runner service, not exposed externally

3.2 Agent API key model

External agents authenticate via a scoped API key — a new auth layer sitting alongside Clerk. Each key has:

ApiKey {
  id            uuid
  orgId         uuid           -- which org this key belongs to
  name          text           -- e.g. "Claude workflow - staging"
  keyHash       text           -- SHA-256 of the actual key
  scopes        text[]         -- e.g. ['cycles:read', 'issues:write', 'triage:read']
  cycleIds      uuid[]         -- optional: restrict to specific cycles; empty = all in org
  rateLimit     int            -- calls per minute
  expiresAt     timestamptz    -- optional expiry
  lastUsedAt    timestamptz
  createdBy     uuid           -- user who created it
}

The key is passed as Authorization: Bearer hk_live_... — distinct prefix from Clerk tokens so the middleware can route correctly.

3.3 Permission scopes catalog

Scopes follow <resource>:<action> naming, matching the MCP tool catalog:

ScopeWhat it allows
cycles:readList + get cycles, docs, members
cycles:writeCreate/update cycles (admin scope)
issues:readRead issues, comments, attachments
issues:writeFile issues, add comments
issues:triageApprove/reject/reclassify — admin scope
runs:writeStart/end test runs
payouts:readRead payout status + balance
payouts:writeTrigger batch + mark paid — super-admin scope
analytics:readUsage events, cycle reports
ai:writeTrigger AI re-analysis on an issue
admin:readRead org/product/user data — admin scope

Scopes are additive + least-privilege. A key for a triage automation gets issues:read + issues:triage; a key for a filing agent gets cycles:read + issues:write + runs:write.

3.4 Per-cycle scoping

A key with cycleIds: [uuid1, uuid2] can only interact with those cycles — the API enforces the same checkTestCycleAccess() gate that human testers hit, but resolves via the key's cycleIds instead of the testers table. An empty cycleIds means all cycles in the key's org.

3.5 How the middleware changes

Incoming request

   ├── Authorization: Bearer ey...  (Clerk token)  → existing Clerk flow

   └── Authorization: Bearer hk_...  (API key)     → new path:

            ├── Resolve key from DB (cache in Redis/memory)
            ├── Validate scopes vs route requirements
            ├── Validate cycleId restriction if present
            ├── Rate-limit check
            └── Inject ApiKeyContext (replaces AuthContext)

No change to existing human flows — the new path is additive.


4. The MCP tool catalog

The MCP server exposes Hackorda's domain as typed tools. Each tool maps to one or more existing API routes. Initial catalog — Phase 1 launch set:

4.1 Cycle tools

Tool nameMaps toScopeDescription
list_cyclesGET /api/test-cycles/browsecycles:readList cycles the key's org has access to. Returns id, name, status, product.
get_cycleGET /api/test-cycles/[id]cycles:readFull cycle detail: docs, members, payout rates, status.
create_cyclePOST /api/admin/test-cyclescycles:writeCreate a cycle for an org/product.
update_cycle_statusPATCH /api/admin/test-cycles/[id]cycles:writeAdvance status: planned → active → review → closed.
list_cycle_docsGET /api/test-cycles/[id]/documentscycles:readList docs (briefs, runbooks, reports) for a cycle.
get_cycle_docGET /api/test-cycles/[id]/documents/[docId]cycles:readFull markdown content of a cycle doc.

4.2 Issue tools

Tool nameMaps toScopeDescription
list_issuesGET /api/test-cycles/issuesissues:readCross-cycle issue list. Filterable by severity, status, payout status, cycle.
get_issueGET /api/test-cycles/[id]/issues/[issueId]issues:readFull issue: description, steps, attachments, AI suggestions, payout.
file_issuePOST /api/test-cycles/[id]/issuesissues:writeFile a new bug. Accepts title, description, steps, expected/actual, severity, attachments. Returns issueId.
comment_on_issuePOST /api/test-cycles/[id]/issues/[issueId]/commentsissues:writeAdd a comment (markdown).
get_issue_commentsGET /api/test-cycles/[id]/issues/[issueId]/commentsissues:readThread of comments on an issue.
trigger_ai_analysisPOST /api/test-cycles/[id]/issues/[issueId]/intakeai:writeRe-run AI intake on an issue (get title/severity/type suggestions).

4.3 Triage tools (admin scope)

Tool nameMaps toScopeDescription
list_triage_queueGET /api/admin/triageissues:triageAll pending issues awaiting a payout decision.
decide_issuePOST /api/admin/triage/decideissues:triageApprove or reject a payout. Accepts issueId, decision, optional new severity + amount.
get_payout_statusGET /api/admin/test-cycles/[id]/payouts/by-testerpayouts:readPayout breakdown per tester for a cycle.

4.4 Run tools

Tool nameMaps toScopeDescription
start_runPOST /api/test-cycles/[id]/runsruns:writeStart a test run in a cycle. Returns runId.
complete_runPATCH /api/test-cycles/[id]/runs/[runId]runs:writeMark a run complete with notes.

4.5 Resource / read tools

Tool nameMaps toScopeDescription
get_balanceGET /api/me/balancepayouts:readTester's earnings breakdown (pending verification, available, paid).
list_organizationsGET /api/admin/organizationsadmin:readOrgs the key has access to.
get_cycle_reportGET /api/admin/test-cycles/[id]cycles:readCycle summary: issue counts by severity/status, payout totals.

4.6 Tool output design principles

All tools return agent-readable JSON — not paginated HTML or UI-shaped responses:

  • IDs always present for follow-up calls (issueId, cycleId, runId).
  • Status as enum strings ("open", "approved") not display labels.
  • Truncate long markdown bodies by default; pass full=true to get the full content.
  • List responses include total + cursor for agents that need to walk pages.
  • Error responses always include code (machine-readable) + message (human-readable).

5. Key use cases

UC-1: Triage automation agent

An agent in an admin's workflow runs each morning, reviews the triage queue, applies consistent severity standards, and pre-approves low-risk issues.

list_triage_queue()
  → for each issue:
      get_issue(issueId)
      trigger_ai_analysis(issueId)   # ensure fresh suggestions
      if issue.aiSuggestions.confidence > 0.9 and severity == 'low':
          decide_issue(issueId, decision='approve', severity='low')
      else:
          # leave for human review

Scope needed: issues:read + issues:triage + ai:write Value: Admin saves 60–80% of triage time on low-severity backlog.


UC-2: Automated bug filing from CI

A CI pipeline (GitHub Actions, Jenkins) catches a failing test and automatically files a structured bug report in the active cycle.

# In CI workflow, on test failure:
list_cycles(status='active', productId=env.PRODUCT_ID)
  → get the active cycle id
start_run(cycleId)
  → runId
file_issue(cycleId, {
  title: test.name,
  description: test.failureMessage,
  stepsToReproduce: test.steps,
  severity: 'high',
  url: deployUrl,
  type: 'bug'
})
complete_run(cycleId, runId)

Scope needed: cycles:read + issues:write + runs:write Value: Zero-latency bug reporting from automated test suites. Every CI failure becomes a tracked, payable bug if a tester confirms it.


UC-3: QA agent in Claude/Cursor

A developer uses Claude Desktop with the Hackorda MCP server installed. "What bugs are open in the v0.9 cycle?" → agent calls list_issues and returns a structured summary. "File that as a bug" → agent calls file_issue with the conversation context.

Scope needed: cycles:read + issues:read + issues:write Value: QA workflow lives inside the developer's existing AI assistant. No context switch to a separate tool.


UC-4: Linear → Hackorda sync agent (pairs with roadmap F)

An agent polls Linear for status changes and updates the corresponding Hackorda issue's externalStatus, keeping the payout pipeline accurate without manual intervention.

Scope needed: issues:read + issues:write (to update external status) Value: Closes the "deferred" Linear webhook gap without a full webhook infrastructure build.


UC-5 (Phase 2): Autonomous regression tester

Admin schedules "run a smoke test against staging.product.com after every deploy." Hackorda's agent runner boots a sandboxed browser, navigates the app following the cycle's test plan, and files any anomalies it finds.

Scope needed: Internal runner token (not external API key). Compute: See §6.2.


6. Compute map

6.1 Phase 1 — MCP surface (light)

ComponentComputeWhere
MCP server process~50 MB, statelessSame DO droplet as app, or tiny dedicated
API key table + scope checkPostgres queryExisting DB (Neon)
Rate limiterExisting rate_limit_buckets tableExisting DB
Key cacheIn-process LRU (< 1 MB)MCP server memory

No new infra needed for Phase 1. The MCP server is a thin gateway process deployable on the existing droplet.

6.2 Phase 2 — Agent runner (heavy)

ComponentComputeScale
Browser sandbox1–2 CPU + 2 GB RAM per concurrent run (Playwright)1 container per run
LLM inferenceAnthropic API calls per agent step (~10–50 steps/run)Per-run cost
Artifact storageScreenshots, traces, video (~50–200 MB/run)DO Spaces / S3
Runner orchestrator1 small processShared with worker (Phase 1)
Sandbox isolationDocker-in-Docker or separate container per run1 run = 1 container

Rough per-run cost estimate:

  • Browser container: ~$0.01–0.05 (5–15 min of a 2 vCPU/2 GB droplet)
  • Anthropic calls: ~$0.05–0.20 (10–50 steps × ~$0.003/step, Sonnet)
  • Storage: ~$0.001–0.005 (50–200 MB at DO Spaces pricing)
  • Total: ~$0.10–0.30 per run

Infra shape for Phase 2:

Runner VM (separate from app, 4 vCPU / 8 GB):
  ├── Runner orchestrator process (pg-boss worker)
  ├── Docker daemon
  ├── Container pool: up to N concurrent browser sandboxes
  └── Artifact uploader → DO Spaces

  N concurrent runs = N × (2 CPU + 2 GB)
  A 4 vCPU / 8 GB droplet → 2 concurrent runs
  Scale: add runner VMs horizontally

7. The full wiki structure

Based on the Firecrawl/Linear model — docs organized for both humans and AI agents (agents increasingly read docs to understand how to use a platform).

docs/
├── README.md                        ← index / "start here"
├── system-overview.md               ← architecture, stack, permissions
├── roadmap.md                       ← infra roadmap (Phase 0→4)
├── feature-roadmap.md               ← product roadmap (buckets A→I)
├── feature-matrix.md                ← current feature state
├── agent-platform.md                ← this doc: agent strategy

├── guides/                          ← NEW: task-oriented how-tos
│   ├── agent-quickstart.md          ← "file your first bug via MCP in 5 min"
│   ├── api-key-setup.md             ← create + scope an API key
│   ├── mcp-server-install.md        ← Claude Desktop, Cursor, custom agent
│   └── ci-integration.md           ← GitHub Actions + Hackorda

├── reference/                       ← NEW: exhaustive reference (agent-readable)
│   ├── mcp-tools.md                 ← every MCP tool, inputs/outputs, examples
│   ├── api-routes.md                ← existing (update with agent endpoints)
│   ├── permissions.md               ← NEW: full scope/role/key model
│   ├── webhooks.md                  ← NEW (Phase 1B): event webhooks
│   └── errors.md                    ← error codes catalog

├── flows/                           ← canonical user journeys (F-01→F-16)
│   └── (existing 16 flow files)

├── use-cases/                       ← NEW: UC-1→UC-N (this doc §5, expanded)
│   ├── triage-automation.md
│   ├── ci-bug-filing.md
│   ├── claude-desktop-qa.md
│   └── autonomous-regression.md    ← Phase 2

├── authentication.md                ← existing (update with API key model)
├── deployment.md                    ← existing
└── ops/
    ├── database.md                  ← existing
    └── self-hosted-runner.md        ← existing

8. The build roadmap for agent features

Phase 1A — Foundation (prerequisite, no user-visible features)

  • api_keys table + Drizzle schema
  • API key middleware (sits alongside Clerk middleware)
  • Admin UI: create / revoke / scope API keys per org
  • Rate-limiting reuse of existing rate_limit_buckets

Phase 1B — MCP server

  • packages/mcp-server/ — standalone Node process using @modelcontextprotocol/sdk
  • Implements the Phase 1 launch set (§4.1–4.5)
  • Deployed alongside the app
  • Published to npm for self-hosting (optional, low cost)
  • Agent-readable error messages + structured outputs

Phase 1C — Connector ecosystem

  • REST API documentation (OpenAPI spec → usable in any connector platform)
  • Zapier / Make.com connector (file issue, list issues, update status)
  • Webhook outbound events (issue filed, triage decided, payout released)

Phase 2A — Runner infrastructure

  • runner/ service: pg-boss worker + Docker orchestrator
  • Sandbox container image (Playwright + Anthropic SDK + artifact uploader)
  • Admin UI: schedule a run, set target URL + test plan
  • Run result → structured issues + tester attribution

Phase 2B — Agentic test plans

  • AI agent uses cycle's doc (test plan/runbook) as the instruction set
  • Executes steps, takes screenshots at each, compares to expected state
  • Files anomalies with full context: screenshot, page URL, console errors

9. Business value summary

CapabilityValue to customersRevenue model
MCP serverQA workflow in their AI assistant (no context switch)Included in SaaS plan or per-seat API tier
API keysIntegrate Hackorda into CI/CD, internal toolsUnblocks enterprise customers who won't use OAuth
WebhooksReal-time sync to Slack, Linear, Jira without pollingSticky integration = retention
Agent triageReduces admin triage time by 60–80%Feature premium or included in growth tier
Agent runnerContinuous regression testing without human testersPer-run billing — new revenue stream
Connector marketplaceLower barrier to adoptionMarketplace distribution = acquisition

Why this matters competitively: QA platforms that become composable infrastructure (callable by agents) have a different retention curve than those that are just UI tools. Every AI workflow that depends on Hackorda is a workflow that can't easily be ripped out.


10. Open decisions (owner's call)

  1. API key billing tier — included in existing seats, or a separate API plan? (Affects pricing architecture before Phase 1A ships.)
  2. MCP server distribution — hosted only (SaaS), self-hosted (open source npm package), or both? Firecrawl does both.
  3. Webhook delivery — in-process (fire-and-forget, same durability gap as current AI calls) vs durable (through the Phase 1 job queue). Should wait for Phase 1 queue.
  4. Runner isolation — Docker-in-Docker on a shared VM vs dedicated per-run Firecracker/Fly Machines. DinD is simpler; Firecracker is more secure for untrusted targets.
  5. Agent runner pricing — per-run flat fee vs per-minute vs per-bug-found? Per-bug-found is most aligned but hard to meter.

On this page