Skip to content
Loading SpecStep…
On this page

REST API walkthrough

Updated 2026-05-30

This page walks through the full end-to-end flow: from verifying your credentials to receiving a delivered documentation package. Each step lists the endpoint and what to expect back.

For reference material — status codes, error shapes, rate limits — see errors and rate-limits. For the OpenAPI document, fetch GET /v1/openapi.json.

Step 1 — verify your credentials

Before doing anything else, confirm your key works and retrieve your account details:

GET /v1/me

Returns your user ID, display name, email, and current plan. If this returns 401, stop and fix your key before proceeding. See authentication.

/v1/* endpoints accept either bearer shape on the Authorization header — an sf_… API key or an oat_… OAuth access token issued via the MCP browser sign-in flow (see mcp). The full auth reference is in auth.

Step 2 — start an interview

An interview is the structured conversation that captures your project's vision, requirements, and constraints. The generation that follows draws entirely from what the interview recorded.

POST /v1/interviews

Takes no request body. The response includes the new interview id, its initial status, and the AI Team's opening turn. You describe what you're building — project type, vision, constraints — in your first POST /v1/interviews/{id}/turns call (Step 3 below); the interview's project type is inferred from that first turn.

You can also attach reference documents — PDFs, images, YAML configs — to give the AI Team additional context. Upload, list, and delete them through the collection:

POST /v1/interviews/{interviewId}/reference-documents — upload (multipart form)

GET /v1/interviews/{interviewId}/reference-documents — list

DELETE /v1/interviews/{interviewId}/reference-documents/{documentId} — remove

Step 3 — submit interview turns

The interview proceeds as a sequence of turns. Each turn you submit becomes part of the recorded conversation. The AI Team responds within the same turn record.

POST /v1/interviews/{id}/turns

Send {"message": "..."}. The response includes the agent's reply and the updated interview state. Keep submitting turns until the AI Team has captured enough context to proceed to generation; you finalize the interview with:

POST /v1/interviews/{id}/complete

Safe retry (recommended). Send an Idempotency-Key HTTP header on every POST /v1/interviews/{id}/turns call. Any unique token works (UUID, ULID, hash of (interview_id, message) — 1..128 chars of [A-Za-z0-9._:-]). On a retry with the same key, you receive the original successful result instead of double-submitting your message. While the original is still in flight, the retry returns 409 Conflict with a Retry-After: 5 header and a structured ProblemDetails body containing retryable: true, retry_after_seconds: 5, turn_committed: false, and client_request_id. See errors for the full envelope.

Async mode is the default (changed 2026-05-19). A bare POST /v1/interviews/{id}/turns (no ?mode= query, no X-SpecStep-Turn-Mode header) now commits your user turn immediately and runs the LLM call in the background. The response is 202 Accepted with a Location: /v1/interviews/turns/{jobId} header and a body like {status: "queued", job_id, interview_id, submission_id?, user_turn_committed: true, snapshot: null}. Poll the location for the agent's reply:

GET /v1/interviews/turns/{jobId}

Returns {status, job_id, interview_id, snapshot?, error_code?, error_message?, is_retryable?, created_at, completed_at?}. The status is one of queued, running, completed, failed. When completed, snapshot carries the full interview state. When failed, error_code tells you what went wrong (INTERVIEW_TURN_TIMEOUT, INTERVIEW_TURN_STUCK_RUNNING, INTERVIEW_TURN_INTERNAL_ERROR, …) and is_retryable tells you whether to re-submit. Idempotency-Key works on async too.

If you supplied an Idempotency-Key whose original async call has already completed, you instead get 200 OK with status: "cached_replay" and the cached snapshot inline — no polling required.

Sync mode (opt-in, scheduled for removal). Pass ?mode=sync (or X-SpecStep-Turn-Mode: sync) to use the legacy inline-reply path: the call waits for the LLM round-trip and returns the agent's reply inline as 200 OK with the full interview snapshot. Subject to the ~60s Front Door ceiling — long interviews may 504. After one release cycle of observed async adoption, sync mode is removed entirely. New integrations should use async (the default).

Cancel an in-flight async turn (added 2026-05-18). If the user's submitted turn was wrong, an LLM call is dragging on, or you want to abandon a half-finished turn rather than wait for it (or for the stuck-job timeout), cancel the job by id:

POST /v1/interviews/turns/{jobId}/cancel

Returns 200 OK with {status: "cancelled", job_id, interview_id, created_at, completed_at} on the happy path. Queued jobs cancel cleanly; running jobs cancel best-effort — the job's terminal status will be cancelled, but the agent reply MAY still appear in the interview transcript if a mid-pipeline SaveChanges committed before the cancel landed. Idempotent on already-cancelled jobs. Returns 409 INTERVIEW_TURN_NOT_CANCELLABLE when the job is already completed or failed (the work landed; read the result via the poll endpoint above). 404 if the job isn't yours.

Completion auto-handoff (added 2026-05-17). When the agent signals completion on a turn — the interview transitions to complete and an intake_artifact_id is produced — SpecStep auto-starts a generation with sensible defaults (review_profile: "Normal", mirror_selection: "ClaudeMd", has_ui derived from the detected project type) and surfaces the result on the same snapshot:

  • started_generation_id — non-null on success; poll GET /v1/generations/{id} for progress.
  • auto_start_failure: {code, message} — non-null when auto-start failed (quota exceeded, validation error, transient provider failure, etc.). The interview turn still succeeded; the intake artifact is committed. Call POST /v1/generations manually with the intake_artifact_id if you want to retry with custom settings.

Both fields stay null when the turn didn't trigger completion. Auto-handoff is restricted to user-actor interviews; API-key actors receive auto_start_failure.code: "AUTO_START_NOT_SUPPORTED_FOR_ACTOR_TYPE" and call POST /v1/generations themselves. Same fields appear on the snapshot returned by GET /v1/interviews/turns/{jobId} when the async job's completion produced an intake artifact.

You can retrieve the full interview at any time:

GET /v1/interviews/{id}

Every interview read carries a transcript_size introspection block (added in v0.18, 2026-05-22) so clients can observe how full a transcript is before queuing the next turn:

"transcript_size": {
  "chars": 8421,
  "tokens_estimate": 2105,
  "max_chars": 800000,
  "max_tokens": 200000,
  "percent_used": 1.1
}
  • chars — total UTF-16 char count of every user + agent turn's content. System prompts, tool messages, and reference documents are excluded.
  • tokens_estimatechars / 4, floored. Conservative upper bound across Anthropic + OpenAI tokenizers; not a substitute for the real tokenizer count.
  • max_chars / max_tokens — current platform ceilings. Surfaced only in v0.18; not enforced yet. A later release will reject submit-turn calls that would push a transcript past the ceiling with a structured error envelope.
  • percent_used100.0 * chars / max_chars, rounded to one decimal. Can exceed 100 once the ceiling is enforced and a stale client has built up an oversize transcript on the side.

The same block appears on get_interview MCP responses with byte-identical field names.

You can also list your own interviews:

GET /v1/interviews?status=...&limit=20

Optional status filter (comma-separated tokens — active, paused, abandoned, complete, awaiting_clarification, or exact enum names) and limit (default 20, max 100). Empty conversations (fewer than two turns) are filtered out so first-contact abandons don't clutter the list. Each row carries id, status, detected_type, display_title, turn_count, started_at, and last_activity_at.

When you want an interview gone from the workspace — accidentally started, no longer relevant — soft-delete it:

DELETE /v1/interviews/{id}

Returns 204 No Content on success and 404 if the interview isn't yours. Allowed in any status (Active, Paused, Complete, Abandoned, AwaitingClarification, ClarificationResolved) — soft-delete is a "remove from my workspace" affordance, not a state-machine transition. The conversation row stays in the database for audit; the row drops out of GET /v1/interviews and the workspace UI. Idempotent: a re-delete on an already-deleted row is also 204.

To recover a deleted interview, see the recycle bin below.

Step 3.5 — discover the enumerable inputs (optional)

Before constructing a generation kickoff, you can ask the API which review profiles, project types, mirror selections, and schema versions it accepts. This lets clients avoid hardcoding magic strings that shift between deploys.

GET /v1/capabilities

Anonymous-shaped — the values describe the public contract and don't depend on the caller. Returns {schema_version, rubric_version, quality_rubric_version, review_profiles, project_types, mirror_selections}. Useful for SDKs and AI agents building dynamic forms.

Step 3.6 — connect an external folder (optional)

Instead of uploading reference documents one at a time, connect a folder from OneDrive, SharePoint, or Google Drive (Dropbox coming) and have its files synced into the interview. Re-syncing later is a single call.

This surface is cookie-only by design — every /v1/external-connectors/* route requires a session cookie, and the OAuth callback rides on the user's browser session. API-key callers cannot register a connector or run a sync; programmatic agents should treat connectors as read-only state and rely on a human to set them up through the UI.

The OAuth handshake is three stages — the UI drives all three, but the contracts are documented here so an SDK or AI agent embedded in the web shell can replicate the flow:

  1. POST /v1/external-connectors/{provider}/authorize with body {interviewId} — returns {authorize_url, state}. Redirect the browser to authorize_url.
  2. GET /v1/external-connectors/{provider}/oauth-callback?code=...&state=... — the provider redirects the browser here. The server exchanges the code and redirects to /interview/{interviewId}?pendingConnector={pendingId} so the UI can open a folder picker. The state token is one-shot — replaying it returns 400. Pending tokens expire after 15 minutes.
  3. POST /v1/external-connectors/pending/{pendingId}/commit with body {folderId, folderName} — registers the connector and runs the first sync. Returns {connector_id, files_synced, files_skipped}.

To populate the picker between stages 2 and 3:

GET /v1/external-connectors/pending/{pendingId}/folders?parent={id} — list folders one level at a time. Returns {folders: [{id, name, has_subfolders}, ...]}. Omit parent for the root.

{provider} accepts onedrive, microsoft-graph, sharepoint, google-drive (coming), and dropbox (v1.5). Kebab-case is what the UI sends.

Once committed, the steady-state surface:

GET /v1/external-connectors — list your connectors. Returns {connectors: [{id, provider, folder_name, status, created_at, last_synced_at}, ...]}. status is one of Active, Revoked, NeedsReconnect; provider is OneDrive, SharePoint, or GoogleDrive.

POST /v1/external-connectors/{connectorId}/sync with body {interviewId} — re-sync the connected folder into the named interview. Returns {files_synced, files_skipped}. 404 if the connector isn't yours.

DELETE /v1/external-connectors/{connectorId} — revoke. Returns 204 on success, 404 if not yours.

Step 3.7 — list intake artifacts (optional)

Once an interview is complete, it produces an intake artifact — the structured document that POST /v1/generations consumes. You can list these directly without filtering interview status inline:

GET /v1/intake-artifacts?status=ready&limit=50&offset=0

Optional status is ready (the only meaningful value today; null/blank = same as ready; unknown labels return an empty list). limit defaults to 50, max 200. Returns {artifacts: [{id, interview_id, project_name, schema_version, completed_at}, ...]}, newest first. Use the returned id as the intake_id argument to POST /v1/generations.

Step 4 — start a generation

Once the interview is complete, start a generation. This is the multi-stage process where the AI Team drafts, self-reviews, and (depending on the review profile) performs a fresh-eyes pass.

POST /v1/generations

The body uses intake_id (the intake artifact derived from the completed interview), the review profile, and a few configuration fields. Minimum required body:

{
  "intake_id": "01952fcb-cd11-7c3e-9a2e-3b1d8f5e6a04",
  "review_profile": "Normal",
  "project_type": "WebApp"
}

The response includes a generation id and a starting state of Queued. Defaults: review_profile is Normal, schema_version is 1.0.0, mirror_selection is None. Accepted review profiles are Fast, Normal, Extensive, and Researcher (case-sensitive). Accepted project_type values are WebApp, MobileApp, MobileGame, DesktopApp, BrowserExtension, AiAgent, and AiTool — call GET /v1/capabilities for the live list rather than hardcoding them. The generation runs asynchronously — you will need to poll for its progress.

Enum values (review_profile, project_type, mirror_selection) are case-sensitive strings on the wire. Callers sending integer ordinals continue to work for back-compat, but new code should send strings — the OpenAPI spec at /v1/openapi.json advertises the string form so codegen clients pick it up automatically.

Note: this endpoint is subject to the tighter 5-kickoffs-per-minute limit. See rate limits.

If you want to update an existing generation based on a new interview snapshot, use:

POST /v1/generations/{id}/update

This also counts as a kickoff for rate-limit purposes.

Step 5 — poll generation status

Generations take time — roughly 4 minutes for Fast, 12 for Normal, 32 for Extensive (p50 across recent completed runs). For a live pre-kickoff forecast tied to your account's history, use the MCP tool estimate_generation_cost — there is no REST estimate endpoint. After kickoff, REST callers read the run's own forecast from the estimated_total_usd / estimated_total_p25_usd / estimated_total_p75_usd and estimated_duration_seconds / estimated_time_remaining_seconds fields on the status response below. Poll the status endpoint to track progress:

GET /v1/generations/{id}

The response includes a state field. Poll until it reaches Complete, Failed, or Cancelled. A reasonable polling interval is 15 to 30 seconds. For richer detail about what each stage is doing, fetch:

GET /v1/generations/{id}/events

For a per-agent narration of what's happening live — used by the Generation Details page's chat-feed view — fetch:

GET /v1/generations/{id}/conversation?take=...

Returns {items: [...]} where each entry is one AgentInvocation projected for display: invocation_id, agent_role, character_name, display_name, asset_slug, accent_hex, action, narration, status, started_at, ended_at, duration_ms, cost_usd. Optional take caps the most-recent N; omit for the full feed.

You can also list every generation on your account, which is useful when a generation never produced a Package row (failed mid-flight, paused, or cancelled) and so doesn't show up under /v1/packages:

GET /v1/generations?status=...&limit=50&offset=0&order=desc

Optional status accepts comma-separated tokens. Roll-up tokens collapse to canonical state sets: in_progress (Queued / Drafting / SpecialistReview / Reviewing / FreshEyes / RiskReview / SecurityReview / Assembling / Delivering), complete, failed, cancelled, paused. You can also pass exact state names (Drafting, Reviewing, etc.) — case-insensitive. order is desc (newest first, default) or asc. Each row includes id, short_id, project_name, state, review_profile, cost_usd, the timing stamps, current_round, failure_reason / failure_category, source_channel, and interview_id.

When you want a generation gone from the workspace — e.g. a private experiment you don't want to keep visible — soft-delete it:

DELETE /v1/generations/{id}

Returns 204 No Content on success, 404 if the generation isn't yours, and 409 Conflict if the generation is still in flight (delete is only allowed on terminal-state rows: Complete, Failed, Cancelled — cancel the run first). The row is hidden from list and detail responses but kept on disk for audit + retention. Idempotent on already-deleted rows.

To recover a deleted generation, see the recycle bin.

Generation states

State Meaning
Queued Waiting to start.
Drafting Recommender, Architect, and Designer Critic produce sections.
SpecialistReview Domain-specialist agents pass over their owned sections (Hush for privacy, Argus for security, Marc for domain, etc.). Fires between drafting and the main review loop.
Reviewing Same-provider Critic reviews per round budget.
FreshEyes Different-provider Critic reviews (Normal / Extensive).
RiskReview The risk-review specialist passes over the package before assembly. Surfaces blocking + non-blocking risk callouts.
SecurityReview The security-review specialist passes over the package before assembly. Surfaces blocking + non-blocking security findings.
Assembling The orchestrator stitches the section drafts + per-agent metadata into the package zip.
Refining Pre-delivery auto-refinement between assembly and delivery: fills referenced-but-missing docs (or drops dangling references), reconciles cross-document contradictions, and resolves or escalates residual blockers. Always advances to Delivering regardless of any residual gaps — what it did is recorded per-pass in refinement_summary / reconciliation_summary / blocker_resolution_summary and consolidated into refinement_audit. A run with nothing to refine skips this state entirely and goes straight from Assembling to Delivering, so a missing Refining is normal, not an error.
Delivering The package is being committed to its delivery target (GitHub repo, signed blob URL).
Paused Generation paused; call resume to continue.
PausedAwaitingClarification A creating agent flagged an ambiguity it can't safely resolve and the orchestrator paused for the user to answer. See step 5.6.
Complete Generation finished; package is ready.
Failed Generation stopped; check detail for the reason.
Cancelled Stopped by request.
AddendumRunning A change-request addendum is replaying against a previously-completed package. Parent row stays Complete; this marker rides on the addendum-child generation.

Generation response fields

GET /v1/generations/{id} returns a JSON body with these fields. New since 2026-05-05: project_name, description, kind, and kind_label so callers can identify what each generation is about and disambiguate the deliverable from runnable code.

Field Notes
id / intake_id Stable UUIDs.
state / review_profile / current_round Pipeline state, profile, current review round.
started_at / completed_at / failed_at UTC timestamps.
failure_reason / failure_category Populated only on Failed rows. failure_reason is a sanitized, human-readable hint — provider-side errors are normalized to a short stable classifier (e.g. provider rate-limit, auth failure, transient) rather than echoing raw upstream messages, so it won't leak provider internals or PII. The exact strings are not a parseable contract; branch on failure_category for programmatic handling.
schema_version / rubric_version / manifest Pinned at kickoff.
running_cost_usd Live cost estimate (USD). Settles to the package's total_cost_usd on Complete.
billing_state Authoritative work / billing posture — NotStarted, Active, PausedRetrying, Complete, or PausedAwaitingInput (added 2026-06-01). Distinguishes "running and earning its cost" (Active) from "paused on a transient error and burning nothing" (PausedRetrying) from "paused waiting on you — your turn, nothing is stuck" (PausedAwaitingInput, e.g. answering a clarification). null on generations created before the status-projection rollout (2026-05-18). Pair with running_cost_usd to disambiguate "healthy" climbing cost from "runaway."
started_work_at Timestamp of the first transition into active work (Drafting). null while the generation is still queued. Lets callers compute "how long has this been actively working?" without re-scanning the events stream. null on pre-rollout generations.
phase_detail Human-readable phase label derived pure-function from state + current_round (examples: "Drafting", "Specialist review (round 2)", "Awaiting your clarification"). Present on every projection row written after the rollout. null on pre-2026-05-18 generations.
progress_explanation One-sentence explanation of what's happening at the current progress_percent (e.g., "Specialists are reviewing the draft in parallel"). Closes the understanding gap the bare progress integer can't. null on pre-rollout generations.
estimated_duration_seconds Historical-median forecast of the run's eventual total wall-clock duration in seconds, keyed by review_profile. null when the historical sample is too small for a confident forecast (the floor is 5 completed generations in the rolling 30-day window) or on pre-rollout generations.
estimated_time_remaining_seconds Best-effort "expected remaining" computed as estimated_duration_seconds - elapsed_since_started_work_at, floored at 0. null while queued, terminal, when the forecast is unavailable, or when a still-running generation has already outrun its forecast (the ETA resets to estimating…).
estimated_completion_at Best-effort wall-clock completion timestamp: started_work_at + estimated_duration_seconds. null while queued, terminal, when the forecast is unavailable, or when a still-running generation has already outrun its forecast (the ETA resets to estimating…).
active_specialist During SpecialistReview only — slug of the most-recently-completed specialist in the current round (codd / halo / tally / vera / trip / merlin / polo). A pragmatic single-value summary of a parallel fan-out. null outside SpecialistReview, when no specialists have completed yet, or on pre-rollout generations.
retry_count Added 2026-05-19. Count of recoverable LLM-provider retries that have fired during this run (rate-limit / transient 5xx / timeout backoffs). Starts at 0 and only increments — never decreases mid-run. Resets to 0 on a host-restart rewind because the retry counter belongs to a single dispatch attempt. Lets callers tell apart "healthy first attempt" (0) from "currently riding out a transient hiccup" (>0).
last_retry_at Added 2026-05-19. UTC timestamp of the most recent retry attempt. null until the first retry fires.
next_retry_at Added 2026-05-19. UTC timestamp the retry policy is currently waiting for before the next attempt (last_retry_at + backoff_delay). null between retries — there isn't a pending one. Lets callers display "next retry in X seconds" without guessing the backoff curve.
recoverable_error_category Added 2026-05-19. Typed classifier for the recoverable failure that caused the most recent retry. One of rate_limit / provider_timeout / provider_server_error / schema_violation / other. Distinct from terminal failure_category — that one is set when the run fails for good; this one is set when an LLM call temporarily failed but the retry policy is still covering it. null when no retry has fired yet.
host_restart_resume_count Added 2026-05-27. How many times this run was automatically resumed after a host restart (capped at 5). Distinct from retry_count — that one is provider-level and resets to 0 on a host-restart rewind, so a run that recovered from several restarts still reads 0; this one spans the run's whole life and only climbs. A non-zero value is the honest reason the run's running_cost_usd or estimated_total_usd runs higher than a clean-run forecast: each resume re-runs work — the full-rewind path re-runs Drafting from scratch, while cheaper in-place resumes pick up from a saved checkpoint. Always present (defaults to 0).
refinement_summary Added 2026-05-29. Outcome of the pre-delivery refinement pass that fills referenced-but-missing docs before a package ships. null when the pass didn't run, made no change, and left no gap (the clean common case). When present, an object with: rounds_used (how many detect → refine → re-validate rounds ran); generated_count / dropped_count / residual_count; generated and dropped arrays of {path, referenced_by[]} — docs the pass filled with real content vs. dangling references it removed (a dead link is worse than no link); a residual array of {path, referenced_by[], reason} — references that couldn't be filled and ship as deferred stubs (these are the package's known gaps); and a ready-to-render summary string. Mirrors the "Pre-delivery refinements" section in the package's handoff.md.
reconciliation_summary Added 2026-05-29. Outcome of the pre-delivery contradiction-reconciliation pass that resolves cross-document architecture contradictions (e.g. one doc says PostgreSQL, another says DynamoDB) before a package ships. null when the pass found nothing to reconcile and left no residual contradiction (the clean common case). When present, an object with: rounds_used (how many detect → reconcile → re-validate rounds ran); reconciled_count / unresolved_count; a reconciled array of {category, summary, affected_locations[]} — contradictions resolved by redrafting the affected docs to agree on one decision; an unresolved array of {category, summary, affected_locations[], reason} — contradictions that couldn't be reconciled and ship as known gaps, with the reason; and a ready-to-render summary string. When the pass reconciles a contradiction, that finding no longer appears in consistency_findings either. Mirrors the "Pre-delivery reconciliation" section in the package's handoff.md.
blocker_resolution_summary Added 2026-05-29. Outcome of the pre-delivery blocker resolve-or-clarify pass that acts on residual Critic-flagged blockers before a package ships. null when there were no residual blockers to act on (the clean common case). When present, an object with: resolved_count / clarified_count / residual_count; a resolved array of {target_section, summary} — blockers the pass cleared by redrafting the affected section; a clarified array of {target_section, summary, question} — blockers escalated into a clarification question; a residual array of {target_section, summary, reason} — blockers that couldn't be resolved and ship as known gaps; and a ready-to-render summary string. Mirrors the "Pre-delivery blocker resolution" section in the package's handoff.md.
refinement_audit Added 2026-05-31. Consolidated audit of the whole pre-delivery refinement pipeline — one flat view of what it auto-fixed versus escalated, aggregated from the three fields above (refinement_summary / reconciliation_summary / blocker_resolution_summary) so you don't have to union three differently-shaped objects to answer "what did the pipeline change, and what did it give up on." null on a clean run where every refinement pass was a no-op (the same common case those three fields collapse to). When present, an object with: auto_fixed_count / escalated_count; an auto_fixed array (the pipeline changed the package) and an escalated array (the pipeline surfaced an unresolved gap), each of {pass, action, target, detail}pass is stub-fill / reconciliation / blocker-resolution; action is generated / dropped / reconciled / resolved for auto-fixed rows or residual-gap / unresolved-contradiction / clarified / residual-blocker for escalated rows; target is the doc path, section, or contradiction category; detail is a human-readable summary, reason, or clarification question (may be empty); and a ready-to-render summary string. Mirrors the "Refinement audit" section in the package's handoff.md.
progress_percent Computed 0–100 progress signal driven by state + current_round. Always present.
estimated_total_usd Historical-median forecast of the run's eventual total cost. null when the historical sample is too small for a confident forecast. From 2026-05-27, on a run that has auto-resumed after host restarts the forecast is widened by host_restart_resume_count (each resume re-runs work), so a resume-prone run's estimate reflects the extra cost rather than reading wildly low against the actual.
estimated_total_p25_usd / estimated_total_p75_usd 25th / 75th percentile cost bounds for the same forecast. Both null when estimated_total_usd is null. Widened on resumed runs alongside estimated_total_usd.
estimated_total_sample_size Number of historical generations that contributed to the forecast. null when the forecast wasn't computed.
project_name Display name from the generation's override, or extracted from the intake's projectName / name / title. May be null when no name can be derived.
description Short intake-derived description (description / summary / vision / elevator_pitch / tagline), truncated to 280 chars with an ellipsis. May be null.
kind Stable constant "specification". Disambiguates the deliverable for callers — packages contain specs (architecture, design, plans), not application code.
kind_label Canonical disambiguation copy: "Specification package — describes how to build the software, not the application code itself." Render as-is alongside kind.

Failure categories

When state is Failed, the response also includes a failure_category field — a machine-readable counterpart to the prose failure_reason string. The value is null while the generation is non-terminal; legacy Failed rows created before categorization was added carry Unknown. failure_reason is a sanitized short string, not the literal upstream error — branch on failure_category for programmatic handling.

failure_category When it fires
Unknown Legacy or unclassified failure — no specific category was recorded.
StuckInQueue The row was never picked up by a dispatcher worker; the queue sweep auto-failed it.
HostRestart A host restart caught the row mid-flight. Safe to retry.
OrchestratorCrash A non-LLM exception in the runner — zip builder, blob upload, or db write.
LlmAuthFailure The LLM provider returned 401 or 403 — a BYO key was revoked, or the platform key is misconfigured.
LlmQuotaExceeded The LLM provider returned 429 or otherwise signalled a rate-limit / quota cap.
NetworkTimeout A transient HTTP, timeout, or network failure — usually talking to the LLM provider, but also covers other outbound calls inside the runner (blob upload, package assembly).
ReviewBudgetExhausted The review loop completed all rounds but blocking issues remained. This is a content-quality outcome, not a system fault — retrying without changing the intake is unlikely to help.
RedraftNoProgress The orchestrator's redraft-no-progress guard fired: a re-draft round produced section content nearly identical to the prior draft AND the same blocking issues still applied. Continuing would burn another full LLM round on the same complaint. Try a higher review profile or a more concrete intake.
ReviewLoopStalled The Critic's blocking-issue set hasn't shrunk for 3 consecutive rounds. Distinct from ReviewBudgetExhausted — this fires earlier, before the budget is gone, when the loop is wasting it on the same complaints.
CostBudgetExceeded The cumulative LLM cost crossed the per-profile cap. Hard backstop in case the convergence guards are slipped. Re-run with a smaller scope or upgrade the cap.
LeaseExpired A dispatcher worker claimed the row, lost its lease (host died, network partition, or pod evicted), and the post-batch sweep auto-failed it. Retryable — the next kickoff will pick up cleanly.
LlmContract The LLM returned content that didn't conform to the orchestrator's expected schema after the retry budget was exhausted. Usually transient; rerunning with a different review profile or a more constrained intake often clears it.

This enum is additive. New categories may appear in future API versions without a major-version bump. Treat unrecognized values as Unknown rather than rejecting the response — a closed-set switch will break when new categories arrive.

Step 5.5 — retry a failed generation

If polling lands on Failed, you can re-run the same intake against a fresh generation aggregate without rebuilding the interview:

POST /v1/generations/{id}/retry

Re-uses the original intake_id, review profile, schema/rubric/quality versions, mirror selection, and reference documents. The original Failed row is preserved for audit; the response is a new generation with its own id. Returns 202 Accepted with Location: /v1/generations/{newId} and a body shaped like the POST /v1/generations response — {id, state, download_url?, package_id?}. Poll the new id as in Step 5.

Quota and concurrency apply exactly as on the initial kickoff: QUOTA_EXCEEDED returns 402 Payment Required with the same X-Quota-Tier / X-Quota-Limit / X-Quota-Used / X-Quota-Reset headers, and CONCURRENCY_CAP_REACHED returns 409. The call also counts against the 5-kickoffs-per-minute limit — see rate limits.

A few cases will refuse the retry with 409: generations spawned as part of a Researcher run can't be retried individually (re-fire the parent Researcher run from the original interview), and generations created before the persisted-command feature can't be replayed this way (restart from the interview). Cross-owner retries are also blocked. See errors for the full list of conflict cases.

Step 5.6 — respond to a mid-generation clarification

A creating agent (Architect, Recommender, or Designer Critic) can flag an ambiguity it can't safely resolve from the intake — "Should the API expose REST endpoints, or render server-side HTML?" — and pause the run. The state moves to PausedAwaitingClarification. Web users see the question in their workspace; API/MCP callers fetch and answer it through these endpoints.

GET /v1/generations/{id}/clarifications

Returns the structured questions for a paused generation. The body is {state, clarifications}. When state is anything other than PausedAwaitingClarification, clarifications is an empty array — callers can poll the same endpoint without branching on state first.

{
  "state": "PausedAwaitingClarification",
  "clarifications": [
    {
      "agent": "Architect",
      "section": "docs/02-architecture/03-api-design.md",
      "question": "Should the API expose REST endpoints, or render server-side HTML forms?",
      "why": "Component design and API design contradict each other; the intake doesn't say which to honor.",
      "proposedDefault": "server-rendered HTML forms"
    }
  ]
}

section may be null for non-section-scoped agents (the Recommender's stack-selection questions, for example). why and proposedDefault are advisory copy the agent supplied — surface them to your user verbatim if you have somewhere to render them.

POST /v1/generations/{id}/clarifications/answers

Submit answers and resume the generation. Body {answers: [{question, answer}]}. Match each question exactly to the verbatim text from the GET response — the endpoint pairs answers to pending clarifications by question text. Answers must cover every pending clarification (all-or-nothing for v1).

{
  "answers": [
    { "question": "Should the API expose REST endpoints, or render server-side HTML forms?", "answer": "Server-rendered HTML forms; no REST in v1." }
  ]
}

Returns 202 Accepted with {generation_id, status_url}. The orchestrator picks the run back up on the next dispatcher tick, threads the answers into the next agent call, and re-drafts the originally-stuck section. Poll the status_url to watch the state advance back through Drafting / Reviewing / etc.

400 with a problem-details body fires when the generation isn't in PausedAwaitingClarification or when the answer set doesn't cover every pending clarification (the missing questions are listed in detail). 404 if the generation isn't yours.

Step 5.7 — control a running generation

While a generation is in flight, three control endpoints let you pause, resume, or cancel it without rebuilding the intake. All three return 204 No Content on success and 404 if the generation isn't yours.

POST /v1/generations/{id}/pause

Halts the aggregate at its current state. Returns 409 with an application/problem+json body if the current state doesn't allow pausing — see errors for the conflict codes.

POST /v1/generations/{id}/resume

Transitions a Paused generation back to the state it was in before the pause — a row paused mid-Reviewing returns to Reviewing. The endpoint records the state transition only; it does not re-enqueue work to the orchestrator, so don't assume in-flight LLM work resumes automatically when the call returns. Returns 409 if the generation isn't currently Paused, or — rarely — if no pre-pause state was recorded. See errors.

POST /v1/generations/{id}/cancel

Stops the generation and signals the orchestrator to halt any in-flight LLM calls — useful for cutting cost once you realize a run is going the wrong way. Optional body: {"reason": "..."}. If omitted or empty, the aggregate stamps (no reason given). Returns 409 if the generation is already in a terminal state (Complete, Failed, Cancelled).

None of the three require a request body unless noted above.

Step 6 — retrieve the package

When the generation reaches Complete, a documentation package is ready. Retrieve it:

GET /v1/packages/{id}

The package record includes metadata about what was generated, the review profile used, and links to the package contents. List all packages on your account:

GET /v1/packages?limit=50&offset=0&order=desc

Optional limit (default 50, max 200), offset (default 0), and order (desc newest-first by default, asc). Each row also carries generation_state so you can tell which packages came from clean Complete runs versus partial / failed runs. The list endpoint resolves the project name + description in the same DB round-trip as the package row, so paging the listing doesn't fan into N+1 queries.

Both endpoints carry the same project_name / description / kind / kind_label projection as GET /v1/generations/{id} (added 2026-05-05).

generation_id is null for packages created by migrating existing documentation rather than by a generation run — those packages have no originating generation (Migrate Existing Docs, 2026-05-27). For generated packages it is always present. Filter or branch on null accordingly.

Each row of GET /v1/packages also carries an addenda_count integer — the number of change addenda attached to the package (see step 6.5). 0 when none exist. Saves a per-row follow-up when rendering an "N addenda" annotation.

When you want a package gone — duplicate of a newer iteration, sensitive content, etc. — soft-delete it:

DELETE /v1/packages/{id}

Returns 204 No Content on success and 404 if the package isn't yours. The package row drops out of GET /v1/packages but stays in the database for audit. Idempotent on already-deleted rows. Note: deleting a package does NOT cascade to its parent generation; if you want both gone, delete each independently. To recover a deleted package, see the recycle bin.

To fetch the current package for a generation without going through list and filtering, use:

GET /v1/generations/{id}/package

Returns the same shape as GET /v1/packages/{id} (id, generation_id, version, total_cost_usd, project_name, description, etc.) for the latest package on the generation. 404 while the generation is still in flight or when the package was soft-deleted. Future-proofs for multi-version packages: when a generation produces several package versions, this returns the most recent.

Step 6.2 — read package contents without downloading the zip

For agents that want to inspect a package's structure without fetching the full archive, two endpoints stream the zip's central directory + individual entries from blob storage:

GET /v1/packages/{id}/files

Returns {files: [{path, size_bytes}, ...]} sorted lexicographically by path. The full zip is never materialized on the server — the response reads only the central directory via Azure SDK range requests.

GET /v1/packages/{id}/files/{*path}

Returns the bytes of a single file. The response shape depends on the file type:

  • Text entries (markdown, YAML, JSON, plain text, CSV, SVG): Content-Type matching the entry, body is the raw text.
  • Binary entries (PNG, unknown extensions): served with the appropriate binary Content-Type.

Files larger than 256 KB return a 400 directing the caller at the bulk zip endpoint (GET /v1/packages/{id}/zip). Path-traversal segments (..) are rejected at the application layer.

Step 6.3 — full-text search inside a package

GET /v1/packages/{id}/search?q=...&limit=20

Searches the package's indexed file contents (markdown, YAML, JSON, plain text, CSV, SVG entries; binary files are skipped during indexing). Returns {query, results: [{file_path, snippet, rank}, ...]} ranked by relevance, with snippets HTML-highlighted using <mark>...</mark> markers around the match terms.

Query syntax follows Postgres websearch_to_tsquery:

  • Quoted phrases: "agent topology"
  • Alternation: auth OR session
  • Exclusion: auth -test

Case-insensitive; English stemming is applied (searching matches search). An empty query returns an empty result set rather than every row. limit defaults to 20, max 50.

For cross-package search across every non-deleted package the caller owns, use the cross-package variant — useful when you don't already know which package contains what you're looking for:

GET /v1/packages/search?q=...&limit=10

Returns {query, results: [{package_id, project_name, version, total_hit_count, files: [{file_path, snippet, rank}, ...]}, ...]} — matched packages ranked by best per-file score, with up to 5 file hits embedded per package. total_hit_count carries the per-package true count so callers can render "showing N of M" or follow up with the per-package endpoint for a deeper look. limit defaults to 10, max 25.

Indexing happens automatically at package completion; no client action is required to make a new package searchable.

Step 6.5 — file a change addendum

For a focused single-change request against a completed package — "Add Apple ID as an OAuth provider", "Localize French", etc. — file an addendum instead of running a full re-generation. An addendum is one targeted LLM call (~30 seconds, ~$0.40-0.50) that produces a 5-section markdown bundle (background.md, change-requirement.md, implementation-guide.md, test-plan.md, decision-log-entry.md) attached to the existing package — no version bump.

For multi-change rewrites that warrant a fresh package version (typically ~$2.50), use POST /v1/generations/{id}/update instead, which runs the full agent pipeline.

POST /v1/packages/{id}/addenda

Body: {title, description}. Both fields are required and non-blank — title ≤ 200 chars, description ≤ 4000 chars. The handler authorizes the caller against the package, downloads the original zip, calls a single LLM, builds the 5-file addendum zip, uploads it, and persists the row. The synchronous call takes ~30 seconds; design clients accordingly.

Returns 200 OK with {addendum_id, download_url, cost_usd}. The download_url is a SAS-tokened blob URL valid for one hour. 400 if title or description is missing/blank. 404 if the package isn't yours.

GET /v1/packages/{id}/addenda

Lists every addendum attached to the package, newest-first. Returns {addenda: [{id, package_id, title, description, cost_usd, submitted_by_user_id, created_at}, ...]}. 404 if the package isn't yours.

GET /v1/packages/{id}/addenda/{addendumId}/zip

302 redirect to a SAS-tokened download URL valid for one hour. Mirrors the per-package zip-download shape. 404 if the package or addendum isn't yours.

Step 6.6 — explain a package for an audience

When you want to share a completed package with someone who isn't going to read the full markdown bundle — an executive, an investor, a new engineer joining the project — ask SpecStep to rewrite it as a short audience-tailored explanation. One audience pick = one cached markdown explanation; repeats for the same audience are free.

GET /v1/explain/audiences

Public catalog of available audiences. Returns {audiences: [{slug, display_name, description}, ...]}. The V1 set is six entries: executive, product-manager, engineering-manager, new-engineer, investor, security. No authentication required.

POST /v1/packages/{id}/explain

Body: {audience} — must match one of the public catalog slugs above. Returns {markdown, audience, model, cost_usd, cached}. The cold call runs one LLM round-trip (~10 seconds, ~$0.05) and persists the result; subsequent calls for the same (package, audience) pair return the cached row with cached: true and cost_usd: 0. 400 EXPLAIN_AUDIENCE_UNKNOWN if the slug isn't in the catalog; 400 MISSING_AUDIENCE if the field is blank; 402 QUOTA_EXPLAIN_EXCEEDED if the monthly explanation quota is reached for your tier; 404 if the package isn't yours. A cold call may return 503 EXPLAIN_TIMEOUT when it exceeds its ~75s wall-clock budget — retry-friendly, no cost incurred, and distinct from a 504 gateway timeout.

GET /v1/packages/{id}/explanations

Lists every cached explanation already generated for the package, newest-first. Returns {explanations: [{id, audience, model, cost_usd, cached, created_at}, ...]}. Useful for showing "already generated" badges in a UI before the user picks an audience. 404 if the package isn't yours.

GET /v1/packages/{id}/explain/download?audience=<slug>&format=<fmt>

Streams a previously-cached explanation as a downloadable file — it does NOT generate, so you must have already run POST /v1/packages/{id}/explain for the same (package, audience) pair. Returns the rendered bytes (200) as an attachment named specstep-explanation-<audience>.<ext>. format must be one of md, txt, pdf, docx — anything else (or omitting it) is 400 UNKNOWN_EXPLANATION_FORMAT; audience must be a slug from the catalog above, else 400 EXPLAIN_AUDIENCE_UNKNOWN. 404 if the package isn't yours, or if no explanation has yet been generated for that (package, audience) pair — generate it first. A safe GET, so a browser <a download> link works with no antiforgery token.

Step 6.7 — migrate existing docs into a package

Have pre-existing documentation that didn't come from a SpecStep generation? Upload a .zip of it and SpecStep classifies each file onto the canonical package layout, then assembles a package you can track development against — no generation run. Both routes take multipart/form-data with the archive in a file field (max 16 MB).

POST /v1/doc-migrations/preview

Dry run: classify the uploaded archive and return the proposed mapping without persisting anything. Returns {source_archive_name, source_byte_count, total_file_count, classified_count, unclassified_count, classifier_version, mapping: [{source_path, doc_type, target_path, layer, confidence}, ...], conflicting_target_paths: [...]}. Each mapping row shows where a source file would land in the normalized package; layer (Manifest / CanonicalTaxonomy / Heuristic / Fallback) + confidence tell you how sure the classifier is. Files that can't be placed map to _source/… (doc_type: "Unclassified"). A non-empty conflicting_target_paths means two files claim the same canonical slot — you must resolve those before committing. 400 for a non-zip, empty, or oversized upload.

POST /v1/doc-migrations/commit

Normalize + persist. Form fields: file (required), project_id (optional — defaults to your default project), version (optional SemVer, default 1.0.0), target_path_overrides (optional JSON object of source-path → target-path corrections from the reviewed preview). Builds the normalized package (canonical layout + _source/ for unclassified files + a specstep.yaml manifest marked source: migrated), stores it, creates the package, and links it to the project. Returns {migration_id, package_id, project_id, version, classified_count, unclassified_count}. The resulting package appears in GET /v1/packages with a null generation_id. 400 for a bad upload; 409 DOC_MIGRATION_UNRESOLVED_CONFLICTS when two sources still claim one canonical slot (supply target_path_overrides to resolve).

Step 7 — deliver the package

Delivery commits the package to the GitHub repository you have configured. You configure the repository in GET /v1/source-control/preferences/installation and PUT /v1/source-control/preferences/installation. Once configured, trigger delivery with:

POST /v1/packages/{id}/deliver

SpecStep commits the package to a new branch and opens a pull request. Your default branch is not touched directly. The response confirms the delivery was queued and includes a reference to the target repository and branch.

Step 7.5 — register webhooks instead of polling

Polling GET /v1/generations/{id} works, but for long-running generations or external automation that can't sit on an open connection, register a webhook subscription on your API key and let SpecStep POST state changes to you instead.

Each subscription belongs to a specific API key (cascade-deleted when the key is revoked) and listens for one or more event types. The signing secret is returned once at create or rotate time — store it; SpecStep never returns it again.

REST vs MCP auth — intentional asymmetry. The REST webhook routes below (create, delete, rotate-secret, test) accept BOTH a cookie session AND an API-key bearer token. Programmatic callers — CI pipelines, server-side scripts — can manage their own webhooks with the same key they use for everything else. The equivalent MCP tools (create_webhook, rotate_webhook_secret, test_webhook) refuse API-key principals; only list_my_webhooks and delete_webhook accept them. The reasoning: an AI agent acting on a leaked or scope-broadened key shouldn't be able to silently redirect or re-sign future event payloads to an attacker-controlled URL. REST callers are explicitly accepting that redirect risk by reaching for the REST endpoint instead of the MCP tool.

POST /v1/api-keys/{apiKeyId}/webhooks

Body:

{
  "url": "https://example.com/specstep-hook",
  "events": ["generation.state_changed", "generation.paused_awaiting_clarification", "generation.completed", "generation.failed"]
}

Returns 201 Created with the subscription record + the plaintext signing_secret. The URL must be HTTPS. Unknown event types return 400.

GET /v1/api-keys/{apiKeyId}/webhooks — list subscriptions for the key. signing_secret is null here; the value is only ever returned at create / rotate.

DELETE /v1/api-keys/{apiKeyId}/webhooks/{webhookId} — remove. Returns 204.

POST /v1/api-keys/{apiKeyId}/webhooks/{webhookId}/rotate-secret — issue a fresh signing secret and invalidate the old one. Returns the subscription record with the new plaintext signing_secret populated.

Test a webhook subscription

POST /v1/api-keys/{apiKeyId}/webhooks/{webhookId}/test — fire a synthetic webhook.test event against the configured URL and return the live delivery outcome. No request body. Returns {success, http_status, failure_reason, latency_ms, delivery_id}. 404 if the subscription isn't yours.

The synthetic event uses the webhook.test event type — receivers can branch on it without affecting business state — and travels through the same signing and delivery path as a real event, so a successful test is a valid integration smoke. Same auth contract as create / delete / rotate: owner-initiated, cookie or API key.

Event types

Event type When it fires
generation.state_changed Every state transition (Queued → Drafting → Reviewing → ...). Most general — subscribe here when in doubt.
generation.paused_awaiting_clarification The generation paused because a creating agent flagged an ambiguity. Pair with step 5.6 to read + answer the question.
generation.completed The terminal Complete transition — package is ready.
generation.failed The terminal Failed transition — failure_reason + failure_category are inlined in the body.

Delivery shape

POST <your URL>
Content-Type: application/json
X-SpecStep-Webhook-Signature: sha256=<hex>
X-SpecStep-Webhook-Timestamp: <unix-seconds>
X-SpecStep-Webhook-Event: generation.state_changed
X-SpecStep-Webhook-Delivery: <delivery-uuid>

{
  "event": "generation.state_changed",
  "delivered_at": "2026-05-05T23:00:00Z",
  "generation": { ...same projection as GET /v1/generations/{id}... }
}

The body inlines the same project_name / description / kind / state / timing fields you'd get from GET /v1/generations/{id}. No follow-up GET required.

Verifying signatures

Compute HMAC-SHA256(raw body bytes, your signing secret), hex-encode it lowercase, and compare to the value in X-SpecStep-Webhook-Signature after the sha256= prefix. Use a constant-time comparison. Reject deliveries with a X-SpecStep-Webhook-Timestamp more than 5 minutes old to defeat replays. Use X-SpecStep-Webhook-Delivery as your dedup key — duplicate delivery IDs are safe to discard.

Delivery semantics

Best-effort with bounded retry: 5xx responses + transport failures retry up to 3 times with exponential backoff (1s, 4s, 16s). 4xx responses are treated as terminal — fix the subscriber, then SpecStep will succeed on the next event. Per-subscription state (last delivery time, last status, last HTTP code) is exposed on the GET response so you can see whether a webhook is healthy without instrumenting your own receiver.

No persistent queue, no DLQ for v1 — if every retry fails, the event is logged on the server side and dropped. Build receivers that can tolerate occasional missed events; the canonical state is always GET /v1/generations/{id}.

Step 7.7 — recover deleted interviews, generations, and packages

Soft-deletes are reversible. Every soft-deleted row stays in the database; you can list and restore them on demand. The web UI surfaces this as "Settings → Recycle Bin"; the equivalent REST endpoints are below.

List your own deleted rows

GET /v1/interviews/deleted?limit=20&offset=0

GET /v1/generations/deleted?status=...&limit=50&offset=0&order=desc

GET /v1/packages/deleted?limit=50&offset=0&order=desc

Each returns the same row shape as its live counterpart. Standard pagination (limit, offset) plus order for the generations + packages variants. Caller-scoped — you only see your own rows. Anonymous → 401.

Restore a deleted row

POST /v1/interviews/{id}/restore

POST /v1/generations/{id}/restore

POST /v1/packages/{id}/restore

All three return 204 No Content on success and 404 if the row isn't yours or doesn't exist. No state guard on restore — even if the generation was Failed or Cancelled at delete time, restore returns it to your workspace in the same state. Idempotent on already-live rows. Each restore writes an audit row.

Restoration does NOT cascade. Restoring an interview does not auto-restore generations or packages produced from it; if you want all three back, restore each independently.

Delete forever (permanent removal)

Added 2026-05-14.

The Recycle Bin's "Delete forever" affordance maps to three hard-delete endpoints. Only soft-deleted rows can be hard-deleted — the endpoints return 409 Conflict on a live row.

DELETE /v1/interviews/{id}/permanent

DELETE /v1/generations/{id}/permanent

DELETE /v1/packages/{id}/permanent

All three return 204 No Content on success, 409 when the row isn't soft-deleted yet, and 404 when the row doesn't exist or isn't yours. The dependent rows (intake artifacts, generation events, package addenda, blob payloads) cascade automatically via the existing EF foreign-key configuration. There is no restore after a hard-delete.

Step 8 — public status endpoints

These endpoints are anonymous — no API key required, no authentication header. They are intentionally reachable when the rest of the API is degraded, so you can wire them into health checks and status pages without worrying about auth-path outages.

GET /v1/status/summary

Returns {overall, services, active_incidents, generated_at} — the current overall status, the per-service breakdown, any active incidents, and when the snapshot was produced.

GET /v1/status/uptime?days=30

Uptime report over the trailing window. days defaults to 30.

GET /v1/status/history?limit=50

Recent incidents, newest first. limit defaults to 50.

POST /v1/status/subscribe

Body {"email": "..."} subscribes the address to status updates. Returns 200 whether or not the email already exists — the endpoint doesn't enumerate subscribers — and the user-facing message is always "Check your email for a confirmation link." Invalid emails return 400; the per-IP throttle is 5 requests per 15 minutes and returns 429 past that.

Other useful endpoints

Account-level reads, BYO-provider key management, notification controls, bug reports, and schema retrieval — all callable with a bearer token, none required for the generation flow.

Account & usage

Endpoint Purpose
GET /v1/usage Current quota usage for your account (generations used, remaining, reset date).
GET /v1/me/analytics Personal usage analytics: generation counts, average duration, review-profile distribution.
GET /v1/me/provider-keys List any BYO LLM provider keys (Anthropic, OpenAI) you have registered.
PATCH /v1/me/provider-keys/{provider} Upsert a provider key. Body: {"secret": "..."}. Requires the Developer role.
DELETE /v1/me/provider-keys/{provider} Remove a provider key.

Notifications

Endpoint Purpose
GET /v1/notifications List your notifications.
GET /v1/notification-preferences Retrieve your notification preferences (email, SMS).
PUT /v1/notification-preferences Update notification preferences.

Retention preferences

Added 2026-05-14.

Per-user default-retention deadline applied to new packages. The Web UI surfaces this at Settings → Privacy & retention.

Endpoint Purpose
GET /v1/users/me/retention-preference Returns {default_retention_days}. null means "indefinite" (the platform default).
PUT /v1/users/me/retention-preference Body: {"default_retention_days": <int> \| null}. Range 1–3650; null clears the override. Returns 200 with the persisted snapshot.

This preference applies only to NEW packages — existing rows aren't backfilled. To change retention on a specific package, see update_package (MCP) or the equivalent REST mutation.

Data export (GDPR data portability)

Added 2026-05-15.

Self-serve export of your account data — interviews, generations, packages, audit log, notification + retention preferences. The job runs asynchronously; the response is 202 Accepted with the request id. When the export completes, the executor uploads a zip to blob storage and you receive an email with a signed download URL valid for 7 days.

Endpoint Purpose
POST /v1/users/me/data-export-request Records the request. Returns 202 Accepted with {request_id, status: "queued"}. Cookie-only — bearer-token callers get 403.
GET /v1/users/me/data-export-request Returns the latest request's snapshot — {request_id, status: "queued"\|"processing"\|"completed"\|"failed", requested_at, download_url?, download_url_expires_at?}. Poll until status reaches a terminal value.

The audit log records data_export.requested on submission and data_export.completed on success.

Account deletion (GDPR right-to-erasure)

Added 2026-05-14.

Self-serve account deletion. The job runs asynchronously; ~30 seconds after submission a worker hard-deletes the user's interviews, generations, packages, API keys, OAuth tokens, external connectors, preferences, and the user row itself. Audit events are anonymized (actor_id replaced with the tombstone Guid.Empty) but retained per the 13-month audit-retention window. The user receives a confirmation email; the support inbox receives an operational copy.

Endpoint Purpose
POST /v1/users/me/deletion-request Records the deletion request. Returns 202 Accepted with {request_id, status: "queued"}. Cookie-only — bearer-token callers get 403.
GET /v1/users/me/deletion-request Polls the request's status. Returns {request_id, status: "queued"\|"processing"\|"completed"\|"failed", requested_at}. Once status = "completed" the response is the last thing you'll get back — the next request returns 401 because the user is gone.

The audit log records account.deletion_requested on submission and account.deletion_completed once the cascade succeeds.

Extra Usage prepaid balance

Added 2026-05-14.

Top up a prepaid balance that absorbs overage when the monthly generation quota is exhausted. Web UI: Settings → Plan → Extra Usage.

Endpoint Purpose
GET /v1/me/extra-usage Returns the user's balance + enabled flag + last-topup timestamp.
POST /v1/me/extra-usage/enable Enables Extra Usage (off by default; overage hits the standard 403-quota-exceeded path until enabled).
POST /v1/me/extra-usage/disable Disables Extra Usage. Existing balance is preserved but not consumed.
POST /v1/me/extra-usage/checkout Creates a Stripe Checkout session for a buy-block top-up. Body: {"amount_usd": <int>}. Returns the checkout URL.
GET /v1/me/extra-usage/transactions?limit=50&offset=0 Lists transactions (top-ups + debits) newest-first.

Cookie-only — Extra Usage purchases are user-actioned, not API-key-driven.

Organizations

Added 2026-05-26.

Groups a team under one account — membership is optional, and the rest of the API behaves the same either way. Requires the Teams plan; the creator becomes the primary contact and first member. Web UI: Settings → Profile → Organization.

Endpoint Purpose
GET /v1/me/organization Returns your organization — {id, name, primary_contact_user_id, address_line1, address_line2, city, region, postal_code, country, phone_number, member_count, created_at, updated_at}. Returns 204 No Content when you don't belong to one.
POST /v1/me/organization Create an organization with yourself as the primary contact + first member. Body: {"name": "Acme Co", "address_line1"?, "address_line2"?, "city"?, "region"?, "postal_code"?, "country"?, "phone_number"?} — only name is required. Returns 201 Created with the organization.

POST /v1/me/organization is gated on your subscription tier and current membership; both failures return an RFC 7807 problem with a code extension:

  • 403, code: TEAMS_TIER_REQUIRED — creating an organization requires the Teams plan. Upgrade, then retry.
  • 409, code: ALREADY_HAS_ORGANIZATION — you already belong to an organization. A user can belong to at most one; leave it before creating another.

API-key scoping & rotation

The canonical key-lifecycle docs live in authentication; the scoping + rotation notes belong here:

  • POST /v1/api-keys accepts an optional scopes: [...] array of permission codes — fetch the catalog from GET /v1/permissions. Pass null or omit the field for legacy unscoped behavior. Unknown codes return 400.
  • POST /v1/api-keys also accepts an optional project_id (UUID) to scope the key to a single project — omit or null to let the key access all of your projects. The project must be one you own or one in your organization, otherwise 400. If you belong to an organization, the key is automatically bound to that organization. The create response (and GET /v1/api-keys summaries) echo project_id + organization_id.
  • PATCH /v1/api-keys/{id}/scopes rotates the scope set on an existing key. Body: {scopes: [...] | null}. Returns 204 / 404 / 400.
  • POST /v1/api-keys/{id}/rotate rotates the key's secret in place. No request body. Returns 200 with a fresh raw_key (shown once — copy it immediately) while preserving the key's identity, scopes, and project/org binding; the old secret is invalidated immediately. 404 if the key isn't yours or is revoked. GET /v1/api-keys summaries expose last_rotated_at.

These endpoints are forbidden to API-key callers — only cookie- or OIDC-authenticated humans can mint, rotate, or re-scope keys, so a leaked key cannot mint a replacement, rotate its own secret, or escalate its own scopes.

Bug reports

Anyone authenticated can submit and read back their own bug reports.

POST /v1/bug-reports — submit. Body: {title, description, severity?, related_generation_id?, current_route?, user_agent?, client_type?}. severity is one of low, medium, high, critical (default medium); client_type is browser, api, or mcp (REST callers default to api; MCP and the browser form stamp their own value server-side, so callers can't spoof). The server enriches every submission with the caller's account name / email / plan, the build version, and a heuristic AI-tool tag derived from the User-Agent (Claude Code / Codex / Copilot / etc.). Returns 201 Created with the full record.

GET /v1/bug-reports/me?limit=20 — your own reports, newest first.

GET /v1/bug-reports/{id} — a single report. Open to the submitter. Foreign callers get 404 so report ids can't be probed.

Quality feedback

Distinct from bug reports — feedback evaluates quality (was the interview good, is the package coherent, what's the build confidence). Bug reports are for broken behavior.

Anyone authenticated can submit and read back their own feedback. The two template-catalog reads are anonymous-OK; the templates are public-safe content the client needs to fill the rubric.

POST /v1/feedback — submit. Body: {type, title, full_report, summary?, severity?, client_type?, interview_id?, intake_artifact_id?, generation_id?, package_id?, interview_quality_score?, package_quality_score?, build_confidence_percent?, letter_grade?, structured_findings?, template_id?, rubric_version?, rubric_section_responses?, rubric_scores?, tags?, estimated_output_quality?, project_type?, review_profile?, transcript_evidence?, package_evidence?}. type is one of interview_quality, package_quality, end_to_end_run, tooling_experience, api_doc_quality, website_quality, launch_readiness, other; severity is info, low, medium, high, critical (default medium); client_type is browser, api, or mcp (REST callers default to api; MCP stamps mcp server-side, so callers can't spoof). Run-bound types (interview_quality, package_quality, end_to_end_run) require at least one target GUID; tooling_experience, api_doc_quality, website_quality, launch_readiness, and other may submit without one. The server enriches every submission with the caller's account name, email, plan, build version, and a heuristic AI-tool tag. Returns 201 Created with the full record.

Optional submitter context: estimated_output_quality is a short qualitative label (≤50 chars) — distinct from the numeric build_confidence_percent and the single-letter letter_grade. project_type and review_profile (≤50 chars each) denormalize the project type and review profile at submission time so the triage queue can filter on them even if the underlying interview is regenerated with different settings. transcript_evidence and package_evidence are optional arrays of quoted snippets (each ≤2000 chars) backing the findings — surface them when an LLM-class submitter (Codex, Claude Code) can quote the source material directly.

Each entry in structured_findings carries {severity, topic, title} plus three optional fields each capped at 2000 chars: evidence (quoted text supporting the finding), expected_behavior (what the caller expected), and suggested_fix (caller's proposed remediation). Mirrors the specialist-reviewer finding shape so feedback findings + reviewer findings aggregate together.

Typed evidence (added 2026-05-21): each finding also accepts an optional typed_evidence array (up to 20 items) for machine-readable signal that would otherwise be flattened into prose. Each item is {kind, payload_json} where kind is one of free, http_response, route, console_error, mcp_tool_call, transcript_turn, screenshot, json_payload and payload_json is a well-formed JSON document (≤4000 chars). Required keys depend on the kind: http_response needs a numeric status; route a string url; console_error a string message; mcp_tool_call a string tool; transcript_turn a numeric turnIndex; screenshot a string path; free and json_payload accept any well-formed JSON. Prose evidence and typed_evidence coexist; read responses echo typed_evidence in the same shape.

GET /v1/feedback/me?limit=20 — your own feedback, newest first. Returns a slim list shape (added 2026-05-21): each row carries the scalars, scores, a 200-char summary_excerpt, and counts (tag_count, finding_count, transcript_evidence_count, package_evidence_count) instead of the full bodies — fetch the complete record (full_report, structured_findings, evidence arrays) from GET /v1/feedback/{id}.

GET /v1/feedback/{id} — a single feedback row. Open to the submitter. Foreign callers get 404 so feedback ids can't be probed.

PATCH /v1/feedback/{id}/amend — submitter self-correction (added 2026-05-21). While your row is still Open AND within the amend window (10 minutes of submission), you can fix free-form content: {title?, summary?, full_report?, transcript_evidence?, package_evidence?, tags?}. Omitted fields are left unchanged. Identity-defining fields (type, severity, target GUIDs, template_id/rubric_version) and structured_findings are NOT amendable. Returns 200 with the updated record. 404 if the row isn't yours (existence isn't leaked); 400 FEEDBACK_AMEND_NOT_OPEN once the row has left Open (review has started); 400 FEEDBACK_AMEND_WINDOW_EXPIRED after the window. After that, the row is locked for self-correction.

GET /v1/feedback/templates — anonymous-OK. Lists the code-defined rubric templates that ship with the platform.

GET /v1/feedback/templates/{id}/{version} — anonymous-OK. Returns the full sections array (each with id, title, prompt, and optional score_scale). Fill rubric_section_responses keyed by section id and rubric_scores for sections that have a non-null score_scale.

Seven built-in templates ship in v1. Pick the one whose scope matches the feedback — narrower rubrics keep the signal cleaner than the all-in-one.

Template id Pairs with type Scope
end-to-end-specstep-quality v1.0.0 end_to_end_run One full SpecStep run (interview through generated package). 13 sections covering interview quality, package coherence, build confidence, letter grade, top blockers, recommended fixes.
interview-quality v1.0.0 interview_quality Otto's performance during a single Interview. 7 sections covering pacing, follow-up quality, coverage breadth, rapport, gaps.
package-buildability v1.0.0 package_quality Whether a generated package is buildable as-is by an AI coder. 8 sections covering coherence, completeness, AI-coder clarity, edge cases, effort-estimate accuracy, top risks.
api-doc-quality v1.0.0 api_doc_quality The public /api-docs/* surface. 8 sections covering endpoint coverage, completeness, example clarity, error handling, schema clarity, missing sections, recommended improvements.
tooling-experience v1.0.0 tooling_experience SpecStep's tooling surfaces (MCP, CLI, IDE integration). 9 sections covering ergonomics, integration, error-message clarity, performance, friction points.
website-quality v1.0.0 website_quality The public marketing/docs site at specstep.com. 11 sections covering visual polish, copy quality, SEO + sitemap correctness, route correctness, mobile experience, console cleanliness, content sanitization.
launch-readiness v1.0.0 launch_readiness Cross-cutting pre-launch review. 12 sections covering Priority-0 blockers, public content sanitization, trust posture, API + MCP stability, mobile readiness, accessibility, performance, observability, and a final go / no-go recommendation.

Support tickets

POST /v1/support/ticket — submit a support ticket through the in-app channel.

Schema retrieval

Endpoint Purpose
GET /v1/schema/package/{version} JSON schema for the package format at a specific version.
GET /v1/schema/intake/{version} JSON schema for the intake artifact format.

Both return application/schema+json on success and 404 when the version isn't recognized. Useful for validating generated content programmatically.

Miscellaneous mutations

PATCH /v1/generations/{id}/name — set the display name for a generation. A request body is required; the name field can be a string, or null / empty / whitespace to clear the override and fall back to the intake-derived name. Sending no body at all returns 400. Returns 204 No Content on success; 404 if not yours.