What's new in SpecStep

Updated 2026-07-16.

Session-state tools — delete errant lessons, reprioritize backlog items

v0.27 · 2026-07-16

Two new session-state tools give your AI coder finer control over its own build record — it can delete a lesson it filed by mistake, and change a backlog item’s priority at any time.

What's new for you

delete_lesson hard-deletes a build lesson filed in error, freeing its name so you can re-file cleanly. A lesson that has already become an enforced rule is protected — archive those instead.
reprioritize_backlog_item changes a backlog item’s priority at any time, with no filer or time-window limit — the hygiene path for re-triaging your queue.
Both are documented in the public MCP tool reference.

Generation pipeline — determinism

v0.27 · 2026-07-16

Continued internal work on generation-pipeline determinism — settling each cross-document decision once and rendering every document from that single resolved source, so the same input yields the same package.

No user-visible changes yet — these changes are foundational and not enabled for generations.

Generation pipeline — API, schema, and safety rules resolved once

v0.26 · 2026-07-06

More of what the platform decides about your project — API routes and their fields, age-appropriate content rules, and a set of cross-cutting product policies — is now settled once and carried into every document that references it.

What's new for you

A paid generation now always delivers its package, with any unresolved items clearly flagged, instead of withholding the package entirely.
An optional step in the pipeline that fails now degrades gracefully instead of interrupting your whole generation.
If your project needs age-appropriate content-visibility rules, those rules are resolved once and rendered consistently across the requirements, vision, and design sections of your package, rather than each section describing them independently.
Your API design and your acceptance criteria draw their endpoint and field details from the same resolved source, so route names and field names match across your package’s documents.

Under the hood

The pipeline extends its resolve-once approach — previously applied to your schema, authentication approach, and role model — to your API surface, a set of numeric and cross-cutting product-policy decisions, and content-visibility rules, so documents reference the same facts rather than each re-deriving its own answer.
A security fix strengthened how write requests handle credentials.
Additional internal consistency checks catch and remove schema elements that referenced columns which no longer exist, and correctly number generated acceptance criteria to avoid ID collisions.
The pipeline settles its core structural decisions more consistently for the same input.

Reliability & recovery

v0.26 · 2026-07-06

Two reliability fixes: the app now keeps working through a brief cache interruption, and our deployment checks no longer misreport a healthy deploy as failed.

What's new for you

If the cache is briefly unreachable, features that read from it keep serving instead of erroring — a transient infrastructure interruption no longer interrupts your work.

Under the hood

Deployment checks now tolerate a transient infrastructure interruption during verification instead of misreporting a healthy deploy as failed.
General infrastructure hygiene.

Session-state & project tools are self-service — and fully documented

v0.26 · 2026-07-01

The session-state tools — build sessions, decision log, backlog, projects, and build lessons & rules — are self-service: any API key carrying the session-state and project scopes can drive them against its own projects.

What's new for you

Build lessons & rules are self-service — file and promote your own lessons, and retrieve the rules that apply to a change, using a project-scoped API key.
A new public reference documents the full session-state & project tool surface, grouped by family, so you can discover every tool from the API docs.
The MCP quickstart now walks through migrating an existing project — importing your existing handoff notes, decision log, and backlog — plus the recommended setup right after you connect.
Starting a build session no longer fails on a missing client-type field; it defaults sensibly.

Under the hood

Rule retrieval now matches area path patterns at any depth, so the rules that apply to the code you’re changing surface for the change you’re making.

Generation pipeline — consistency & determinism

v0.26 · 2026-07-01

No user-visible changes — internal generation-pipeline consistency and determinism work.

Generation-quality measurement instrumentation

v0.26 · 2026-06-28

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Generation pipeline — resolve once, derive everything

v0.25 · 2026-06-28

the generation pipeline now settles every core technical decision once — schema, authentication library, role model, allowed values, deletion semantics, framework choice — and renders that resolved set into every dependent document, rather than letting each document re-derive its own answer.

What's new for you

The data model, API design, acceptance criteria, non-functional requirements, and ADRs all draw from a single resolved set of core decisions — so the same data model, roles, and allowed values appear identically across every document in the package.
A schema-conformance check at the end of the Refining stage blocks delivery when any document’s tables still diverge from the resolved schema — mismatches no longer ship silently.
More contradiction findings are auto-resolved during Refining — fewer false flags on equivalent technology choices, example provider lists, and anonymity handling.
If a generation can’t finish in full, it now delivers the most complete draft it reached instead of failing with nothing.

Under the hood

Core decisions are distilled into a content-addressed resolved spec, cached so the same intake produces the same resolved decisions across restarts and resume cycles.
Dependent documents are rendered from that resolved spec via deterministic per-concern passes — the chosen authentication library, role definitions, and allowed values are stamped into every section’s prose and tables from one source of truth.
Prompt assembly now condenses or truncates sibling sections before sending, so every generation prompt fits the model’s context window rather than relying on the model to handle silent overflow.

Workspace packages list handles migrated packages

v0.25 · 2026-06-19

the Workspace packages list no longer crashes when a project contains a migrated package — migrated packages (imported rather than generated) now appear in the list alongside generated ones.

What's new for you

The Workspace packages list shows migrated packages without crashing — imported packages appear right alongside packages you generated.

v0.25 polish & infrastructure

v0.25 · 2026-06-28

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Generation pipeline — resolve, don’t just flag

v0.24 · 2026-06-16

the generation pipeline now resolves the problems it finds rather than surfacing them for you to untangle — filling its own gaps, reconciling internal contradictions, and committing explicit product defaults before delivery.

What's new for you

Specs fill their own gaps — missing acceptance criteria and unspecified operational defaults (response-time targets, data-retention periods, rate limits) all get an explicit product default instead of shipping as open questions.
Concrete technology choices are committed early in the generation, so every section of the package agrees on the same answer instead of each proposing its own.
When a genuine blocker truly can’t be auto-resolved, the pipeline pauses and asks one plain-language clarification question — clearly worded, one question per issue.
Packages now carry “Specified” status labels, not “Implemented” — the spec describes what you intend to build, not what’s already shipped.
Clarification questions are deduplicated — one per logical issue, with your answer applied across all the sections that depend on it.

Under the hood

Review depth is calibrated to the project’s stakes — a simple standalone tool isn’t evaluated against the same bar as a system that handles payments or personal data.
Contradictions between a specialist’s findings and the package’s established decisions are reconciled automatically, with the established decision winning.
A new guard catches any spec clause that would auto-approve low-confidence content in a child-safety context and rewrites it to deny-by-default.
Fewer false flags — hard-delete semantics for account deletion, requirement-ID matching that ignores padding differences, auth-framework analysis that groups equivalent patterns instead of treating them as contradictions, and compliance-language checks that skip quoted examples.
A cost circuit-breaker prevents a stuck refinement loop from running indefinitely between resume cycles.

Project-scoped API keys stay in their project

v0.24 · 2026-06-13

two access gaps for project-scoped API keys are closed — the work a key creates now lands in the right project, and the key can’t see your other projects.

What's new for you

Work filed via a project-scoped API key now lands in that key’s project — not in your Default project.
A project-scoped key can no longer list or read your other projects — it sees only the one it’s bound to.

Build lessons & rules

v0.24 · 2026-06-08

SpecStep now captures what each build session learned — automatically, per project — and surfaces those lessons where they’re useful: in the project activity feed and on your project’s detail page.

What's new for you

Hard-fought build lessons are captured automatically at session end — nothing slips through because someone forgot to write it down.
Lessons and rules are isolated per project — one project’s patterns never bleed into another’s, and each project’s team manages its own set.
Build lessons and derived rules appear in the project activity feed alongside generation events.
A read-only “Build lessons & rules” card on your project’s detail page shows captured lessons and the rules distilled from them.

Under the hood

Lesson and rule data is scoped to each project — no cross-project leakage by design.

v0.24 polish & infrastructure

v0.24 · 2026-06-16

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Project detail page — command-center redesign

v0.24 · 2026-06-08

the per-project page was a long vertical wall of raw data; it’s now a dashboard-first command center.

What's new for you

A build-progress ring hero and an at-a-glance health banner surface repo and package gaps, staleness, and backlog items that need attention — without scrolling.
KPI tiles now open slide-in detail panels for build progress, sessions and analytics, build cost, decisions, and backlog.
A recent-activity timeline, a milestones/next-up card drawn from the linked phase plan, and a recent-contributors card show who has logged work and what’s coming next.
All project configuration lives behind one Settings button — a side drawer with scoped sub-drawers replaced the scattered inline fields.

Under the hood

Two new API reads back the page — project activity and contributors — both wired to the slide-in panels.

Credit-based pricing and billing

v0.24 · 2026-06-08

pricing now works in credits — every plan carries a credit balance that usage draws from.

What's new for you

Plans and quotas now show in credits wherever quota context appears in the product.
The pricing page has been rewritten to describe the credit model and what each tier includes.
The Terms of Service, Privacy Policy, and FAQ are updated to cover the credit model, how feedback and the AI inbox work, and data-transfer terms.

Teams and organizations

v0.24 · 2026-06-08

users on a shared company email domain are now grouped into an organization automatically — no manual invite needed for the common case.

What's new for you

Users with a shared company email domain are auto-joined to their organization on sign-in.
Each plan has a seat limit; new members see a clear message when the team is full.
Settings → Team gives admins a member roster with role management and a seat count.

Launch-readiness site refresh

v0.24 · 2026-06-08

a marketing and site polish wave: a clearer homepage, unified support routing, an updated About page, a brand kit, and API docs that reflect the current product.

What's new for you

Homepage restructured around what you build with SpecStep, with a “Try free →” primary call to action and leaner top nav.
Support, contact, and feedback now route through one place — no more searching for the right form.
The About page gains company and trust framing.
A one-click brand kit and context-aware taglines are available at /brand; the old tagline is retired.
The API and MCP docs and the interactive API reference are synced to the current product surface — operations, parameter names, and examples are current.
Settings shows your current plan and an honest source-control status.
The generation detail page has a cleaner next-action strip after a package is delivered.

Generation quality — fewer false flags, more auto-resolved

v0.24 · 2026-06-08

spec packages are more trustworthy — fewer false problem flags and more issues resolved automatically before you see them.

What's new for you

Fewer false “a required document is missing” alerts — every quality check is now routed explicitly before the package is delivered.
More contradiction types — including legal/safety language and credential-handling inconsistencies — are resolved automatically before delivery.
Build-readiness reasons are now specific: you see which document or check triggered a needs-refinement state, not a generic label.
A missing role definition is now generated rather than flagged as an unresolved gap.

Under the hood

A new requirement-fidelity audit (off by default) can detect drift between the generated package and your original intake.
False “claimed count” and false schema-conflict findings are filtered before delivery.

Cost to build — token usage and the session kit

v0.24 · 2026-06-08

see how many tokens each generation used and what it cost, broken down by model — rolled up per project.

What's new for you

A new usage dashboard on each project shows a token breakdown by model and the cost at each model’s published rates.
Token totals appear on the project page and on individual build sessions.
A session-skills kit (start-session / end-session) is available in the public marketplace and documented in the quickstart — install it for automatic token reporting and session continuity across computers.
The session-end reporter runs on Windows, macOS, and Linux without modification.

Generation reliability

v0.24 · 2026-06-08

three reliability fixes that prevent generations from running twice, running away, or leaving you at a dead end after an interview error.

What's new for you

A generation now shows “Starting…” instead of “Queued / 0%” while it initializes — so you can see it’s underway.
When an interview fails on the first question, you see a clear error message and recovery actions instead of a blank page.

Under the hood

System-level guards prevent a generation from starting twice or running indefinitely under rapid service updates.

Operator and access hardening

v0.24 · 2026-06-08

destructive actions now ask for confirmation before they execute, and navigation makes the product and internal tooling clearly distinct.

What's new for you

Destructive actions — deletes, revocations, force-transitions — now require typing a confirmation phrase before they execute.
Navigation in the signed-in app clearly separates the product area from internal tooling.

Under the hood

No user-visible changes beyond the above — internal infrastructure, security hardening, and reliability fixes.

v0.24 polish & infrastructure

v0.24 · 2026-06-08

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Specs that catch and fix their own contradictions before delivery

v0.24 · 2026-06-01 → 06-05

generations now reconcile their own internal conflicts, run more trustworthy reviews, and ship the complete AI-coder mirror set on every package.

What's new for you

Generated specs detect and resolve their own contradictions before delivery — a whole-package sweep loops until the conflicts are gone.
The required reviews are more trustworthy — they fit the model’s context window instead of falsely reporting “review failed” on large packages.
Every package now ships the full AI-coder mirror set — the partial subset that dropped some files is retired.
Build-readiness checks catch database-dialect, schema-gap, and stack-feasibility problems before you hand the spec to Claude Code, Cursor, or Copilot.
The traceability matrix is reconciled against the canonical requirements and architecture docs, so requirement references stay honest.

Under the hood

Stack decisions now propagate from a single canonical anchor derived from what the package actually adopts, so the spec stays internally consistent.
Database-dialect and nullable-foreign-key checks became deterministic, with added schema-vs-query column-existence verification.
Security review was completed and hardened — access controls and data handling were strengthened, and the review recovers cleanly on large packages.
New feasibility checks catch sign-in-versus-database mismatches and license-constraint conflicts before delivery, including at resume.
Refinement rejects generated stubs that would introduce an unresolvable architecture conflict, and reconciles compliance language so the spec doesn’t overclaim.

Cost to build — token usage rolled up per project

v0.24 · 2026-06-05

SpecStep now records each build session’s AI-coder token usage against its project, so per-session totals add up to a project-wide view of the token cost of building it.

What's new for you

Each build session’s AI-coder token usage is recorded against its project, so you can see the token cost of building a project.
Per-session totals roll up to a project-wide cost-to-build view, measured in tokens.
A session-end reporter sums the session transcript automatically, so the rollup stays current without manual bookkeeping.
The public API docs now describe the session-state kit — the start-session and end-session skills — and how to hook it up.

Interviews and generation kickoff, made reliable

v0.24 · 2026-06-04 → 06-05

a set of reliability fixes that stop interviews from double-starting generations, re-asking answered questions, or misreporting your project’s attributes.

What's new for you

One interview can no longer start two generations at once — a guard holds the line across the interview and the request path.
A web interview no longer double-starts its generation — fixes the “Couldn’t load the interview” error.
The interviewer stops re-asking questions you’ve already answered once you say generate.
A web app’s AI-features and compliance flags now report correctly — your project’s attributes are reconciled at intake instead of defaulting to false.

Security review completed and faster resume

v0.24 · 2026-06-01 → 06-05

Security review completed — access controls and data isolation hardened, and generations recover faster after a clarification.

No user-visible changes beyond the above — internal infrastructure, security hardening, and reliability fixes.

Live generation status you can trust

v0.23 · 2026-06-01

The state, progress, time estimate, round count, and billing posture shown while a generation runs are now consistent across the API, the MCP tools, and the web detail page.

What's new for you

After you answer a clarification, every status surface reflects the resumed state immediately — no more reading “paused, awaiting your answer” for a minute after you’ve already answered.
The progress bar never moves backwards — pausing for a clarification, or re-reviewing after you resolve a blocker, no longer drops the percentage.
The time estimate degrades to “Finalizing…” in the home stretch instead of going blank when a run outpaces its forecast.
The review-round label always reads honestly — “round N of M” never shows a number past the total.
A generation paused for your input shows a distinct “paused — your turn” billing state, separate from a transient-error retry.

Web Controls v1 — canonical design system

v0.23 · 2026-05-30 → 06-01

A visual refresh aligns every core control — buttons, inputs, cards, status badges, tabs, chips, and empty states — to a single consistent design system, in both light and dark themes.

What's new for you

Every core control now follows one visual spec, so the UI looks and feels coherent across the app.
Primary actions use high-contrast ink instead of the accent color, making the action hierarchy immediately clear.
The accent color is a consistent green across every surface that uses it.
Chips use the smooth sans-serif font, dropped their bullet prefix, and dark-mode button hover is visibly stronger.

Under the hood

A single canonical token set drives color, radius, and spacing for all controls across both themes.

Sharper, more reliable generations

v0.23 · 2026-05-30 → 06-01

A generation-quality push that makes spec packages more internally consistent, better grounded, and more reliably delivered.

What's new for you

References between documents are validated and reconciled before the package is assembled — renumbered paths, orphaned requirement citations, and naming inconsistencies are resolved automatically.
The traceability matrix now derives its coverage columns from the same catalog every other document uses — no more placeholder rows.
Backend and data products include an operational-readiness section automatically; AI-feature products include an AI-safety section automatically.
When a genuine architectural decision is unresolved, the run pauses and asks you rather than guessing and shipping.
Flagged-issue severity is scaled to your project: a small standalone tool is not evaluated against an enterprise-platform bar.
A deployment-feasibility check flags a contradiction before delivery — for example a serverless or static-only host paired with a component that needs a long-running self-hosted process — instead of shipping a plan that can’t be built as described.
The package’s stated generation cost is reconciled to one total, so the manifest, the handoff document, and the API never disagree on what the package cost to generate.

Under the hood

The Refining stage — which fills stub sections and reconciles documents before delivery — now runs in production; it had been registered but not active.
Stub-fill drafts are parallelized within a pass, and the stage re-runs on resume so the filled package is always what ships.
Mid-pass interruptions resume from the last checkpoint instead of failing the whole pass, and project attributes are re-fetched on every resume so a resumed run can't inherit a stale configuration.
Stale findings — flagged issues that a later step already resolved — are down-ranked or pruned automatically.
A refinement-audit summary is now visible on the generation detail and in the handoff document.

Consistent data tables across the app

v0.23 · 2026-05-30 → 06-01

Every data table in the app now shares the same component, the same visual style in light and dark themes, and server-side sort and search across the whole dataset — not just the visible page.

What's new for you

Tables on Projects, workspace packages, and billing and usage panels look the same in light and dark themes.
Sorting and search run against the full dataset — results don't change when you move to the next page.
Pagination is consistent across all tables, including a “load more” option that works without a known total.
Every empty table tells you what will appear there and how to add the first item.

Build Lessons & Rules — the pipeline learns from its own history

v0.23 · 2026-05-30 → 06-01

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Triage-flow dashboards

v0.23 · 2026-06-01

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

“At Every Step” branding

v0.23 · 2026-05-31

The new brand tagline — “At Every Step” — now appears as a typographic lockup across page headers, social share cards, and the /brand page.

What's new for you

Page headers and social share previews now carry the “At Every Step” lockup alongside the SpecStep wordmark.
The /brand page reflects the updated tagline for anyone building with or alongside SpecStep assets.

See which AI client made each change

v0.23 · 2026-05-30

SpecStep now records which AI client — and which version — made each session-state write, using identity captured at connection time.

What's new for you

Each session-state record shows which agent made the change and its version.
The Connected MCP clients panel in Settings now lists API-key clients, not just OAuth ones.

Under the hood

Client identity is captured at connection time and stored durably, so attribution survives a cache miss.

v0.23 polish & infrastructure

v0.23 · 2026-05-30 → 06-01

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

v0.22 · 2026-05-30

Every package now goes through a final Refining stage before it’s handed off — filling in placeholder sections, reconciling contradictions between documents, and resolving genuine blockers.

What's new for you

Packages get a final Refining pass that fills placeholder sections and removes dangling references before delivery.
Contradictions between documents are reconciled automatically, with a reconciliation summary on the generation.
When a real blocker can’t be resolved from your inputs, the run pauses and asks you a focused clarifying question instead of guessing.
Refinement, reconciliation, and blocker-resolution summaries show up on the generation, the read APIs, and handoff.md.

Under the hood

An interrupted run resumes into the Refining stage instead of restarting, and reconciliation only redrafts the documents that actually disagree.

Sharper, more internally consistent specs

v0.22 · 2026-05-30

A wave of accuracy work keeps requirement IDs lined up across documents, adds a canonical architecture-decisions section, and cuts false-positive blockers.

What's new for you

Acceptance criteria and requirement references come from one canonical set, so IDs line up across every document in the package.
Packages include an architecture-decisions section the spec binds to instead of re-deriving.
The required safety and security review must pass before a package is delivered.
Far fewer false-positive blockers — measurable criteria are no longer flagged for wording, and contradictions are caught before delivery.
An age-appropriateness check catches a case that could surface older-tier content to minors.

Dates in your local timezone

v0.22 · 2026-05-30

Dates and times now render in your browser’s local timezone instead of UTC.

What's new for you

Timestamps across Projects, Generation, and your Settings tables now show in your local timezone.
Times stay correct even in background tabs.

Search across the site

v0.22 · 2026-05-29

A new search box in the top bar — plus a dedicated /search page — finds pages across the site.

What's new for you

Search from the top bar in both the marketing and signed-in views.
/ and ⌘K open search, and your recent searches are remembered.
A /search results page whose links work even before the page is fully interactive.

More reliable generations and honest ETAs

v0.22 · 2026-05-27

Progress and time-remaining on a running generation are now accurate and self-healing.

What's new for you

The generation page self-heals if a live update is dropped, so progress stops freezing.
Time-remaining stays honest past the original estimate instead of reading “finishing up” indefinitely.
Progress holds at its last point on failure instead of resetting to zero.
A resumed run reflects your current plan tier.

Redesigned “Explain this package” with downloads

v0.22 · 2026-05-27

The package-explanation modal got a clearer redesign and now lets you download the explanation.

What's new for you

A clearer, more readable explanation modal.
Download the explanation as Markdown, plain text, PDF, or Word.

Self-service session-state and projects

v0.22 · 2026-05-30

The session-state and project tools are now self-service for any signed-in user, with stronger per-user isolation and a default-project setting.

What's new for you

Session-state and project tools work for any signed-in user automatically.
A default-project setting so new work lands where you expect; organization sharing is preserved.
Stronger isolation keeps your session state and projects scoped to you.

v0.22 polish & infrastructure

v0.22 · 2026-05-27 → 30

Reliability fixes, consistency-checker calibration, and expanded test coverage across the series.

What's new for you

The top navigation stays on one line.

Auto-resume telemetry made honest

v0.21 · 2026-05-27

When a generation is interrupted and automatically resumes, the recovery used to read as “expensive and stalled” across the API and the web even though the run completed correctly. This release makes recovery read honestly — interrupted, recovering, completed.

What's new for you

Progress no longer jumps backward when a run resumes — get_generation, wait_for_generation, and list_generations now show the same monotonic progress the REST endpoint and the web already showed.
A new host_restart_resume_count field (MCP + REST) and a matching recovery badge on the workspace progress chip show when a run recovered — a run that resumed several times reads as honest recovery, not a silent stall.
The cost forecast is adjusted for resume-prone runs, so a recovered run's estimate reflects the extra work instead of reading far below the actual.
The generation event stream gained an auto-resume-completed event and now records every resume; the “time queued” signal measures the latest interruption, not time since the run first started.

Build-readiness and self-consistency in the package

v0.21 · 2026-05-27

A round of package-quality fixes: the package now reports its own build-readiness, its review summary no longer contradicts its contents, and every generation ships the AI-coder instruction files.

What's new for you

A structured build_readiness field on specstep.yaml — readiness is queryable, not just rendered, so your tooling can gate on it.
The package's reviews[] summary reconciles against the actual review payload, so it no longer says “not applicable” while carrying findings.
Generations started from the web “Start generation” button now include the AI-coder instruction files (CLAUDE.md, AGENTS.md, .cursorrules, .github/copilot-instructions.md) by default, matching every other way to start a run.
The generated CLAUDE.md is a clean verbatim mirror with an MCP-first reading order.

Migrate existing docs — the in-app project card

v0.21 · 2026-05-27

The doc-migration capability shipped earlier this series as an API; now there's a UI for it — a project card that walks upload → preview → commit.

What's new for you

A “Migrate existing docs” card on your project: upload a ZIP, see each file's proposed destination in the SpecStep layout, re-route any row that landed wrong, then commit — no generation, no cost.
Migrated packages are first-class — they count in your usage view and appear in package search alongside generated ones.

Migrate existing docs into a SpecStep package

v0.21 · 2026-05-26

Upload a ZIP of documentation you already have, review a dry-run mapping of how each file will be organized, then commit it into a package linked to your project — no generation needed.

What's new for you

Upload a ZIP of existing documentation to get a preview of how each file maps to its canonical location in the package — review it before anything is written.
Confirm the mapping and commit it in one step. Non-markdown assets are preserved exactly as uploaded.
Migrated packages appear alongside generated packages in list_packages and get_package — your AI coder sees them immediately via the MCP connection.
Packages created this way are included in account data deletion, the same as any other package.
Available now via the REST API (POST /v1/doc-migrations/preview and /v1/doc-migrations/commit) and MCP tools (preview_doc_migration and commit_doc_migration).

Project analytics dashboard

v0.21 · 2026-05-26

Three new sections on your project page give you a live read on velocity, flow health, and how much of the scope has been built.

What's new for you

Velocity — KPI tiles and a weekly throughput chart show how much is getting done and how that rate is trending.
Flow health — a cumulative-flow chart and a flow-efficiency stat surface where work is accumulating or moving freely.
Progress vs. scope — a burn-up chart, a weekly lead-time view, and a cycle-time scatter let you see what's shipped against what was planned.
Starting an interview now links it to a project automatically, so your analytics start accruing without a manual step.
The linked project appears at the top of the interview page, with a dropdown to reassign it if needed.
Connecting a repository is more robust when the GitHub App isn't set up yet — it no longer errors.

Security — resource access checks hardened

v0.21 · 2026-05-26

Your projects', packages', and organization's data stay scoped to your own account, and a gap in package access checks was closed in this release.

What's new for you

Your data is accessible only to your account and organization — strengthened checks apply at every layer.
A gap that could allow access to packages across accounts is closed.

Project-scoped API keys with in-place rotation

v0.21 · 2026-05-26

Create an API key bound to a single project — it can only see that project's data. Rotate the secret in place without deleting and recreating the key.

What's new for you

The “API key” card on the project detail page lets you create a scoped developer key tied to that project.
Rotate the secret with an inline confirm step; the new secret is shown once immediately after rotation.
A scoped key returns only the data belonging to its project — useful for giving a Claude Code, Cursor, or Copilot integration access to one project without broader reach.
The rotation endpoint is available via the REST API (POST /v1/api-keys/{id}/rotate).

Generation build-readiness, platform-aware recommendations, and lifecycle events

v0.21 · 2026-05-26

Several improvements to what you get out of a generation and how clearly it tells you what to do next.

What's new for you

The handoff document now includes a build-readiness banner listing unresolved blockers, low-confidence sections, empty reviews, and dangling references — the first thing your AI coder sees when it opens the package.
A file that's referenced but not yet generated ships as a navigable stub with a clear deferred status, instead of a dead link.
Clarification answers you gave during the interview are now honored if a generation is resumed — they were being stored but not applied.
The specialist caption shows a live “X of Y complete” count with an accurate roster as the generation runs.
Named lifecycle events appear in the generation event feed, so tools like Claude Code and Cursor can track generation progress precisely.
The platform now recommends services native to your chosen hosting target — pick a serverless host and the recommended stack favors that host's own database, key-value, and object-storage offerings instead of a generic default.
A launch-confidence warning appears when a generation skips a Security, Risk, Data Model, or Reliability review on a risk-bearing profile.

Generation resilience — restart recovery

v0.21 · 2026-05-26

A generation interrupted by an infrastructure restart recovers more reliably and picks up where it left off on the next boot.

AI Agents — internal tooling update

v0.21 · 2026-05-26

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Public API reference — cleaner surface and consistent summaries

v0.21 · 2026-05-26

The API reference at /api-docs/reference is cleaner and easier to navigate.

What's new for you

Internal and permission-gated endpoints are excluded from the public document — the reference shows only what's available to you.
Operations are grouped into readable categories by route, replacing a flat undifferentiated list.
Every public operation — 125 in total — now has a one-line summary, so scanning the reference gives you a clear picture of what each endpoint does.
The document title is now “SpecStep API” rather than the assembly name that was leaking into client code generators.

v0.21 polish & infrastructure

v0.21 · 2026-05-26

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Organizations for teams

v0.20 · 2026-05-26

Group your team under one organization. On the Teams plan, create an organization from your profile and become its primary contact — from then on, the work you create is associated with it automatically.

What's new for you

A new Organization field on your profile. On the Teams plan, create an organization right there — name, address, and phone — and you become its primary contact and first member.
Not on the Teams plan yet? The same field links straight to the plan you need to create one.
Once you belong to an organization, the interviews, generations, projects, and other records you create are associated with it automatically — no extra step.
Membership is optional — the rest of SpecStep works exactly the same whether or not you’re in an organization.

Projects workspace with a doc-vs-built dashboard

v0.20 · 2026-05-25

A full Projects area — group your builds, decisions, and backlog under named projects, track what’s been built against what was designed, and pull in live repository metrics when the SpecStep GitHub App is installed.

What's new for you

A new Projects section in the nav rail gives each project its own page — editable name, description, linked GitHub repo, and start date, plus paginated and filterable tables for build sessions, decisions, and backlog items.
A metrics dashboard shows generation counts, last-activity dates, and a weekly decision-velocity chart per project.
When the SpecStep GitHub App is installed on a linked repo, the project page surfaces repository metrics: pull-request count, code size, primary language, and test-file count.
Link a project to its documentation package — from the project page, the package page, or the new-project form, which auto-fills the project name and description from the chosen package.
The Build progress view parses the linked package’s phase plan into phases and individual tasks, marks each task built or not-built by scanning the repo’s commit history, and shows per-phase and overall completion bars — with a manual override available per task.

Under the hood

Generated phase plans now use a consistent, parseable task format so the commit-history scan is reliable.
Built status is computed from the repo’s commit history without cloning the repository.

Internally consistent generated packages

v0.20 · 2026-05-25

Packages no longer ship contradictory sections. When an AI agent’s output conflicts with an upstream document — architecture decisions, stack rationale — the platform catches it, reconciles the affected section, and marks it with a visible note so the resolution is transparent.

What's new for you

Contradictions between a specialist’s output and upstream documents are caught before the package ships — the conflicting section is reconciled and marked with an authoritative banner.
The banner persists if a section is redrafted later, so the reconciliation stays visible across edits.

Richer, safer, and more reliable generations

v0.20 · 2026-05-25

Four improvements to what you get from a generation and how reliably it arrives.

What's new for you

Security and data-model reviews now run on every tier — not just higher tiers.
Generated packages include AI-coder instruction files and an AGENTS.md by default, and recommend the SpecStep MCP server for session state — so Claude Code, Cursor, Copilot, and similar tools pick up your project context without extra setup.
The generation list shows live state and a counting-down ETA while a generation is running.
A generation interrupted by a transient infrastructure issue resumes automatically — your progress is preserved and the run picks up where it left off.
A pre-ship check drops stale blocker warnings that no longer match the final content, so the delivered package isn’t cluttered with issues that were already resolved.

Projects page polish

v0.20 · 2026-05-25

The Projects page got a visual refresh, and your active project now lives in a card on the page itself instead of the top bar.

What's new for you

The active-project switcher moved from the top bar into an “Active project” card on the Projects page.
The Projects list and per-project pages were restyled with sortable, paginated tables and clearer dates.

Under the hood

Internal tooling was redesigned for readability and reliability.

Accurate running-generation count and status filter

v0.20 · 2026-05-26

The generations list sometimes showed finished generations as still running and mis-sorted them in the status filter; the count and the filter are now accurate.

What's new for you

The running-generation count reflects reality — completed, failed, and cancelled runs no longer appear as running.
The status filter sorts each generation into the correct bucket.

Generation progress you can trust

v0.20 · 2026-05-26

The progress bar and time-remaining estimate now reflect what the platform is actually doing — moving smoothly, never jumping backward, and telling you when it’s revising.

What's new for you

The progress bar never jumps backward — progress only moves forward.
During a revision pass the status reads “Revising — round N of M” so you know the platform is refining your package, not stalled.
The time-remaining estimate counts down steadily from a stable baseline and accounts for revision rounds — it no longer resets mid-run.
Progress climbs continuously within each phase, not only when crossing a phase boundary.

Real lines of code on the cost panel — total and per-agent

v0.20 · 2026-05-26

Completed runs now show the real lines of code produced — both the package total and a per-agent breakdown — on the cost panel.

What's new for you

The cost panel for a completed run shows the total lines of code in your generated package, counted from the real output, not an estimate.
It also breaks the lines down per agent, so you can see how much each contributor added or removed.

Consistency checks cover more review specialists

v0.20 · 2026-05-25

The consistency checks that catch and reconcile contradicting sections now cover more of the review specialists, so more conflicts are resolved before your package is delivered.

What's new for you

A wider set of specialist reviewers now participates in the contradiction-detection pass — more conflicts are caught and marked with an authoritative resolution before the package ships.

Connect a GitHub repository to a project

v0.20 · 2026-05-25

You can now connect a GitHub repository to a project directly from the project page, using a guided connect flow and a repo picker.

What's new for you

A “Connect GitHub repository” button on the project page walks you through connecting your GitHub account; once connected, a picker lists your accessible repositories so you can bind one to the project.
Projects you’d already linked by URL are connected automatically — no extra step needed.

The interview knows when it has enough — and you can nudge it to finish

v0.20 · 2026-05-25

Otto now recognizes when it has gathered enough to start generating, and you can nudge it to wrap up whenever you’re ready.

What's new for you

When Otto decides it has enough information, it offers to wrap up the interview rather than asking another question.
A “Wrap up and generate” button lets you finish the interview on your own schedule — no need to wait for Otto to reach the same conclusion.

Failed or cancelled runs don’t count against your quota

v0.20 · 2026-05-25

If a generation fails or you cancel it, that run no longer counts against your monthly quota.

What's new for you

Failed and cancelled generations are excluded from your monthly usage count — only completed runs consume quota.
The quota panel now explains this clearly so you can see what does and doesn’t count.

Known gaps front-and-center in the handoff doc + locked requirements traceability

v0.20 · 2026-05-25

Generated packages now surface known gaps up front in the handoff document, and requirements identifiers stay consistent across every document in the package.

What's new for you

Known gaps appear at the top of the handoff document — the first thing an AI coder or developer sees when they open the package.
Requirement identifiers are locked at generation time so they stay consistent across the traceability matrix, acceptance criteria, and all other documents.

Session-state tooling — cross-computer session continuity

v0.20 · 2026-05-25

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

v0.20 polish and infrastructure

v0.20 · 2026-05-25

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Generation reliability and trust

v0.19 · 2026-05-25

A round of reliability work so a generation recovers automatically from transient interruptions, always shows what's really happening, and handles large or complex specs gracefully.

What's new for you

Generations recover automatically from a transient infrastructure interruption — instead of failing, a run shows “brief interruption — resuming shortly, your progress is preserved” and picks up where it left off.
The status always reflects what's really happening — a finished generation reads Complete instead of appearing stuck on its last phase.
Large or complex specs are handled gracefully — a longer allowance for each step, and a spec too large for the model is caught early with a clear, actionable message instead of an error.
When the system detects a genuine platform issue, it flags it for us to fix — so problems get caught and addressed without you having to report them.

v0.19 polish and infrastructure (late series)

v0.19 · 2026-05-25

Mostly under-the-hood work this round, with one visible refinement.

What's new for you

The agent activity log on the generation detail page now reads as a clean, even-width log.

Projects

v0.19 · 2026-05-25

A new Projects surface lets you group your work under named projects.

What's new for you

A new /projects page and per-project detail page, reachable from the nav rail, let you group generations and session-state under named projects.
Decision-log entries, build sessions, and backlog items can be scoped to a project.

SpecStep as a session-state MCP server

v0.19 · 2026-05-25

SpecStep can now serve as your AI agent's persistent session-state memory — its decision log, build sessions, and backlog — over MCP and REST, with a UI to browse it. Your agent's context survives across sessions and machines, so it can resume work without re-reading every file each time.

What's new for you

Store and retrieve your AI agent's decision log, build sessions, and backlog through the same tools over MCP or REST — both credentials work on either surface.
Browse the stored state in the app: list and detail views for each, plus cross-aggregate views that tie a session to its decisions and backlog.
Bring an existing decision log or backlog into the store with markdown import — paste it through the agent or upload the file.
Full-text search across decision-log entries, backlog items, and build sessions, including resume-by-description on build sessions.

SpecStep-as-session-state-MCP-server — foundation in place

v0.19 · 2026-05-23

Foundation of the arc that turns SpecStep itself into a session-state MCP server — AI coders working on any project will be able to track build sessions, decision logs, and backlog items via the same discipline SpecStep uses on its own build.

Under the hood

The first vertical — build sessions — ships end-to-end with five new MCP tools. The next verticals (decision-log and backlog) follow shortly; the final piece is the end-to-end workflow that an AI coder uses to resume work across sessions.

CI reliability — runner self-heal + smarter capacity alerts

v0.19 · 2026-05-23

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Agent conversation feed stays visible on every terminal state

v0.19 · 2026-05-23

The agent-conversation feed on a generation detail page used to disappear once the run reached a terminal state. It now stays visible regardless of state.

What's new for you

Diagnostic context (conversation turns, retry events, cost trace) stays visible after a generation finishes, regardless of its outcome.

Cost visibility on the generation detail page

v0.19 · 2026-05-23

Two new permissions and a five-PR arc that put cost and per-agent lines-of-code on the generation detail page. Cost visibility is permission-gated so real provider cost is available to users with the new permission, while customers continue to see billed pricing only.

What's new for you

With the new cost-visibility permission, real provider cost renders alongside billed cost on every agent bubble and on the cost rail card.
A new tokens-visibility permission gates per-agent input and output token counts on the rail card.
The agent-conversation feed shows lines-of-code per response again — a regression dating to two weeks earlier is resolved.

v0.19 · 2026-05-23

Three specialists were silently under-routed because the intake-extractor's attributes shape was missing the canonical keys those agents key off. The gap is closed.

What's new for you

Generations that need a Marc-transcript pass (compliance / regulated industries), prompt-engineering subject matter, or product-management scoping now actually route to the right specialist.

New-users inbox — internal review tooling

v0.19 · 2026-05-23

No user-visible changes — internal review tooling for newly-signed-up users.

Marketing — agent visibility + HowItWorks animations restored

v0.19 · 2026-05-23

Closes the agent-visibility work that started earlier this series. The marketing home page now shows only the agents flagged for the home roster; the specialist count updates automatically. A new page lists agents currently in development. The HowItWorks step animations and modal popups were restored after a recent change broke them.

What's new for you

The marketing home page roster updates without a manual edit when a new agent is launched-out-of-mission.
A new /agents/top-secret-mission page lists agents currently in development.
The HowItWorks section animates step-by-step again, and clicking a step opens a modal with the long-form detail.

Quality regressions caught automatically + new monitoring alerts

v0.19 · 2026-05-23

Two reliability arcs closed end-to-end. The system catches five named generation-quality failures automatically and files a bug report so the team sees the issue before the customer does. New monitoring covers week-over-week build-confidence drops, response-time regressions, and wire-shape drift across multiple users.

What's new for you

You hear back on quality issues without having to file them — the system catches its own regressions and surfaces them to the team automatically.

Cite a feedback ID in a commit and it auto-resolves on merge

v0.19 · 2026-05-23

A commit body containing Resolves feedback <id> now auto-resolves that feedback row when the PR merges, stamping the commit and PR URL on the row. The convention shortens the feedback-to-resolved cycle from "file feedback, ship fix, then triage" to "file feedback, ship fix" — the triage step happens automatically.

What's new for you

Submitted feedback auto-resolves when the corresponding fix lands, with a link to the commit and PR right on the feedback row.
Multiple feedback IDs per commit are supported; the keyword is case-insensitive.

v0.19 polish and infrastructure

v0.19 · 2026-05-23

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Cheaper Architect runs + Comparator handles large packages + Interview prompt hardening

v0.18 · 2026-05-22

A pipeline-reliability sweep across the LLM-call surface. Architect runs cache stable user content above the model's empirical caching threshold, which materially reduces cost per generation. The Comparator handles large packages without context overflow and returns a job id immediately so MCP callers don't block on the model. Interview prompts gained settled-answer and completion-intent rules so a conversation no longer loops on a topic the user has signalled is done.

What's new for you

Architect runs cost less per generation after the cache change.
compare_packages returns a job id immediately and handles large packages reliably.
Interviews stop on closure cues ("we're done", "MVP is complete") instead of looping back to settled topics.
The intake-extractor returns better-structured stack recommendations — a previously-failing shape variant is now schema-constrained at the model boundary.

Architect handles longer sections + Interview replies on clarification request

v0.18 · 2026-05-23

Three coordinated fixes after the auto-filing pipeline caught the first concrete defects on real generations.

What's new for you

An Architect run that previously truncated silently on a long section now either completes successfully or surfaces an actionable error.
The Interview page lets you respond to a clarification request directly — the composer renders on AwaitingClarification instead of locking out.

Truthful uptime + /security page + cleaner OpenAPI for anonymous endpoints

v0.18 · 2026-05-22

Four PRs closing two launch-blocker bug reports on the public surfaces.

What's new for you

The /status page shows truthful uptime — "Unknown" displays for periods before monitoring started, instead of an inferred 100%.
The new /security page lists the platform's security posture in plain language and cross-links from /status.
The published OpenAPI spec correctly marks anonymous public REST endpoints as unauthenticated.
Status-page incident history surfaces the monitoring start date and a ticket-routing link; the subscribe form documents its scope and the double-opt-in path.

OAuth and MCP-client unblock

v0.18 · 2026-05-22

Four PRs across the OAuth and MCP-client surface that close registration and tool-catalog gaps.

What's new for you

MCP clients that follow the OAuth 2.1 + refresh-token flow can now register at /oauth/register instead of being rejected.
OAuth-authenticated MCP sessions match the API-key flow's tool catalog — an OAuth session and an API key with the same permissions get the same MCP tools.
A new bulk_resolve_alerts MCP tool resolves alert cohorts in one call.

Reliability fixes — interview turns, bug-report PATCH, feedback interstitial

v0.18 · 2026-05-21

Three coordinated fixes from the post-launch triage day.

What's new for you

Second-and-subsequent interview turns no longer fail with a concurrency error on quick double-submit.
PATCH /v1/bug-reports/{id} accepts both PascalCase and snake_case status values, matching the error response's advertised shape.
Anonymous visitors hitting /feedback see a branded interstitial explaining the page instead of an abrupt sign-in redirect.

MCP server respects JSON-RPC notifications

v0.18 · 2026-05-21

The MCP HTTP-streamable transport now respects JSON-RPC 2.0 notifications: methods sent without an id are answered with HTTP 202 + empty body, and the lifecycle methods silently no-op instead of returning an error.

What's new for you

MCP clients that strictly follow the JSON-RPC 2.0 + MCP specs complete the handshake against https://specstep.com/mcp without modification.

CI — merge-queue rollout and revert + workflow isolation + dynamic ports

v0.18 · 2026-05-22

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

v0.18 polish and infrastructure (late series)

v0.18 · 2026-05-21 → 22

Cross-cutting polish that doesn't belong in a single themed entry: nightly workflow auto-resolves bug-report rows on next-green-nightly; the api-docs reference embed adopts SpecStep meta chrome; bug-report promotion auto-resolves the source feedback row and cascade-closes findings; sitemap completeness improvements; 17 missing agent detail pages were built out; documentation refresh on MCP tooling args.

What's new for you

17 previously-missing agent detail pages are now live on the marketing site.
Bug-report promotion from a feedback row now also auto-resolves the source feedback row.
/api-docs/reference (the Scalar embed) carries SpecStep meta chrome consistent with the rest of the site.

Quality feedback: validate, amend, and structured evidence

v0.18 · 2026-05-21

The quality-feedback surface gains a dry-run validate tool, a self-correction window for submitters, machine-readable typed evidence on findings, and a slimmed list shape that cuts multi-MB payloads.

What's new for you

validate_feedback (MCP) — dry-run validates a feedback submission shape and returns {valid, errors[]} without persisting. Fix template/section-id/cap violations before spending a submit_feedback call.
amend_feedback (MCP) + PATCH /v1/feedback/{id}/amend (REST) — the original submitter can self-correct an Open feedback row within a 10-minute window: title, summary, full_report, evidence, or tags. No review-queue intervention needed.
Findings now carry a typed_evidence array — structured machine-readable evidence alongside the prose string. Kinds: HTTP response, route, console error, MCP tool call, transcript turn, screenshot, JSON payload.
GET /v1/feedback and /v1/feedback/me now return a summary shape — scalars, a 200-character excerpt, and counts — instead of full bodies. The full record is one by-id call away. Multi-MB list payloads are gone.
Submitting a rubric template that doesn't pair with the feedback type now fails fast with FEEDBACK_TEMPLATE_TYPE_MISMATCH instead of being silently accepted.
Promoting feedback to a bug report is now atomic — a single staged commit — so a failure can no longer orphan a half-created bug report.

Interview turns default to async

v0.18 · 2026-05-21

submit_interview_turn now defaults to async — the call returns a job_id immediately instead of blocking — so long interview turns no longer risk a ~60s gateway timeout.

What's new for you

submit_interview_turn now defaults to mode: "async": you get a job_id to poll via get_interview_turn_status, or subscribe to the live push. You no longer need to opt in.
mode: "sync" remains for callers that want the inline reply, but it's subject to the ~60s gateway ceiling and is scheduled for removal.
The web Interview page runs async end-to-end with a live push and a polling fallback.

v0.18 platform and CI foundation

v0.18 · 2026-05-21

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Retry visibility on running generations — see when the platform is retrying and why

v0.17 · 2026-05-19

When the platform retries an in-flight AI call (rate-limited upstream, transient blip), polling clients now see exactly what's happening instead of guessing from a climbing cost meter. Closes the last sub-finding from one of our beta customers' feedback reports about “is this run still healthy?”

What's new for you

GET /v1/generations/{id} (REST) + get_generation + wait_for_generation (MCP) carry four new fields: retry_count, last_retry_at, next_retry_at, recoverable_error_category (one of rate_limit, provider_timeout, provider_server_error, schema_violation, other).
The real-time push payload also carries the same four fields, so any live dashboard you build stays in sync without re-polling.
Retry counts start at zero on every fresh attempt and only increment forward — you can reason about “has the platform retried since I last looked?” with a simple integer compare.

v0.17 · 2026-05-19

A persistent progress chip in the topbar so you don't lose track of running generations when you navigate away from the workspace. The workspace card itself also now stays in sync with the API in real time.

What's new for you

New progress chip in the topbar next to the bell. Hidden when nothing is running; shows the slowest in-flight generation's progress when one or more are. Hover (mouse) or click (touch / keyboard) to expand a per-row list of every running generation with its own progress bar.
A small x2 / x3 badge overlays the bar when more than one generation is running.
Color tells you the worst state at a glance: blue when work is actively progressing, amber when a retry is in flight, amber pulse when an interview clarification is waiting on you.
The workspace row's retry counter, billing state, and started-work time now update from the real-time push instead of waiting for a page refresh.

Better error when an explanation takes too long

v0.17 · 2026-05-19

The “Explain This To Me” modal occasionally surfaced a confusing HTTP 502 when the upstream AI was slow. Replaced with a 75-second wall-clock budget and a friendlier retry prompt.

What's new for you

When an explanation takes longer than expected, the modal now shows “Took longer than expected. Try again — most explanations finish within 30 seconds.” (was: “Couldn't generate the explanation (HTTP 502)”.)
Most “Try again” clicks succeed within a few seconds — the timeout fires when the upstream is genuinely slow, not on a healthy call.

Per-finding statuses on multi-issue feedback

v0.17 · 2026-05-18

When you submit feedback with multiple findings (e.g., a package-quality review covering five different concerns), each finding now tracks its own resolution status independently. The parent feedback row stays Open until every child finding has been resolved or dismissed.

What's new for you

Multi-issue feedback rows now expose per-finding statuses through the MCP and REST surfaces. You can see exactly which of your reported issues are in triage, which are resolved, and which were dismissed — not just a single status for the whole submission.
New MCP tool to walk findings cross-row: list_feedback_findings answers “show me every open finding I've reported, sorted by severity” without re-walking each parent.
A recurrence-chain tool (list_feedback_recurrences) traces every follow-up feedback row that referenced an earlier submission as the seed.

No more “Server restarted. Please refresh.” modal

v0.17 · 2026-05-18

The browser session now reconnects on its own after a transient infrastructure interruption. The previously-needed manual page refresh is gone.

What's new for you

The page reconnects automatically when the session hiccups. No more “Server restarted. Please refresh.” modal interrupting whatever you were doing.

v0.17 polish & infrastructure

v0.17 · 2026-05-18 / 19

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Explain a package for an audience — one click, audience-tailored markdown

v0.17 · 2026-05-18

Packages now ship with a one-click "Explain what's in it" surface that rewrites the package as a short, audience-tailored markdown summary. Pick the audience — executive, product manager, engineering manager, new engineer, investor, or security reviewer — and the result is generated, cached, and copy/paste-able. Repeats for the same audience are free.

What's new for you

New "Explain what's in it" button on the Generation Detail Package-ready card, and a lightbulb icon on every Workspace Complete row — both open the same audience picker. The first request for a given audience runs the AI in a few seconds; subsequent picks of the same audience return the cached markdown instantly at no cost.
Six curated audiences ship at launch: Executive (what was decided, business consequences, no jargon), Product manager (scope, user stories, acceptance-criteria summary), Engineering manager (architecture, risks, de-risking plan), New engineer (where to start reading, mental model, gotchas), Investor (what shipped, market position, what's next), and Security (threat surface, trust boundaries, what got reviewed).
Three new REST endpoints — GET /v1/explain/audiences (public catalog), POST /v1/packages/{id}/explain (cold or cached), GET /v1/packages/{id}/explanations (list cached audiences for a package). Two new MCP tools — list_audiences and explain_package — mirror the REST surface for agent callers.
Per-tier monthly quotas keep the cost predictable: 20 cold explanations / month on Free, 200 / month on Pro, unlimited on Team. Warm-cache hits never count.

Clearer signals on running generations — billing state, started-work timestamp, completion forecast, active specialist

v0.17 · 2026-05-18

Polling a running generation now returns five new fields that let callers tell "actively working" from "paused on a transient error" without re-scanning the events stream. Closes a gap where climbing cost could look like runaway billing even when the platform was making real progress.

What's new for you

billing_state — NotStarted, Active, PausedRetrying, or Complete. Pair with running_cost_usd to disambiguate "healthy" climbing cost from runaway: Active + climbing cost = the platform is earning the spend; PausedRetrying + climbing cost = something's stuck and you should look.
started_work_at — the exact moment the dispatcher claimed the row. Lets callers compute "how long has this been actively working?" without scanning events.
estimated_time_remaining_seconds and estimated_completion_at — best-effort forecast computed from the historical-median model. Null while queued, terminal, or when the forecast is unavailable.
active_specialist — during specialist review, the slug of the most-recently-completed specialist in the current round (one of codd / halo / tally / vera / trip / merlin / polo). A pragmatic single-value summary of a parallel fan-out.
progress_explanation on the MCP get_generation / wait_for_generation responses — one-sentence narration of what's happening at the current progress_percent (e.g., "Specialists are reviewing the draft in parallel"). Closes the same understanding gap for agent callers.
All five fields are strictly additive; null on generations that started before the rollout.

Cancel an in-flight async interview turn

v0.17 · 2026-05-18

If you submitted an interview turn in async mode and realize the message was wrong — or the LLM call is dragging on — you no longer have to wait for it (or the stuck-job timeout). A new cancel surface ships on both REST and MCP.

What's new for you

New REST endpoint POST /v1/interviews/turns/{jobId}/cancel and new MCP tool cancel_interview_turn. Queued jobs cancel cleanly; running jobs cancel best-effort — the job's terminal status will be cancelled, but the agent reply MAY still appear in the interview transcript if a mid-pipeline commit landed before the cancel did.
Idempotent on already-cancelled jobs. Returns 409 INTERVIEW_TURN_NOT_CANCELLABLE when the job already reached completed or failed — the work landed; read the result via the existing poll endpoint.

Marketing site & api-docs polish — skip-link fixes, metadata, mobile rendering

v0.17 · 2026-05-18

A focused pass on the public-facing surfaces ahead of v1 promotion — SEO metadata, accessibility, and mobile rendering across the marketing site, status pages, and api-docs reference.

What's new for you

Skip-to-content links and section TOC anchors are now route-qualified everywhere — pressing Tab on a non-root page now jumps to the page's main content instead of being silently hijacked to the homepage by <base href="/">. Affects every non-root marketing route.
SEO metadata closed across the homepage and status sub-pages — route-specific meta descriptions, canonical URLs, visible h1s on /status/uptime and /status/history.
/api-docs/reference/ no longer overflows horizontally on narrow viewports — document-level max-width: 100vw clamp added to the Scalar reference mount.
The release-notes back-to-top pill now scrolls to the Contents heading instead of doing nothing (was a stale empty fragment).
Public OpenAPI document slimmed — an internal-only webhook callback is no longer listed (it was never a customer-integrator binding target). Narrows the documented surface without removing any caller-relevant route.

Package consistency checks catch cross-doc contradictions before delivery

v0.16 · 2026-05-17

A new consistency-checking stage runs across the assembled package immediately before delivery. Each check looks for a class of contradiction across the package’s docs — project-name drift, missing referenced files, requirement-identifier orphans, schema feasibility against acceptance criteria, architecture decisions that conflict across docs, storage assumed in-memory while requirements imply durable state, stale traceability matrix, JSON field-naming conventions that disagree between API design and acceptance criteria, and review-report freshness.

What's new for you

Packages now ship with a consistency_findings array in the manifest plus a banner in handoff.md when Critical-severity contradictions are detected. The AI coder reading the package gets a flagged punch list instead of discovering the contradiction at build time.
Catches the contradictions that previously made retest packages fail at the wire boundary: project-name drift across docs, missing referenced files, FR / NFR / AC identifier orphans, schema feasibility against AC, architecture-decision conflicts (JWT vs. session cookies, Postgres vs. MongoDB, etc.), storage-durability vs. stateful-requirement mismatches, stale traceability matrix, JSON naming-convention mixups, and empty / placeholder review reports.
Severity stratified — Critical bubbles to the handoff banner; High and Medium live in the manifest for awareness without blocking delivery.

Interview pipeline hardening — idempotency, async submission, completion auto-handoff

v0.16 · 2026-05-17

The interview submission path picked up four classes of hardening: client-driven idempotency on the wire, an opt-in async mode for long turns, automatic generation handoff on interview completion, and graceful shutdown so a deploy mid-turn doesn’t strand work.

What's new for you

submit_interview_turn accepts a client_request_id — retry a request and the server returns the original response instead of re-running the model. Same shape on REST and MCP.
Pass mode: "async" to submit_interview_turn for an immediate job-id response when you don’t want to block the caller on a long turn. Poll status via the new get_interview_turn_status tool.
MCP unknown-argument errors return a structured envelope (typed code + offending key + suggestion) instead of a flat string.
When the interview finishes, the generation auto-starts and the response carries a next_action field pointing the agent at the running generation. No more “interview done, now what?” deadlock.
Deploys mid-turn no longer strand interview work — queued and in-flight rows rewind cleanly on graceful shutdown so the next revision picks them up.

Feedback workflow maturity — recurrence threading, terminal-state notifications, four new rubric templates

v0.16 · 2026-05-17

The Feedback aggregate shipped in v0.15; this series matured the workflow around it — threading recurrences when a resolved issue comes back, notifying submitters when their row reaches a terminal state, having the server emit recommendation envelopes that AI agents can auto-file against, and adding rubric templates for the four feedback categories standing review called out.

What's new for you

When you file feedback or a bug report and a server-side fix lands, you get a notification when the row reaches Resolved / Won’t-fix.
If a resolved feedback comes back, file a new one with recurrence_of_feedback_id (or recurrence_of_bug_report_id) and the new row threads to the original — reviewers see the full history instead of starting over.
AI agents calling MCP tools that detect quality drag receive a feedback_recommendation envelope alongside the response, with a recommendation_token the agent can pass into submit_feedback. Repeat tokens within 30 days bump an occurrence counter on the existing row instead of filing a duplicate.
Four new feedback rubric templates ship alongside the existing end-to-end one: interview-quality, package-buildability, api-doc-quality, tooling-experience. Pick the one that matches the scope of your feedback — narrower rubrics keep the signal cleaner than the all-in-one.
New api_doc_quality feedback type pairs with the api-doc-quality rubric so feedback on the /api-docs/* surface has its own bucket.

Privacy-conscious visitor analytics on the marketing site

v0.16 · 2026-05-17

The marketing site picked up a privacy-conscious analytics pipeline so we can see what’s working on the public surface without tracking individual visitors across days.

What's new for you

First-visit / repeat-visit / signup-conversion analytics with no third-party tracker, no cross-day correlation, and country-level geo attribution. DNT: 1 is honored end-to-end.
The Privacy Policy §1.4 documents the analytics collection — a daily-rotating salt over hash(IP, user-agent) is the visitor identity; the salt deletes 48 hours after rotation, so no database snapshot can be re-correlated with a fresh observation.

MCP `wait_for_generation` picks up progress and cost forecast

v0.16 · 2026-05-17

wait_for_generation is the recommended polling primitive but until now omitted the progress_percent and cost-forecast fields that get_generation returned. MCP clients had to call both tools to render a single progress screen. Closed in this series.

What's new for you

MCP clients polling a generation via wait_for_generation now get progress_percent (0–100), current_round, and the historical-median cost forecast (estimated_total_usd plus p25 / p75 / sample size) on every response.
Strictly additive — clients ignoring undocumented fields keep working.

Outbound email branding — SpecStep <donotreply@specstep.com>

v0.16 · 2026-05-17

Transactional emails now ship under the verified SpecStep custom-domain sender instead of a generic default address.

What's new for you

Transactional emails show SpecStep <donotreply@specstep.com> as the sender across every send surface (verification, password reset, support routing, retention warnings, terminal-state notifications). DNS-aligned, SPF / DKIM-verified.

v0.16 polish & infrastructure

v0.16 · 2026-05-17

Cross-cutting polish that doesn’t belong in a single themed entry.

What's new for you

Twelve custom monochrome rail icons replace the prior set across the left-rail navigation — consistent stroke weight, no-slope discipline, orthogonal glyphs for each major surface.
Layout cleanup: redundant section-level page titles dropped from a few internal layouts so the page title isn’t rendered twice.
The intake-extraction tool ships a tighter input schema so the model can’t produce shapes the validator would reject — closes an interview-side feedback recurrence.

Submit quality feedback on interviews, packages, and end-to-end runs

v0.15 · 2026-05-16

A new Feedback surface joins Bug Reports. Bug Reports captures broken behavior; Feedback captures quality reviews — was the interview good, is the generated package coherent, what's the build confidence for the end-to-end run.

What's new for you

New /feedback pages — submit a quality review on any interview, generated package, or end-to-end run from inside the app. Pick a target, score build confidence on a 0–10 scale, and narrate what worked and what didn't.
New submit_feedback MCP tool — same submission shape for AI clients. Kept distinct from submit_bug_report so quality signal stays separate from broken-behavior signal.
Built-in feedback templates ship out of the box for interview-rated, package-rated, and end-to-end-rated submissions.

Interview and package quality pass

v0.14 · 2026-05-16

A focused sweep on the agent-experience surface — tighter interview triggers, better project-name handling, clearer MCP errors, and a series of package-quality prompt tightenings.

What's new for you

The interview triggers child-safety questions when the discussed domain implicates minors; previously the trigger missed several common phrasings. The follow-up rating prompt fires earlier in the interview and the phase-progression check is stricter so the interview doesn't advance without the required signal.
Project-name extraction picks a canonical name even when the user gives a contradictory short name vs. long name — manifests, package zips, and emitted artifacts all agree on the same project name.
MCP tools surface unknown-argument typos instead of silently dropping them — misspell an argument name and the tool returns a structured error pointing at the offending key.
Generation packages ship with a decision-log stub and per-agent cost data, so downstream consumers can audit what each agent contributed.
Workspace-card progress numbers (cost, phase, elapsed) refine at request time so they're accurate instead of cached-stale.

Reliable cancellation and smoother first-render

v0.14 · 2026-05-16

Two reliability threads closed in the same week: cancel-button behavior on long-running generations, and a series of first-render crashes on interactive workspace and generation components.

What's new for you

Cancelling a generation now reliably propagates to the agent — the cancel button confirms with a spinner, the workspace card flips to Cancelling within seconds, and the agent halts at the next clean phase boundary instead of running to completion.
Workspace, Generation detail, and Animated cost components no longer flash an error on first render — the interactive bits now wait until the page is ready to handle them.

Privacy and retention SLA closure

v0.14 · 2026-05-16

The two outstanding privacy follow-ups from the legal-page refresh closed end-to-end. The policy text that previously hedged can now say what it means.

What's new for you

When you delete your account, identity-bearing fields in audit records are scrubbed at deletion time, not just identity-replaced.
When you hard-delete a generation or package from the Recycle Bin, the underlying file storage is purged immediately.
A nightly sweep cleans up anything that escapes the immediate purge, backing the 24-hour retention SLA in the Privacy Policy.

Pro tier “Coming soon”

v0.14 · 2026-05-16

Payment processing isn't yet enabled in production; the Pro tier is gated behind a “Coming soon” placeholder on the pricing surfaces until it is.

What's new for you

The Pricing page shows “Coming soon” in place of a Pro price. Free tier remains live and immediately subscribable.
Pro checkout is blocked at the system level until payment processing is enabled — no half-completed sessions.

v0.14 polish & infrastructure (continued)

v0.14 → v0.15 · 2026-05-16

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

v0.14 · 2026-05-15 → 16

The landing-page topbar and the in-app topbar drifted on four axes — font, brand-lockup size, redundant navigation links, and a noisy tier chip. Settled in two passes.

What's new for you

Identical topbar across landing and signed-in pages — same font, same brand-lockup size, same persona menu.
Click your name to open your workspace; the dropdown still surfaces Workspace, Settings, and Sign out as labelled choices.
Signed-in visitors on landing see their identity (avatar, name, notification bell) and the preview chip — the marketing shell no longer looks logged-out when you're logged in.
The persona button no longer shows the plan label next to your name; plan is one hover away in the dropdown.

Six new MCP tools — structured findings, change-request files, pre-flight validation, content diff

v0.14 · 2026-05-16

The MCP catalog grew from 50 to 56 tools in a single shipping pass that closed every deferred entry from the original MCP-additions plan.

What's new for you

Structured security and quality findings. get_security_findings and get_generation_quality_report return per-finding severity, topic, and title for the security, reliability, accessibility, cost, and risk reviews. Branch on max_severity instead of parsing the markdown.
File-level addendum access. list_change_request_files and get_change_request_file read individual files inside a change-request addendum zip without the fetch-then-unzip dance.
Pre-flight validation. validate_generation_request dry-runs a kickoff — same checks as start_generation, no enqueue. Returns the same error codes the live tool throws so you can branch before paying for the call.
Real content diff across packages. diff_package_files emits line-level unified diff across same-named files in 2–5 packages. Sister of compare_packages; returns actual delta, not byte counts.

Privacy Policy and Terms of Service refresh

v0.14 · 2026-05-16

Both legal pages refreshed to cover features that shipped since the last revision, plus parent-corporation disclosure and a stronger international transfer mechanism.

What's new for you

Privacy Policy — new disclosures for External Connectors, MCP OAuth tokens, bring-your-own LLM keys, outbound webhooks, source-control delivery, profile photos, Recycle Bin auto-purge, account deletion cascade, data export, and SMS notifications.
Privacy Policy §9 rewrote the international transfer mechanism from consent-based to the EU-US Data Privacy Framework with UK Extension and Swiss DPF.
Terms of Service — new §8.5 covering third-party integrations and credentials (external storage, MCP OAuth, BYO LLM keys, source-control delivery, outbound webhooks); a California auto-renewal disclosure on §6; §10 rewrote to reflect 24-hour deletion processing and 7-day Data Export URLs.
Both pages disclose No Compromise AI as the Delaware parent corporation. Texas governing law unchanged.
Terms version bumped — authenticated users are prompted to re-accept on next visit. API and MCP OAuth callers receive a structured re-acceptance response until acceptance is recorded.

External Connectors v1 — SharePoint live, browser-based attach from MCP, Google Drive promoted

v0.14 · 2026-05-15

External Connectors crossed the v1 launch bar — SharePoint joined OneDrive and Google Drive as a live provider, MCP clients can start a folder attach without leaving the agent, and new MCP clients self-register without manual setup.

What's new for you

SharePoint connector ships as the third live provider alongside OneDrive and Google Drive. Connect a SharePoint site during the interview; reference documents flow in the same way.
attach_external_folder MCP tool returns a one-time browser URL; you open it, complete OAuth + folder pick + first sync, and your MCP client polls a sister tool until the attach completes. No copy-pasted commands.
Google Drive promoted from “Coming soon” to live across pricing, FAQ, About, and API docs.
MCP clients can self-register at POST /oauth/register. Existing MCP clients keep working under their previous registration.

Brand identity — public /brand page and vendor logos on integration tiles

v0.14 · 2026-05-15

The SpecStep marks had been on disk for weeks but never had a public home; integration partners and press had nowhere to grab them. Closed with a dedicated /brand page plus vendor logos replacing emoji glyphs across integration surfaces.

What's new for you

New public page at /brand — logo downloads (mark, wordmark, horizontal lockup, stacked lockup with PNGs from 16px to 1024px), color tokens, typography, canonical naming, “Powered by SpecStep” live-preview badges with paste-snippet HTML, MCP-tile size recommendations, and a permissive license that says editorial and integration use are free with no permission needed.
Connect-a-folder modal and landing-page External Connectors section show real Microsoft / OneDrive / SharePoint / Google / Google Drive / GitHub product logos instead of emoji glyphs. Dropbox appears as a “Coming soon” tile.
Landing-page sign-in row treats Microsoft, Google, and GitHub as equal peers with identical button styling.

v0.14 · 2026-05-14 → 15

Two GDPR-mandated surfaces shipped end-to-end — the right to erasure and the right to data portability.

What's new for you

Delete account in Settings — requests cascade-delete every owned resource (interviews, generations, packages, addendums, webhooks, external-connector credentials, API keys, audit-event identity replacement, blob deletion). Confirmation email arrives when the cascade completes.
Export data in Settings — requests a zip of every interview transcript, generation manifest, package, and account-metadata blob you own. Email arrives with a 7-day signed download link.
Both flows process within 24 hours of submission.

Consistent Save / Cancel and toast behavior across Settings

v0.14 · 2026-05-14 → 15

A focused pass on the form, save, and notification primitives that every Settings panel consumes. The user-visible result is identical save-cancel-toast behavior across every editable surface.

What's new for you

Every editable Settings panel uses the same Save / Cancel bar that anchors at the bottom while you're editing — no more guessing which surface uses which submit pattern.
Save success and failure shows the same toast surface across every panel; transient errors no longer hide silently.
Form-section headings are consistent across Settings — same typeface, weight, and spacing on Profile, Notifications, Logs, Webhooks, Cost & Usage, and API Keys.
Empty states share one visual shape; the “Try refreshing” and “Try a different filter” affordances no longer drift per surface.

Workspace stat tiles — ISSUES and USAGE, plus linked filters and cost drill-down

v0.14 · 2026-05-15

The /workspace stats band gained two new tiles and every tile now navigates somewhere meaningful when clicked.

What's new for you

Two new tiles — ISSUES (count of generations in a failed terminal state) and USAGE (THIS MONTH) (total cost across the current calendar month).
IN FLIGHT and COMPLETED tiles are now anchors — clicking either jumps the table below to the matching filter, rows sorted most-recent-first.
Spend tiles navigate to the cost-and-usage drill-down for the matching period.

Wide-prose layout for /about, /support, and /release-notes

v0.14 · 2026-05-15

The three long-prose public pages were stuck in a 720-pixel column on desktop, leaving the right two-thirds of the screen blank. They now use the same wide layout the FAQ and API docs already use.

What's new for you

About, Support, and Release notes pages render full-width on desktop with a sticky right-rail navigation that tracks scroll position.
Page hero band hosts the title and last-updated stamp instead of stacking inline at the top of the prose.

Per-user retention preference and Recycle Bin hard-delete

v0.14 · 2026-05-14

Two retention-related surfaces. You can now set your own default retention window for new generations, and Recycle Bin gained a permanent “Delete forever” affordance for users who don't want soft-deleted rows hanging around for the standard 30-day window.

What's new for you

New Default retention setting in Settings — pick how long completed generations stay on your account before auto-purge (overrides the platform default for new generations only).
Recycle Bin has a Delete forever affordance per row — one-click hard-delete of a soft-deleted generation without waiting for the sweep.

v0.14 polish & infrastructure

v0.14 · 2026-05-14 → 16

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Pre-v1 launch polish across the entire product

v0.14 · 2026-05-14

A multi-session polish pass settled the user experience across every surface — marketing, API docs, onboarding, the generation flow, Settings, and account administration — before v1.

What's new for you

Marketing surfaces — pricing page cleaned up, hero accessibility and motion-calm pass, /about gains an inline jump-nav across seven sections, copy fixes on BillingSuccess and the team roster.
API docs — anchors corrected, enum rows restored, sub-nav reordered with a mobile-friendly TOC, code blocks colorized with a copy-button, semantic status pills on error and rate-limit tables, a dedicated page header with last-updated stamp.
Onboarding and Interview — clearer error and OAuth-consent copy, humanized timestamps on the Status page, first-load orientation on the Interview page, composer flatten plus legal-ack “Later” and profile-compare scroll and decision-table visibility, a tidier modal experience with Escape returning focus correctly.
Generation and Package — Workspace renders correctly on first paint and on mobile, real permalink and a cost share-of-budget bar on Generation Detail, the real package file tree, per-row short-id demoted to a hover tooltip.
Settings — sidebar reorganization, copy improvements plus empty-state teaching, a renamed Data Export path, semantic pill palette across the silent-deduped comparator reveal.
Account administration — search and filters and sortable column headers across the high-traffic tables, a fresh role-permissions audit-trail surface, TOC anchors on Engineering release notes resolve to the same page.
Visual settlement — Generation Detail folds progress and cost and pipeline into a single telemetry strip; the five system pages (Error, Access denied, Account disabled, Terms acceptance, OAuth Consent) share a common shell; the permalink URL element on Generation Detail gains a visible focus ring.

Under the hood

Cross-cutting sweeps — display helpers retire raw enum names, missing CSS primitives added, responsive stat-tile layout, sentence-case capitalization sweep across navigation and tab labels, an emoji-to-SVG sweep, public-copy-gate extension to catch internal identifiers across components.
Visual and accessibility baselines refreshed after the per-batch work; new capture workflows generate baselines on CI runners so refreshes are deterministic.

Extra Usage prepaid balance

v0.13 · 2026-05-13

Pro and Team accounts can now buy a prepaid Extra Usage block — additional generation credit that drains alongside the monthly allowance.

What's new for you

New Extra Usage card on the Billing page shows your current balance and lets you buy a block.
When your monthly allowance runs out, generations draw from the Extra Usage balance instead of failing.
Balance and buy receipts visible inline on the same surface.

AI provider preferences

v0.13 · 2026-05-13

Choose which AI providers SpecStep is allowed to use for your generations.

What's new for you

A new Settings panel lets you opt out of specific AI providers; agents pinned to a disabled provider fall back to your chosen default.

Contact form, refreshed About + FAQ, support routing

v0.13 · 2026-05-13

The Contact page is now a form (no email address visible); support tickets route to a dedicated support inbox; About and FAQ have been refreshed for the current product, and /faq gains a right-rail table of contents matching the API docs.

What's new for you

/contact is a form — pick a reason (Sales / Partnership / Press / Integration / Feedback / Other), type your message, send. No email address listed for scrapers.
Support tickets now go to a dedicated support inbox, separate from general inquiries.
/about and /faq are current — cover MCP, browser sign-in for MCP clients, External Connectors (SharePoint and OneDrive live; Google Drive and Dropbox coming), Addendums, Recycle Bin, Webhooks, and the renamed review profiles.
/faq has a sticky right-rail table of contents that scroll-spies the active section as you scroll.

Pricing page evolution

v0.13 · 2026-05-13

The Pricing comparison grid gained an AI Providers band, a Document Management band (SharePoint and OneDrive live; Dropbox and Google Drive coming), and a Team-tier "Coming soon" overlay so the upgrade CTA doesn't fire before the tier ships.

What's new for you

AI Providers band shows which providers each tier may use.
Document Management band lists the connector lineup with status pills.
Team upgrade CTA shows a "Coming soon" overlay until the tier ships.
Per-tier alignment + readability cleanup so checkmarks and pills line up.

v0.13 · 2026-05-13

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Right-rail TOC + page-width polish

v0.13 · 2026-05-13

Cap the layout width on /faq and /api-docs/* so the table of contents pins to the right edge and the article fills the center.

What's new for you

/faq + /api-docs/* article column now fills the available width up to the table of contents.
The TOC rail pins to the right edge of the centered container.
The rail's inner scrollbar is hidden visually (wheel-scrolling still works on the longer pages).

v0.13 polish & infrastructure

v0.13 · 2026-05-13

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

v0.13 · 2026-05-13

MCP clients — Claude Desktop, Claude.ai, Cursor, Codex, GitHub Copilot, Continue, Cline — can now sign you in through a browser instead of requiring a manually pasted API key.

What's new for you

On first connection, your MCP client opens SpecStep's authorization page in a browser; you approve with your existing account session and the client receives a 90-day token automatically.
API keys keep working as-is — headless and CI flows are unaffected.
Settings → API keys gained a “Connected MCP clients” panel where you can see active MCP sessions and revoke any of them individually.

Under the hood

Full OAuth 2.1 browser flow with strict redirect enforcement — no redirect to arbitrary domains.
Authorization codes are single-use with a short TTL, consumed atomically so replay is structurally blocked.
REST and MCP share the same authentication model — either credential works on either surface.
New discovery and authorization endpoints are published at standard well-known paths.

Per-intake cost and duration estimates

v0.13 · 2026-05-13

Profile cards on the Interview page now show cost and duration ranges grounded in your actual usage history — not hardcoded approximations.

What's new for you

Each profile card shows a tight cost range and duration estimate derived from rolling 30-day medians across the agents that will run for your project.
Estimates factor in your tier and the project attributes Otto has detected so far — so the numbers shift as the interview progresses.
Cards without enough history show a baseline estimate plus a “projected” label so you know it isn't history-grounded yet.
The compare panel below the cards draws from a profile-wide median forecast, so the side-by-side view stays current as Otto refines what he knows about your project.

Under the hood

Estimates recompute after every interview turn — no manual refresh needed.

Interview profile picker — compare panel and grid fix

v0.13 · 2026-05-13

Two improvements to the profile selection step: a side-by-side comparison panel and a layout fix that was pushing the Fast profile out of position.

What's new for you

A collapsible “Compare profiles” panel sits between the three main cards and the Researcher option — a 7-row table covering best use case, review rounds, specialists included, estimated cost and duration, and tier required.
The Fast, Normal, and Extensive cards now sit correctly in a three-column grid — a misplaced element was displacing Fast into the second cell and wrapping Extensive onto a second row.

Profile renames and a per-feature comparison grid

v0.13 · 2026-05-13

“Thorough” is now “Normal” and “Exhaustive” is now “Extensive” — names that map directly to what each tier actually delivers. The Pricing comparison grid is restructured into three category bands so the tier differences are scannable at a glance.

What's new for you

The three standard profiles are now Fast / Normal / Extensive — the progression reads as a straight scale rather than a scale with a marketing name in the middle.
The Pricing comparison table is rebuilt around three category bands — Review profiles, External connectors, Agents included — each with a section heading and one sub-row per item, with a checkmark in every tier column that includes it.
The External connectors band carries a “Free: connect + preview only” sub-note so Free users see they can still run the auto-respond magic moment even though generating from connector data requires a paid tier.
Checkmarks render as styled checkmarks — they were appearing as escaped HTML text before the fix.
If your tier doesn't support the previously selected profile default, the Interview page automatically switches your selection to Normal.

Under the hood

A migration renames the stored profile values — no data loss, no manual step.
The Agents-included band reads from the agent role catalog at render time, sorted by pipeline position so the order matches the orchestrator's flow.

External Connectors — pull reference docs from SharePoint, OneDrive, and Google Drive

v0.13 · 2026-05-12 → 13

Connect a SharePoint site, a OneDrive folder, or a Google Drive folder to an interview and Otto summarizes the contents and feeds them into your spec as reference documents — without you typing a summary yourself.

What's new for you

Connect a folder or site during the interview; Otto runs a background summarization pass and posts a follow-up turn with what he found — no manual copy-paste.
If the background call fails, Otto posts a recovery turn explaining what happened and what to try next.
Attached Files shows a “via SharePoint,” “via OneDrive,” or “via Google Drive” badge on any file that came from a connector.
Generation Detail shows a “Used N references from [Provider]” chip so you can see what each connector contributed — available on REST and MCP as well.
AI agents now read scanned PDFs natively without pre-processing.
Dropbox is coming next.
Free accounts can connect a folder and watch Otto summarize it; generating a spec that uses connector-sourced references requires Pro or Team.

Under the hood

Provider adapters live behind a common connector abstraction so additional providers (Dropbox, others) plug in cleanly.
The OAuth flow for each provider is handled through dedicated REST endpoints; connector credentials are stored per-workspace, not per-user.
The premium gate is enforced at generation time: Free users see the summarization step, but the generation call is blocked with a clear tier explanation before any provider cost is incurred.

MCP-native positioning on the marketing site

v0.13 · 2026-05-13

The landing page now leads with SpecStep's programmable surface — a new hero strip and a Tools section that shows the specific tools AI coders call through MCP.

What's new for you

The hero now includes an MCP-native callout strip above the fold — alongside REST + OpenAPI, Webhooks, and “Same key everywhere” — so the programmable platform story is visible before you scroll.
A new Tools section shows the categories of tools available through MCP, giving AI coders a concrete sense of what they can automate.
The sitemap is now generated dynamically and pings search indexes after each deploy so crawlers pick up changes faster.

Under the hood

Marketing HTML is served with cache headers that prevent crawlers from serving a stale deploy to users following a link.

Webhook subscriptions land in Settings

v0.12 · 2026-05-12

Manage your webhook subscriptions directly from the browser — no more calling the REST API to register an endpoint. New Settings → Webhooks tab with an API-key picker, an inline create form, and per-row Test / Rotate / Delete actions.

What's new for you

New Settings tab lists every subscription registered against the API key you pick — URL, subscribed events, last delivery status + HTTP code, and a “needs rotation” warning if the signing secret needs to be refreshed.
Test fires a synthetic event against the destination and renders the live outcome inline — delivery time and HTTP status, or the failure reason if it didn't land.
Rotate issues a fresh signing secret, shown once with the same copy + “I've copied it” gate the API-key flow uses. The reveal modal includes a “How to verify the signature” expander covering the header name, the HMAC algorithm, and constant-time comparison.
Delete takes a single confirm and removes the subscription — future events for that endpoint stop firing immediately.

Under the hood

Webhook create, list, delete, and rotate logic flows through a single Application service so the REST endpoints, MCP tools, and the new Settings page all share one path.

Generation detail polish — readable timestamps and cleaner now-playing

v0.12 · 2026-05-12

Small UI fixes on the generation detail and Interview pages based on user feedback. Timestamps switched from 24-hour to 12-hour with am/pm; the conversation feed's now-playing peek no longer looks like cards are stacking up underneath the active one; the Interview side panel reorders to keep the AI Team visible above the file list.

What's new for you

Generation header timestamps read “May 12 · 8:10pm” instead of “May 12 · 20:10”; the right-rail Started / Completed dates read “5/12/26 8:10pm” instead of “2026-05-12 20:10”.
The conversation feed bubble's per-turn timestamp also flips to 12-hour lowercase am/pm.
The conversation feed's now-playing peek card leans to the bottom-left and auto-fades after ~1.5 seconds so previous activity stops visibly stacking under the active card.
Interview side panel: AI team now sits above Attached files so the file list doesn't push the AI team out of view.

Independent security review — full sweep

v0.12 · 2026-05-12

An independent security review covered the codebase end-to-end. Authorization, webhook validation, and secret handling were hardened; the headline user-visible improvement is faster session revocation.

What's new for you

Revoking a sign-in session now takes effect within about 30 seconds, not on next sign-in.
API keys now carry per-key scopes — create a key scoped to the permissions you want and the server enforces the scope on every call. The Settings UI gained a scope picker so you can preview the scope set before issuing.
If you have unsigned terms-of-service updates, you'll see the acceptance prompt the next time you sign in — not silently bypassed by a stale cookie.
REST, MCP, and webhook error fields no longer surface raw provider exception messages — you get a sanitized, category-derived string instead.

MCP surface expansion — 11 new tools

v0.12 · 2026-05-12

A surface review identified 11 high-value tools missing from the MCP surface. All 11 shipped, bringing the MCP tool count from 36 to 47.

What's new for you

Read change-request addenda from MCP — list_change_requests + get_change_request return the full addendum metadata plus a fresh download URL.
Pull a single intake artifact's full payload via get_intake_artifact — the list tool returns metadata only; the new singular tool returns full intake artifact details for an in-flight or completed run.
Estimate a generation's cost before kicking it off via estimate_generation_cost(profile) and an addendum's cost via estimate_change_request_cost — both return the rolling 30-day median and p25/p75 from your historical runs.
List packages scoped to one generation via list_packages_for_generation(generation_id) — sister to the existing user-scoped list.
Manage webhook subscriptions from MCP — list_my_webhooks, create_webhook, delete_webhook, rotate_webhook_secret, and test_webhook which fires a synthetic event and returns the live delivery outcome.
Compare your own packages from MCP via compare_packages — returns identity verdict, per-package report, and cross-package report.

Performance pass

v0.12 · 2026-05-11 → 12

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Reliability fixes

v0.12 · 2026-05-11

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

v0.12 polish & infrastructure

v0.12 · 2026-05-11 → 12

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Mid-flight recovery covers every pipeline state

v0.11 · 2026-05-10

A generation interrupted by a transient infrastructure issue now resumes cleanly from any pipeline state, not just the original three.

What's new for you

A generation that fails or restarts mid-flight resumes from wherever it was — no lost progress, no double-billing for work that already landed.
GitHub delivery retries are idempotent — a retry after a transient interruption won't push the same commit twice if the first attempt landed but the response was lost.

REST auth fixed and GitHub integration coverage

v0.11 · 2026-05-10

Two long-overdue cleanups: REST callers now get a proper JSON 401/403 on auth failure instead of an HTML redirect, and GitHub integration reliability improved with expanded automated coverage.

What's new for you

REST API callers — CLI, MCP clients, programmatic integrations — get a 401 or 403 with a JSON body on auth failure instead of a redirect to an HTML page. Your client can now tell “wrong credentials” from “session expired” without parsing HTML.

Test-automation reviewer pass

v0.11 · 2026-05-10

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Addenda are first-class generation rows

v0.11 · 2026-05-10

Addenda now run through the same dispatcher as full generations, with full lease, heartbeat, and sweep coverage alongside them.

What's new for you

Addendum POST returns the addendum's full shape — ID, download URL, cost — once the worker finishes. No more polling the page for status.
Expand a package row in the workspace to see all addenda attached to it without opening the detail page.
MCP list and search calls now show only the calling actor's own resources.

Intake-name extraction fixed and visual baselines committed

v0.11 · 2026-05-10

Two long-deferred items closed: new generations name themselves correctly from intake data, and visual-baseline end-to-end tests went from “feature exists but no baselines committed” to “18 baselines reviewed, committed, and gated in CI.”

What's new for you

New generations pick up their name from the intake — project names show through cleanly instead of falling back to “(unnamed).” Existing rows still show the old name; the rename pencil on the detail page handles those.

v0.11 polish & infrastructure

v0.11 · 2026-05-10

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Workspace tables get pagination, sort, and UX cleanup

v0.10 · 2026-05-09

The Generations and Packages tables in the workspace are now paginated and sortable. A batch of smaller paper cuts — broken date format, mismatched icon, missing project name in toast headlines — got fixed in the same pass.

What's new for you

Both tables paginate — default 5 rows per page, dropdown for 10, 25, 50, 100, or All; your choice persists across reloads.
Click any column header to sort ascending, descending, or off; the active column shows a directional glyph.
Inline rename on the generation detail page — click the pencil next to the title to rename in place, matching the workspace row pattern.
Failed-generation toast renders in red instead of the same color as a completed run, and the headline leads with the project name.
The package “Created” column reads in a human date format instead of ISO.
“Request a change” button in the packages row matches the Download and Delete chrome — a single icon button.
Detailed agent narration restored in the in-flight feed — specific rationale and confidence scores, not just a summary verb.
Package cost on new runs reflects the real per-agent invocation sum — was showing zero before.

Security review sweep — across every project

v0.10 · 2026-05-09

Independent security review completed; authorization, webhook validation, and secret handling hardened across every project in the codebase.

Reliability and notifications backlog burn-down

v0.10 · 2026-05-09

A focused pass closed 30+ backlog items, mostly in the notifications, source-control, and authorization layers.

What's new for you

Notifications are exactly-once on retried webhook events — duplicate event IDs no longer fire a second inbox row.
A failed notification can retry after a transient channel failure — the orchestrator no longer locks a failed row permanently.

Addendum reliability and GitHub adapter coverage

v0.10 · 2026-05-09

A second-pass review tackled specific reliability and correctness items — largest themes: addendum reliability, GitHub adapter test coverage, and aggregate cost on the package wire shape.

What's new for you

The package now shows an aggregate cost covering the original generation plus every addendum — you can see what the package actually cost end-to-end.
Markdown rendering for addendum content is now formatted correctly — no more raw-text headings on the page.
The recommender retries with an explicit “you missed these required fields” callout when its first response is incomplete.

v0.10 polish & infrastructure

v0.10 · 2026-05-08

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Change addendums for completed packages

v0.9 · 2026-05-08

Filing a focused change against a completed package no longer requires a full re-generation. Addendums are a single-LLM-call flow that produces a 5-section markdown bundle.

What's new for you

Every completed package row in the workspace has a “Request a change” button. Pick the mode (addendum or full re-generation), describe the change, submit.
Addendums produce a 5-file zip — background, change requirement, implementation guide, test plan, and a decision-log entry — attached as a sibling artifact, no version bump.
The generation detail page gained a “Change addenda” section listing every addendum filed against the package, newest first, with per-row download.
The workspace package row shows an addendum count next to the version when at least one addendum is attached.
The notification bell surfaces “Change addendum ready” on completion, linking directly to the parent generation.

Under the hood

Three new REST endpoints for addenda — create, list, and per-addendum download — plus a new MCP tool request_change mirroring the same shape.

Clickwrap terms acceptance and AI-output disclaimers

v0.9 · 2026-05-08

Sign-in now gates on an explicit terms-acceptance checkbox, and the terms themselves gained new sections covering AI-generated output and preview-edition expectations.

What's new for you

The landing-page sign-in card requires a “I agree to the Terms and Privacy Policy” checkbox before the OAuth buttons become active.
Three new terms sections cover AI-output limitations and your validation responsibility, preview-edition expectations (no SLA, data may move), and warranty and liability disclosures.
The access-denied page now exists — it was a broken 404 before for users who hit a permission-gated route.

Conversation feed — now-playing stack with diff chips

v0.9 · 2026-05-08

The live agent feed during a generation is now a single “now-playing” card stack instead of a scrolling list. Every action shows a +N/−N chip telling you exactly how much content the agent produced or removed.

What's new for you

While a generation runs, the feed centers the active agent's card and fades the previous one behind it — no more rapid-fire scrolling that was hard to follow.
Each completed action carries a green +N / red −N chip showing lines added or removed.
An expander shows the full prior history when you want it; the default view stays focused on what's happening now.

Search across all your packages

v0.9 · 2026-05-08

A search box above the workspace package list now searches inside every package you own in one query — file contents, not just project names.

What's new for you

The new search field returns ranked file hits across every package; each hit shows a highlighted snippet with matched terms.
Quoted phrases, OR alternation, and term exclusion all work — standard web-search syntax.
Per-package search and cross-package search are both available as REST endpoints and as MCP tools, so agents can find a package by content in one round trip.

MCP gap closure — lifecycle, files, capabilities, intake artifacts

v0.9 · 2026-05-08

Eight new MCP tools close the recovery and discovery gaps for AI agents driving SpecStep — lifecycle controls, per-file access, capabilities discovery, and intake-artifact listing.

What's new for you

cancel_generation, retry_generation, pause_generation, resume_generation — an agent observing a stuck or runaway run can bail out cleanly without re-deriving the intake.
list_package_files + get_package_file — inspect package structure and read individual files without downloading the full zip.
get_capabilities — discover valid review-profile names, project types, and schema versions before constructing a kickoff.
list_intake_artifacts — find ready-to-generate intakes without filtering inline.
update_generation_name + get_latest_package_for_generation — small metadata tools that close the agent-side ergonomics gap.

Auto-filed bug reports with diagnostic context

v0.9 · 2026-05-08

When a generation fails for a platform reason, SpecStep now auto-files a bug report with diagnostic context — no waiting for a user to notice and report.

What's new for you

System-detected failed runs can now auto-file a bug report for the team.
The failure card now says “We've automatically reported this to our team — you don't need to file a separate ticket,” so users aren't left wondering whether anyone knows.

v0.9 polish & infrastructure

v0.9 · 2026-05-07

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Lyra HTML mockups in chat and packages

v0.8 · 2026-05-07

Lyra now sketches HTML/CSS mockups inline in the interview chat and drops them into your final package zip — visual scaffolding you can open in a browser, share, and iterate from.

What's new for you

Ask Otto for UI work and Lyra produces real HTML/CSS mockups inline — sandboxed preview, click to expand into a lightbox.
Download from the lightbox to save the mockup as a standalone file you can open, share, or hand to a designer.
Mockups also land in your generated package under the design directory so your build agent can read them alongside the spec.
Upload screenshots or design references during the interview — Lyra sees those images when drafting mockups, so the proposed UI matches what you uploaded.

Two more specialists — Marc and Trip

v0.8 · 2026-05-07

Two more specialists join the AI team — Marc for industry-specific context and Trip for user-journey rigor — bringing the roster from 22 to 24 agents.

What's new for you

Marc (Business Analyst) covers domain models, regulatory landscape, comparable products, and customer-journey patterns for industry-specific projects. Downstream architecture agents read Marc's output when picking the stack.
Trip (UX Researcher / Workflow Analyst) joins on user-facing work — user journeys, task flows, edge cases, empty states, and accessibility-adjacent workflow issues.
The marketing site now shows the live specialist count, driven from the agent catalog — future additions update everywhere automatically.

Soft-delete, restore, and a Recycle Bin

v0.8 · 2026-05-07

Every interview, generation, and package is now reversible — soft-delete from the web app, the REST API, or MCP; restore from a Recycle Bin in Settings; and a 10-second Undo toast catches the common case before you have to dig.

What's new for you

Delete an interview, generation, or package from its detail page — allowed in any terminal state.
After every delete, a corner toast appears with an Undo button for ~10 seconds — one click and the row comes back.
Recycle Bin under Settings → Recycle Bin lists your own soft-deleted rows across interviews, generations, and packages; Restore returns a row to your workspace.
Soft-delete and restore are also available via the REST API and MCP — agents can delete and restore items from the same session where they're working.

Marketing and app feel like one product

v0.8 · 2026-05-07

The signed-in app now feels like the same product as the marketing site — same wordmark, same nav character, same chrome treatment, dark mode working end-to-end.

What's new for you

Dark mode now works end-to-end on the marketing site — backgrounds, cards, the orchestration timeline, and the “What you get” file tree all flip cleanly.
The signed-in app's topbar adopted the marketing site's editorial treatment: sticky, blurred backdrop, same lockup wordmark, same nav-link style. Account chrome — notification bell, avatar, plan badge — is preserved.
The full marketing nav (How it works / What you get / Meet the team / Pricing / About / FAQ / API docs / Support / Contact) is present in the signed-in topbar — you no longer hit a dead end looking for marketing pages.
The page heading no longer shows a stray focus outline after you click somewhere — screen readers still announce navigation normally.

Landing-page content evolution

v0.8 · 2026-05-07

A round of landing-page updates — fresher copy, an expanded orchestration timeline, and full section nav.

What's new for you

The top nav adds How it works, What you get, and Meet the team links so visitors can jump directly to any section.
Hero eyebrow updated to “Experts On Demand. 24 specialists, 1 conversation.”
The How it works orchestration timeline now shows all 24 agents and extends the sample conversation with new turns from Marc, Trip, Codd, Atlas, and Tally.
The “What you get” file tree adds the new design/mockups directory that Lyra writes; the third card now mentions Lyra mockups, Merlin, and Polo.

Account Tiers and Generation Types

v0.8 · 2026-05-07

Two new administrative surfaces — one for managing which subscription plans unlock which generation profiles, and one for editing per-profile labels in real time.

What's new for you

Each profile's display name and description are editable in real time, so changes show up in the picker without a deploy.
The interview profile picker reflects label changes within about 60 seconds.

Per-profile cost in My Analytics

v0.8 · 2026-05-07

My Analytics now breaks your monthly spend down by generation profile so you can see where the dollars actually go.

What's new for you

New “By profile” panel on Settings → My Analytics shows your average cost per generation across Fast, Normal, Extensive, and Researcher for the trailing window.
The same data is available via the REST API for anyone building an external dashboard.

Preview-mode user approval gate

v0.8 · 2026-05-07

While SpecStep is in preview, new accounts can complete an interview but generation kickoff stays disabled until a SpecStep admin approves the account.

What's new for you

New users land in the workspace, can run interviews, and see the catalog — but the Start generation button is disabled with a “your account is pending approval” notice until an admin approves it.
All accounts that existed before this change were automatically approved — no disruption to existing users.

Tighter acceptance criteria

v0.8 · 2026-05-07

The Architect now drafts acceptance criteria with machine-verifiable Then clauses, and the Critic flags subjective language as blocking before the package ships.

What's new for you

Acceptance criteria in your generated requirements now read as Given/When/Then where each Then clause is something a test or integration check can verify — “returns 204 within 200 ms,” not “is fast.”
If a Then clause uses subjective language (“user-friendly,” “intuitive,” “fast”), the Critic flags it as blocking and the Architect re-drafts before the package ships.

Orchestrator and LLM reliability

v0.8 · 2026-05-06

Three reliability fixes cleared a class of “this used to fail mysteriously” production issues.

What's new for you

A generation that was in progress when the host restarted now auto-resumes from the last checkpoint instead of staying stuck — the retry-from-interview button is a fallback, not the primary path anymore.
Cancel and Delete on the generation details page work correctly again — a regression had been swallowing both actions silently.
When the primary model returns an incomplete stack recommendation, the orchestrator now retries with a secondary model before failing the run.

API docs cover the Recycle Bin

v0.8 · 2026-05-07

The public API docs at /api-docs/rest and /api-docs/mcp now document the soft-delete, restore, and Recycle Bin endpoints and tools introduced this release.

What's new for you

The REST guide covers all user-facing restore and deleted-list endpoints, including the status codes returned for each operation.
Per-entity DELETE behavior is documented for interviews, generations, and packages — including what happens when you try to delete an active generation.
New MCP tool entries documented: delete_interview, restore_interview, delete_generation, restore_generation, and the full update_package tool with all three operations.

Bug reports from anywhere

v0.8 · 2026-05-06

File a bug from the browser, the REST API, or the MCP tool surface — every report lands in the same queue with the same context attached.

What's new for you

The Submit-a-ticket form's “Bug” category now files a real bug report — same queue as the API and MCP paths.
API callers can submit via POST /v1/bug-reports and read their own submissions back by list or by ID.
MCP-capable agents get submit_bug_report, list_my_bug_reports, and get_bug_report — an agent can file the bug from the same session where it noticed the problem.
Every submission automatically carries your account name, plan tier, build version, and the AI tool you're using — triage gets the context without a follow-up.

More MCP introspection

v0.8 · 2026-05-06

The MCP tool surface grew from 12 to 19 tools — agents can now list and fetch interviews, list generations in any state, and read bug reports they filed.

What's new for you

New list_interviews and get_interview MCP tools mirror the REST list/by-ID pattern, with a matching REST endpoint.
New list_generations covers in-flight, paused, failed, and cancelled runs — list_packages only sees runs that produced a package, so this fills the gap.
list_packages gained limit, offset, and order arguments, plus a generation_state field on every row so you can tell clean completions from partial or failed runs.
get_status was renamed get_generation and its response now includes progress percentage, failure category, and a cost forecast with p25/p75 confidence bounds.

Under the hood

The get_status → get_generation rename is a breaking change; callers using the old name need to update.

Reliability and concurrency hardening

v0.8 · 2026-05-06

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Workspace and marketing polish

v0.8 · 2026-05-06

A round of UI fixes: the notification bell opens on hover, the marketing sign-in flow works again, three more accessibility contrast gaps closed, and action icons show hover tooltips.

What's new for you

The notification bell opens on mouse-hover instead of requiring a click; a hover bridge keeps the dropdown open as your cursor moves to it. Clicking an item still navigates straight to its source.
The marketing site's “Sign in” call-to-action works correctly again — a regression had been swallowing the click.
Cancel is reachable from a generation paused awaiting clarification — previously only Resume and Answer were shown.
Three more contrast violations closed: agent thumbnail accents in the How It Works legend, accent text in dark mode, and agent accents in the Beats section all meet WCAG 2.1 AA.
The eight new specialist agents now have headshots and loader animations on the marketing Meet the Team section.

Eight new specialist agents

v0.8 · 2026-05-06

SpecStep's roster grows from 14 to 22 agents — eight new specialists for reliability, data, accessibility, localization, AI/ML, compliance, cost, and risk. Each runs only when your project actually needs it, based on signals Otto picks up during the interview.

What's new for you

Atlas (Reliability) drafts the operational story — SLOs, alert catalog, capacity plan, RPO/RTO, on-call playbook stubs.
Codd (Data) designs schema with constraints, an index strategy per query path, online-migration patterns, and retention tied to the privacy story.
Halo (Accessibility) writes the WCAG 2.1 AA contract for UI projects: keyboard maps, screen-reader expectations, focus-visible rules, color-contrast minimums.
Polo (Localization) scopes the i18n plan: locale catalog, pluralization, RTL, date/number/currency formatting, translation flow, and the fallback chain.
Merlin (Prompt Engineer) tunes AI features: model selection rationale, prompt-injection defenses, eval strategy, and cost ceilings.
Reg (Compliance) maps your SOC 2, HIPAA, PCI-DSS, or GDPR posture into specific controls and audit evidence — consultation-only, with the standard AI-isn't-legal-advice disclosure.
Tally (Cost) puts numbers on build hours, run-rate, build-vs-buy decisions, and budget alarms.
Hazard (Risk) compiles the risk register: schedule, vendor lock-in, regulatory drift, capacity, team, and the AI-coder traps to watch for.
Vera (Test Automation) is now available on the Free tier — every project gets a real testing strategy regardless of plan.
Each agent has a dedicated page at /agents/{slug} with bio, tagline, and accent color.

Soft-delete generations from the workspace

v0.8 · 2026-05-06

Delete a finished generation from your workspace listing without losing the audit trail.

What's new for you

A delete button appears on every terminal-state generation row (Complete, Failed, Cancelled) — click to remove it from your workspace view.
The same delete affordance is available from the Generation Details page.
The row drops out of every workspace view and spend tile; the underlying record is retained for audit and retention purposes.

Marketing accessibility pass

v0.8 · 2026-05-06

WCAG 2.1 AA contrast pass on the marketing site, driven by an automated accessibility audit.

What's new for you

The hero headline's accent color now meets WCAG AA contrast against the page background.
The pitch section's numbered badges meet contrast minimums on every background tone.
Dimmed agent thumbnails in the How It Works legend stay legible at their reduced opacity instead of fading into the page.

Mid-generation clarifications

v0.7 · 2026-05-05 → 06

When an agent realizes mid-generation that it's missing context it can't reasonably guess at, it now pauses the run and asks you — in the original interview chat — rather than guessing wrong or failing the whole package.

What's new for you

The Recommender, Architect, and DesignerCritic can pause a generation with a specific question when the spec is missing critical detail.
Paused generations show an “Answer required” card on the details page with the questions inline and a one-click jump back into the interview chat.
Workspace rows for blocked generations link straight to the interview in warning-toned styling — so they stand out from in-flight rows at a glance.
Answers feed back into the resumed run; agents re-draft the originally-stuck section with the missing context filled in.
API and MCP callers get a structured surface: GET /v1/generations/{id}/clarifications, POST .../clarifications/answers, and the answer_clarifications MCP tool.

Marketing site rewrite

v0.7 · 2026-05-05

New look for every public page — new typography, new palette, new hero, a How-It-Works timeline, and a Meet-the-Team section with animated agent loaders.

What's new for you

Refreshed homepage with a 60-second pitch, a live-feeling How-It-Works timeline, an annotated .specstep/ file tree, and a 14-agent roster with click-to-expand bios.
Each agent has its own accent color and animated full-body loader on the team detail modal — Otto cyan, Stax blue, Alan purple, Lyra pink, and so on.
Pricing, About, Privacy, Terms, FAQ, Contact, Support, and per-agent pages all restyled to match.
New brand lockups, favicons, and Open Graph cards.

Live agent conversation feed

v0.7 · 2026-05-05 → 06

The Generation Details page and workspace in-flight rows now feel alive while a generation is running — you see agents check in one at a time, cost tick up as new invocations land, and a real progress bar advance.

What's new for you

Agent-by-agent conversation feed on the Generation Details page: chat-bubble entries with the agent's name and accent color, a one-line action verb, a longer narration, duration, and a running cost total.
Workspace rows for in-flight generations show the latest two agent turns inline, a thin progress bar, and an animated cost counter.
The cost row on the details page shows a running total alongside a historical-median estimated range — so you have a sense of where this generation is likely to land.
An “Active agent” indicator shows who's working on your generation right now, with a pulsing live dot.

Pipeline reliability and cost realism

v0.7 · 2026-05-04 → 06

Parallel pipeline execution and a series of efficiency improvements — generations now finish faster, fail more cleanly, and cost what they actually cost.

What's new for you

Faster generations: the Architect drafts sections in parallel waves; fresh-eyes review runs concurrently for the Extensive profile; DesignerCritic runs alongside the Recommender on UI projects.
Setup-time failures — auth gate, quota, and similar — now persist as Failed with a clear reason instead of silently sticking in Queued.
The Critic now re-drafts only the sections it flagged instead of the whole package — meaningfully more efficient on every run.

Workspace and API surface polish

v0.7 · 2026-05-05

Smaller improvements you'll feel right away: better spend visibility, project names that flow through every read API, and the right destination when you click into a generation.

What's new for you

Two new spend tiles on the workspace — rolling 7-day and 30-day — alongside the existing 24-hour tile, each with an all-time total beneath it.
Clicking a workspace row now opens the full Generation Details page directly instead of a small overlay that duplicated the same content.
Project name, description, and a stable “specification package” kind label are now on every GET /v1/generations/{id} response and the equivalent MCP responses — so external tools can show what a generation is about without parsing the intake.
The workspace's per-row Download button now resolves the correct package ID server-side (was 404ing on a path mismatch).

Mid-flight recovery for generations

v0.7 · 2026-05-04

A generation that loses its host process or dispatcher worker mid-flight now resumes on the next host rather than silently stalling in Queued or vanishing entirely.

What's new for you

Generations whose host restarted while running now pick up where they left off instead of needing a manual retry.
The stuck-generation sweep that auto-fails abandoned rows now fires after 10 minutes, down from 30.
The workspace silently retries once when a system-caused failure happened within the first minute — you only see the error if it happens twice.

Failure classification and retry UX

v0.7 · 2026-05-04

When a generation fails, the workspace tells you what kind of failure it was — in plain English — and gives you a one-click Retry instead of a generic error.

What's new for you

A failed generation row shows a short, plain-English reason with a “Show technical details” toggle for the full message.
An inline Retry button re-runs the same intake at the same profile in one click.
System-caused failures with a queue time under a minute auto-retry once, silently — you only see the error if it happens twice.
A friendly 401/403 page replaces the blank framework response when an unauthenticated or unauthorized request hits a protected page.
Review-budget exhaustion — the Critic ran out of rounds with blocking issues still open — gets its own surface with the issue summary and suggested next actions, not a generic error.

Internal observability improvements

v0.7 · 2026-05-04

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Tab-as-page routing for Settings and Billing

v0.7 · 2026-05-03 → 04

Each tab inside Settings and Billing is now its own routable page — deep links work, the back button works, and breadcrumbs anchor you in the navigation hierarchy.

What's new for you

Bookmark /settings/notifications directly — each tab is a real URL, not a query string.
The browser back button moves between tabs the way you'd expect.
Each section page shows breadcrumbs above the header so you always know where you are.

Public API documentation

v0.7 · 2026-05-03 → 04

A new /api-docs section on the marketing site documents every REST endpoint, error code, failure category, real-time contract, and MCP tool — with an OpenAPI spec auto-generated at build time.

What's new for you

Read the full REST API reference at /api-docs/rest before you write integration code.
Every endpoint — including retry, lifecycle controls (pause, resume, cancel, rename), and the status family — is documented with request and response examples.
The failure_category field on the generation response is documented alongside the additive contract that protects existing integrations.

Service status and uptime page

v0.7 · 2026-05-03

A public /status page shows the current state of every part of the SpecStep platform, with a 90-day uptime history and an incident workflow.

What's new for you

/status shows green/yellow/red live, with the last 90 days of uptime as a daily strip.
Subscribe to status updates by email; verify in one click, unsubscribe in one click.
A history view at /status/history shows every past incident with its full timeline.

v0.7 infrastructure & polish

v0.7 · 2026-05-03

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Security review hardening

v0.6 · 2026-05-03

Independent security review completed; authorization, webhook validation, and secret handling hardened.

Independent review sweep

v0.6 · 2026-05-03

Independent security review completed; correctness and security fixes landed across every layer of the codebase.

v0.6 · 2026-05-03

A short polish pass cleared color-contrast and form-label failures and tightened a handful of resilience edges — rate limiter, splash screen, and the 429 page.

What's new for you

The persona dropdown has a proper form label; the notification bell announces its role to screen readers.
The rate limiter no longer throttles static assets; 429 responses now render a friendly page instead of a blank framework response.
The splash screen waits for the real-time circuit to be ready before dismissing — no more blank page on slow connections.

Meet the Team

v0.6 · 2026-05-03

Each agent now has a public profile — accent color, mission, signal sources, and how they show up in your generation — reachable from a Meet the Team strip on the homepage.

What's new for you

The homepage Meet the Team strip links to individual pages for the Code Reviewer, Privacy Attorney, Security Expert, and every other agent.
Each agent page shows the accent color, public summary, project types it consults on, and how to reach it from REST or MCP.

Agent identity and per-agent detail pages

v0.6 · 2026-05-03

Each agent — Code Reviewer, Privacy Attorney, Security Expert, and the rest — now has its own visual identity and a public profile.

What's new for you

Each agent displays its own accent color and curated summary, distinct from every other agent.
Admins can write or edit an agent's public summary and choose whether it appears on the homepage.

Interview UX redesign

v0.6 · 2026-05-03

The interview — where you talk through your idea with the AI Team — has been rebuilt to make the conversation easier to read and to show you what's been captured as it happens.

What's new for you

Agent turns and your replies are now visually distinct, so you can follow the thread at a glance.
A “Captured so far” panel surfaces the structured output the AI Team is building in real time.
An inline divider marks exactly when a new agent joins the conversation mid-interview.
The Cancel button now stops the AI call immediately — previously it set a flag and waited for the response to finish.
Each source reference appears as a labeled pill; the Researcher card spans the full row width.

Cancellation and reliability

v0.6 · 2026-05-03

Generations now have a real Cancel button, automatic recovery for stalls, and a live view of exactly what's happening while you wait.

What's new for you

Cancel a running generation at any time — it stops immediately and shows a Cancelled state.
The in-flight panel now shows the current stage, which agent is working, last activity time, and recent events.
Unlimited Access plans no longer hit the concurrency cap that applies to standard plans.

Site-wide design pass

v0.6 · 2026-05-03

Every major surface — Workspace, Interview, Marketing, Pricing, Landing, About/FAQ/Privacy, system pages, ticket form, and Generation Detail — went through a structured design and implementation pass.

What's new for you

Generation Detail page rebuilt with a metadata header, pipeline progress strip, and two-column layout.
Pricing page gains a Most Popular pill, a full comparison table, and explicit upgrade calls to action.
Landing page proof moves above the fold; an anatomy-of-package section shows exactly what ships in a generation.
The ticket form deflects to the FAQ before submitting, with category routing and pre-filled name and email.
Account Disabled and Billing Success pages have rewritten copy — accurate and appropriately toned for each moment.

Settings split into three surfaces

v0.6 · 2026-05-02

The single Settings page has been split into separate surfaces for personal preferences, administration, and billing — each with its own URL and sub-section navigation.

What's new for you

Personal preferences, administration controls, and billing are no longer on the same page — each lives at its own URL.
Sub-section navigation within each surface is consistent across all three areas.

v0.6 infrastructure & polish

v0.6 · 2026-05-02 → 03

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Run Comparison

v0.5 · 2026-05-02

Researcher mode fans a single idea into three parallel documentation runs, then scores and grades each result so you can see which configuration produces the best output before you commit.

What's new for you

Start a Researcher run from the Workspace to generate three side-by-side documentation packages from one idea.
A letter-grade scorecard in the Workspace shows which package won and why — build quality, content judgment, and near-duplicate detection all factored in.
Per-tier generation profiles replace a hardcoded limit table, so quota behavior is consistent across plans.

Security hardening

v0.5 · 2026-05-02

Independent security review completed; authorization, webhook validation, and secret handling hardened. Resource access checks were strengthened across every read path.

Security Expert agent

v0.5 · 2026-05-02

The AI Team gained a Security Expert that reviews your documentation package and produces a dedicated security-findings artifact.

What's new for you

A 04-security-review.md file now appears in every documentation package, covering findings the Security Expert surfaced during its review pass.
Important findings surface in your notification inbox — you don't need to go looking for them.

Architect resilience

v0.5 · 2026-05-02

The Architect agent no longer fails an entire generation when a single spec section hits a validation error.

What's new for you

Generations that previously stopped on a difficult section now complete, with low-confidence sections flagged in the package manifest rather than lost entirely.

Live generation progress

v0.5 · 2026-05-02

The Workspace progress bar now advances in real time as the Architect works through each section — no refresh needed.

What's new for you

The Workspace row label updates live — “Drafting · section 5 of 17 · est. ~12 min remaining” — so you know exactly where a generation stands.
Multiple open Workspace tabs stay in sync with each other automatically.

Internal observability improvements

v0.5 · 2026-05-02

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Test coverage and accessibility

v0.5 · 2026-05-01 → 02

A multi-day quality push brought every routable page, domain aggregate, application service, and LLM agent under automated test — the foundation that lets everything else move fast.

What's new for you

No user-visible changes — all changes are foundational.

AI Team expansion and agent consultation

v0.5 · 2026-05-01

The AI Team grew from four roles to a full catalog — including three legal specialists — and the Interviewer can now consult any agent mid-conversation.

What's new for you

A “Meet Your AI Team” panel above the project type selector previews every agent that will be consulted before the interview starts.
The Interviewer now pulls in other agents mid-conversation and shows a turn-by-turn summary of who contributed.
Your conversation keeps a Team panel showing which agents have joined, why each was consulted, and a way to remove one with a reason.
Legal-flagged agents — Privacy Attorney, Commercial Attorney, Internet & Tech Attorney — show a one-time acknowledgment modal as a reminder that their output is not legal advice.
New agents: Code Reviewer, Test Automation, Copy Editor, Privacy Attorney, Commercial Attorney, Internet & Tech Attorney.

Bring-your-own AI provider keys

v0.5 · 2026-05-01

Role-based access control now governs who can do what across the product, and users can supply their own AI provider key.

What's new for you

Bring your own AI provider key to bypass the included quota and use your own model access directly.
Each Workspace row shows a source-channel badge — Web, API, or MCP — so you can see how each generation was started.

File uploads and multimodal context

v0.5 · 2026-05-01

Attach reference documents — PDFs, images, YAML configs, hand-drawn diagrams — directly in the interview, and the AI Team reads them.

What's new for you

A paperclip and thumbnail row in the interview composer lets you attach files mid-conversation.
The Recommender, Architect, and DesignerCritic agents read your uploaded files directly — images, structured docs, and text formats alike.
Supported formats: PDF, DOCX, PNG/JPG, RTF, ODF, HTML, SVG, JSON, YAML, XML.

My Analytics

v0.5 · 2026-05-02

The Analytics view is now scoped to your own data by default, with an All Analytics dropdown available to privileged roles.

What's new for you

“Analytics” is now “My Analytics” — everything you see reflects your own generations, quota, and usage.

Shell and UX foundations

v0.5 · 2026-05-01

v0.5 lands with the official SpecStep mark, a global footer, a live inbox, and a dark-mode toggle that works.

What's new for you

The official SpecStep mark and favicon appear across all pages, with Open Graph meta for link previews.
A global footer adds About, Contact, and Support sections to every page.
The topbar bell opens a cross-device inbox; clicking a notification takes you directly to that generation.
Every “Loading...” placeholder — reconnect overlay and initial splash included — is now an animated brand loader.
The dark-mode toggle flips the theme immediately.
The version label shows cleanly without a build-suffix; hover it to see the full build SHA.

Initial preview launch

v0.4 · 2026-04-29 → 30

SpecStep launched with a Web UI, a REST API, and an MCP server — three surfaces over one orchestration core — covering seven project types end-to-end and delivering packages directly to a GitHub repo as a pull request.

What's new for you

Start a generation from the browser, the REST API, or any MCP-capable client including Claude Code and Claude Desktop.
Choose Fast, Normal, or Extensive review depth; multi-provider review provides a fresh-eyes check on every package.
Watch your generation progress in real time — stage, agent, and state updates push to the UI without polling.
Finished packages download as a zip or land in your GitHub repo as a pull request, with optional Copilot review.
Email and SMS notifications, a cost dashboard, and GitHub source-control settings are all in Settings from day one.
Sign in with Microsoft Entra ID, Google, or GitHub.

Under the hood

Clean layered architecture; orchestration is surface-agnostic — the same core drives Web, REST, and MCP.
The pipeline aims for deterministic zip output: holding inputs, model versions, and prompt versions constant should produce a stable bundle. We test for this and any drift surfaces as a failing build.
Automated checks include redaction passes for credentials, financial identifiers, and personally identifying information before agents read content. We layer detection rules and review their coverage in audits.
Database and blob storage are backed by managed cloud redundancy tiers and routine restore drills; we don't publish the exact replication geometry.
Automated accessibility checks target WCAG 2.1 AA and run in CI. We treat the result as a continuous-improvement signal, not a third-party certification.

What's new in SpecStep

Session-state tools — delete errant lessons, reprioritize backlog items

What's new for you

Generation pipeline — determinism

Generation pipeline — API, schema, and safety rules resolved once

What's new for you

Under the hood

Reliability & recovery

What's new for you

Under the hood

Session-state & project tools are self-service — and fully documented

What's new for you

Under the hood

Generation pipeline — consistency & determinism

Generation-quality measurement instrumentation

Generation pipeline — resolve once, derive everything

What's new for you

Under the hood

Workspace packages list handles migrated packages

What's new for you

v0.25 polish & infrastructure

Generation pipeline — resolve, don’t just flag

What's new for you

Under the hood

Project-scoped API keys stay in their project

What's new for you

Build lessons & rules

What's new for you

Under the hood

v0.24 polish & infrastructure

Project detail page — command-center redesign

What's new for you

Under the hood

Credit-based pricing and billing

What's new for you

Teams and organizations

What's new for you

Launch-readiness site refresh

What's new for you

Generation quality — fewer false flags, more auto-resolved

What's new for you

Under the hood

Cost to build — token usage and the session kit

What's new for you

Generation reliability

What's new for you

Under the hood

Operator and access hardening

What's new for you

Under the hood

v0.24 polish & infrastructure

Specs that catch and fix their own contradictions before delivery

What's new for you

Under the hood

Cost to build — token usage rolled up per project

What's new for you

Interviews and generation kickoff, made reliable

What's new for you

Security review completed and faster resume

Live generation status you can trust

What's new for you

Web Controls v1 — canonical design system

What's new for you

Under the hood

Sharper, more reliable generations

What's new for you

Under the hood

Consistent data tables across the app

What's new for you

Build Lessons & Rules — the pipeline learns from its own history

Triage-flow dashboards

“At Every Step” branding

What's new for you

See which AI client made each change

What's new for you

Under the hood

v0.23 polish & infrastructure

Specs refine themselves before delivery

What's new for you

Under the hood