Skip to content
Loading SpecStep…

What's new in SpecStep

Updated 2026-06-01.

Live generation status you can trust

v0.23 · 2026-06-01

The state, progress, time estimate, round count, and billing posture shown while a generation runs are now consistent across the API, the MCP tools, and the web detail page.

What's new for you

  • After you answer a clarification, every status surface reflects the resumed state immediately — no more reading “paused, awaiting your answer” for a minute after you’ve already answered.
  • The progress bar never moves backwards — pausing for a clarification, or re-reviewing after you resolve a blocker, no longer drops the percentage.
  • The time estimate degrades to “Finalizing…” in the home stretch instead of going blank when a run outpaces its forecast.
  • The review-round label always reads honestly — “round N of M” never shows a number past the total.
  • A generation paused for your input shows a distinct “paused — your turn” billing state, separate from a transient-error retry.

Web Controls v1 — canonical design system

v0.23 · 2026-05-30 → 06-01

A visual refresh aligns every core control — buttons, inputs, cards, status badges, tabs, chips, and empty states — to a single consistent design system, in both light and dark themes.

What's new for you

  • Every core control now follows one visual spec, so the UI looks and feels coherent across the app.
  • Primary actions use high-contrast ink instead of the accent color, making the action hierarchy immediately clear.
  • The accent color is a consistent green across every surface that uses it.
  • Chips use the smooth sans-serif font, dropped their bullet prefix, and dark-mode button hover is visibly stronger.

Under the hood

  • A single canonical token set drives color, radius, and spacing for all controls across both themes.

Sharper, more reliable generations

v0.23 · 2026-05-30 → 06-01

A generation-quality push that makes spec packages more internally consistent, better grounded, and more reliably delivered.

What's new for you

  • References between documents are validated and reconciled before the package is assembled — renumbered paths, orphaned requirement citations, and naming inconsistencies are resolved automatically.
  • The traceability matrix now derives its coverage columns from the same catalog every other document uses — no more placeholder rows.
  • Backend and data products include an operational-readiness section automatically; AI-feature products include an AI-safety section automatically.
  • When a genuine architectural decision is unresolved, the run pauses and asks you rather than guessing and shipping.
  • Flagged-issue severity is scaled to your project: a small standalone tool is not evaluated against an enterprise-platform bar.
  • A deployment-feasibility check flags a contradiction before delivery — for example a serverless or static-only host paired with a component that needs a long-running self-hosted process — instead of shipping a plan that can’t be built as described.
  • The package’s stated generation cost is reconciled to one total, so the manifest, the handoff document, and the API never disagree on what the package cost to generate.

Under the hood

  • The Refining stage — which fills stub sections and reconciles documents before delivery — now runs in production; it had been registered but not active.
  • Stub-fill drafts are parallelized within a pass, and the stage re-runs on resume so the filled package is always what ships.
  • Mid-pass interruptions resume from the last checkpoint instead of failing the whole pass, and project attributes are re-fetched on every resume so a resumed run can't inherit a stale configuration.
  • Stale findings — flagged issues that a later step already resolved — are down-ranked or pruned automatically.
  • A refinement-audit summary is now visible on the generation detail and in the handoff document.

Consistent data tables across the app

v0.23 · 2026-05-30 → 06-01

Every data table in the app now shares the same component, the same visual style in light and dark themes, and server-side sort and search across the whole dataset — not just the visible page.

What's new for you

  • Tables on Projects, workspace packages, and billing and usage panels look the same in light and dark themes.
  • Sorting and search run against the full dataset — results don't change when you move to the next page.
  • Pagination is consistent across all tables, including a “load more” option that works without a known total.
  • Every empty table tells you what will appear there and how to add the first item.

Build Lessons & Rules — the pipeline learns from its own history

v0.23 · 2026-05-30 → 06-01

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Triage-flow dashboards

v0.23 · 2026-06-01

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

“At Every Step” branding

v0.23 · 2026-05-31

The new brand tagline — “At Every Step” — now appears as a typographic lockup across page headers, social share cards, and the /brand page.

What's new for you

  • Page headers and social share previews now carry the “At Every Step” lockup alongside the SpecStep wordmark.
  • The /brand page reflects the updated tagline for anyone building with or alongside SpecStep assets.

See which AI client made each change

v0.23 · 2026-05-30

SpecStep now records which AI client — and which version — made each session-state write, using identity captured at connection time.

What's new for you

  • Each session-state record shows which agent made the change and its version.
  • The Connected MCP clients panel in Settings now lists API-key clients, not just OAuth ones.

Under the hood

  • Client identity is captured at connection time and stored durably, so attribution survives a cache miss.

v0.23 polish & infrastructure

v0.23 · 2026-05-30 → 06-01

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Specs refine themselves before delivery

v0.22 · 2026-05-30

Every package now goes through a final Refining stage before it’s handed off — filling in placeholder sections, reconciling contradictions between documents, and resolving genuine blockers.

What's new for you

  • Packages get a final Refining pass that fills placeholder sections and removes dangling references before delivery.
  • Contradictions between documents are reconciled automatically, with a reconciliation summary on the generation.
  • When a real blocker can’t be resolved from your inputs, the run pauses and asks you a focused clarifying question instead of guessing.
  • Refinement, reconciliation, and blocker-resolution summaries show up on the generation, the read APIs, and handoff.md.

Under the hood

  • An interrupted run resumes into the Refining stage instead of restarting, and reconciliation only redrafts the documents that actually disagree.

Sharper, more internally consistent specs

v0.22 · 2026-05-30

A wave of accuracy work keeps requirement IDs lined up across documents, adds a canonical architecture-decisions section, and cuts false-positive blockers.

What's new for you

  • Acceptance criteria and requirement references come from one canonical set, so IDs line up across every document in the package.
  • Packages include an architecture-decisions section the spec binds to instead of re-deriving.
  • The required safety and security review must pass before a package is delivered.
  • Far fewer false-positive blockers — measurable criteria are no longer flagged for wording, and contradictions are caught before delivery.
  • An age-appropriateness check catches a case that could surface older-tier content to minors.

Dates in your local timezone

v0.22 · 2026-05-30

Dates and times now render in your browser’s local timezone instead of UTC.

What's new for you

  • Timestamps across Projects, Generation, and your Settings tables now show in your local timezone.
  • Times stay correct even in background tabs.

v0.22 · 2026-05-29

A new search box in the top bar — plus a dedicated /search page — finds pages across the site.

What's new for you

  • Search from the top bar in both the marketing and signed-in views.
  • / and ⌘K open search, and your recent searches are remembered.
  • A /search results page whose links work even before the page is fully interactive.

More reliable generations and honest ETAs

v0.22 · 2026-05-27

Progress and time-remaining on a running generation are now accurate and self-healing.

What's new for you

  • The generation page self-heals if a live update is dropped, so progress stops freezing.
  • Time-remaining stays honest past the original estimate instead of reading “finishing up” indefinitely.
  • Progress holds at its last point on failure instead of resetting to zero.
  • A resumed run reflects your current plan tier.

Redesigned “Explain this package” with downloads

v0.22 · 2026-05-27

The package-explanation modal got a clearer redesign and now lets you download the explanation.

What's new for you

  • A clearer, more readable explanation modal.
  • Download the explanation as Markdown, plain text, PDF, or Word.

Self-service session-state and projects

v0.22 · 2026-05-30

The session-state and project tools are now self-service for any signed-in user, with stronger per-user isolation and a default-project setting.

What's new for you

  • Session-state and project tools work for any signed-in user automatically.
  • A default-project setting so new work lands where you expect; organization sharing is preserved.
  • Stronger isolation keeps your session state and projects scoped to you.

v0.22 polish & infrastructure

v0.22 · 2026-05-27 → 30

Reliability fixes, consistency-checker calibration, and expanded test coverage across the series.

What's new for you

  • The top navigation stays on one line.

Auto-resume telemetry made honest

v0.21 · 2026-05-27

When a generation is interrupted and automatically resumes, the recovery used to read as “expensive and stalled” across the API and the web even though the run completed correctly. This release makes recovery read honestly — interrupted, recovering, completed.

What's new for you

  • Progress no longer jumps backward when a run resumes — get_generation, wait_for_generation, and list_generations now show the same monotonic progress the REST endpoint and the web already showed.
  • A new host_restart_resume_count field (MCP + REST) and a matching recovery badge on the workspace progress chip show when a run recovered — a run that resumed several times reads as honest recovery, not a silent stall.
  • The cost forecast is adjusted for resume-prone runs, so a recovered run's estimate reflects the extra work instead of reading far below the actual.
  • The generation event stream gained an auto-resume-completed event and now records every resume; the “time queued” signal measures the latest interruption, not time since the run first started.

Build-readiness and self-consistency in the package

v0.21 · 2026-05-27

A round of package-quality fixes: the package now reports its own build-readiness, its review summary no longer contradicts its contents, and every generation ships the AI-coder instruction files.

What's new for you

  • A structured build_readiness field on specstep.yaml — readiness is queryable, not just rendered, so your tooling can gate on it.
  • The package's reviews[] summary reconciles against the actual review payload, so it no longer says “not applicable” while carrying findings.
  • Generations started from the web “Start generation” button now include the AI-coder instruction files (CLAUDE.md, AGENTS.md, .cursorrules, .github/copilot-instructions.md) by default, matching every other way to start a run.
  • The generated CLAUDE.md is a clean verbatim mirror with an MCP-first reading order.

Migrate existing docs — the in-app project card

v0.21 · 2026-05-27

The doc-migration capability shipped earlier this series as an API; now there's a UI for it — a project card that walks upload → preview → commit.

What's new for you

  • A “Migrate existing docs” card on your project: upload a ZIP, see each file's proposed destination in the SpecStep layout, re-route any row that landed wrong, then commit — no generation, no cost.
  • Migrated packages are first-class — they count in your usage view and appear in package search alongside generated ones.

Migrate existing docs into a SpecStep package

v0.21 · 2026-05-26

Upload a ZIP of documentation you already have, review a dry-run mapping of how each file will be organized, then commit it into a package linked to your project — no generation needed.

What's new for you

  • Upload a ZIP of existing documentation to get a preview of how each file maps to its canonical location in the package — review it before anything is written.
  • Confirm the mapping and commit it in one step. Non-markdown assets are preserved exactly as uploaded.
  • Migrated packages appear alongside generated packages in list_packages and get_package — your AI coder sees them immediately via the MCP connection.
  • Packages created this way are included in account data deletion, the same as any other package.
  • Available now via the REST API (POST /v1/doc-migrations/preview and /v1/doc-migrations/commit) and MCP tools (preview_doc_migration and commit_doc_migration).

Project analytics dashboard

v0.21 · 2026-05-26

Three new sections on your project page give you a live read on velocity, flow health, and how much of the scope has been built.

What's new for you

  • Velocity — KPI tiles and a weekly throughput chart show how much is getting done and how that rate is trending.
  • Flow health — a cumulative-flow chart and a flow-efficiency stat surface where work is accumulating or moving freely.
  • Progress vs. scope — a burn-up chart, a weekly lead-time view, and a cycle-time scatter let you see what's shipped against what was planned.
  • Starting an interview now links it to a project automatically, so your analytics start accruing without a manual step.
  • The linked project appears at the top of the interview page, with a dropdown to reassign it if needed.
  • Connecting a repository is more robust when the GitHub App isn't set up yet — it no longer errors.

Security — resource access checks hardened

v0.21 · 2026-05-26

Your projects', packages', and organization's data stay scoped to your own account, and a gap in package access checks was closed in this release.

What's new for you

  • Your data is accessible only to your account and organization — strengthened checks apply at every layer.
  • A gap that could allow access to packages across accounts is closed.

Project-scoped API keys with in-place rotation

v0.21 · 2026-05-26

Create an API key bound to a single project — it can only see that project's data. Rotate the secret in place without deleting and recreating the key.

What's new for you

  • The “API key” card on the project detail page lets you create a scoped developer key tied to that project.
  • Rotate the secret with an inline confirm step; the new secret is shown once immediately after rotation.
  • A scoped key returns only the data belonging to its project — useful for giving a Claude Code, Cursor, or Copilot integration access to one project without broader reach.
  • The rotation endpoint is available via the REST API (POST /v1/api-keys/{id}/rotate).

Generation build-readiness, platform-aware recommendations, and lifecycle events

v0.21 · 2026-05-26

Several improvements to what you get out of a generation and how clearly it tells you what to do next.

What's new for you

  • The handoff document now includes a build-readiness banner listing unresolved blockers, low-confidence sections, empty reviews, and dangling references — the first thing your AI coder sees when it opens the package.
  • A file that's referenced but not yet generated ships as a navigable stub with a clear deferred status, instead of a dead link.
  • Clarification answers you gave during the interview are now honored if a generation is resumed — they were being stored but not applied.
  • The specialist caption shows a live “X of Y complete” count with an accurate roster as the generation runs.
  • Named lifecycle events appear in the generation event feed, so tools like Claude Code and Cursor can track generation progress precisely.
  • The platform now recommends services native to your chosen hosting target — pick a serverless host and the recommended stack favors that host's own database, key-value, and object-storage offerings instead of a generic default.
  • A launch-confidence warning appears when a generation skips a Security, Risk, Data Model, or Reliability review on a risk-bearing profile.

Generation resilience — restart recovery

v0.21 · 2026-05-26

A generation interrupted by an infrastructure restart recovers more reliably and picks up where it left off on the next boot.

AI Agents — internal tooling update

v0.21 · 2026-05-26

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Public API reference — cleaner surface and consistent summaries

v0.21 · 2026-05-26

The API reference at /api-docs/reference is cleaner and easier to navigate.

What's new for you

  • Internal and permission-gated endpoints are excluded from the public document — the reference shows only what's available to you.
  • Operations are grouped into readable categories by route, replacing a flat undifferentiated list.
  • Every public operation — 125 in total — now has a one-line summary, so scanning the reference gives you a clear picture of what each endpoint does.
  • The document title is now “SpecStep API” rather than the assembly name that was leaking into client code generators.

v0.21 polish & infrastructure

v0.21 · 2026-05-26

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Organizations for teams

v0.20 · 2026-05-26

Group your team under one organization. On the Teams plan, create an organization from your profile and become its primary contact — from then on, the work you create is associated with it automatically.

What's new for you

  • A new Organization field on your profile. On the Teams plan, create an organization right there — name, address, and phone — and you become its primary contact and first member.
  • Not on the Teams plan yet? The same field links straight to the plan you need to create one.
  • Once you belong to an organization, the interviews, generations, projects, and other records you create are associated with it automatically — no extra step.
  • Membership is optional — the rest of SpecStep works exactly the same whether or not you’re in an organization.

Projects workspace with a doc-vs-built dashboard

v0.20 · 2026-05-25

A full Projects area — group your builds, decisions, and backlog under named projects, track what’s been built against what was designed, and pull in live repository metrics when the SpecStep GitHub App is installed.

What's new for you

  • A new Projects section in the nav rail gives each project its own page — editable name, description, linked GitHub repo, and start date, plus paginated and filterable tables for build sessions, decisions, and backlog items.
  • A metrics dashboard shows generation counts, last-activity dates, and a weekly decision-velocity chart per project.
  • When the SpecStep GitHub App is installed on a linked repo, the project page surfaces repository metrics: pull-request count, code size, primary language, and test-file count.
  • Link a project to its documentation package — from the project page, the package page, or the new-project form, which auto-fills the project name and description from the chosen package.
  • The Build progress view parses the linked package’s phase plan into phases and individual tasks, marks each task built or not-built by scanning the repo’s commit history, and shows per-phase and overall completion bars — with a manual override available per task.

Under the hood

  • Generated phase plans now use a consistent, parseable task format so the commit-history scan is reliable.
  • Built status is computed from the repo’s commit history without cloning the repository.

Internally consistent generated packages

v0.20 · 2026-05-25

Packages no longer ship contradictory sections. When an AI agent’s output conflicts with an upstream document — architecture decisions, stack rationale — the platform catches it, reconciles the affected section, and marks it with a visible note so the resolution is transparent.

What's new for you

  • Contradictions between a specialist’s output and upstream documents are caught before the package ships — the conflicting section is reconciled and marked with an authoritative banner.
  • The banner persists if a section is redrafted later, so the reconciliation stays visible across edits.

Richer, safer, and more reliable generations

v0.20 · 2026-05-25

Four improvements to what you get from a generation and how reliably it arrives.

What's new for you

  • Security, data-model, and risk reviews now run on every tier — not just higher tiers.
  • Generated packages include AI-coder instruction files and an AGENTS.md by default, and recommend the SpecStep MCP server for session state — so Claude Code, Cursor, Copilot, and similar tools pick up your project context without extra setup.
  • The generation list shows live state and a counting-down ETA while a generation is running.
  • A generation interrupted by a transient infrastructure issue resumes automatically — your progress is preserved and the run picks up where it left off.
  • A pre-ship check drops stale blocker warnings that no longer match the final content, so the delivered package isn’t cluttered with issues that were already resolved.

Projects page polish

v0.20 · 2026-05-25

The Projects page got a visual refresh, and your active project now lives in a card on the page itself instead of the top bar.

What's new for you

  • The active-project switcher moved from the top bar into an “Active project” card on the Projects page.
  • The Projects list and per-project pages were restyled with sortable, paginated tables and clearer dates.

Under the hood

  • Internal operator tooling was redesigned for readability and reliability.

Accurate running-generation count and status filter

v0.20 · 2026-05-26

The generations list sometimes showed finished generations as still running and mis-sorted them in the status filter; the count and the filter are now accurate.

What's new for you

  • The running-generation count reflects reality — completed, failed, and cancelled runs no longer appear as running.
  • The status filter sorts each generation into the correct bucket.

Generation progress you can trust

v0.20 · 2026-05-26

The progress bar and time-remaining estimate now reflect what the platform is actually doing — moving smoothly, never jumping backward, and telling you when it’s revising.

What's new for you

  • The progress bar never jumps backward — progress only moves forward.
  • During a revision pass the status reads “Revising — round N of M” so you know the platform is refining your package, not stalled.
  • The time-remaining estimate counts down steadily from a stable baseline and accounts for revision rounds — it no longer resets mid-run.
  • Progress climbs continuously within each phase, not only when crossing a phase boundary.

Real lines of code on the cost panel — total and per-agent

v0.20 · 2026-05-26

Completed runs now show the real lines of code produced — both the package total and a per-agent breakdown — on the cost panel.

What's new for you

  • The cost panel for a completed run shows the total lines of code in your generated package, counted from the real output, not an estimate.
  • It also breaks the lines down per agent, so you can see how much each contributor added or removed.

Consistency checks cover more review specialists

v0.20 · 2026-05-25

The consistency checks that catch and reconcile contradicting sections now cover more of the review specialists, so more conflicts are resolved before your package is delivered.

What's new for you

  • A wider set of specialist reviewers now participates in the contradiction-detection pass — more conflicts are caught and marked with an authoritative resolution before the package ships.

Connect a GitHub repository to a project

v0.20 · 2026-05-25

You can now connect a GitHub repository to a project directly from the project page, using a guided connect flow and a repo picker.

What's new for you

  • A “Connect GitHub repository” button on the project page walks you through connecting your GitHub account; once connected, a picker lists your accessible repositories so you can bind one to the project.
  • Projects you’d already linked by URL are connected automatically — no extra step needed.

The interview knows when it has enough — and you can nudge it to finish

v0.20 · 2026-05-25

Otto now recognizes when it has gathered enough to start generating, and you can nudge it to wrap up whenever you’re ready.

What's new for you

  • When Otto decides it has enough information, it offers to wrap up the interview rather than asking another question.
  • A “Wrap up and generate” button lets you finish the interview on your own schedule — no need to wait for Otto to reach the same conclusion.

Failed or cancelled runs don’t count against your quota

v0.20 · 2026-05-25

If a generation fails or you cancel it, that run no longer counts against your monthly quota.

What's new for you

  • Failed and cancelled generations are excluded from your monthly usage count — only completed runs consume quota.
  • The quota panel now explains this clearly so you can see what does and doesn’t count.

Known gaps front-and-center in the handoff doc + locked requirements traceability

v0.20 · 2026-05-25

Generated packages now surface known gaps up front in the handoff document, and requirements identifiers stay consistent across every document in the package.

What's new for you

  • Known gaps appear at the top of the handoff document — the first thing an AI coder or developer sees when they open the package.
  • Requirement identifiers are locked at generation time so they stay consistent across the traceability matrix, acceptance criteria, and all other documents.

Session-state tooling — cross-computer session continuity

v0.20 · 2026-05-25

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

v0.20 polish and infrastructure

v0.20 · 2026-05-25

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Generation reliability and trust

v0.19 · 2026-05-25

A round of reliability work so a generation recovers automatically from transient interruptions, always shows what's really happening, and handles large or complex specs gracefully.

What's new for you

  • Generations recover automatically from a transient infrastructure interruption — instead of failing, a run shows “brief interruption — resuming shortly, your progress is preserved” and picks up where it left off.
  • The status always reflects what's really happening — a finished generation reads Complete instead of appearing stuck on its last phase.
  • Large or complex specs are handled gracefully — a longer allowance for each step, and a spec too large for the model is caught early with a clear, actionable message instead of an error.
  • When the system detects a genuine platform issue, it flags it for us to fix — so problems get caught and addressed without you having to report them.

v0.19 polish and infrastructure (late series)

v0.19 · 2026-05-25

Mostly under-the-hood work this round, with one visible refinement.

What's new for you

  • The agent activity log on the generation detail page now reads as a clean, even-width log.

Projects

v0.19 · 2026-05-25

A new Projects surface lets you group your work under named projects.

What's new for you

  • A new /projects page and per-project detail page, reachable from the nav rail, let you group generations and session-state under named projects.
  • Decision-log entries, build sessions, and backlog items can be scoped to a project.

SpecStep as a session-state MCP server

v0.19 · 2026-05-25

SpecStep can now serve as your AI agent's persistent session-state memory — its decision log, build sessions, and backlog — over MCP and REST, with a UI to browse it. Your agent's context survives across sessions and machines, so it can resume work without re-reading every file each time.

What's new for you

  • Store and retrieve your AI agent's decision log, build sessions, and backlog through the same tools over MCP or REST — both credentials work on either surface.
  • Browse the stored state in the app: list and detail views for each, plus cross-aggregate views that tie a session to its decisions and backlog.
  • Bring an existing decision log or backlog into the store with markdown import — paste it through the agent or upload the file.
  • Full-text search across decision-log entries, backlog items, and build sessions, including resume-by-description on build sessions.

SpecStep-as-session-state-MCP-server — foundation in place

v0.19 · 2026-05-23

Foundation of the arc that turns SpecStep itself into a session-state MCP server — AI coders working on any project will be able to track build sessions, decision logs, and backlog items via the same discipline SpecStep uses on its own build.

Under the hood

  • The first vertical — build sessions — ships end-to-end with five new MCP tools. The next verticals (decision-log and backlog) follow shortly; the final piece is the end-to-end workflow that an AI coder uses to resume work across sessions.

CI reliability — runner self-heal + smarter capacity alerts

v0.19 · 2026-05-23

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Agent conversation feed stays visible on every terminal state

v0.19 · 2026-05-23

The agent-conversation feed on a generation detail page used to disappear once the run reached a terminal state. It now stays visible regardless of state.

What's new for you

  • Diagnostic context (conversation turns, retry events, cost trace) stays visible after a generation finishes, regardless of its outcome.

Cost visibility on the generation detail page

v0.19 · 2026-05-23

Two new permissions and a five-PR arc that put cost and per-agent lines-of-code on the generation detail page. Cost visibility is permission-gated so real provider cost is available to users with the new permission, while customers continue to see billed pricing only.

What's new for you

  • With the new cost-visibility permission, real provider cost renders alongside billed cost on every agent bubble and on the cost rail card.
  • A new tokens-visibility permission gates per-agent input and output token counts on the rail card.
  • The agent-conversation feed shows lines-of-code per response again — a regression dating to two weeks earlier is resolved.

Intake routing covers three more specialists

v0.19 · 2026-05-23

Three specialists were silently under-routed because the intake-extractor's attributes shape was missing the canonical keys those agents key off. The gap is closed.

What's new for you

  • Generations that need a Marc-transcript pass (compliance / regulated industries), prompt-engineering subject matter, or product-management scoping now actually route to the right specialist.

New-users inbox — internal review tooling

v0.19 · 2026-05-23

No user-visible changes — internal review tooling for newly-signed-up users.

Marketing — agent visibility + HowItWorks animations restored

v0.19 · 2026-05-23

Closes the agent-visibility work that started earlier this series. The marketing home page now shows only the agents flagged for the home roster; the specialist count updates automatically. A new page lists agents currently in development. The HowItWorks step animations and modal popups were restored after a recent change broke them.

What's new for you

  • The marketing home page roster updates without a manual edit when a new agent is launched-out-of-mission.
  • A new /agents/top-secret-mission page lists agents currently in development.
  • The HowItWorks section animates step-by-step again, and clicking a step opens a modal with the long-form detail.

Quality regressions caught automatically + new monitoring alerts

v0.19 · 2026-05-23

Two reliability arcs closed end-to-end. The system catches five named generation-quality failures automatically and files a bug report so the team sees the issue before the customer does. New monitoring covers week-over-week build-confidence drops, response-time regressions, and wire-shape drift across multiple users.

What's new for you

  • You hear back on quality issues without having to file them — the system catches its own regressions and surfaces them to the team automatically.

Cite a feedback ID in a commit and it auto-resolves on merge

v0.19 · 2026-05-23

A commit body containing Resolves feedback <id> now auto-resolves that feedback row when the PR merges, stamping the commit and PR URL on the row. The convention shortens the feedback-to-resolved cycle from "file feedback, ship fix, then triage" to "file feedback, ship fix" — the triage step happens automatically.

What's new for you

  • Submitted feedback auto-resolves when the corresponding fix lands, with a link to the commit and PR right on the feedback row.
  • Multiple feedback IDs per commit are supported; the keyword is case-insensitive.

v0.19 polish and infrastructure

v0.19 · 2026-05-23

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Cheaper Architect runs + Comparator handles large packages + Interview prompt hardening

v0.18 · 2026-05-22

A pipeline-reliability sweep across the LLM-call surface. Architect runs cache stable user content above the model's empirical caching threshold, which materially reduces cost per generation. The Comparator handles large packages without context overflow and returns a job id immediately so MCP callers don't block on the model. Interview prompts gained settled-answer and completion-intent rules so a conversation no longer loops on a topic the user has signalled is done.

What's new for you

  • Architect runs cost less per generation after the cache change.
  • compare_packages returns a job id immediately and handles large packages reliably.
  • Interviews stop on closure cues ("we're done", "MVP is complete") instead of looping back to settled topics.
  • The intake-extractor returns better-structured stack recommendations — a previously-failing shape variant is now schema-constrained at the model boundary.

Architect handles longer sections + Interview replies on clarification request

v0.18 · 2026-05-23

Three coordinated fixes after the auto-filing pipeline caught the first concrete defects on real generations.

What's new for you

  • An Architect run that previously truncated silently on a long section now either completes successfully or surfaces an actionable error.
  • The Interview page lets you respond to a clarification request directly — the composer renders on AwaitingClarification instead of locking out.

Truthful uptime + /security page + cleaner OpenAPI for anonymous endpoints

v0.18 · 2026-05-22

Four PRs closing two launch-blocker bug reports on the public surfaces.

What's new for you

  • The /status page shows truthful uptime — "Unknown" displays for periods before monitoring started, instead of an inferred 100%.
  • The new /security page lists the platform's security posture in plain language and cross-links from /status.
  • The published OpenAPI spec correctly marks anonymous public REST endpoints as unauthenticated.
  • Status-page incident history surfaces the monitoring start date and a ticket-routing link; the subscribe form documents its scope and the double-opt-in path.

OAuth and MCP-client unblock

v0.18 · 2026-05-22

Four PRs across the OAuth and MCP-client surface that close registration and tool-catalog gaps.

What's new for you

  • MCP clients that follow the OAuth 2.1 + refresh-token flow can now register at /oauth/register instead of being rejected.
  • OAuth-authenticated MCP sessions match the API-key flow's tool catalog — an OAuth session and an API key with the same permissions get the same MCP tools.
  • A new bulk_resolve_alerts MCP tool resolves alert cohorts in one call.

Reliability fixes — interview turns, bug-report PATCH, feedback interstitial

v0.18 · 2026-05-21

Three coordinated fixes from the post-launch triage day.

What's new for you

  • Second-and-subsequent interview turns no longer fail with a concurrency error on quick double-submit.
  • PATCH /v1/bug-reports/{id} accepts both PascalCase and snake_case status values, matching the error response's advertised shape.
  • Anonymous visitors hitting /feedback see a branded interstitial explaining the page instead of an abrupt sign-in redirect.

MCP server respects JSON-RPC notifications

v0.18 · 2026-05-21

The MCP HTTP-streamable transport now respects JSON-RPC 2.0 notifications: methods sent without an id are answered with HTTP 202 + empty body, and the lifecycle methods silently no-op instead of returning an error.

What's new for you

  • MCP clients that strictly follow the JSON-RPC 2.0 + MCP specs complete the handshake against https://specstep.com/mcp without modification.

CI — merge-queue rollout and revert + workflow isolation + dynamic ports

v0.18 · 2026-05-22

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

v0.18 polish and infrastructure (late series)

v0.18 · 2026-05-21 → 22

Cross-cutting polish that doesn't belong in a single themed entry: nightly workflow auto-resolves bug-report rows on next-green-nightly; the api-docs reference embed adopts SpecStep meta chrome; bug-report promotion auto-resolves the source feedback row and cascade-closes findings; sitemap completeness improvements; 17 missing agent detail pages were built out; documentation refresh on MCP tooling args.

What's new for you

  • 17 previously-missing agent detail pages are now live on the marketing site.
  • Bug-report promotion from a feedback row now also auto-resolves the source feedback row.
  • /api-docs/reference (the Scalar embed) carries SpecStep meta chrome consistent with the rest of the site.

Quality feedback: validate, amend, and structured evidence

v0.18 · 2026-05-21

The quality-feedback surface gains a dry-run validate tool, a self-correction window for submitters, machine-readable typed evidence on findings, and a slimmed list shape that cuts multi-MB payloads.

What's new for you

  • validate_feedback (MCP) — dry-run validates a feedback submission shape and returns {valid, errors[]} without persisting. Fix template/section-id/cap violations before spending a submit_feedback call.
  • amend_feedback (MCP) + PATCH /v1/feedback/{id}/amend (REST) — the original submitter can self-correct an Open feedback row within a 10-minute window: title, summary, full_report, evidence, or tags. No review-queue intervention needed.
  • Findings now carry a typed_evidence array — structured machine-readable evidence alongside the prose string. Kinds: HTTP response, route, console error, MCP tool call, transcript turn, screenshot, JSON payload.
  • GET /v1/feedback and /v1/feedback/me now return a summary shape — scalars, a 200-character excerpt, and counts — instead of full bodies. The full record is one by-id call away. Multi-MB list payloads are gone.
  • Submitting a rubric template that doesn't pair with the feedback type now fails fast with FEEDBACK_TEMPLATE_TYPE_MISMATCH instead of being silently accepted.
  • Promoting feedback to a bug report is now atomic — a single staged commit — so a failure can no longer orphan a half-created bug report.

Interview turns default to async

v0.18 · 2026-05-21

submit_interview_turn now defaults to async — the call returns a job_id immediately instead of blocking — so long interview turns no longer risk a ~60s gateway timeout.

What's new for you

  • submit_interview_turn now defaults to mode: "async": you get a job_id to poll via get_interview_turn_status, or subscribe to the live push. You no longer need to opt in.
  • mode: "sync" remains for callers that want the inline reply, but it's subject to the ~60s gateway ceiling and is scheduled for removal.
  • The web Interview page runs async end-to-end with a live push and a polling fallback.

v0.18 platform and CI foundation

v0.18 · 2026-05-21

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Retry visibility on running generations — see when the platform is retrying and why

v0.17 · 2026-05-19

When the platform retries an in-flight AI call (rate-limited upstream, transient blip), polling clients now see exactly what's happening instead of guessing from a climbing cost meter. Closes the last sub-finding from one of our beta customers' feedback reports about “is this run still healthy?”

What's new for you

  • GET /v1/generations/{id} (REST) + get_generation + wait_for_generation (MCP) carry four new fields: retry_count, last_retry_at, next_retry_at, recoverable_error_category (one of rate_limit, provider_timeout, provider_server_error, schema_violation, other).
  • The real-time push payload also carries the same four fields, so any live dashboard you build stays in sync without re-polling.
  • Retry counts start at zero on every fresh attempt and only increment forward — you can reason about “has the platform retried since I last looked?” with a simple integer compare.

Progress chip in the topbar + live-updating workspace card

v0.17 · 2026-05-19

A persistent progress chip in the topbar so you don't lose track of running generations when you navigate away from the workspace. The workspace card itself also now stays in sync with the API in real time.

What's new for you

  • New progress chip in the topbar next to the bell. Hidden when nothing is running; shows the slowest in-flight generation's progress when one or more are. Hover (mouse) or click (touch / keyboard) to expand a per-row list of every running generation with its own progress bar.
  • A small x2 / x3 badge overlays the bar when more than one generation is running.
  • Color tells you the worst state at a glance: blue when work is actively progressing, amber when a retry is in flight, amber pulse when an interview clarification is waiting on you.
  • The workspace row's retry counter, billing state, and started-work time now update from the real-time push instead of waiting for a page refresh.

Better error when an explanation takes too long

v0.17 · 2026-05-19

The “Explain This To Me” modal occasionally surfaced a confusing HTTP 502 when the upstream AI was slow. Replaced with a 75-second wall-clock budget and a friendlier retry prompt.

What's new for you

  • When an explanation takes longer than expected, the modal now shows “Took longer than expected. Try again — most explanations finish within 30 seconds.” (was: “Couldn't generate the explanation (HTTP 502)”.)
  • Most “Try again” clicks succeed within a few seconds — the timeout fires when the upstream is genuinely slow, not on a healthy call.

Per-finding statuses on multi-issue feedback

v0.17 · 2026-05-18

When you submit feedback with multiple findings (e.g., a package-quality review covering five different concerns), each finding now tracks its own resolution status independently. The parent feedback row stays Open until every child finding has been resolved or dismissed.

What's new for you

  • Multi-issue feedback rows now expose per-finding statuses through the MCP and REST surfaces. You can see exactly which of your reported issues are in triage, which are resolved, and which were dismissed — not just a single status for the whole submission.
  • New MCP tool to walk findings cross-row: list_feedback_findings answers “show me every open finding I've reported, sorted by severity” without re-walking each parent.
  • A recurrence-chain tool (list_feedback_recurrences) traces every follow-up feedback row that referenced an earlier submission as the seed.

No more “Server restarted. Please refresh.” modal

v0.17 · 2026-05-18

The browser session now reconnects on its own after a transient infrastructure interruption. The previously-needed manual page refresh is gone.

What's new for you

  • The page reconnects automatically when the session hiccups. No more “Server restarted. Please refresh.” modal interrupting whatever you were doing.

v0.17 polish & infrastructure

v0.17 · 2026-05-18 / 19

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Explain a package for an audience — one click, audience-tailored markdown

v0.17 · 2026-05-18

Packages now ship with a one-click "Explain what's in it" surface that rewrites the package as a short, audience-tailored markdown summary. Pick the audience — executive, product manager, engineering manager, new engineer, investor, or security reviewer — and the result is generated, cached, and copy/paste-able. Repeats for the same audience are free.

What's new for you

  • New "Explain what's in it" button on the Generation Detail Package-ready card, and a lightbulb icon on every Workspace Complete row — both open the same audience picker. The first request for a given audience runs the AI in a few seconds; subsequent picks of the same audience return the cached markdown instantly at no cost.
  • Six curated audiences ship at launch: Executive (what was decided, business consequences, no jargon), Product manager (scope, user stories, acceptance-criteria summary), Engineering manager (architecture, risks, de-risking plan), New engineer (where to start reading, mental model, gotchas), Investor (what shipped, market position, what's next), and Security (threat surface, trust boundaries, what got reviewed).
  • Three new REST endpoints — GET /v1/explain/audiences (public catalog), POST /v1/packages/{id}/explain (cold or cached), GET /v1/packages/{id}/explanations (list cached audiences for a package). Two new MCP tools — list_audiences and explain_package — mirror the REST surface for agent callers.
  • Per-tier monthly quotas keep the cost predictable: 20 cold explanations / month on Free, 200 / month on Pro, unlimited on Team. Warm-cache hits never count.

Clearer signals on running generations — billing state, started-work timestamp, completion forecast, active specialist

v0.17 · 2026-05-18

Polling a running generation now returns five new fields that let callers tell "actively working" from "paused on a transient error" without re-scanning the events stream. Closes a gap where climbing cost could look like runaway billing even when the platform was making real progress.

What's new for you

  • billing_stateNotStarted, Active, PausedRetrying, or Complete. Pair with running_cost_usd to disambiguate "healthy" climbing cost from runaway: Active + climbing cost = the platform is earning the spend; PausedRetrying + climbing cost = something's stuck and you should look.
  • started_work_at — the exact moment the dispatcher claimed the row. Lets callers compute "how long has this been actively working?" without scanning events.
  • estimated_time_remaining_seconds and estimated_completion_at — best-effort forecast computed from the historical-median model. Null while queued, terminal, or when the forecast is unavailable.
  • active_specialist — during specialist review, the slug of the most-recently-completed specialist in the current round (one of codd / halo / tally / vera / trip / merlin / polo). A pragmatic single-value summary of a parallel fan-out.
  • progress_explanation on the MCP get_generation / wait_for_generation responses — one-sentence narration of what's happening at the current progress_percent (e.g., "Specialists are reviewing the draft in parallel"). Closes the same understanding gap for agent callers.
  • All five fields are strictly additive; null on generations that started before the rollout.

Cancel an in-flight async interview turn

v0.17 · 2026-05-18

If you submitted an interview turn in async mode and realize the message was wrong — or the LLM call is dragging on — you no longer have to wait for it (or the stuck-job timeout). A new cancel surface ships on both REST and MCP.

What's new for you

  • New REST endpoint POST /v1/interviews/turns/{jobId}/cancel and new MCP tool cancel_interview_turn. Queued jobs cancel cleanly; running jobs cancel best-effort — the job's terminal status will be cancelled, but the agent reply MAY still appear in the interview transcript if a mid-pipeline commit landed before the cancel did.
  • Idempotent on already-cancelled jobs. Returns 409 INTERVIEW_TURN_NOT_CANCELLABLE when the job already reached completed or failed — the work landed; read the result via the existing poll endpoint.

Marketing site & api-docs polish — skip-link fixes, metadata, mobile rendering

v0.17 · 2026-05-18

A focused pass on the public-facing surfaces ahead of v1 promotion — SEO metadata, accessibility, and mobile rendering across the marketing site, status pages, and api-docs reference.

What's new for you

  • Skip-to-content links and section TOC anchors are now route-qualified everywhere — pressing Tab on a non-root page now jumps to the page's main content instead of being silently hijacked to the homepage by <base href="/">. Affects every non-root marketing route.
  • SEO metadata closed across the homepage and status sub-pages — route-specific meta descriptions, canonical URLs, visible h1s on /status/uptime and /status/history.
  • /api-docs/reference/ no longer overflows horizontally on narrow viewports — document-level max-width: 100vw clamp added to the Scalar reference mount.
  • The release-notes back-to-top pill now scrolls to the Contents heading instead of doing nothing (was a stale empty fragment).
  • Public OpenAPI document slimmed — an internal-only webhook callback is no longer listed (it was never a customer-integrator binding target). Narrows the documented surface without removing any caller-relevant route.

Package consistency checks catch cross-doc contradictions before delivery

v0.16 · 2026-05-17

A new consistency-checking stage runs across the assembled package immediately before delivery. Each check looks for a class of contradiction across the package’s docs — project-name drift, missing referenced files, requirement-identifier orphans, schema feasibility against acceptance criteria, architecture decisions that conflict across docs, storage assumed in-memory while requirements imply durable state, stale traceability matrix, JSON field-naming conventions that disagree between API design and acceptance criteria, and review-report freshness.

What's new for you

  • Packages now ship with a consistency_findings array in the manifest plus a banner in handoff.md when Critical-severity contradictions are detected. The AI coder reading the package gets a flagged punch list instead of discovering the contradiction at build time.
  • Catches the contradictions that previously made retest packages fail at the wire boundary: project-name drift across docs, missing referenced files, FR / NFR / AC identifier orphans, schema feasibility against AC, architecture-decision conflicts (JWT vs. session cookies, Postgres vs. MongoDB, etc.), storage-durability vs. stateful-requirement mismatches, stale traceability matrix, JSON naming-convention mixups, and empty / placeholder review reports.
  • Severity stratified — Critical bubbles to the handoff banner; High and Medium live in the manifest for awareness without blocking delivery.

Interview pipeline hardening — idempotency, async submission, completion auto-handoff

v0.16 · 2026-05-17

The interview submission path picked up four classes of hardening: client-driven idempotency on the wire, an opt-in async mode for long turns, automatic generation handoff on interview completion, and graceful shutdown so a deploy mid-turn doesn’t strand work.

What's new for you

  • submit_interview_turn accepts a client_request_id — retry a request and the server returns the original response instead of re-running the model. Same shape on REST and MCP.
  • Pass mode: "async" to submit_interview_turn for an immediate job-id response when you don’t want to block the caller on a long turn. Poll status via the new get_interview_turn_status tool.
  • MCP unknown-argument errors return a structured envelope (typed code + offending key + suggestion) instead of a flat string.
  • When the interview finishes, the generation auto-starts and the response carries a next_action field pointing the agent at the running generation. No more “interview done, now what?” deadlock.
  • Deploys mid-turn no longer strand interview work — queued and in-flight rows rewind cleanly on graceful shutdown so the next revision picks them up.

Feedback workflow maturity — recurrence threading, terminal-state notifications, four new rubric templates

v0.16 · 2026-05-17

The Feedback aggregate shipped in v0.15; this series matured the workflow around it — threading recurrences when a resolved issue comes back, notifying submitters when their row reaches a terminal state, having the server emit recommendation envelopes that AI agents can auto-file against, and adding rubric templates for the four feedback categories standing review called out.

What's new for you

  • When you file feedback or a bug report and a server-side fix lands, you get a notification when the row reaches Resolved / Won’t-fix.
  • If a resolved feedback comes back, file a new one with recurrence_of_feedback_id (or recurrence_of_bug_report_id) and the new row threads to the original — reviewers see the full history instead of starting over.
  • AI agents calling MCP tools that detect quality drag receive a feedback_recommendation envelope alongside the response, with a recommendation_token the agent can pass into submit_feedback. Repeat tokens within 30 days bump an occurrence counter on the existing row instead of filing a duplicate.
  • Four new feedback rubric templates ship alongside the existing end-to-end one: interview-quality, package-buildability, api-doc-quality, tooling-experience. Pick the one that matches the scope of your feedback — narrower rubrics keep the signal cleaner than the all-in-one.
  • New api_doc_quality feedback type pairs with the api-doc-quality rubric so feedback on the /api-docs/* surface has its own bucket.

Privacy-conscious visitor analytics on the marketing site

v0.16 · 2026-05-17

The marketing site picked up a privacy-conscious analytics pipeline so we can see what’s working on the public surface without tracking individual visitors across days.

What's new for you

  • First-visit / repeat-visit / signup-conversion analytics with no third-party tracker, no cross-day correlation, and country-level geo attribution. DNT: 1 is honored end-to-end.
  • The Privacy Policy §1.4 documents the analytics collection — a daily-rotating salt over hash(IP, user-agent) is the visitor identity; the salt deletes 48 hours after rotation, so no database snapshot can be re-correlated with a fresh observation.

MCP wait_for_generation picks up progress and cost forecast

v0.16 · 2026-05-17

wait_for_generation is the recommended polling primitive but until now omitted the progress_percent and cost-forecast fields that get_generation returned. MCP clients had to call both tools to render a single progress screen. Closed in this series.

What's new for you

  • MCP clients polling a generation via wait_for_generation now get progress_percent (0–100), current_round, and the historical-median cost forecast (estimated_total_usd plus p25 / p75 / sample size) on every response.
  • Strictly additive — clients ignoring undocumented fields keep working.

Outbound email branding — SpecStep <donotreply@specstep.com>

v0.16 · 2026-05-17

Transactional emails now ship under the verified SpecStep custom-domain sender instead of a generic default address.

What's new for you

  • Transactional emails show SpecStep <donotreply@specstep.com> as the sender across every send surface (verification, password reset, support routing, retention warnings, terminal-state notifications). DNS-aligned, SPF / DKIM-verified.

v0.16 polish & infrastructure

v0.16 · 2026-05-17

Cross-cutting polish that doesn’t belong in a single themed entry.

What's new for you

  • Twelve custom monochrome rail icons replace the prior set across the left-rail navigation — consistent stroke weight, no-slope discipline, orthogonal glyphs for each major surface.
  • Layout cleanup: redundant section-level page titles dropped from a few internal layouts so the page title isn’t rendered twice.
  • The intake-extraction tool ships a tighter input schema so the model can’t produce shapes the validator would reject — closes an interview-side feedback recurrence.

Submit quality feedback on interviews, packages, and end-to-end runs

v0.15 · 2026-05-16

A new Feedback surface joins Bug Reports. Bug Reports captures broken behavior; Feedback captures quality reviews — was the interview good, is the generated package coherent, what's the build confidence for the end-to-end run.

What's new for you

  • New /feedback pages — submit a quality review on any interview, generated package, or end-to-end run from inside the app. Pick a target, score build confidence on a 0–10 scale, and narrate what worked and what didn't.
  • New submit_feedback MCP tool — same submission shape for AI clients. Kept distinct from submit_bug_report so quality signal stays separate from broken-behavior signal.
  • Built-in feedback templates ship out of the box for interview-rated, package-rated, and end-to-end-rated submissions.

Interview and package quality pass

v0.14 · 2026-05-16

A focused sweep on the agent-experience surface — tighter interview triggers, better project-name handling, clearer MCP errors, and a series of package-quality prompt tightenings.

What's new for you

  • The interview triggers child-safety questions when the discussed domain implicates minors; previously the trigger missed several common phrasings. The follow-up rating prompt fires earlier in the interview and the phase-progression check is stricter so the interview doesn't advance without the required signal.
  • Project-name extraction picks a canonical name even when the user gives a contradictory short name vs. long name — manifests, package zips, and emitted artifacts all agree on the same project name.
  • MCP tools surface unknown-argument typos instead of silently dropping them — misspell an argument name and the tool returns a structured error pointing at the offending key.
  • Generation packages ship with a decision-log stub and per-agent cost data, so downstream consumers can audit what each agent contributed.
  • Workspace-card progress numbers (cost, phase, elapsed) refine at request time so they're accurate instead of cached-stale.

Reliable cancellation and smoother first-render

v0.14 · 2026-05-16

Two reliability threads closed in the same week: cancel-button behavior on long-running generations, and a series of first-render crashes on interactive workspace and generation components.

What's new for you

  • Cancelling a generation now reliably propagates to the agent — the cancel button confirms with a spinner, the workspace card flips to Cancelling within seconds, and the agent halts at the next clean phase boundary instead of running to completion.
  • Workspace, Generation detail, and Animated cost components no longer flash an error on first render — the interactive bits now wait until the page is ready to handle them.

Privacy and retention SLA closure

v0.14 · 2026-05-16

The two outstanding privacy follow-ups from the legal-page refresh closed end-to-end. The policy text that previously hedged can now say what it means.

What's new for you

  • When you delete your account, identity-bearing fields in audit records are scrubbed at deletion time, not just identity-replaced.
  • When you hard-delete a generation or package from the Recycle Bin, the underlying file storage is purged immediately.
  • A nightly sweep cleans up anything that escapes the immediate purge, backing the 24-hour retention SLA in the Privacy Policy.

Pro tier “Coming soon”

v0.14 · 2026-05-16

Payment processing isn't yet enabled in production; the Pro tier is gated behind a “Coming soon” placeholder on the pricing surfaces until it is.

What's new for you

  • The Pricing page shows “Coming soon” in place of a Pro price. Free tier remains live and immediately subscribable.
  • Pro checkout is blocked at the system level until payment processing is enabled — no half-completed sessions.

v0.14 polish & infrastructure (continued)

v0.14 → v0.15 · 2026-05-16

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Consistent topbar across marketing and signed-in pages

v0.14 · 2026-05-15 → 16

The landing-page topbar and the in-app topbar drifted on four axes — font, brand-lockup size, redundant navigation links, and a noisy tier chip. Settled in two passes.

What's new for you

  • Identical topbar across landing and signed-in pages — same font, same brand-lockup size, same persona menu.
  • Click your name to open your workspace; the dropdown still surfaces Workspace, Settings, and Sign out as labelled choices.
  • Signed-in visitors on landing see their identity (avatar, name, notification bell) and the preview chip — the marketing shell no longer looks logged-out when you're logged in.
  • The persona button no longer shows the plan label next to your name; plan is one hover away in the dropdown.

Six new MCP tools — structured findings, change-request files, pre-flight validation, content diff

v0.14 · 2026-05-16

The MCP catalog grew from 50 to 56 tools in a single shipping pass that closed every deferred entry from the original MCP-additions plan.

What's new for you

  • Structured security and quality findings. get_security_findings and get_generation_quality_report return per-finding severity, topic, and title for the security, reliability, accessibility, cost, and risk reviews. Branch on max_severity instead of parsing the markdown.
  • File-level addendum access. list_change_request_files and get_change_request_file read individual files inside a change-request addendum zip without the fetch-then-unzip dance.
  • Pre-flight validation. validate_generation_request dry-runs a kickoff — same checks as start_generation, no enqueue. Returns the same error codes the live tool throws so you can branch before paying for the call.
  • Real content diff across packages. diff_package_files emits line-level unified diff across same-named files in 2–5 packages. Sister of compare_packages; returns actual delta, not byte counts.

v0.14 · 2026-05-16

Both legal pages refreshed to cover features that shipped since the last revision, plus parent-corporation disclosure and a stronger international transfer mechanism.

What's new for you

  • Privacy Policy — new disclosures for External Connectors, MCP OAuth tokens, bring-your-own LLM keys, outbound webhooks, source-control delivery, profile photos, Recycle Bin auto-purge, account deletion cascade, data export, and SMS notifications.
  • Privacy Policy §9 rewrote the international transfer mechanism from consent-based to the EU-US Data Privacy Framework with UK Extension and Swiss DPF.
  • Terms of Service — new §8.5 covering third-party integrations and credentials (external storage, MCP OAuth, BYO LLM keys, source-control delivery, outbound webhooks); a California auto-renewal disclosure on §6; §10 rewrote to reflect 24-hour deletion processing and 7-day Data Export URLs.
  • Both pages disclose No Compromise AI as the Delaware parent corporation. Texas governing law unchanged.
  • Terms version bumped — authenticated users are prompted to re-accept on next visit. API and MCP OAuth callers receive a structured re-acceptance response until acceptance is recorded.

External Connectors v1 — SharePoint live, browser-based attach from MCP, Google Drive promoted

v0.14 · 2026-05-15

External Connectors crossed the v1 launch bar — SharePoint joined OneDrive and Google Drive as a live provider, MCP clients can start a folder attach without leaving the agent, and new MCP clients self-register without manual setup.

What's new for you

  • SharePoint connector ships as the third live provider alongside OneDrive and Google Drive. Connect a SharePoint site during the interview; reference documents flow in the same way.
  • attach_external_folder MCP tool returns a one-time browser URL; you open it, complete OAuth + folder pick + first sync, and your MCP client polls a sister tool until the attach completes. No copy-pasted commands.
  • Google Drive promoted from “Coming soon” to live across pricing, FAQ, About, and API docs.
  • MCP clients can self-register at POST /oauth/register. Existing MCP clients keep working under their previous registration.

Brand identity — public /brand page and vendor logos on integration tiles

v0.14 · 2026-05-15

The SpecStep marks had been on disk for weeks but never had a public home; integration partners and press had nowhere to grab them. Closed with a dedicated /brand page plus vendor logos replacing emoji glyphs across integration surfaces.

What's new for you

  • New public page at /brand — logo downloads (mark, wordmark, horizontal lockup, stacked lockup with PNGs from 16px to 1024px), color tokens, typography, canonical naming, “Powered by SpecStep” live-preview badges with paste-snippet HTML, MCP-tile size recommendations, and a permissive license that says editorial and integration use are free with no permission needed.
  • Connect-a-folder modal and landing-page External Connectors section show real Microsoft / OneDrive / SharePoint / Google / Google Drive / GitHub product logos instead of emoji glyphs. Dropbox appears as a “Coming soon” tile.
  • Landing-page sign-in row treats Microsoft, Google, and GitHub as equal peers with identical button styling.

Account deletion and data export

v0.14 · 2026-05-14 → 15

Two GDPR-mandated surfaces shipped end-to-end — the right to erasure and the right to data portability.

What's new for you

  • Delete account in Settings — requests cascade-delete every owned resource (interviews, generations, packages, addendums, webhooks, external-connector credentials, API keys, audit-event identity replacement, blob deletion). Confirmation email arrives when the cascade completes.
  • Export data in Settings — requests a zip of every interview transcript, generation manifest, package, and account-metadata blob you own. Email arrives with a 7-day signed download link.
  • Both flows process within 24 hours of submission.

Consistent Save / Cancel and toast behavior across Settings

v0.14 · 2026-05-14 → 15

A focused pass on the form, save, and notification primitives that every Settings panel consumes. The user-visible result is identical save-cancel-toast behavior across every editable surface.

What's new for you

  • Every editable Settings panel uses the same Save / Cancel bar that anchors at the bottom while you're editing — no more guessing which surface uses which submit pattern.
  • Save success and failure shows the same toast surface across every panel; transient errors no longer hide silently.
  • Form-section headings are consistent across Settings — same typeface, weight, and spacing on Profile, Notifications, Logs, Webhooks, Cost & Usage, and API Keys.
  • Empty states share one visual shape; the “Try refreshing” and “Try a different filter” affordances no longer drift per surface.

Workspace stat tiles — ISSUES and USAGE, plus linked filters and cost drill-down

v0.14 · 2026-05-15

The /workspace stats band gained two new tiles and every tile now navigates somewhere meaningful when clicked.

What's new for you

  • Two new tiles — ISSUES (count of generations in a failed terminal state) and USAGE (THIS MONTH) (total cost across the current calendar month).
  • IN FLIGHT and COMPLETED tiles are now anchors — clicking either jumps the table below to the matching filter, rows sorted most-recent-first.
  • Spend tiles navigate to the cost-and-usage drill-down for the matching period.

Wide-prose layout for /about, /support, and /release-notes

v0.14 · 2026-05-15

The three long-prose public pages were stuck in a 720-pixel column on desktop, leaving the right two-thirds of the screen blank. They now use the same wide layout the FAQ and API docs already use.

What's new for you

  • About, Support, and Release notes pages render full-width on desktop with a sticky right-rail navigation that tracks scroll position.
  • Page hero band hosts the title and last-updated stamp instead of stacking inline at the top of the prose.

Per-user retention preference and Recycle Bin hard-delete

v0.14 · 2026-05-14

Two retention-related surfaces. You can now set your own default retention window for new generations, and Recycle Bin gained a permanent “Delete forever” affordance for users who don't want soft-deleted rows hanging around for the standard 30-day window.

What's new for you

  • New Default retention setting in Settings — pick how long completed generations stay on your account before auto-purge (overrides the platform default for new generations only).
  • Recycle Bin has a Delete forever affordance per row — one-click hard-delete of a soft-deleted generation without waiting for the sweep.

v0.14 polish & infrastructure

v0.14 · 2026-05-14 → 16

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Pre-v1 launch polish across the entire product

v0.14 · 2026-05-14

A multi-session polish pass settled the user experience across every surface — marketing, API docs, onboarding, the generation flow, Settings, and account administration — before v1.

What's new for you

  • Marketing surfaces — pricing page cleaned up, hero accessibility and motion-calm pass, /about gains an inline jump-nav across seven sections, copy fixes on BillingSuccess and the team roster.
  • API docs — anchors corrected, enum rows restored, sub-nav reordered with a mobile-friendly TOC, code blocks colorized with a copy-button, semantic status pills on error and rate-limit tables, a dedicated page header with last-updated stamp.
  • Onboarding and Interview — clearer error and OAuth-consent copy, humanized timestamps on the Status page, first-load orientation on the Interview page, composer flatten plus legal-ack “Later” and profile-compare scroll and decision-table visibility, a tidier modal experience with Escape returning focus correctly.
  • Generation and Package — Workspace renders correctly on first paint and on mobile, real permalink and a cost share-of-budget bar on Generation Detail, the real package file tree, per-row short-id demoted to a hover tooltip.
  • Settings — sidebar reorganization, copy improvements plus empty-state teaching, a renamed Data Export path, semantic pill palette across the silent-deduped comparator reveal.
  • Account administration — search and filters and sortable column headers across the high-traffic tables, a fresh role-permissions audit-trail surface, TOC anchors on Engineering release notes resolve to the same page.
  • Visual settlement — Generation Detail folds progress and cost and pipeline into a single telemetry strip; the five system pages (Error, Access denied, Account disabled, Terms acceptance, OAuth Consent) share a common shell; the permalink URL element on Generation Detail gains a visible focus ring.

Under the hood

  • Cross-cutting sweeps — display helpers retire raw enum names, missing CSS primitives added, responsive stat-tile layout, sentence-case capitalization sweep across navigation and tab labels, an emoji-to-SVG sweep, public-copy-gate extension to catch internal identifiers across components.
  • Visual and accessibility baselines refreshed after the per-batch work; new capture workflows generate baselines on CI runners so refreshes are deterministic.

Extra Usage prepaid balance

v0.13 · 2026-05-13

Pro and Team accounts can now buy a prepaid Extra Usage block — additional generation credit that drains alongside the monthly allowance.

What's new for you

  • New Extra Usage card on the Billing page shows your current balance and lets you buy a block.
  • When your monthly allowance runs out, generations draw from the Extra Usage balance instead of failing.
  • Balance and buy receipts visible inline on the same surface.

AI provider preferences

v0.13 · 2026-05-13

Choose which AI providers SpecStep is allowed to use for your generations.

What's new for you

  • A new Settings panel lets you opt out of specific AI providers; agents pinned to a disabled provider fall back to your chosen default.

Contact form, refreshed About + FAQ, support routing

v0.13 · 2026-05-13

The Contact page is now a form (no email address visible); support tickets route to a dedicated support inbox; About and FAQ have been refreshed for the current product, and /faq gains a right-rail table of contents matching the API docs.

What's new for you

  • /contact is a form — pick a reason (Sales / Partnership / Press / Integration / Feedback / Other), type your message, send. No email address listed for scrapers.
  • Support tickets now go to a dedicated support inbox, separate from general inquiries.
  • /about and /faq are current — cover MCP, browser sign-in for MCP clients, External Connectors (SharePoint and OneDrive live; Google Drive and Dropbox coming), Addendums, Recycle Bin, Webhooks, and the renamed review profiles.
  • /faq has a sticky right-rail table of contents that scroll-spies the active section as you scroll.

Pricing page evolution

v0.13 · 2026-05-13

The Pricing comparison grid gained an AI Providers band, a Document Management band (SharePoint and OneDrive live; Dropbox and Google Drive coming), and a Team-tier "Coming soon" overlay so the upgrade CTA doesn't fire before the tier ships.

What's new for you

  • AI Providers band shows which providers each tier may use.
  • Document Management band lists the connector lineup with status pills.
  • Team upgrade CTA shows a "Coming soon" overlay until the tier ships.
  • Per-tier alignment + readability cleanup so checkmarks and pills line up.

OAuth login fixes

v0.13 · 2026-05-13

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Right-rail TOC + page-width polish

v0.13 · 2026-05-13

Cap the layout width on /faq and /api-docs/* so the table of contents pins to the right edge and the article fills the center.

What's new for you

  • /faq + /api-docs/* article column now fills the available width up to the table of contents.
  • The TOC rail pins to the right edge of the centered container.
  • The rail's inner scrollbar is hidden visually (wheel-scrolling still works on the longer pages).

v0.13 polish & infrastructure

v0.13 · 2026-05-13

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

OAuth sign-in for MCP clients

v0.13 · 2026-05-13

MCP clients — Claude Desktop, Claude.ai, Cursor, Codex, GitHub Copilot, Continue, Cline — can now sign you in through a browser instead of requiring a manually pasted API key.

What's new for you

  • On first connection, your MCP client opens SpecStep's authorization page in a browser; you approve with your existing account session and the client receives a 90-day token automatically.
  • API keys keep working as-is — headless and CI flows are unaffected.
  • Settings → API keys gained a “Connected MCP clients” panel where you can see active MCP sessions and revoke any of them individually.

Under the hood

  • Full OAuth 2.1 browser flow with strict redirect enforcement — no redirect to arbitrary domains.
  • Authorization codes are single-use with a short TTL, consumed atomically so replay is structurally blocked.
  • REST and MCP share the same authentication model — either credential works on either surface.
  • New discovery and authorization endpoints are published at standard well-known paths.

Per-intake cost and duration estimates

v0.13 · 2026-05-13

Profile cards on the Interview page now show cost and duration ranges grounded in your actual usage history — not hardcoded approximations.

What's new for you

  • Each profile card shows a tight cost range and duration estimate derived from rolling 30-day medians across the agents that will run for your project.
  • Estimates factor in your tier and the project attributes Otto has detected so far — so the numbers shift as the interview progresses.
  • Cards without enough history show a baseline estimate plus a “projected” label so you know it isn't history-grounded yet.
  • The compare panel below the cards draws from a profile-wide median forecast, so the side-by-side view stays current as Otto refines what he knows about your project.

Under the hood

  • Estimates recompute after every interview turn — no manual refresh needed.

Interview profile picker — compare panel and grid fix

v0.13 · 2026-05-13

Two improvements to the profile selection step: a side-by-side comparison panel and a layout fix that was pushing the Fast profile out of position.

What's new for you

  • A collapsible “Compare profiles” panel sits between the three main cards and the Researcher option — a 7-row table covering best use case, review rounds, specialists included, estimated cost and duration, and tier required.
  • The Fast, Normal, and Extensive cards now sit correctly in a three-column grid — a misplaced element was displacing Fast into the second cell and wrapping Extensive onto a second row.

Profile renames and a per-feature comparison grid

v0.13 · 2026-05-13

“Thorough” is now “Normal” and “Exhaustive” is now “Extensive” — names that map directly to what each tier actually delivers. The Pricing comparison grid is restructured into three category bands so the tier differences are scannable at a glance.

What's new for you

  • The three standard profiles are now Fast / Normal / Extensive — the progression reads as a straight scale rather than a scale with a marketing name in the middle.
  • The Pricing comparison table is rebuilt around three category bands — Review profiles, External connectors, Agents included — each with a section heading and one sub-row per item, with a checkmark in every tier column that includes it.
  • The External connectors band carries a “Free: connect + preview only” sub-note so Free users see they can still run the auto-respond magic moment even though generating from connector data requires a paid tier.
  • Checkmarks render as styled checkmarks — they were appearing as escaped HTML text before the fix.
  • If your tier doesn't support the previously selected profile default, the Interview page automatically switches your selection to Normal.

Under the hood

  • A migration renames the stored profile values — no data loss, no manual step.
  • The Agents-included band reads from the agent role catalog at render time, sorted by pipeline position so the order matches the orchestrator's flow.

External Connectors — pull reference docs from SharePoint, OneDrive, and Google Drive

v0.13 · 2026-05-12 → 13

Connect a SharePoint site, a OneDrive folder, or a Google Drive folder to an interview and Otto summarizes the contents and feeds them into your spec as reference documents — without you typing a summary yourself.

What's new for you

  • Connect a folder or site during the interview; Otto runs a background summarization pass and posts a follow-up turn with what he found — no manual copy-paste.
  • If the background call fails, Otto posts a recovery turn explaining what happened and what to try next.
  • Attached Files shows a “via SharePoint,” “via OneDrive,” or “via Google Drive” badge on any file that came from a connector.
  • Generation Detail shows a “Used N references from [Provider]” chip so you can see what each connector contributed — available on REST and MCP as well.
  • AI agents now read scanned PDFs natively without pre-processing.
  • Dropbox is coming next.
  • Free accounts can connect a folder and watch Otto summarize it; generating a spec that uses connector-sourced references requires Pro or Team.

Under the hood

  • Provider adapters live behind a common connector abstraction so additional providers (Dropbox, others) plug in cleanly.
  • The OAuth flow for each provider is handled through dedicated REST endpoints; connector credentials are stored per-workspace, not per-user.
  • The premium gate is enforced at generation time: Free users see the summarization step, but the generation call is blocked with a clear tier explanation before any provider cost is incurred.

MCP-native positioning on the marketing site

v0.13 · 2026-05-13

The landing page now leads with SpecStep's programmable surface — a new hero strip and a Tools section that shows the specific tools AI coders call through MCP.

What's new for you

  • The hero now includes an MCP-native callout strip above the fold — alongside REST + OpenAPI, Webhooks, and “Same key everywhere” — so the programmable platform story is visible before you scroll.
  • A new Tools section shows the categories of tools available through MCP, giving AI coders a concrete sense of what they can automate.
  • The sitemap is now generated dynamically and pings search indexes after each deploy so crawlers pick up changes faster.

Under the hood

  • Marketing HTML is served with cache headers that prevent crawlers from serving a stale deploy to users following a link.

Webhook subscriptions land in Settings

v0.12 · 2026-05-12

Manage your webhook subscriptions directly from the browser — no more calling the REST API to register an endpoint. New Settings → Webhooks tab with an API-key picker, an inline create form, and per-row Test / Rotate / Delete actions.

What's new for you

  • New Settings tab lists every subscription registered against the API key you pick — URL, subscribed events, last delivery status + HTTP code, and a “needs rotation” warning if the signing secret needs to be refreshed.
  • Test fires a synthetic event against the destination and renders the live outcome inline — delivery time and HTTP status, or the failure reason if it didn't land.
  • Rotate issues a fresh signing secret, shown once with the same copy + “I've copied it” gate the API-key flow uses. The reveal modal includes a “How to verify the signature” expander covering the header name, the HMAC algorithm, and constant-time comparison.
  • Delete takes a single confirm and removes the subscription — future events for that endpoint stop firing immediately.

Under the hood

  • Webhook create, list, delete, and rotate logic flows through a single Application service so the REST endpoints, MCP tools, and the new Settings page all share one path.

Generation detail polish — readable timestamps and cleaner now-playing

v0.12 · 2026-05-12

Small UI fixes on the generation detail and Interview pages based on user feedback. Timestamps switched from 24-hour to 12-hour with am/pm; the conversation feed's now-playing peek no longer looks like cards are stacking up underneath the active one; the Interview side panel reorders to keep the AI Team visible above the file list.

What's new for you

  • Generation header timestamps read “May 12 · 8:10pm” instead of “May 12 · 20:10”; the right-rail Started / Completed dates read “5/12/26 8:10pm” instead of “2026-05-12 20:10”.
  • The conversation feed bubble's per-turn timestamp also flips to 12-hour lowercase am/pm.
  • The conversation feed's now-playing peek card leans to the bottom-left and auto-fades after ~1.5 seconds so previous activity stops visibly stacking under the active card.
  • Interview side panel: AI team now sits above Attached files so the file list doesn't push the AI team out of view.

Independent security review — full sweep

v0.12 · 2026-05-12

An independent security review covered the codebase end-to-end. Authorization, webhook validation, and secret handling were hardened; the headline user-visible improvement is faster session revocation.

What's new for you

  • Revoking a sign-in session now takes effect within about 30 seconds, not on next sign-in.
  • API keys now carry per-key scopes — create a key scoped to the permissions you want and the server enforces the scope on every call. The Settings UI gained a scope picker so you can preview the scope set before issuing.
  • If you have unsigned terms-of-service updates, you'll see the acceptance prompt the next time you sign in — not silently bypassed by a stale cookie.
  • REST, MCP, and webhook error fields no longer surface raw provider exception messages — you get a sanitized, category-derived string instead.

MCP surface expansion — 11 new tools

v0.12 · 2026-05-12

A surface review identified 11 high-value tools missing from the MCP surface. All 11 shipped, bringing the MCP tool count from 36 to 47.

What's new for you

  • Read change-request addenda from MCP — list_change_requests + get_change_request return the full addendum metadata plus a fresh download URL.
  • Pull a single intake artifact's full payload via get_intake_artifact — the list tool returns metadata only; the new singular tool returns full intake artifact details for an in-flight or completed run.
  • Estimate a generation's cost before kicking it off via estimate_generation_cost(profile) and an addendum's cost via estimate_change_request_cost — both return the rolling 30-day median and p25/p75 from your historical runs.
  • List packages scoped to one generation via list_packages_for_generation(generation_id) — sister to the existing user-scoped list.
  • Manage webhook subscriptions from MCP — list_my_webhooks, create_webhook, delete_webhook, rotate_webhook_secret, and test_webhook which fires a synthetic event and returns the live delivery outcome.
  • Compare your own packages from MCP via compare_packages — returns identity verdict, per-package report, and cross-package report.

Performance pass

v0.12 · 2026-05-11 → 12

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Reliability fixes

v0.12 · 2026-05-11

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

v0.12 polish & infrastructure

v0.12 · 2026-05-11 → 12

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Mid-flight recovery covers every pipeline state

v0.11 · 2026-05-10

A generation interrupted by a transient infrastructure issue now resumes cleanly from any pipeline state, not just the original three.

What's new for you

  • A generation that fails or restarts mid-flight resumes from wherever it was — no lost progress, no double-billing for work that already landed.
  • GitHub delivery retries are idempotent — a retry after a transient interruption won't push the same commit twice if the first attempt landed but the response was lost.

REST auth fixed and GitHub integration coverage

v0.11 · 2026-05-10

Two long-overdue cleanups: REST callers now get a proper JSON 401/403 on auth failure instead of an HTML redirect, and GitHub integration reliability improved with expanded automated coverage.

What's new for you

  • REST API callers — CLI, MCP clients, programmatic integrations — get a 401 or 403 with a JSON body on auth failure instead of a redirect to an HTML page. Your client can now tell “wrong credentials” from “session expired” without parsing HTML.

Test-automation reviewer pass

v0.11 · 2026-05-10

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Addenda are first-class generation rows

v0.11 · 2026-05-10

Addenda now run through the same dispatcher as full generations, with full lease, heartbeat, and sweep coverage alongside them.

What's new for you

  • Addendum POST returns the addendum's full shape — ID, download URL, cost — once the worker finishes. No more polling the page for status.
  • Expand a package row in the workspace to see all addenda attached to it without opening the detail page.
  • MCP list and search calls now show only the calling actor's own resources.

Intake-name extraction fixed and visual baselines committed

v0.11 · 2026-05-10

Two long-deferred items closed: new generations name themselves correctly from intake data, and visual-baseline end-to-end tests went from “feature exists but no baselines committed” to “18 baselines reviewed, committed, and gated in CI.”

What's new for you

  • New generations pick up their name from the intake — project names show through cleanly instead of falling back to “(unnamed).” Existing rows still show the old name; the rename pencil on the detail page handles those.

v0.11 polish & infrastructure

v0.11 · 2026-05-10

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Workspace tables get pagination, sort, and UX cleanup

v0.10 · 2026-05-09

The Generations and Packages tables in the workspace are now paginated and sortable. A batch of smaller paper cuts — broken date format, mismatched icon, missing project name in toast headlines — got fixed in the same pass.

What's new for you

  • Both tables paginate — default 5 rows per page, dropdown for 10, 25, 50, 100, or All; your choice persists across reloads.
  • Click any column header to sort ascending, descending, or off; the active column shows a directional glyph.
  • Inline rename on the generation detail page — click the pencil next to the title to rename in place, matching the workspace row pattern.
  • Failed-generation toast renders in red instead of the same color as a completed run, and the headline leads with the project name.
  • The package “Created” column reads in a human date format instead of ISO.
  • “Request a change” button in the packages row matches the Download and Delete chrome — a single icon button.
  • Detailed agent narration restored in the in-flight feed — specific rationale and confidence scores, not just a summary verb.
  • Package cost on new runs reflects the real per-agent invocation sum — was showing zero before.

Security review sweep — across every project

v0.10 · 2026-05-09

Independent security review completed; authorization, webhook validation, and secret handling hardened across every project in the codebase.

Reliability and notifications backlog burn-down

v0.10 · 2026-05-09

A focused pass closed 30+ backlog items, mostly in the notifications, source-control, and authorization layers.

What's new for you

  • Notifications are exactly-once on retried webhook events — duplicate event IDs no longer fire a second inbox row.
  • A failed notification can retry after a transient channel failure — the orchestrator no longer locks a failed row permanently.

Addendum reliability and GitHub adapter coverage

v0.10 · 2026-05-09

A second-pass review tackled specific reliability and correctness items — largest themes: addendum reliability, GitHub adapter test coverage, and aggregate cost on the package wire shape.

What's new for you

  • The package now shows an aggregate cost covering the original generation plus every addendum — you can see what the package actually cost end-to-end.
  • Markdown rendering for addendum content is now formatted correctly — no more raw-text headings on the page.
  • The recommender retries with an explicit “you missed these required fields” callout when its first response is incomplete.

v0.10 polish & infrastructure

v0.10 · 2026-05-08

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Change addendums for completed packages

v0.9 · 2026-05-08

Filing a focused change against a completed package no longer requires a full re-generation. Addendums are a single-LLM-call flow that produces a 5-section markdown bundle.

What's new for you

  • Every completed package row in the workspace has a “Request a change” button. Pick the mode (addendum or full re-generation), describe the change, submit.
  • Addendums produce a 5-file zip — background, change requirement, implementation guide, test plan, and a decision-log entry — attached as a sibling artifact, no version bump.
  • The generation detail page gained a “Change addenda” section listing every addendum filed against the package, newest first, with per-row download.
  • The workspace package row shows an addendum count next to the version when at least one addendum is attached.
  • The notification bell surfaces “Change addendum ready” on completion, linking directly to the parent generation.

Under the hood

  • Three new REST endpoints for addenda — create, list, and per-addendum download — plus a new MCP tool request_change mirroring the same shape.

Clickwrap terms acceptance and AI-output disclaimers

v0.9 · 2026-05-08

Sign-in now gates on an explicit terms-acceptance checkbox, and the terms themselves gained new sections covering AI-generated output and preview-edition expectations.

What's new for you

  • The landing-page sign-in card requires a “I agree to the Terms and Privacy Policy” checkbox before the OAuth buttons become active.
  • Three new terms sections cover AI-output limitations and your validation responsibility, preview-edition expectations (no SLA, data may move), and warranty and liability disclosures.
  • The access-denied page now exists — it was a broken 404 before for users who hit a permission-gated route.

Conversation feed — now-playing stack with diff chips

v0.9 · 2026-05-08

The live agent feed during a generation is now a single “now-playing” card stack instead of a scrolling list. Every action shows a +N/−N chip telling you exactly how much content the agent produced or removed.

What's new for you

  • While a generation runs, the feed centers the active agent's card and fades the previous one behind it — no more rapid-fire scrolling that was hard to follow.
  • Each completed action carries a green +N / red −N chip showing lines added or removed.
  • An expander shows the full prior history when you want it; the default view stays focused on what's happening now.

v0.9 · 2026-05-08

A search box above the workspace package list now searches inside every package you own in one query — file contents, not just project names.

What's new for you

  • The new search field returns ranked file hits across every package; each hit shows a highlighted snippet with matched terms.
  • Quoted phrases, OR alternation, and term exclusion all work — standard web-search syntax.
  • Per-package search and cross-package search are both available as REST endpoints and as MCP tools, so agents can find a package by content in one round trip.

MCP gap closure — lifecycle, files, capabilities, intake artifacts

v0.9 · 2026-05-08

Eight new MCP tools close the recovery and discovery gaps for AI agents driving SpecStep — lifecycle controls, per-file access, capabilities discovery, and intake-artifact listing.

What's new for you

  • cancel_generation, retry_generation, pause_generation, resume_generation — an agent observing a stuck or runaway run can bail out cleanly without re-deriving the intake.
  • list_package_files + get_package_file — inspect package structure and read individual files without downloading the full zip.
  • get_capabilities — discover valid review-profile names, project types, and schema versions before constructing a kickoff.
  • list_intake_artifacts — find ready-to-generate intakes without filtering inline.
  • update_generation_name + get_latest_package_for_generation — small metadata tools that close the agent-side ergonomics gap.

Auto-filed bug reports with diagnostic context

v0.9 · 2026-05-08

When a generation fails for a platform reason, SpecStep now auto-files a bug report with diagnostic context — no waiting for a user to notice and report.

What's new for you

  • System-detected failed runs can now auto-file a bug report for the team.
  • The failure card now says “We've automatically reported this to our team — you don't need to file a separate ticket,” so users aren't left wondering whether anyone knows.

v0.9 polish & infrastructure

v0.9 · 2026-05-07

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Lyra HTML mockups in chat and packages

v0.8 · 2026-05-07

Lyra now sketches HTML/CSS mockups inline in the interview chat and drops them into your final package zip — visual scaffolding you can open in a browser, share, and iterate from.

What's new for you

  • Ask Otto for UI work and Lyra produces real HTML/CSS mockups inline — sandboxed preview, click to expand into a lightbox.
  • Download from the lightbox to save the mockup as a standalone file you can open, share, or hand to a designer.
  • Mockups also land in your generated package under the design directory so your build agent can read them alongside the spec.
  • Upload screenshots or design references during the interview — Lyra sees those images when drafting mockups, so the proposed UI matches what you uploaded.

Two more specialists — Marc and Trip

v0.8 · 2026-05-07

Two more specialists join the AI team — Marc for industry-specific context and Trip for user-journey rigor — bringing the roster from 22 to 24 agents.

What's new for you

  • Marc (Business Analyst) covers domain models, regulatory landscape, comparable products, and customer-journey patterns for industry-specific projects. Downstream architecture agents read Marc's output when picking the stack.
  • Trip (UX Researcher / Workflow Analyst) joins on user-facing work — user journeys, task flows, edge cases, empty states, and accessibility-adjacent workflow issues.
  • The marketing site now shows the live specialist count, driven from the agent catalog — future additions update everywhere automatically.

Soft-delete, restore, and a Recycle Bin

v0.8 · 2026-05-07

Every interview, generation, and package is now reversible — soft-delete from the web app, the REST API, or MCP; restore from a Recycle Bin in Settings; and a 10-second Undo toast catches the common case before you have to dig.

What's new for you

  • Delete an interview, generation, or package from its detail page — allowed in any terminal state.
  • After every delete, a corner toast appears with an Undo button for ~10 seconds — one click and the row comes back.
  • Recycle Bin under Settings → Recycle Bin lists your own soft-deleted rows across interviews, generations, and packages; Restore returns a row to your workspace.
  • Soft-delete and restore are also available via the REST API and MCP — agents can delete and restore items from the same session where they're working.

Marketing and app feel like one product

v0.8 · 2026-05-07

The signed-in app now feels like the same product as the marketing site — same wordmark, same nav character, same chrome treatment, dark mode working end-to-end.

What's new for you

  • Dark mode now works end-to-end on the marketing site — backgrounds, cards, the orchestration timeline, and the “What you get” file tree all flip cleanly.
  • The signed-in app's topbar adopted the marketing site's editorial treatment: sticky, blurred backdrop, same lockup wordmark, same nav-link style. Account chrome — notification bell, avatar, plan badge — is preserved.
  • The full marketing nav (How it works / What you get / Meet the team / Pricing / About / FAQ / API docs / Support / Contact) is present in the signed-in topbar — you no longer hit a dead end looking for marketing pages.
  • The page heading no longer shows a stray focus outline after you click somewhere — screen readers still announce navigation normally.

Landing-page content evolution

v0.8 · 2026-05-07

A round of landing-page updates — fresher copy, an expanded orchestration timeline, and full section nav.

What's new for you

  • The top nav adds How it works, What you get, and Meet the team links so visitors can jump directly to any section.
  • Hero eyebrow updated to “Experts On Demand. 24 specialists, 1 conversation.”
  • The How it works orchestration timeline now shows all 24 agents and extends the sample conversation with new turns from Marc, Trip, Codd, Atlas, and Tally.
  • The “What you get” file tree adds the new design/mockups directory that Lyra writes; the third card now mentions Lyra mockups, Merlin, and Polo.

Account Tiers and Generation Types

v0.8 · 2026-05-07

Two new administrative surfaces — one for managing which subscription plans unlock which generation profiles, and one for editing per-profile labels in real time.

What's new for you

  • Each profile's display name and description are editable in real time, so changes show up in the picker without a deploy.
  • The interview profile picker reflects label changes within about 60 seconds.

Per-profile cost in My Analytics

v0.8 · 2026-05-07

My Analytics now breaks your monthly spend down by generation profile so you can see where the dollars actually go.

What's new for you

  • New “By profile” panel on Settings → My Analytics shows your average cost per generation across Fast, Normal, Extensive, and Researcher for the trailing window.
  • The same data is available via the REST API for anyone building an external dashboard.

Preview-mode user approval gate

v0.8 · 2026-05-07

While SpecStep is in preview, new accounts can complete an interview but generation kickoff stays disabled until a SpecStep admin approves the account.

What's new for you

  • New users land in the workspace, can run interviews, and see the catalog — but the Start generation button is disabled with a “your account is pending approval” notice until an admin approves it.
  • All accounts that existed before this change were automatically approved — no disruption to existing users.

Tighter acceptance criteria

v0.8 · 2026-05-07

The Architect now drafts acceptance criteria with machine-verifiable Then clauses, and the Critic flags subjective language as blocking before the package ships.

What's new for you

  • Acceptance criteria in your generated requirements now read as Given/When/Then where each Then clause is something a test or integration check can verify — “returns 204 within 200 ms,” not “is fast.”
  • If a Then clause uses subjective language (“user-friendly,” “intuitive,” “fast”), the Critic flags it as blocking and the Architect re-drafts before the package ships.

Orchestrator and LLM reliability

v0.8 · 2026-05-06

Three reliability fixes cleared a class of “this used to fail mysteriously” production issues.

What's new for you

  • A generation that was in progress when the host restarted now auto-resumes from the last checkpoint instead of staying stuck — the retry-from-interview button is a fallback, not the primary path anymore.
  • Cancel and Delete on the generation details page work correctly again — a regression had been swallowing both actions silently.
  • When the primary model returns an incomplete stack recommendation, the orchestrator now retries with a secondary model before failing the run.

API docs cover the Recycle Bin

v0.8 · 2026-05-07

The public API docs at /api-docs/rest and /api-docs/mcp now document the soft-delete, restore, and Recycle Bin endpoints and tools introduced this release.

What's new for you

  • The REST guide covers all user-facing restore and deleted-list endpoints, including the status codes returned for each operation.
  • Per-entity DELETE behavior is documented for interviews, generations, and packages — including what happens when you try to delete an active generation.
  • New MCP tool entries documented: delete_interview, restore_interview, delete_generation, restore_generation, and the full update_package tool with all three operations.

Bug reports from anywhere

v0.8 · 2026-05-06

File a bug from the browser, the REST API, or the MCP tool surface — every report lands in the same queue with the same context attached.

What's new for you

  • The Submit-a-ticket form's “Bug” category now files a real bug report — same queue as the API and MCP paths.
  • API callers can submit via POST /v1/bug-reports and read their own submissions back by list or by ID.
  • MCP-capable agents get submit_bug_report, list_my_bug_reports, and get_bug_report — an agent can file the bug from the same session where it noticed the problem.
  • Every submission automatically carries your account name, plan tier, build version, and the AI tool you're using — triage gets the context without a follow-up.

More MCP introspection

v0.8 · 2026-05-06

The MCP tool surface grew from 12 to 19 tools — agents can now list and fetch interviews, list generations in any state, and read bug reports they filed.

What's new for you

  • New list_interviews and get_interview MCP tools mirror the REST list/by-ID pattern, with a matching REST endpoint.
  • New list_generations covers in-flight, paused, failed, and cancelled runs — list_packages only sees runs that produced a package, so this fills the gap.
  • list_packages gained limit, offset, and order arguments, plus a generation_state field on every row so you can tell clean completions from partial or failed runs.
  • get_status was renamed get_generation and its response now includes progress percentage, failure category, and a cost forecast with p25/p75 confidence bounds.

Under the hood

  • The get_statusget_generation rename is a breaking change; callers using the old name need to update.

Reliability and concurrency hardening

v0.8 · 2026-05-06

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Workspace and marketing polish

v0.8 · 2026-05-06

A round of UI fixes: the notification bell opens on hover, the marketing sign-in flow works again, three more accessibility contrast gaps closed, and action icons show hover tooltips.

What's new for you

  • The notification bell opens on mouse-hover instead of requiring a click; a hover bridge keeps the dropdown open as your cursor moves to it. Clicking an item still navigates straight to its source.
  • The marketing site's “Sign in” call-to-action works correctly again — a regression had been swallowing the click.
  • Cancel is reachable from a generation paused awaiting clarification — previously only Resume and Answer were shown.
  • Three more contrast violations closed: agent thumbnail accents in the How It Works legend, accent text in dark mode, and agent accents in the Beats section all meet WCAG 2.1 AA.
  • The eight new specialist agents now have headshots and loader animations on the marketing Meet the Team section.

Eight new specialist agents

v0.8 · 2026-05-06

SpecStep's roster grows from 14 to 22 agents — eight new specialists for reliability, data, accessibility, localization, AI/ML, compliance, cost, and risk. Each runs only when your project actually needs it, based on signals Otto picks up during the interview.

What's new for you

  • Atlas (Reliability) drafts the operational story — SLOs, alert catalog, capacity plan, RPO/RTO, on-call playbook stubs.
  • Codd (Data) designs schema with constraints, an index strategy per query path, online-migration patterns, and retention tied to the privacy story.
  • Halo (Accessibility) writes the WCAG 2.1 AA contract for UI projects: keyboard maps, screen-reader expectations, focus-visible rules, color-contrast minimums.
  • Polo (Localization) scopes the i18n plan: locale catalog, pluralization, RTL, date/number/currency formatting, translation flow, and the fallback chain.
  • Merlin (Prompt Engineer) tunes AI features: model selection rationale, prompt-injection defenses, eval strategy, and cost ceilings.
  • Reg (Compliance) maps your SOC 2, HIPAA, PCI-DSS, or GDPR posture into specific controls and audit evidence — consultation-only, with the standard AI-isn't-legal-advice disclosure.
  • Tally (Cost) puts numbers on build hours, run-rate, build-vs-buy decisions, and budget alarms.
  • Hazard (Risk) compiles the risk register: schedule, vendor lock-in, regulatory drift, capacity, team, and the AI-coder traps to watch for.
  • Vera (Test Automation) is now available on the Free tier — every project gets a real testing strategy regardless of plan.
  • Each agent has a dedicated page at /agents/{slug} with bio, tagline, and accent color.

Soft-delete generations from the workspace

v0.8 · 2026-05-06

Delete a finished generation from your workspace listing without losing the audit trail.

What's new for you

  • A delete button appears on every terminal-state generation row (Complete, Failed, Cancelled) — click to remove it from your workspace view.
  • The same delete affordance is available from the Generation Details page.
  • The row drops out of every workspace view and spend tile; the underlying record is retained for audit and retention purposes.

Marketing accessibility pass

v0.8 · 2026-05-06

WCAG 2.1 AA contrast pass on the marketing site, driven by an automated accessibility audit.

What's new for you

  • The hero headline's accent color now meets WCAG AA contrast against the page background.
  • The pitch section's numbered badges meet contrast minimums on every background tone.
  • Dimmed agent thumbnails in the How It Works legend stay legible at their reduced opacity instead of fading into the page.

Mid-generation clarifications

v0.7 · 2026-05-05 → 06

When an agent realizes mid-generation that it's missing context it can't reasonably guess at, it now pauses the run and asks you — in the original interview chat — rather than guessing wrong or failing the whole package.

What's new for you

  • The Recommender, Architect, and DesignerCritic can pause a generation with a specific question when the spec is missing critical detail.
  • Paused generations show an “Answer required” card on the details page with the questions inline and a one-click jump back into the interview chat.
  • Workspace rows for blocked generations link straight to the interview in warning-toned styling — so they stand out from in-flight rows at a glance.
  • Answers feed back into the resumed run; agents re-draft the originally-stuck section with the missing context filled in.
  • API and MCP callers get a structured surface: GET /v1/generations/{id}/clarifications, POST .../clarifications/answers, and the answer_clarifications MCP tool.

Marketing site rewrite

v0.7 · 2026-05-05

New look for every public page — new typography, new palette, new hero, a How-It-Works timeline, and a Meet-the-Team section with animated agent loaders.

What's new for you

  • Refreshed homepage with a 60-second pitch, a live-feeling How-It-Works timeline, an annotated .specstep/ file tree, and a 14-agent roster with click-to-expand bios.
  • Each agent has its own accent color and animated full-body loader on the team detail modal — Otto cyan, Stax blue, Alan purple, Lyra pink, and so on.
  • Pricing, About, Privacy, Terms, FAQ, Contact, Support, and per-agent pages all restyled to match.
  • New brand lockups, favicons, and Open Graph cards.

Live agent conversation feed

v0.7 · 2026-05-05 → 06

The Generation Details page and workspace in-flight rows now feel alive while a generation is running — you see agents check in one at a time, cost tick up as new invocations land, and a real progress bar advance.

What's new for you

  • Agent-by-agent conversation feed on the Generation Details page: chat-bubble entries with the agent's name and accent color, a one-line action verb, a longer narration, duration, and a running cost total.
  • Workspace rows for in-flight generations show the latest two agent turns inline, a thin progress bar, and an animated cost counter.
  • The cost row on the details page shows a running total alongside a historical-median estimated range — so you have a sense of where this generation is likely to land.
  • An “Active agent” indicator shows who's working on your generation right now, with a pulsing live dot.

Pipeline reliability and cost realism

v0.7 · 2026-05-04 → 06

Parallel pipeline execution and a series of efficiency improvements — generations now finish faster, fail more cleanly, and cost what they actually cost.

What's new for you

  • Faster generations: the Architect drafts sections in parallel waves; fresh-eyes review runs concurrently for the Extensive profile; DesignerCritic runs alongside the Recommender on UI projects.
  • Setup-time failures — auth gate, quota, and similar — now persist as Failed with a clear reason instead of silently sticking in Queued.
  • The Critic now re-drafts only the sections it flagged instead of the whole package — meaningfully more efficient on every run.

Workspace and API surface polish

v0.7 · 2026-05-05

Smaller improvements you'll feel right away: better spend visibility, project names that flow through every read API, and the right destination when you click into a generation.

What's new for you

  • Two new spend tiles on the workspace — rolling 7-day and 30-day — alongside the existing 24-hour tile, each with an all-time total beneath it.
  • Clicking a workspace row now opens the full Generation Details page directly instead of a small overlay that duplicated the same content.
  • Project name, description, and a stable “specification package” kind label are now on every GET /v1/generations/{id} response and the equivalent MCP responses — so external tools can show what a generation is about without parsing the intake.
  • The workspace's per-row Download button now resolves the correct package ID server-side (was 404ing on a path mismatch).

Mid-flight recovery for generations

v0.7 · 2026-05-04

A generation that loses its host process or dispatcher worker mid-flight now resumes on the next host rather than silently stalling in Queued or vanishing entirely.

What's new for you

  • Generations whose host restarted while running now pick up where they left off instead of needing a manual retry.
  • The stuck-generation sweep that auto-fails abandoned rows now fires after 10 minutes, down from 30.
  • The workspace silently retries once when a system-caused failure happened within the first minute — you only see the error if it happens twice.

Failure classification and retry UX

v0.7 · 2026-05-04

When a generation fails, the workspace tells you what kind of failure it was — in plain English — and gives you a one-click Retry instead of a generic error.

What's new for you

  • A failed generation row shows a short, plain-English reason with a “Show technical details” toggle for the full message.
  • An inline Retry button re-runs the same intake at the same profile in one click.
  • System-caused failures with a queue time under a minute auto-retry once, silently — you only see the error if it happens twice.
  • A friendly 401/403 page replaces the blank framework response when an unauthenticated or unauthorized request hits a protected page.
  • Review-budget exhaustion — the Critic ran out of rounds with blocking issues still open — gets its own surface with the issue summary and suggested next actions, not a generic error.

Internal observability improvements

v0.7 · 2026-05-04

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Tab-as-page routing for Settings and Billing

v0.7 · 2026-05-03 → 04

Each tab inside Settings and Billing is now its own routable page — deep links work, the back button works, and breadcrumbs anchor you in the navigation hierarchy.

What's new for you

  • Bookmark /settings/notifications directly — each tab is a real URL, not a query string.
  • The browser back button moves between tabs the way you'd expect.
  • Each section page shows breadcrumbs above the header so you always know where you are.

Public API documentation

v0.7 · 2026-05-03 → 04

A new /api-docs section on the marketing site documents every REST endpoint, error code, failure category, real-time contract, and MCP tool — with an OpenAPI spec auto-generated at build time.

What's new for you

  • Read the full REST API reference at /api-docs/rest before you write integration code.
  • Every endpoint — including retry, lifecycle controls (pause, resume, cancel, rename), and the status family — is documented with request and response examples.
  • The failure_category field on the generation response is documented alongside the additive contract that protects existing integrations.

Service status and uptime page

v0.7 · 2026-05-03

A public /status page shows the current state of every part of the SpecStep platform, with a 90-day uptime history and an incident workflow.

What's new for you

  • /status shows green/yellow/red live, with the last 90 days of uptime as a daily strip.
  • Subscribe to status updates by email; verify in one click, unsubscribe in one click.
  • A history view at /status/history shows every past incident with its full timeline.

v0.7 infrastructure & polish

v0.7 · 2026-05-03

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Security review hardening

v0.6 · 2026-05-03

Independent security review completed; authorization, webhook validation, and secret handling hardened.

Independent review sweep

v0.6 · 2026-05-03

Independent security review completed; correctness and security fixes landed across every layer of the codebase.

Accessibility and resilience polish

v0.6 · 2026-05-03

A short polish pass cleared color-contrast and form-label failures and tightened a handful of resilience edges — rate limiter, splash screen, and the 429 page.

What's new for you

  • The persona dropdown has a proper form label; the notification bell announces its role to screen readers.
  • The rate limiter no longer throttles static assets; 429 responses now render a friendly page instead of a blank framework response.
  • The splash screen waits for the real-time circuit to be ready before dismissing — no more blank page on slow connections.

Meet the Team

v0.6 · 2026-05-03

Each agent now has a public profile — accent color, mission, signal sources, and how they show up in your generation — reachable from a Meet the Team strip on the homepage.

What's new for you

  • The homepage Meet the Team strip links to individual pages for the Code Reviewer, Privacy Attorney, Security Expert, and every other agent.
  • Each agent page shows the accent color, public summary, project types it consults on, and how to reach it from REST or MCP.

Agent identity and per-agent detail pages

v0.6 · 2026-05-03

Each agent — Code Reviewer, Privacy Attorney, Security Expert, and the rest — now has its own visual identity and a public profile.

What's new for you

  • Each agent displays its own accent color and curated summary, distinct from every other agent.
  • Admins can write or edit an agent's public summary and choose whether it appears on the homepage.

Interview UX redesign

v0.6 · 2026-05-03

The interview — where you talk through your idea with the AI Team — has been rebuilt to make the conversation easier to read and to show you what's been captured as it happens.

What's new for you

  • Agent turns and your replies are now visually distinct, so you can follow the thread at a glance.
  • A “Captured so far” panel surfaces the structured output the AI Team is building in real time.
  • An inline divider marks exactly when a new agent joins the conversation mid-interview.
  • The Cancel button now stops the AI call immediately — previously it set a flag and waited for the response to finish.
  • Each source reference appears as a labeled pill; the Researcher card spans the full row width.

Cancellation and reliability

v0.6 · 2026-05-03

Generations now have a real Cancel button, automatic recovery for stalls, and a live view of exactly what's happening while you wait.

What's new for you

  • Cancel a running generation at any time — it stops immediately and shows a Cancelled state.
  • The in-flight panel now shows the current stage, which agent is working, last activity time, and recent events.
  • Unlimited Access plans no longer hit the concurrency cap that applies to standard plans.

Site-wide design pass

v0.6 · 2026-05-03

Every major surface — Workspace, Interview, Marketing, Pricing, Landing, About/FAQ/Privacy, system pages, ticket form, and Generation Detail — went through a structured design and implementation pass.

What's new for you

  • Generation Detail page rebuilt with a metadata header, pipeline progress strip, and two-column layout.
  • Pricing page gains a Most Popular pill, a full comparison table, and explicit upgrade calls to action.
  • Landing page proof moves above the fold; an anatomy-of-package section shows exactly what ships in a generation.
  • The ticket form deflects to the FAQ before submitting, with category routing and pre-filled name and email.
  • Account Disabled and Billing Success pages have rewritten copy — accurate and appropriately toned for each moment.

Settings split into three surfaces

v0.6 · 2026-05-02

The single Settings page has been split into separate surfaces for personal preferences, administration, and billing — each with its own URL and sub-section navigation.

What's new for you

  • Personal preferences, administration controls, and billing are no longer on the same page — each lives at its own URL.
  • Sub-section navigation within each surface is consistent across all three areas.

v0.6 infrastructure & polish

v0.6 · 2026-05-02 → 03

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Run Comparison

v0.5 · 2026-05-02

Researcher mode fans a single idea into three parallel documentation runs, then scores and grades each result so you can see which configuration produces the best output before you commit.

What's new for you

  • Start a Researcher run from the Workspace to generate three side-by-side documentation packages from one idea.
  • A letter-grade scorecard in the Workspace shows which package won and why — build quality, content judgment, and near-duplicate detection all factored in.
  • Per-tier generation profiles replace a hardcoded limit table, so quota behavior is consistent across plans.

Security hardening

v0.5 · 2026-05-02

Independent security review completed; authorization, webhook validation, and secret handling hardened. Resource access checks were strengthened across every read path.

Security Expert agent

v0.5 · 2026-05-02

The AI Team gained a Security Expert that reviews your documentation package and produces a dedicated security-findings artifact.

What's new for you

  • A 04-security-review.md file now appears in every documentation package, covering findings the Security Expert surfaced during its review pass.
  • Important findings surface in your notification inbox — you don't need to go looking for them.

Architect resilience

v0.5 · 2026-05-02

The Architect agent no longer fails an entire generation when a single spec section hits a validation error.

What's new for you

  • Generations that previously stopped on a difficult section now complete, with low-confidence sections flagged in the package manifest rather than lost entirely.

Live generation progress

v0.5 · 2026-05-02

The Workspace progress bar now advances in real time as the Architect works through each section — no refresh needed.

What's new for you

  • The Workspace row label updates live — “Drafting · section 5 of 17 · est. ~12 min remaining” — so you know exactly where a generation stands.
  • Multiple open Workspace tabs stay in sync with each other automatically.

Internal observability improvements

v0.5 · 2026-05-02

No user-visible changes — internal infrastructure, security hardening, and reliability fixes.

Test coverage and accessibility

v0.5 · 2026-05-01 → 02

A multi-day quality push brought every routable page, domain aggregate, application service, and LLM agent under automated test — the foundation that lets everything else move fast.

What's new for you

  • No user-visible changes — all changes are foundational.

AI Team expansion and agent consultation

v0.5 · 2026-05-01

The AI Team grew from four roles to a full catalog — including three legal specialists — and the Interviewer can now consult any agent mid-conversation.

What's new for you

  • A “Meet Your AI Team” panel above the project type selector previews every agent that will be consulted before the interview starts.
  • The Interviewer now pulls in other agents mid-conversation and shows a turn-by-turn summary of who contributed.
  • Your conversation keeps a Team panel showing which agents have joined, why each was consulted, and a way to remove one with a reason.
  • Legal-flagged agents — Privacy Attorney, Commercial Attorney, Internet & Tech Attorney — show a one-time acknowledgment modal as a reminder that their output is not legal advice.
  • New agents: Code Reviewer, Test Automation, Copy Editor, Privacy Attorney, Commercial Attorney, Internet & Tech Attorney.

Bring-your-own AI provider keys

v0.5 · 2026-05-01

Role-based access control now governs who can do what across the product, and users can supply their own AI provider key.

What's new for you

  • Bring your own AI provider key to bypass the included quota and use your own model access directly.
  • Each Workspace row shows a source-channel badge — Web, API, or MCP — so you can see how each generation was started.

File uploads and multimodal context

v0.5 · 2026-05-01

Attach reference documents — PDFs, images, YAML configs, hand-drawn diagrams — directly in the interview, and the AI Team reads them.

What's new for you

  • A paperclip and thumbnail row in the interview composer lets you attach files mid-conversation.
  • The Recommender, Architect, and DesignerCritic agents read your uploaded files directly — images, structured docs, and text formats alike.
  • Supported formats: PDF, DOCX, PNG/JPG, RTF, ODF, HTML, SVG, JSON, YAML, XML.

My Analytics

v0.5 · 2026-05-02

The Analytics view is now scoped to your own data by default, with an All Analytics dropdown available to privileged roles.

What's new for you

  • “Analytics” is now “My Analytics” — everything you see reflects your own generations, quota, and usage.

Shell and UX foundations

v0.5 · 2026-05-01

v0.5 lands with the official SpecStep mark, a global footer, a live inbox, and a dark-mode toggle that works.

What's new for you

  • The official SpecStep mark and favicon appear across all pages, with Open Graph meta for link previews.
  • A global footer adds About, Contact, and Support sections to every page.
  • The topbar bell opens a cross-device inbox; clicking a notification takes you directly to that generation.
  • Every “Loading...” placeholder — reconnect overlay and initial splash included — is now an animated brand loader.
  • The dark-mode toggle flips the theme immediately.
  • The version label shows cleanly without a build-suffix; hover it to see the full build SHA.

Initial preview launch

v0.4 · 2026-04-29 → 30

SpecStep launched with a Web UI, a REST API, and an MCP server — three surfaces over one orchestration core — covering seven project types end-to-end and delivering packages directly to a GitHub repo as a pull request.

What's new for you

  • Start a generation from the browser, the REST API, or any MCP-capable client including Claude Code and Claude Desktop.
  • Choose Fast, Normal, or Extensive review depth; multi-provider review provides a fresh-eyes check on every package.
  • Watch your generation progress in real time — stage, agent, and state updates push to the UI without polling.
  • Finished packages download as a zip or land in your GitHub repo as a pull request, with optional Copilot review.
  • Email and SMS notifications, a cost dashboard, and GitHub source-control settings are all in Settings from day one.
  • Sign in with Microsoft Entra ID, Google, or GitHub.

Under the hood

  • Clean layered architecture; orchestration is surface-agnostic — the same core drives Web, REST, and MCP.
  • The pipeline aims for deterministic zip output: holding inputs, model versions, and prompt versions constant should produce a stable bundle. We test for this and any drift surfaces as a failing build.
  • Automated checks include redaction passes for credentials, financial identifiers, and personally identifying information before agents read content. We layer detection rules and review their coverage in audits.
  • Database and blob storage are backed by managed cloud redundancy tiers and routine restore drills; we don't publish the exact replication geometry.
  • Automated accessibility checks target WCAG 2.1 AA and run in CI. We treat the result as a continuous-improvement signal, not a third-party certification.
Top