Release history and product news β short, dated, and with pointers to the
architectural decisions behind them. After installing the platform locally,
the full technical changelog is available under /changelog.
0.8.1 is a pure UX patch for the project
wizard β no new feature, no new roadmap phase, but four
visible improvements that lift the wizard from "works" to
"feels polished".
Progress stepper: visual indicator
1β2β3β4 at the top of every wizard step; completed
steps are links back, the current one is highlighted.
Makes "where am I right now?" trivial.
Objective preview in step 4:
collapsible card showing the fully composed initial
objective prompt. The user can see before
clicking "Create project" what the coding agent will
actually receive as first input.
Server-side required-field validation:
missing fields no longer slip silently through the
POST; the controller redirects back to step 2 with a
list of missing fields at the top of the page.
Already-entered values are preserved.
Version-cache empty state: when the
cache is still empty for a VERSION_PICK field, the
input shows "loading versionsβ¦" plus a link to the
admin cache page β instead of a silently empty
field.
Wizard cost estimate, inline diffs, and browser notifications
With 0.8.0 the platform closes out two
deferred building blocks: the wizard now shows the rough cost
of a run before it even starts, and the approval view embeds
a real diff of the workspace changes instead of leaving you
with a blind "Approve / Reject". Browser notifications keep
you informed about run completion or approval requests even
with the tab closed.
Wizard step 4 β cost estimate (phase 4.5)
New WizardCostEstimator wires the existing
PromptComposer, the phase-9
TokenEstimator (JTokkit), and
ModelPricingProperties into a local EUR
estimate. No LLM call, no latency in the wizard.
Default model claude-sonnet-4-6, output
assumption = 1.5Γ input. Deliberately conservative so the
displayed number doesn't systematically overstate real
run cost.
Models without a price entry show
"no price table configured" rather than fake numbers.
Inline diffs before approval (phase 7.5b)
GitService#diffSeit shells out
git diff --no-color <sha> from the
workspace, capped at 256 KB so an accidentally huge diff
can't blow the heap.
The run detail view embeds a collapsible diff card when
the status is WAITING_FOR_APPROVAL, compared
against the earliest GitCheckpoint of the
run.
Approve / Reject is no longer blind; the diff shows
exactly what the agent changed.
Browser notifications (phase 7.5a)
LogStreamService emits an extra
status SSE event per poll cycle whenever the
run status changes β no separate endpoint needed.
The detail view shows an "Enable" button for the Web
Notifications API. Permission is requested only on user
click (modern browsers silently deny otherwise).
Notifications fire on COMPLETED,
FAILED, TIMEOUT,
CANCELLED, NEEDS_CORRECTION and
WAITING_FOR_APPROVAL β the states where you
should come back from a closed tab.
Around the release
985 tests green, JaCoCo thresholds
(80 % line / 80 % branch) held
throughout.
Run templates, container sandbox, live token stream
With 0.7.0 the Software Factory ships three
remaining building blocks from the security and reuse track:
a reusable run template, an optional container
sandbox per run, and a real live stream of token usage
instead of collecting only at the end.
Run templates (phase 7.5)
From a finished run you can now save a
RunTemplate (adapter, team, objective).
On the next quick-start it's the preferred suggestion β
skipping clicks on recurring jobs.
Templates are scoped per creator and optional project.
Applying a template seeds a new run with its values; the
template is not tied to its source run β if a project is
deleted, the template survives as a blueprint.
Audit: every template change is logged as
RUN_TEMPLATE_CREATED/UPDATED/DELETED.
Browser notifications and inline diffs are deliberately
deferred to a later polish sprint.
Container sandbox per run (phase 8, ADR-0011 B)
New ContainerProcessSandbox starts every
agent subprocess inside an ephemeral Docker or Podman
container with hard limits: --cpus 2,
--memory 4g,
--pids-limit 512,
--read-only root + /tmp as
tmpfs, --network=none by default.
The workspace is bind-mounted to
/workspace β everything else is invisible to
the agent. That's the answer to "does the agent see my
entire filesystem?".
Selection via setting
execution.sandbox.variant=container; if
Docker is missing from the PATH, the platform falls back
to the existing LocalProcessSandbox with a
log warning. Single-tenant hosts notice nothing;
Enterprise-tier can flip without code changes.
Live token stream + local estimation (phase 9, ADR-0013)
New ClaudeStreamJsonParser reads
--output-format=stream-json line by line and
emits a typed ExecutionEvent.Usage per event
with input/output/cached token counts plus model id and
request id. The log viewer now sees token usage live
rather than only after the run ends.
TokenEstimator (with
com.knuddels:jtokkit:1.1.0) provides local
token estimates before a run starts β wizard
step 4 can predict cost without a vendor call.
We adopt the concepts as our own implementation rather
than depending on the claudecode4j library
(single-maintainer risk too high) β details in
ADR-0013.
Around the release
970+ tests green, JaCoCo thresholds
(80 % line / 80 % branch) held
throughout.
ADR-0011 (agent-sandbox) moved from
Proposed to Accepted; ADR-0013
(claudecode4j-konzept-uebernahme) is new.
Conductor: model routing, plugin sync, and repo import
With 0.6.0 the Software Factory stops being
a wrapper around Claude Code and starts becoming the
conductor of the entire setup. Three roadmap phases
deliver the leap: per-role model routing, automatic mapping
of locally installed plugins and skills into every run, and
a sixth wizard template that brings existing repositories
under platform control.
Per-role model routing (Phase 5)
Default mapping: Architect and
Documentation run on claude-opus-4-7,
Developer / QA / Merge-Release on
claude-sonnet-4-6, Reviewer and
Security-Reviewer on claude-haiku-4-5. Saves
tokens without sacrificing quality on the strategic
roles.
Every AgentDefinition now has a
preferredModel field (migration V14)
that the ClaudeCodeExecutionAdapter passes
through as a --model flag. Typo IDs fall
back to the CLI default β no build crash.
Drift detection: if the CLI ended up
using a different model (e.g. because a subagent
delegation rerouted), a warning shows up in the log.
Plugin and skills sync (Phase 6)
New conductor module: scans
~/.claude/plugins/ and
~/.claude/skills/ on first access, caches
results, supports manual refresh. Path traversal is
guarded against symlinks pointing outside the base.
Before every run, the platform writes
.claude/settings.local.json into the
workspace (plugin whitelist + skills path). Claude Code
respects workspace settings automatically.
Per active team member, a
.claude/agents/<role>.md file is
written with YAML frontmatter (name,
description, model,
role). Subagents are CLI-native delegable.
Repo-import template (Phase 7, partial)
Sixth wizard template
existing-repo-import:
instead of bootstrapping a new skeleton, it instructs
Claude Code to inspect an existing repo and write an
IMPORT_REPORT.md (read markdowns, scan
build config, summarise directory tree). No code is
touched before the report is reviewed.
Fields: repo path, primary language (9 options), build
tool (9 options), current state, next steps. Initial
prompts in DE and EN on the classpath.
Project memory via PROJECT_NOTES.md (Phase 7, partial)
The freeText field of
ProjectDefinition is now written as
PROJECT_NOTES.md into the workspace root at
run start. The coding agent finds the file in the repo and
uses it as extra context β a memory layer under user
control that persists between runs.
Deliberately not in this release
From phase 7, the following remain open:
RunTemplate entity ("save the last successful
run as a template"), browser notifications on run finish,
and inline diffs before approval. They will follow in a
subsequent iteration (phase 7.5), building on the conductor
infrastructure shipped now.
Around the release
900+ tests green, JaCoCo thresholds (80 % line /
80 % branch) held throughout.
New .trivyignore file with five
consciously accepted Go-stdlib CVEs inside the
eclipse-temurin:25-jre base image (container
helper binaries, not the Java code we execute).
With 0.5.0 the Software Factory becomes
polyglot. The
project wizard
used to offer two templates: Spring Boot Backend and
Static Frontend. Three polyglot templates join them β
the platform now covers the most common backend stacks for solo
developers.
Three new templates
ASP.NET Core Backend (.NET) β pick the
.NET version (8 or 9), project flavor (WebAPI / MVC /
worker service), database via EF Core (PostgreSQL or SQL
Server) or Dapper, architecture style
Clean Architecture, Layered or
Vertical Slice, configurable root namespace.
Python Backend (FastAPI / Flask) β Python
3.12 or 3.13, web framework FastAPI or
Flask, database driver SQLAlchemy
with PostgreSQL or SQLite, package-manager choice between
uv, pip and poetry.
Node.js Backend (Express / Fastify / Hono) β
Node 20 or 22, three framework options, ORM Prisma
or Drizzle (or none), language TypeScript or
JavaScript, package manager
pnpm / npm / yarn.
Ten new quality-gate toggles
.NET: Roslyn analyzers with
TreatWarningsAsErrors,
SonarAnalyzer.CSharp as an additional rule set,
NuGetAudit for vulnerability scanning during
dotnet restore.
Python: Ruff as an all-in-one
linter and formatter, mypy in strict mode,
pytest-cov with a hard coverage gate
(--cov-fail-under=80), Bandit as a
security linter with optional SARIF upload.
Node: ESLint in flat-config form
(ESLint 9+), Vitest as test runner with v8
coverage, npm-audit-signatures for Sigstore
provenance.
Trivy for everyone
The Trivy container-scan toggle now applies to all
five templates β whether you ship a Java, .NET, Python or Node
app, the produced container image can be scanned for known
CVEs.
Version cache extended
Eleven new cache keys: ESLint, Vitest and Node LTS get daily
refreshes via the npm and GitHub Releases APIs; .NET, Python
and the tool versions are kept statically in the code catalog
and bumped with each platform release. The admin inspect page
at /einstellungen/wizard/versions now shows 16
entries instead of five.
Deliberately not in this release
Go, Rust, PHP and Ruby are missing β we wait for concrete
user demand before maintaining further templates. The
architecture is trivially extensible: new templates come from
additional code-as-data entries in TemplateRegistry
plus six Markdown files each (initial prompt DE/EN, three to
five toggle snippets DE/EN). No migration, no schema change.
Tell us which stack should be next.
Around the release
841 tests green, JaCoCo thresholds (80 % line / 80 %
branch) held throughout.
Roadmap doc docs/roadmap/projekt-wizard.md
updated: toggle catalog from 4 to 14 toggles, three
additional template entries.
Demo instance at demo.softwarefabrik.io
runs v0.5.0 β click through directly, no login.
v0.4.0May 7, 2026
Settings, wizard, quality gates: from cockpit to guided platform
0.4.0 moves the platform from "configure-it-yourself"
into a guided mode: global defaults have a dedicated area, a
four-step assistant creates new projects with the right stack and
opt-in quality gates, and a daily background job keeps the
"latest stable version" of every component up to date.
Global settings UI
New area/einstellungen
(ADMIN only) with sections for workspace, adapter,
per-adapter default model (claude / codex / gemini / aider)
and token budget caps (daily / weekly).
Override orderPROJECT > USER > GLOBAL > YAML
via the new SettingService; consumers no longer
read directly from application.yml but go through
the service. Changes apply without a restart (5-minute cache).
Quick-start button in the run list launches
a run from the setup of the last successful run in under
three clicks.
Audit log for every setting change; Flyway V9
(app_setting).
Four-step project wizard
New path/wizard: pick a template
→ answer template-specific questions → toggle
quality gates → review summary. Submission lands in the
normal project editor with all fields pre-filled.
Drafts survive browser reloads and app
restarts; "continue draft" appears on the entry page.
A daily cleanup deletes open drafts older than 30 days.
Two initial templates:
modern Spring Boot backend (Java version, database,
architecture style, base package) and
static frontend (Node.js version, Vite / Astro /
plain HTML).
Initial prompts are stored on the classpath in DE and EN;
placeholders are substituted server-side.
/projects/new is preserved as the
"quick create" path for experienced users.
Quality-gate toggles
ArchUnit layered tests, OWASP
Dependency-Check, Trivy container scan,
and Playwright E2E tests – opt-in via the
wizard, filtered by template (Playwright only appears on
templates with a web UI).
Markdown snippets in DE and EN under
wizard/snippets/<lang>/<id>.md;
appended to the initial prompt for the coding agent.
Tool versions (ArchUnit, OWASP plugin, Trivy CLI,
@playwright/test) are injected from the version
cache – snippets aren't pinned to stale values.
Version cache with background refresh
Daily @Scheduled job at 03:00
server time pulls the "latest stable" version of every
component from Maven Central, GitHub Releases and the npm
registry.
Java-computed staleness (no Postgres-specific
computed column – keeps the H2 test profile working).
Stale badge appears in the wizard and on the admin inspect
page after 25 hours.
Admin inspect page at
/einstellungen/wizard/versions with table, last
refresh timestamp, error column and a manual refresh button.
Flyway V12 (wizard_draft) and
V13 (version_cache).
Security
pgJDBC pinned to 42.7.11 –
fixes CVE-2026-42198 (HIGH, pgjdbc client-side denial of
service). Trivy image scan is HIGH/CRITICAL-clean again.
docker-compose.yml now binds
Postgres explicitly to 127.0.0.1:5432 instead of
0.0.0.0:5432, so the database password is no
longer exposed to any network with TCP routing to the host.
Around the release
17 static pages now carry rel="canonical" plus
hreflang alternates – search engines can
tell DE/EN variants apart and stop treating www / non-www
duplicates as separate pages.
Imprint extended with an "about me" section linking to
janda.io
and the German VAT identification number.
810 + tests green, JaCoCo thresholds (80 % line /
80 % branch) held throughout.
v0.3.0April 27, 2026
Tokens, costs, live logs: from black box to cockpit
0.3.0 is a frontend leap: the dashboard now shows
14-day token and cost charts, a dedicated analytics page splits
tokens, costs, budget and adapter comparison, logs stream live to
the browser via Server-Sent Events, and runs can be paused, resumed
or cancelled in batch.
Token and cost schema
Run metrics aggregate per run: input, output
and cached tokens, duration in milliseconds, EUR cost and
phase status. Migration V7 creates the table and
adds the token columns to execution_log.
Model price table in application.yml
(input / output / cached per 1 M tokens) – sourced from
the official price page for gpt-5.4-mini,
gpt-5-mini, claude-sonnet-4-6,
claude-opus-4-7 and claude-haiku-4-5.
Claude Code adapter calls the CLI with
--output-format=json and parses token usage and
model ID from the result payload.
Mock adapter reports deterministic pseudo
tokens (derived from the markdown length) – dashboard and
analytics show meaningful values even without a real vendor run.
Dashboard upgrade and dark mode
New widgets: success rate, average run
duration, 14-day cost total, token usage (input / output split)
as an SVG bar chart, EUR cost curve.
Dark mode with sun / moon toggle in the
header, persisted to localStorage; an explicit
user choice rather than automatic
prefers-color-scheme.
CSS-only tooltips on the KPI tiles, skeleton
class for lazily loading lists (respecting
prefers-reduced-motion), hamburger menu below
720 px.
Analytics page with budget enforcement
New menu entry /analytics with four tabs:
token usage (input / output / cached stacked), costs
(daily curve plus per-model-ID breakdown), budget
(manage monthly budget per project), and adapter comparison
(runs / tokens / costs / success rate per adapter).
CSV export per tab, RFC-4180-compliant.
Monthly budget per project (migration
V8): soft threshold (default 80 %) plus
optional hard block mode that prevents new run creation
at > 100 % utilisation with a clear error message.
Live logs and run control
SSE endpoint/runs/<id>/logs/stream: logs appear live
without reload, heartbeat every 20 s, auto-reconnect
after 5 s on connection loss, auto-scroll toggle.
Pause / resume: new status
PAUSED, context-sensitive buttons in the run
detail view (only visible in the matching state).
Batch cancel: checkboxes on the run list plus
"Cancel selected" – runs that have already reached a
terminal state are skipped without error.
Deliberately not included
Still missing from the phase-3 roadmap: Gantt phase timeline,
phase rollback, context inject into running runs, and the
pre-run dry-run mode. They will follow in a later iteration –
this release delivers the cockpit leap without introducing new
adapter risk.
Around the release
New runbook docs/runbooks/deploy-demo.md with
atomic JAR swap and rollback path for
demo.softwarefabrik.io.
628 tests green, JaCoCo thresholds (80 % line /
80 % branch / 80 % instruction) held
throughout.
v0.2.5April 26, 2026
Review layer and quality gate: from "writes code" to "checks code"
0.2.5 brings the platform's second pillar:
alongside the executing agents (Claude Code, Codex, Gemini, Aider)
there is now a dedicated read-only review layer
with an aggregating quality gate. Whoever writes
gets reviewed – cleanly separated in the classpath and in
responsibility.
New in this release
Read-only reviewers: aider-review
and claude-review invoke their respective CLIs in
read-only mode; security,
architecture-reviewer and
hallucination-review run as static heuristics
without any external tool.
Quality gate with configurable policy
(strict / lenient): aggregates reviewer findings into a single
decision (PASSED / WARNING / FAILED / SKIPPED / ERROR),
including confidence-score aggregation.
Special rules independent of the policy:
SECURITY/HIGH and ARCHITECTURE/CRITICAL are always blocking;
reviewer crashes materialise as ERROR rather than being
swallowed.
Quality-gate UI at
/runs/<id>/quality-gate with reviewer
selection, policy choice and result view (blocking /
non-blocking, confidence, reviewer steps).
Dashboard charts: 14-day run activity as an
inline SVG bar chart, project and run status distributions as
horizontal bars – entirely server-rendered, no
JavaScript required.
Demo profile for
demo.softwarefabrik.io: auto-login as the
demo user, demo banner in the header, daily DB
reset via cron recommended.
Deliberately not included
Continue (continue.dev) is not integrated –
rationale in docs/review-and-quality-gate.md. Short
version: Continue does not have a stable non-interactive CLI mode
that can be reliably called from a server app in read-only
operation.
Around the release
New documentation chapter
docs/review-and-quality-gate.md and runbook
docs/runbooks/demo-instanz.md.
Prepared Word supplement
docs/word-supplement-v1.4.md for the next
revision of the concept paper.
Version bump to 0.2.5 – visible in the
platform footer and under /changelog.
v0.2.0April 25, 2026
Four adapters instead of one: Codex, Gemini and Aider have joined
Until now the Software Factory was hard-wired to Claude Code
as the executing development agent. With the new adapter registry, you can
now pick the agent for each run β and three additional vendor adapters
ship with this release.
What changed
Adapter registry: all adapters live in the backend
simultaneously. The concrete adapter is selected in the run wizard
and persisted on the run.
OpenAI Codex (codex exec): OpenAI's open
terminal CLI, conceptually equivalent to Claude Code.
Google Gemini (gemini -p): Google's open
terminal CLI, with a generous free tier via Google sign-in.
Aider: mature open-source agent with a
configurable model backend (Anthropic / OpenAI / Gemini /
local via Ollama).
Mock remains the DEMO default β no license, no
external tool, always available.
Licensing
All vendor adapters are unlocked from Community upwards.
In DEMO mode, the mock adapter remains the only option so the platform
runs out-of-the-box without sign-up.
Accompanying changes
New in-app page /changelog with the full version history.
Current version visible in the platform footer with a link to the
history.
DB migration V3 adds an adapter ID column to the run table.
v0.1.0April 17, 2026
First production-capable release
The Agentic Software Factory ships as a local control plane for
AI-assisted software development β with Claude Code as the initial
development agent, a lease-based licensing stack (Keycloak +
Spring-Boot license service, RS256 JWT), and an accessibility-conscious
Thymeleaf UI.
Project-idea wizard with a Markdown generator
(PROJECT.md, INSTRUCTIONS.md,
AGENTS.md, ...).
Run lifecycle with phases, statuses, and audit log.
Workspace bootstrap with git init and build gate (mvn verify).
Approval policies, Spring Security login, bootstrap admin.
Air-gap-capable lease system with COMMUNITY/PROFESSIONAL/ENTERPRISE tiers.