This page explains how the Agentic Software Factory is built β from the UI layer down to the adapters that talk to the coding CLIs. It targets architects and developers who want a feel for layers, modules, data flows and the key design decisions without diving straight into the source. If you'd rather have a step-by-step usage walkthrough, the introduction is the better starting point.
The platform is a control plane for AI-assisted software development. It bundles four jobs into a single web application: project capture, artifact generation, run orchestration and quality assessment. You don't drive the Claude Code, Codex, Gemini or Aider CLIs from your shell β the platform invokes them as subprocesses, collects their output, persists it in the database and surfaces it live in the UI.
The architecture follows two very classic principles, deliberately not dressed up with buzzwords for v1:
run, wizard, settings) has its own domain, application, web and infrastructure layer. Domain depends on nothing, application only on domain, web on application, infrastructure implements domain ports.Browser β Spring MVC + Thymeleaf β Application services β Domain model
β
PostgreSQL ββ Repositories ββ Adapters (Claude / Codex / Gemini / Aider / Mock / Git / Filesystem)
Inside the Maven module app, all functional modules live under io.softwarefabrik.app.<module>. Each module has at most four sub-packages:
domain/ β entities, value objects, repository interfaces, business services. No Spring, JPA or web annotations.application/ β use-case services that orchestrate domain methods. This is where transaction boundaries (@Transactional) live.web/ β Spring MVC controllers, Thymeleaf bindings, UI DTOs, REST endpoints for SSE.infrastructure/ β JPA implementations of repositories, external adapters, filesystem access, scheduled jobs.domain depends on nothing.application depends only on domain.web depends on application, not directly on domain.infrastructure implements domain ports and may use Spring magic.domain package β only via that module's application service.ArchUnit tests under app/src/test/java/.../architecture/ pin those rules. A wrong dependency turns CI red β architecture is enforced by automated tests, not by good intentions.
v0.8.1 modules at a glance:
Short profiles per module β handy when reading the code or hunting for a feature:
Manages the agent roles (Architect, Developer, Reviewer, QA, Security, Documentation, Merge/Release) including preferredModel per role. Main classes: AgentDefinition, AgentRoleService. Routes: /agents.
Append-only audit log for security-relevant events β login attempts, setting changes, run starts, approval decisions. Main class: AuditEvent. Table audit_event.
Shared utilities without business meaning: ID generator, clock abstraction, generic validators. Kept deliberately small.
New since v0.6.0: before every run writes .claude/settings.local.json (plugin whitelist + skills path) and .claude/agents/<role>.md per active team member into the workspace. Main classes: PluginCatalog, SkillsCatalog, ConductorWorkspaceWriter, ConductorRunPreparation.
The heart of run execution: ExecutionAdapterRegistry, sandbox implementations LocalProcessSandbox + ContainerProcessSandbox with ExecutionSandboxFactory (selection via setting execution.sandbox.variant), adapters for Claude, Codex, Gemini, Aider and Mock. One infrastructure/<name> sub-package per adapter. Under claudecode/, since v0.7.0, additionally ClaudeStreamJsonParser for live token events and TokenEstimator for local pre-estimates.
Workspace Git operations: git init, auto-commits, diff and log output for the run detail page. Uses jgit.
License logic: DEMO / COMMUNITY / FULL tiers, lease-JWT verification against the Keycloak public key, limit enforcement. Routes: /license, /admin/license.
Approval policies: which phases need a human gate, which run through. Stores ApprovalDecision records.
The ProjectDefinition entity with all editor fields (vision, audience, tech, architecture, security, β¦). Routes: /projects, /projects/{id}/edit.
Generates the six Markdown artifacts (PROJECT.md, INSTRUCTIONS.md, AGENTS.md, WORKFLOW.md, DEFINITION_OF_DONE.md, README.md) from project fields. Templates live under resources/prompts/.
Aggregates reviewer findings into the quality-gate verdict (PASSED / WARNING / FAILED / SKIPPED / ERROR). Special rules for SECURITY/HIGH and ARCHITECTURE/CRITICAL. Routes: /runs/{id}/quality-gate.
Reviewer implementations: aider-review, claude-review, security, architecture-reviewer, hallucination-review. CLI invocations in read-only mode or static heuristics.
The run model with phases, status, logs, token usage and workspace path. Main classes: Run, RunPhase, RunOrchestrationService, WorkspaceService. Routes: /runs, /runs/{id}.
Encrypted storage of API keys (Anthropic, OpenAI, Gemini) in Postgres. AES-GCM with master key from SOFTWAREFABRIK_SECRETS_MASTER_KEY. Routes: /integrations.
Spring Security configuration: login, bootstrap admin, role model (USER / ADMIN), CSRF, BCrypt passwords. Main class: SecurityConfig.
Global platform defaults at /einstellungen with a 5-minute TTL cache and override order PROJECT > USER > GLOBAL > YAML. Table app_setting (V9). Since v0.7.0 also the key execution.sandbox.variant for the container sandbox.
Per-project team composition from agent roles. Serialised into AGENTS.md.
Per-run build validation: mvn verify, npm run build or a configured build command. Writes BuildResult.
UI cross-cutting concerns: global layouts, kopfbereich.html, theme toggle, dashboard controller. No business logic.
The four-step assistant at /wizard. Main classes: WizardController, WizardService, WizardDraft, TemplateRegistry, ToggleRegistry, VersionLookupClient, WizardCostEstimator. Tables wizard_draft (V12), version_cache (V13). Six templates registered (Spring Boot, Static Frontend, .NET, Python, Node, Existing-Repo-Import); v0.8.x adds progress stepper, objective preview and local cost estimate in step 4.
A typical project lifecycle has five stations. Each one writes persistently to Postgres, each is visible in the UI:
ββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββ ββββββββββββββββ
β Wizard β β β Project β β β Artifacts β β β Run β β β Quality gate β
β /wizard β β (DRAFT) β β (Markdown) β β β β (Verdict) β
ββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββ ββββββββββββββββ
β β β β β
βΌ βΌ βΌ βΌ βΌ
wizard_draft project_def prompt_artifact run, run_phase quality_gate_run
run_log,
token_usage
wizard_draft (survives browser reload and server restart). On completion a ProjectDefinition is created and the draft flips to completed.DRAFT and moves to READY after artifact generation.PromptAssemblyService from project fields. They're editable β the agent always reads the most recently saved version.RunOrchestrationService is asynchronous (Spring @Async); the UI doesn't poll, it listens via SSE.The platform uses PostgreSQL as the single data source. Schema changes happen exclusively via Flyway migrations under app/src/main/resources/db/migration/. Each migration is numbered (V<n>__<name>.sql) and immutable once it's been in production β new changes go in as additional migrations, never as edits to old ones.
| Migration | Content |
|---|---|
V1 | Initial schema: project_definition, agent_definition, team, run, run_phase. |
V2 | Logs and token usage: run_log, token_usage_event. |
V3 | Approval policies: approval_policy, approval_decision. |
V4 | Encrypted secrets: integration_secret. |
V5 | Audit log: audit_event. |
V6 | Quality-gate tables: quality_gate_run, review_finding. |
V7 | License tables: license, license_lease. |
V8 | Build results: build_result. |
V9 | v0.4.0: app_setting with scope column (GLOBAL / USER / PROJECT) and audit trail. |
V10 | Run tags and quick-start markers. |
V11 | v0.4.0: agent_preferred_model β column for role-specific model selection. |
V12 | v0.4.0: wizard_draft β JSON state column, step counter, template ID, cleanup TTL. |
V13 | v0.4.0: version_cache β cache key, JSON value, refreshed_at, last_error. Java-computed staleness, no DB computed column (so the H2 test profile keeps working). |
V14 | v0.6.0: extended agent_definition for the conductor (mission, active skills). |
V15 | v0.7.0: run_template β adapter, team, objective, optional project scope, audit fields. Source for "start run from template". |
GENERATED ALWAYS AS. Where unavoidable, a db.migration.h2/ override file fills the gap.Adapters are Spring beans that implement the ExecutionAdapter interface. At startup the ExecutionAdapterRegistry collects all available beans into a map <name β adapter>. The name comes from a constant on the adapter (e.g. "claudecode", "codex", "gemini", "aider", "mock").
When a run starts, the adapter is resolved in this order:
preferredAdapter.app_setting with scope GLOBAL.application.yml as a last fallback.This override order β PROJECT > USER > GLOBAL > YAML β is implemented by SettingService and applies everywhere on the platform, not just to adapters.
// simplified
ExecutionAdapter adapter = registry.byName(
settingService.resolve("execution.adapter.default", scope)
);
So the wizard can always pre-fill current stable versions, the platform keeps a cache of version lookups. The cache refreshes daily at 03:00 via a @Scheduled job; a second job at 04:00 deletes expired wizard_draft rows (TTL 30 days).
Version lookups follow the strategy pattern: a VersionLookupClient interface, three implementations:
https://search.maven.org/solrsearch/select for Java libraries (Spring Boot, ArchUnit, OWASP Dependency-Check).https://api.github.com/repos/<owner>/<repo>/releases/latest (e.g. for the Trivy CLI).https://registry.npmjs.org/<package> for Node packages (Playwright).The cache key is a functional constant, not a technical path β e.g. spring-boot.3.x, archunit.latest, owasp.dependency-check.latest, trivy.cli.latest, playwright.npm.latest. That lets us swap the source without breaking consumers.
Staleness is computed in Java (threshold: 25 hours, a small buffer over the 03:00 job): now - refreshed_at > 25h. We deliberately do not use a Postgres GENERATED column, because the H2 test profile would trip over it.
Admins can inspect the cache at /einstellungen/wizard/versions. Each row shows cache key, returned value, refreshed_at, last_error and a Stale badge; a button "Refresh now" triggers a synchronous lookup for that key.
Every run gets its own workspace β a local directory under SOFTWAREFABRIK_WORKSPACES_ROOT/<project-slug>/<run-id>. The adapter (e.g. claudecode) is launched in that directory as its own process (LocalProcessSandbox). Each run thus has its own Git history, its own build output, its own node_modules or target folder β runs cannot accidentally clobber each other.
LocalProcessSandbox sets environment variables cleanly per process (no System.setenv at platform level), terminates processes hard on cancel/timeout (destroyForcibly()) and writes stdout and stderr line-by-line into the run_log table while pushing the same lines into the SSE stream.
ContainerProcessSandbox ships alongside (ADR-0011 Accepted, variant B). It starts every agent in an ephemeral Docker or Podman container with --cpus 2 --memory 4g --pids-limit 512 --read-only, a bindmount on the workspace and --network=none by default. Enabled via setting execution.sandbox.variant=container; if Docker is missing from the PATH, the ExecutionSandboxFactory falls back to the local variant with a log warning. Single-tenant hosts notice nothing; Enterprise tier can flip without code changes.Security in this architecture is not one module, it's a cross-cutting discipline. The points that matter:
SOFTWAREFABRIK_ADMIN_USER + ...PASSWORD. If the password is omitted, no admin is created β weak defaults would be a bug.SOFTWAREFABRIK_SECRETS_MASTER_KEY in Postgres. Never plaintext, never in logs.audit_event. Append-only, with timestamp, user subject and a diff of the change.EXECUTION, COMPLETION) wait for an explicit user decision. The default policy is conservative β approval enabled at plan and review boundaries.127.0.0.1:5432 instead of 0.0.0.0:5432, so the port isn't accidentally exposed externally. Trivy/OWASP scans run per CI build.The settings module is the platform's control panel. It combines three design choices that work together:
GLOBAL, USER, PROJECT). SettingService.resolve(key, context) searches narrow-to-wide: PROJECT first, then USER, then GLOBAL, then YAML.SettingService caches resolved values, so a resolve is essentially free 99 % of the time. UI changes don't hard-invalidate the cache β at most 5 minutes later the new value is live. No app restart required.app_setting creates an audit_event entry with subject, key, old value, new value. Who changed what and when is always traceable.Concrete settings managed today:
| Key | Meaning | Typical value |
|---|---|---|
workspace.root | Root directory for all run workspaces | /var/softwarefabrik/workspaces |
git.user.name | Git author for auto-commits | Software Factory Bot |
git.user.email | Git email | bot@softwarefabrik.local |
execution.adapter.default | Default execution adapter | claudecode |
execution.model.claudecode | Default model for Claude Code | claude-sonnet-4-6 |
execution.model.codex | Default model for Codex | gpt-5 |
budget.tokens.daily | Daily token cap | 2000000 |
budget.tokens.weekly | Weekly token cap | 10000000 |
budget.threshold.soft | Soft threshold in percent (warns only) | 80 |
The run detail page streams three things live from the backend to the browser: logs, phase updates and token counter. Instead of polling the platform uses Server-Sent Events (SSE) β an open HTTP stream that pushes one-way messages from the backend.
:keepalive comment. That keeps the connection alive and exposes dead TCP sockets quickly.retry: 5000, so 5 s.SSE endpoints sit at /runs/{id}/stream/logs, /runs/{id}/stream/phases, /runs/{id}/stream/tokens. If a reverse proxy blocks HTTP streaming (some corporate setups do), the UI falls back to a 2-second polling loop with the same payload.
Tests are not optional; they are part of the architecture. Three pillars:
app/src/test/java/.../<module>/<β¦Service>Test.java. Fast, no Spring context, Mockito-based.@WebMvcTest, @DataJpaTest) against H2.web straight into infrastructure and a test fails.Coverage gate: JaCoCo 80/80 (lines + branches). A service merged without tests doesn't pass mvn verify. That's intentional β the platform is no longer a prototype.
CI pipeline (.github/workflows/ci.yml): build + test + ArchUnit + JaCoCo + OWASP Dependency-Check + Trivy file scan. OWASP runs non-blocking without an NVD key, Trivy only in its dedicated job.
What the platform omits is as important as what it does. These omissions keep v1 readable and extensible:
mvn verify or npm run build run inside the workspace, not on a build cluster. Anyone wanting to scale that builds their own infrastructure.Beside the product backend there is a separate stack for identity and licensing, decoupled from the platform DB:
The client starts in a registration-free DEMO mode (mock adapter only, hard-coded limits, no server contact). After registration via the device flow, a COMMUNITY license is created automatically; an admin can upgrade the tier to FULL in the admin UI. The lease determines which adapter may be used and how run creation and team size are limited.