openai/symphony
Publicmirrored fromhttps://github.com/openai/symphonyAvailable
SPEC.md
2110lines · modecode
| 1 | # Symphony Service Specification |
| 2 | |
| 3 | Status: Draft v1 (language-agnostic) |
| 4 | |
| 5 | Purpose: Define a service that orchestrates coding agents to get project work done. |
| 6 | |
| 7 | ## 1. Problem Statement |
| 8 | |
| 9 | Symphony is a long-running automation service that continuously reads work from an issue tracker |
| 10 | (Linear in this specification version), creates an isolated workspace for each issue, and runs a |
| 11 | coding agent session for that issue inside the workspace. |
| 12 | |
| 13 | The service solves four operational problems: |
| 14 | |
| 15 | - It turns issue execution into a repeatable daemon workflow instead of manual scripts. |
| 16 | - It isolates agent execution in per-issue workspaces so agent commands run only inside per-issue |
| 17 | workspace directories. |
| 18 | - It keeps the workflow policy in-repo (`WORKFLOW.md`) so teams version the agent prompt and runtime |
| 19 | settings with their code. |
| 20 | - It provides enough observability to operate and debug multiple concurrent agent runs. |
| 21 | |
| 22 | Implementations are expected to document their trust and safety posture explicitly. This |
| 23 | specification does not require a single approval, sandbox, or operator-confirmation policy; some |
| 24 | implementations may target trusted environments with a high-trust configuration, while others may |
| 25 | require stricter approvals or sandboxing. |
| 26 | |
| 27 | Important boundary: |
| 28 | |
| 29 | - Symphony is a scheduler/runner and tracker reader. |
| 30 | - Ticket writes (state transitions, comments, PR links) are typically performed by the coding agent |
| 31 | using tools available in the workflow/runtime environment. |
| 32 | - A successful run may end at a workflow-defined handoff state (for example `Human Review`), not |
| 33 | necessarily `Done`. |
| 34 | |
| 35 | ## 2. Goals and Non-Goals |
| 36 | |
| 37 | ### 2.1 Goals |
| 38 | |
| 39 | - Poll the issue tracker on a fixed cadence and dispatch work with bounded concurrency. |
| 40 | - Maintain a single authoritative orchestrator state for dispatch, retries, and reconciliation. |
| 41 | - Create deterministic per-issue workspaces and preserve them across runs. |
| 42 | - Stop active runs when issue state changes make them ineligible. |
| 43 | - Recover from transient failures with exponential backoff. |
| 44 | - Load runtime behavior from a repository-owned `WORKFLOW.md` contract. |
| 45 | - Expose operator-visible observability (at minimum structured logs). |
| 46 | - Support restart recovery without requiring a persistent database. |
| 47 | |
| 48 | ### 2.2 Non-Goals |
| 49 | |
| 50 | - Rich web UI or multi-tenant control plane. |
| 51 | - Prescribing a specific dashboard or terminal UI implementation. |
| 52 | - General-purpose workflow engine or distributed job scheduler. |
| 53 | - Built-in business logic for how to edit tickets, PRs, or comments. (That logic lives in the |
| 54 | workflow prompt and agent tooling.) |
| 55 | - Mandating strong sandbox controls beyond what the coding agent and host OS provide. |
| 56 | - Mandating a single default approval, sandbox, or operator-confirmation posture for all |
| 57 | implementations. |
| 58 | |
| 59 | ## 3. System Overview |
| 60 | |
| 61 | ### 3.1 Main Components |
| 62 | |
| 63 | 1. `Workflow Loader` |
| 64 | - Reads `WORKFLOW.md`. |
| 65 | - Parses YAML front matter and prompt body. |
| 66 | - Returns `{config, prompt_template}`. |
| 67 | |
| 68 | 2. `Config Layer` |
| 69 | - Exposes typed getters for workflow config values. |
| 70 | - Applies defaults and environment variable indirection. |
| 71 | - Performs validation used by the orchestrator before dispatch. |
| 72 | |
| 73 | 3. `Issue Tracker Client` |
| 74 | - Fetches candidate issues in active states. |
| 75 | - Fetches current states for specific issue IDs (reconciliation). |
| 76 | - Fetches terminal-state issues during startup cleanup. |
| 77 | - Normalizes tracker payloads into a stable issue model. |
| 78 | |
| 79 | 4. `Orchestrator` |
| 80 | - Owns the poll tick. |
| 81 | - Owns the in-memory runtime state. |
| 82 | - Decides which issues to dispatch, retry, stop, or release. |
| 83 | - Tracks session metrics and retry queue state. |
| 84 | |
| 85 | 5. `Workspace Manager` |
| 86 | - Maps issue identifiers to workspace paths. |
| 87 | - Ensures per-issue workspace directories exist. |
| 88 | - Runs workspace lifecycle hooks. |
| 89 | - Cleans workspaces for terminal issues. |
| 90 | |
| 91 | 6. `Agent Runner` |
| 92 | - Creates workspace. |
| 93 | - Builds prompt from issue + workflow template. |
| 94 | - Launches the coding agent app-server client. |
| 95 | - Streams agent updates back to the orchestrator. |
| 96 | |
| 97 | 7. `Status Surface` (optional) |
| 98 | - Presents human-readable runtime status (for example terminal output, dashboard, or other |
| 99 | operator-facing view). |
| 100 | |
| 101 | 8. `Logging` |
| 102 | - Emits structured runtime logs to one or more configured sinks. |
| 103 | |
| 104 | ### 3.2 Abstraction Levels |
| 105 | |
| 106 | Symphony is easiest to port when kept in these layers: |
| 107 | |
| 108 | 1. `Policy Layer` (repo-defined) |
| 109 | - `WORKFLOW.md` prompt body. |
| 110 | - Team-specific rules for ticket handling, validation, and handoff. |
| 111 | |
| 112 | 2. `Configuration Layer` (typed getters) |
| 113 | - Parses front matter into typed runtime settings. |
| 114 | - Handles defaults, environment tokens, and path normalization. |
| 115 | |
| 116 | 3. `Coordination Layer` (orchestrator) |
| 117 | - Polling loop, issue eligibility, concurrency, retries, reconciliation. |
| 118 | |
| 119 | 4. `Execution Layer` (workspace + agent subprocess) |
| 120 | - Filesystem lifecycle, workspace preparation, coding-agent protocol. |
| 121 | |
| 122 | 5. `Integration Layer` (Linear adapter) |
| 123 | - API calls and normalization for tracker data. |
| 124 | |
| 125 | 6. `Observability Layer` (logs + optional status surface) |
| 126 | - Operator visibility into orchestrator and agent behavior. |
| 127 | |
| 128 | ### 3.3 External Dependencies |
| 129 | |
| 130 | - Issue tracker API (Linear for `tracker.kind: linear` in this specification version). |
| 131 | - Local filesystem for workspaces and logs. |
| 132 | - Optional workspace population tooling (for example Git CLI, if used). |
| 133 | - Coding-agent executable that supports JSON-RPC-like app-server mode over stdio. |
| 134 | - Host environment authentication for the issue tracker and coding agent. |
| 135 | |
| 136 | ## 4. Core Domain Model |
| 137 | |
| 138 | ### 4.1 Entities |
| 139 | |
| 140 | #### 4.1.1 Issue |
| 141 | |
| 142 | Normalized issue record used by orchestration, prompt rendering, and observability output. |
| 143 | |
| 144 | Fields: |
| 145 | |
| 146 | - `id` (string) |
| 147 | - Stable tracker-internal ID. |
| 148 | - `identifier` (string) |
| 149 | - Human-readable ticket key (example: `ABC-123`). |
| 150 | - `title` (string) |
| 151 | - `description` (string or null) |
| 152 | - `priority` (integer or null) |
| 153 | - Lower numbers are higher priority in dispatch sorting. |
| 154 | - `state` (string) |
| 155 | - Current tracker state name. |
| 156 | - `branch_name` (string or null) |
| 157 | - Tracker-provided branch metadata if available. |
| 158 | - `url` (string or null) |
| 159 | - `labels` (list of strings) |
| 160 | - Normalized to lowercase. |
| 161 | - `blocked_by` (list of blocker refs) |
| 162 | - Each blocker ref contains: |
| 163 | - `id` (string or null) |
| 164 | - `identifier` (string or null) |
| 165 | - `state` (string or null) |
| 166 | - `created_at` (timestamp or null) |
| 167 | - `updated_at` (timestamp or null) |
| 168 | |
| 169 | #### 4.1.2 Workflow Definition |
| 170 | |
| 171 | Parsed `WORKFLOW.md` payload: |
| 172 | |
| 173 | - `config` (map) |
| 174 | - YAML front matter root object. |
| 175 | - `prompt_template` (string) |
| 176 | - Markdown body after front matter, trimmed. |
| 177 | |
| 178 | #### 4.1.3 Service Config (Typed View) |
| 179 | |
| 180 | Typed runtime values derived from `WorkflowDefinition.config` plus environment resolution. |
| 181 | |
| 182 | Examples: |
| 183 | |
| 184 | - poll interval |
| 185 | - workspace root |
| 186 | - active and terminal issue states |
| 187 | - concurrency limits |
| 188 | - coding-agent executable/args/timeouts |
| 189 | - workspace hooks |
| 190 | |
| 191 | #### 4.1.4 Workspace |
| 192 | |
| 193 | Filesystem workspace assigned to one issue identifier. |
| 194 | |
| 195 | Fields (logical): |
| 196 | |
| 197 | - `path` (workspace path; current runtime typically uses absolute paths, but relative roots are |
| 198 | possible if configured without path separators) |
| 199 | - `workspace_key` (sanitized issue identifier) |
| 200 | - `created_now` (boolean, used to gate `after_create` hook) |
| 201 | |
| 202 | #### 4.1.5 Run Attempt |
| 203 | |
| 204 | One execution attempt for one issue. |
| 205 | |
| 206 | Fields (logical): |
| 207 | |
| 208 | - `issue_id` |
| 209 | - `issue_identifier` |
| 210 | - `attempt` (integer or null, `null` for first run, `>=1` for retries/continuation) |
| 211 | - `workspace_path` |
| 212 | - `started_at` |
| 213 | - `status` |
| 214 | - `error` (optional) |
| 215 | |
| 216 | #### 4.1.6 Live Session (Agent Session Metadata) |
| 217 | |
| 218 | State tracked while a coding-agent subprocess is running. |
| 219 | |
| 220 | Fields: |
| 221 | |
| 222 | - `session_id` (string, `<thread_id>-<turn_id>`) |
| 223 | - `thread_id` (string) |
| 224 | - `turn_id` (string) |
| 225 | - `codex_app_server_pid` (string or null) |
| 226 | - `last_codex_event` (string/enum or null) |
| 227 | - `last_codex_timestamp` (timestamp or null) |
| 228 | - `last_codex_message` (summarized payload) |
| 229 | - `codex_input_tokens` (integer) |
| 230 | - `codex_output_tokens` (integer) |
| 231 | - `codex_total_tokens` (integer) |
| 232 | - `last_reported_input_tokens` (integer) |
| 233 | - `last_reported_output_tokens` (integer) |
| 234 | - `last_reported_total_tokens` (integer) |
| 235 | - `turn_count` (integer) |
| 236 | - Number of coding-agent turns started within the current worker lifetime. |
| 237 | |
| 238 | #### 4.1.7 Retry Entry |
| 239 | |
| 240 | Scheduled retry state for an issue. |
| 241 | |
| 242 | Fields: |
| 243 | |
| 244 | - `issue_id` |
| 245 | - `identifier` (best-effort human ID for status surfaces/logs) |
| 246 | - `attempt` (integer, 1-based for retry queue) |
| 247 | - `due_at_ms` (monotonic clock timestamp) |
| 248 | - `timer_handle` (runtime-specific timer reference) |
| 249 | - `error` (string or null) |
| 250 | |
| 251 | #### 4.1.8 Orchestrator Runtime State |
| 252 | |
| 253 | Single authoritative in-memory state owned by the orchestrator. |
| 254 | |
| 255 | Fields: |
| 256 | |
| 257 | - `poll_interval_ms` (current effective poll interval) |
| 258 | - `max_concurrent_agents` (current effective global concurrency limit) |
| 259 | - `running` (map `issue_id -> running entry`) |
| 260 | - `claimed` (set of issue IDs reserved/running/retrying) |
| 261 | - `retry_attempts` (map `issue_id -> RetryEntry`) |
| 262 | - `completed` (set of issue IDs; bookkeeping only, not dispatch gating) |
| 263 | - `codex_totals` (aggregate tokens + runtime seconds) |
| 264 | - `codex_rate_limits` (latest rate-limit snapshot from agent events) |
| 265 | |
| 266 | ### 4.2 Stable Identifiers and Normalization Rules |
| 267 | |
| 268 | - `Issue ID` |
| 269 | - Use for tracker lookups and internal map keys. |
| 270 | - `Issue Identifier` |
| 271 | - Use for human-readable logs and workspace naming. |
| 272 | - `Workspace Key` |
| 273 | - Derive from `issue.identifier` by replacing any character not in `[A-Za-z0-9._-]` with `_`. |
| 274 | - Use the sanitized value for the workspace directory name. |
| 275 | - `Normalized Issue State` |
| 276 | - Compare states after `trim` + `lowercase`. |
| 277 | - `Session ID` |
| 278 | - Compose from coding-agent `thread_id` and `turn_id` as `<thread_id>-<turn_id>`. |
| 279 | |
| 280 | ## 5. Workflow Specification (Repository Contract) |
| 281 | |
| 282 | ### 5.1 File Discovery and Path Resolution |
| 283 | |
| 284 | Workflow file path precedence: |
| 285 | |
| 286 | 1. Explicit application/runtime setting (set by CLI startup path). |
| 287 | 2. Default: `WORKFLOW.md` in the current process working directory. |
| 288 | |
| 289 | Loader behavior: |
| 290 | |
| 291 | - If the file cannot be read, return `missing_workflow_file` error. |
| 292 | - The workflow file is expected to be repository-owned and version-controlled. |
| 293 | |
| 294 | ### 5.2 File Format |
| 295 | |
| 296 | `WORKFLOW.md` is a Markdown file with optional YAML front matter. |
| 297 | |
| 298 | Design note: |
| 299 | |
| 300 | - `WORKFLOW.md` should be self-contained enough to describe and run different workflows (prompt, |
| 301 | runtime settings, hooks, and tracker selection/config) without requiring out-of-band |
| 302 | service-specific configuration. |
| 303 | |
| 304 | Parsing rules: |
| 305 | |
| 306 | - If file starts with `---`, parse lines until the next `---` as YAML front matter. |
| 307 | - Remaining lines become the prompt body. |
| 308 | - If front matter is absent, treat the entire file as prompt body and use an empty config map. |
| 309 | - YAML front matter must decode to a map/object; non-map YAML is an error. |
| 310 | - Prompt body is trimmed before use. |
| 311 | |
| 312 | Returned workflow object: |
| 313 | |
| 314 | - `config`: front matter root object (not nested under a `config` key). |
| 315 | - `prompt_template`: trimmed Markdown body. |
| 316 | |
| 317 | ### 5.3 Front Matter Schema |
| 318 | |
| 319 | Top-level keys: |
| 320 | |
| 321 | - `tracker` |
| 322 | - `polling` |
| 323 | - `workspace` |
| 324 | - `hooks` |
| 325 | - `agent` |
| 326 | - `codex` |
| 327 | |
| 328 | Unknown keys should be ignored for forward compatibility. |
| 329 | |
| 330 | Note: |
| 331 | |
| 332 | - The workflow front matter is extensible. Optional extensions may define additional top-level keys |
| 333 | (for example `server`) without changing the core schema above. |
| 334 | - Extensions should document their field schema, defaults, validation rules, and whether changes |
| 335 | apply dynamically or require restart. |
| 336 | - Common extension: `server.port` (integer) enables the optional HTTP server described in Section |
| 337 | 13.7. |
| 338 | |
| 339 | #### 5.3.1 `tracker` (object) |
| 340 | |
| 341 | Fields: |
| 342 | |
| 343 | - `kind` (string) |
| 344 | - Required for dispatch. |
| 345 | - Current supported value: `linear` |
| 346 | - `endpoint` (string) |
| 347 | - Default for `tracker.kind == "linear"`: `https://api.linear.app/graphql` |
| 348 | - `api_key` (string) |
| 349 | - May be a literal token or `$VAR_NAME`. |
| 350 | - Canonical environment variable for `tracker.kind == "linear"`: `LINEAR_API_KEY`. |
| 351 | - If `$VAR_NAME` resolves to an empty string, treat the key as missing. |
| 352 | - `project_slug` (string) |
| 353 | - Required for dispatch when `tracker.kind == "linear"`. |
| 354 | - `active_states` (list of strings or comma-separated string) |
| 355 | - Default: `Todo`, `In Progress` |
| 356 | - `terminal_states` (list of strings or comma-separated string) |
| 357 | - Default: `Closed`, `Cancelled`, `Canceled`, `Duplicate`, `Done` |
| 358 | |
| 359 | #### 5.3.2 `polling` (object) |
| 360 | |
| 361 | Fields: |
| 362 | |
| 363 | - `interval_ms` (integer or string integer) |
| 364 | - Default: `30000` |
| 365 | - Changes should be re-applied at runtime and affect future tick scheduling without restart. |
| 366 | |
| 367 | #### 5.3.3 `workspace` (object) |
| 368 | |
| 369 | Fields: |
| 370 | |
| 371 | - `root` (path string or `$VAR`) |
| 372 | - Default: `<system-temp>/symphony_workspaces` |
| 373 | - `~` and strings containing path separators are expanded. |
| 374 | - Bare strings without path separators are preserved as-is (relative roots are allowed but |
| 375 | discouraged). |
| 376 | |
| 377 | #### 5.3.4 `hooks` (object) |
| 378 | |
| 379 | Fields: |
| 380 | |
| 381 | - `after_create` (multiline shell script string, optional) |
| 382 | - Runs only when a workspace directory is newly created. |
| 383 | - Failure aborts workspace creation. |
| 384 | - `before_run` (multiline shell script string, optional) |
| 385 | - Runs before each agent attempt after workspace preparation and before launching the coding |
| 386 | agent. |
| 387 | - Failure aborts the current attempt. |
| 388 | - `after_run` (multiline shell script string, optional) |
| 389 | - Runs after each agent attempt (success, failure, timeout, or cancellation) once the workspace |
| 390 | exists. |
| 391 | - Failure is logged but ignored. |
| 392 | - `before_remove` (multiline shell script string, optional) |
| 393 | - Runs before workspace deletion if the directory exists. |
| 394 | - Failure is logged but ignored; cleanup still proceeds. |
| 395 | - `timeout_ms` (integer, optional) |
| 396 | - Default: `60000` |
| 397 | - Applies to all workspace hooks. |
| 398 | - Non-positive values should be treated as invalid and fall back to the default. |
| 399 | - Changes should be re-applied at runtime for future hook executions. |
| 400 | |
| 401 | #### 5.3.5 `agent` (object) |
| 402 | |
| 403 | Fields: |
| 404 | |
| 405 | - `max_concurrent_agents` (integer or string integer) |
| 406 | - Default: `10` |
| 407 | - Changes should be re-applied at runtime and affect subsequent dispatch decisions. |
| 408 | - `max_retry_backoff_ms` (integer or string integer) |
| 409 | - Default: `300000` (5 minutes) |
| 410 | - Changes should be re-applied at runtime and affect future retry scheduling. |
| 411 | - `max_concurrent_agents_by_state` (map `state_name -> positive integer`) |
| 412 | - Default: empty map. |
| 413 | - State keys are normalized (`trim` + `lowercase`) for lookup. |
| 414 | - Invalid entries (non-positive or non-numeric) are ignored. |
| 415 | |
| 416 | #### 5.3.6 `codex` (object) |
| 417 | |
| 418 | Fields: |
| 419 | |
| 420 | For Codex-owned config values such as `approval_policy`, `thread_sandbox`, and |
| 421 | `turn_sandbox_policy`, supported values are defined by the targeted Codex app-server version. |
| 422 | Implementors should treat them as pass-through Codex config values rather than relying on a |
| 423 | hand-maintained enum in this spec. To inspect the installed Codex schema, run |
| 424 | `codex app-server generate-json-schema --out <dir>` and inspect the relevant definitions referenced |
| 425 | by `v2/ThreadStartParams.json` and `v2/TurnStartParams.json`. Implementations may validate these |
| 426 | fields locally if they want stricter startup checks. |
| 427 | |
| 428 | - `command` (string shell command) |
| 429 | - Default: `codex app-server` |
| 430 | - The runtime launches this command via `bash -lc` in the workspace directory. |
| 431 | - The launched process must speak a compatible app-server protocol over stdio. |
| 432 | - `approval_policy` (Codex `AskForApproval` value) |
| 433 | - Default: implementation-defined. |
| 434 | - `thread_sandbox` (Codex `SandboxMode` value) |
| 435 | - Default: implementation-defined. |
| 436 | - `turn_sandbox_policy` (Codex `SandboxPolicy` value) |
| 437 | - Default: implementation-defined. |
| 438 | - `turn_timeout_ms` (integer) |
| 439 | - Default: `3600000` (1 hour) |
| 440 | - `read_timeout_ms` (integer) |
| 441 | - Default: `5000` |
| 442 | - `stall_timeout_ms` (integer) |
| 443 | - Default: `300000` (5 minutes) |
| 444 | - If `<= 0`, stall detection is disabled. |
| 445 | |
| 446 | ### 5.4 Prompt Template Contract |
| 447 | |
| 448 | The Markdown body of `WORKFLOW.md` is the per-issue prompt template. |
| 449 | |
| 450 | Rendering requirements: |
| 451 | |
| 452 | - Use a strict template engine (Liquid-compatible semantics are sufficient). |
| 453 | - Unknown variables must fail rendering. |
| 454 | - Unknown filters must fail rendering. |
| 455 | |
| 456 | Template input variables: |
| 457 | |
| 458 | - `issue` (object) |
| 459 | - Includes all normalized issue fields, including labels and blockers. |
| 460 | - `attempt` (integer or null) |
| 461 | - `null`/absent on first attempt. |
| 462 | - Integer on retry or continuation run. |
| 463 | |
| 464 | Fallback prompt behavior: |
| 465 | |
| 466 | - If the workflow prompt body is empty, the runtime may use a minimal default prompt |
| 467 | (`You are working on an issue from Linear.`). |
| 468 | - Workflow file read/parse failures are configuration/validation errors and should not silently fall |
| 469 | back to a prompt. |
| 470 | |
| 471 | ### 5.5 Workflow Validation and Error Surface |
| 472 | |
| 473 | Error classes: |
| 474 | |
| 475 | - `missing_workflow_file` |
| 476 | - `workflow_parse_error` |
| 477 | - `workflow_front_matter_not_a_map` |
| 478 | - `template_parse_error` (during prompt rendering) |
| 479 | - `template_render_error` (unknown variable/filter, invalid interpolation) |
| 480 | |
| 481 | Dispatch gating behavior: |
| 482 | |
| 483 | - Workflow file read/YAML errors block new dispatches until fixed. |
| 484 | - Template errors fail only the affected run attempt. |
| 485 | |
| 486 | ## 6. Configuration Specification |
| 487 | |
| 488 | ### 6.1 Source Precedence and Resolution Semantics |
| 489 | |
| 490 | Configuration precedence: |
| 491 | |
| 492 | 1. Workflow file path selection (runtime setting -> cwd default). |
| 493 | 2. YAML front matter values. |
| 494 | 3. Environment indirection via `$VAR_NAME` inside selected YAML values. |
| 495 | 4. Built-in defaults. |
| 496 | |
| 497 | Value coercion semantics: |
| 498 | |
| 499 | - Path/command fields support: |
| 500 | - `~` home expansion |
| 501 | - `$VAR` expansion for env-backed path values |
| 502 | - Apply expansion only to values intended to be local filesystem paths; do not rewrite URIs or |
| 503 | arbitrary shell command strings. |
| 504 | |
| 505 | ### 6.2 Dynamic Reload Semantics |
| 506 | |
| 507 | Dynamic reload is required: |
| 508 | |
| 509 | - The software should watch `WORKFLOW.md` for changes. |
| 510 | - On change, it should re-read and re-apply workflow config and prompt template without restart. |
| 511 | - The software should attempt to adjust live behavior to the new config (for example polling |
| 512 | cadence, concurrency limits, active/terminal states, codex settings, workspace paths/hooks, and |
| 513 | prompt content for future runs). |
| 514 | - Reloaded config applies to future dispatch, retry scheduling, reconciliation decisions, hook |
| 515 | execution, and agent launches. |
| 516 | - Implementations are not required to restart in-flight agent sessions automatically when config |
| 517 | changes. |
| 518 | - Extensions that manage their own listeners/resources (for example an HTTP server port change) may |
| 519 | require restart unless the implementation explicitly supports live rebind. |
| 520 | - Implementations should also re-validate/reload defensively during runtime operations (for example |
| 521 | before dispatch) in case filesystem watch events are missed. |
| 522 | - Invalid reloads should not crash the service; keep operating with the last known good effective |
| 523 | configuration and emit an operator-visible error. |
| 524 | |
| 525 | ### 6.3 Dispatch Preflight Validation |
| 526 | |
| 527 | This validation is a scheduler preflight run before attempting to dispatch new work. It validates |
| 528 | the workflow/config needed to poll and launch workers, not a full audit of all possible workflow |
| 529 | behavior. |
| 530 | |
| 531 | Startup validation: |
| 532 | |
| 533 | - Validate configuration before starting the scheduling loop. |
| 534 | - If startup validation fails, fail startup and emit an operator-visible error. |
| 535 | |
| 536 | Per-tick dispatch validation: |
| 537 | |
| 538 | - Re-validate before each dispatch cycle. |
| 539 | - If validation fails, skip dispatch for that tick, keep reconciliation active, and emit an |
| 540 | operator-visible error. |
| 541 | |
| 542 | Validation checks: |
| 543 | |
| 544 | - Workflow file can be loaded and parsed. |
| 545 | - `tracker.kind` is present and supported. |
| 546 | - `tracker.api_key` is present after `$` resolution. |
| 547 | - `tracker.project_slug` is present when required by the selected tracker kind. |
| 548 | - `codex.command` is present and non-empty. |
| 549 | |
| 550 | ### 6.4 Config Fields Summary (Cheat Sheet) |
| 551 | |
| 552 | This section is intentionally redundant so a coding agent can implement the config layer quickly. |
| 553 | |
| 554 | - `tracker.kind`: string, required, currently `linear` |
| 555 | - `tracker.endpoint`: string, default `https://api.linear.app/graphql` when `tracker.kind=linear` |
| 556 | - `tracker.api_key`: string or `$VAR`, canonical env `LINEAR_API_KEY` when `tracker.kind=linear` |
| 557 | - `tracker.project_slug`: string, required when `tracker.kind=linear` |
| 558 | - `tracker.active_states`: list/string, default `Todo, In Progress` |
| 559 | - `tracker.terminal_states`: list/string, default `Closed, Cancelled, Canceled, Duplicate, Done` |
| 560 | - `polling.interval_ms`: integer, default `30000` |
| 561 | - `workspace.root`: path, default `<system-temp>/symphony_workspaces` |
| 562 | - `hooks.after_create`: shell script or null |
| 563 | - `hooks.before_run`: shell script or null |
| 564 | - `hooks.after_run`: shell script or null |
| 565 | - `hooks.before_remove`: shell script or null |
| 566 | - `hooks.timeout_ms`: integer, default `60000` |
| 567 | - `agent.max_concurrent_agents`: integer, default `10` |
| 568 | - `agent.max_turns`: integer, default `20` |
| 569 | - `agent.max_retry_backoff_ms`: integer, default `300000` (5m) |
| 570 | - `agent.max_concurrent_agents_by_state`: map of positive integers, default `{}` |
| 571 | - `codex.command`: shell command string, default `codex app-server` |
| 572 | - `codex.approval_policy`: Codex `AskForApproval` value, default implementation-defined |
| 573 | - `codex.thread_sandbox`: Codex `SandboxMode` value, default implementation-defined |
| 574 | - `codex.turn_sandbox_policy`: Codex `SandboxPolicy` value, default implementation-defined |
| 575 | - `codex.turn_timeout_ms`: integer, default `3600000` |
| 576 | - `codex.read_timeout_ms`: integer, default `5000` |
| 577 | - `codex.stall_timeout_ms`: integer, default `300000` |
| 578 | - `server.port` (extension): integer, optional; enables the optional HTTP server, `0` may be used |
| 579 | for ephemeral local bind, and CLI `--port` overrides it |
| 580 | |
| 581 | ## 7. Orchestration State Machine |
| 582 | |
| 583 | The orchestrator is the only component that mutates scheduling state. All worker outcomes are |
| 584 | reported back to it and converted into explicit state transitions. |
| 585 | |
| 586 | ### 7.1 Issue Orchestration States |
| 587 | |
| 588 | This is not the same as tracker states (`Todo`, `In Progress`, etc.). This is the service's internal |
| 589 | claim state. |
| 590 | |
| 591 | 1. `Unclaimed` |
| 592 | - Issue is not running and has no retry scheduled. |
| 593 | |
| 594 | 2. `Claimed` |
| 595 | - Orchestrator has reserved the issue to prevent duplicate dispatch. |
| 596 | - In practice, claimed issues are either `Running` or `RetryQueued`. |
| 597 | |
| 598 | 3. `Running` |
| 599 | - Worker task exists and the issue is tracked in `running` map. |
| 600 | |
| 601 | 4. `RetryQueued` |
| 602 | - Worker is not running, but a retry timer exists in `retry_attempts`. |
| 603 | |
| 604 | 5. `Released` |
| 605 | - Claim removed because issue is terminal, non-active, missing, or retry path completed without |
| 606 | re-dispatch. |
| 607 | |
| 608 | Important nuance: |
| 609 | |
| 610 | - A successful worker exit does not mean the issue is done forever. |
| 611 | - The worker may continue through multiple back-to-back coding-agent turns before it exits. |
| 612 | - After each normal turn completion, the worker re-checks the tracker issue state. |
| 613 | - If the issue is still in an active state, the worker should start another turn on the same live |
| 614 | coding-agent thread in the same workspace, up to `agent.max_turns`. |
| 615 | - The first turn should use the full rendered task prompt. |
| 616 | - Continuation turns should send only continuation guidance to the existing thread, not resend the |
| 617 | original task prompt that is already present in thread history. |
| 618 | - Once the worker exits normally, the orchestrator still schedules a short continuation retry |
| 619 | (about 1 second) so it can re-check whether the issue remains active and needs another worker |
| 620 | session. |
| 621 | |
| 622 | ### 7.2 Run Attempt Lifecycle |
| 623 | |
| 624 | A run attempt transitions through these phases: |
| 625 | |
| 626 | 1. `PreparingWorkspace` |
| 627 | 2. `BuildingPrompt` |
| 628 | 3. `LaunchingAgentProcess` |
| 629 | 4. `InitializingSession` |
| 630 | 5. `StreamingTurn` |
| 631 | 6. `Finishing` |
| 632 | 7. `Succeeded` |
| 633 | 8. `Failed` |
| 634 | 9. `TimedOut` |
| 635 | 10. `Stalled` |
| 636 | 11. `CanceledByReconciliation` |
| 637 | |
| 638 | Distinct terminal reasons are important because retry logic and logs differ. |
| 639 | |
| 640 | ### 7.3 Transition Triggers |
| 641 | |
| 642 | - `Poll Tick` |
| 643 | - Reconcile active runs. |
| 644 | - Validate config. |
| 645 | - Fetch candidate issues. |
| 646 | - Dispatch until slots are exhausted. |
| 647 | |
| 648 | - `Worker Exit (normal)` |
| 649 | - Remove running entry. |
| 650 | - Update aggregate runtime totals. |
| 651 | - Schedule continuation retry (attempt `1`) after the worker exhausts or finishes its in-process |
| 652 | turn loop. |
| 653 | |
| 654 | - `Worker Exit (abnormal)` |
| 655 | - Remove running entry. |
| 656 | - Update aggregate runtime totals. |
| 657 | - Schedule exponential-backoff retry. |
| 658 | |
| 659 | - `Codex Update Event` |
| 660 | - Update live session fields, token counters, and rate limits. |
| 661 | |
| 662 | - `Retry Timer Fired` |
| 663 | - Re-fetch active candidates and attempt re-dispatch, or release claim if no longer eligible. |
| 664 | |
| 665 | - `Reconciliation State Refresh` |
| 666 | - Stop runs whose issue states are terminal or no longer active. |
| 667 | |
| 668 | - `Stall Timeout` |
| 669 | - Kill worker and schedule retry. |
| 670 | |
| 671 | ### 7.4 Idempotency and Recovery Rules |
| 672 | |
| 673 | - The orchestrator serializes state mutations through one authority to avoid duplicate dispatch. |
| 674 | - `claimed` and `running` checks are required before launching any worker. |
| 675 | - Reconciliation runs before dispatch on every tick. |
| 676 | - Restart recovery is tracker-driven and filesystem-driven (no durable orchestrator DB required). |
| 677 | - Startup terminal cleanup removes stale workspaces for issues already in terminal states. |
| 678 | |
| 679 | ## 8. Polling, Scheduling, and Reconciliation |
| 680 | |
| 681 | ### 8.1 Poll Loop |
| 682 | |
| 683 | At startup, the service validates config, performs startup cleanup, schedules an immediate tick, and |
| 684 | then repeats every `polling.interval_ms`. |
| 685 | |
| 686 | The effective poll interval should be updated when workflow config changes are re-applied. |
| 687 | |
| 688 | Tick sequence: |
| 689 | |
| 690 | 1. Reconcile running issues. |
| 691 | 2. Run dispatch preflight validation. |
| 692 | 3. Fetch candidate issues from tracker using active states. |
| 693 | 4. Sort issues by dispatch priority. |
| 694 | 5. Dispatch eligible issues while slots remain. |
| 695 | 6. Notify observability/status consumers of state changes. |
| 696 | |
| 697 | If per-tick validation fails, dispatch is skipped for that tick, but reconciliation still happens |
| 698 | first. |
| 699 | |
| 700 | ### 8.2 Candidate Selection Rules |
| 701 | |
| 702 | An issue is dispatch-eligible only if all are true: |
| 703 | |
| 704 | - It has `id`, `identifier`, `title`, and `state`. |
| 705 | - Its state is in `active_states` and not in `terminal_states`. |
| 706 | - It is not already in `running`. |
| 707 | - It is not already in `claimed`. |
| 708 | - Global concurrency slots are available. |
| 709 | - Per-state concurrency slots are available. |
| 710 | - Blocker rule for `Todo` state passes: |
| 711 | - If the issue state is `Todo`, do not dispatch when any blocker is non-terminal. |
| 712 | |
| 713 | Sorting order (stable intent): |
| 714 | |
| 715 | 1. `priority` ascending (1..4 are preferred; null/unknown sorts last) |
| 716 | 2. `created_at` oldest first |
| 717 | 3. `identifier` lexicographic tie-breaker |
| 718 | |
| 719 | ### 8.3 Concurrency Control |
| 720 | |
| 721 | Global limit: |
| 722 | |
| 723 | - `available_slots = max(max_concurrent_agents - running_count, 0)` |
| 724 | |
| 725 | Per-state limit: |
| 726 | |
| 727 | - `max_concurrent_agents_by_state[state]` if present (state key normalized) |
| 728 | - otherwise fallback to global limit |
| 729 | |
| 730 | The runtime counts issues by their current tracked state in the `running` map. |
| 731 | |
| 732 | ### 8.4 Retry and Backoff |
| 733 | |
| 734 | Retry entry creation: |
| 735 | |
| 736 | - Cancel any existing retry timer for the same issue. |
| 737 | - Store `attempt`, `identifier`, `error`, `due_at_ms`, and new timer handle. |
| 738 | |
| 739 | Backoff formula: |
| 740 | |
| 741 | - Normal continuation retries after a clean worker exit use a short fixed delay of `1000` ms. |
| 742 | - Failure-driven retries use `delay = min(10000 * 2^(attempt - 1), agent.max_retry_backoff_ms)`. |
| 743 | - Power is capped by the configured max retry backoff (default `300000` / 5m). |
| 744 | |
| 745 | Retry handling behavior: |
| 746 | |
| 747 | 1. Fetch active candidate issues (not all issues). |
| 748 | 2. Find the specific issue by `issue_id`. |
| 749 | 3. If not found, release claim. |
| 750 | 4. If found and still candidate-eligible: |
| 751 | - Dispatch if slots are available. |
| 752 | - Otherwise requeue with error `no available orchestrator slots`. |
| 753 | 5. If found but no longer active, release claim. |
| 754 | |
| 755 | Note: |
| 756 | |
| 757 | - Terminal-state workspace cleanup is handled by startup cleanup and active-run reconciliation |
| 758 | (including terminal transitions for currently running issues). |
| 759 | - Retry handling mainly operates on active candidates and releases claims when the issue is absent, |
| 760 | rather than performing terminal cleanup itself. |
| 761 | |
| 762 | ### 8.5 Active Run Reconciliation |
| 763 | |
| 764 | Reconciliation runs every tick and has two parts. |
| 765 | |
| 766 | Part A: Stall detection |
| 767 | |
| 768 | - For each running issue, compute `elapsed_ms` since: |
| 769 | - `last_codex_timestamp` if any event has been seen, else |
| 770 | - `started_at` |
| 771 | - If `elapsed_ms > codex.stall_timeout_ms`, terminate the worker and queue a retry. |
| 772 | - If `stall_timeout_ms <= 0`, skip stall detection entirely. |
| 773 | |
| 774 | Part B: Tracker state refresh |
| 775 | |
| 776 | - Fetch current issue states for all running issue IDs. |
| 777 | - For each running issue: |
| 778 | - If tracker state is terminal: terminate worker and clean workspace. |
| 779 | - If tracker state is still active: update the in-memory issue snapshot. |
| 780 | - If tracker state is neither active nor terminal: terminate worker without workspace cleanup. |
| 781 | - If state refresh fails, keep workers running and try again on the next tick. |
| 782 | |
| 783 | ### 8.6 Startup Terminal Workspace Cleanup |
| 784 | |
| 785 | When the service starts: |
| 786 | |
| 787 | 1. Query tracker for issues in terminal states. |
| 788 | 2. For each returned issue identifier, remove the corresponding workspace directory. |
| 789 | 3. If the terminal-issues fetch fails, log a warning and continue startup. |
| 790 | |
| 791 | This prevents stale terminal workspaces from accumulating after restarts. |
| 792 | |
| 793 | ## 9. Workspace Management and Safety |
| 794 | |
| 795 | ### 9.1 Workspace Layout |
| 796 | |
| 797 | Workspace root: |
| 798 | |
| 799 | - `workspace.root` (normalized path; the current config layer expands path-like values and preserves |
| 800 | bare relative names) |
| 801 | |
| 802 | Per-issue workspace path: |
| 803 | |
| 804 | - `<workspace.root>/<sanitized_issue_identifier>` |
| 805 | |
| 806 | Workspace persistence: |
| 807 | |
| 808 | - Workspaces are reused across runs for the same issue. |
| 809 | - Successful runs do not auto-delete workspaces. |
| 810 | |
| 811 | ### 9.2 Workspace Creation and Reuse |
| 812 | |
| 813 | Input: `issue.identifier` |
| 814 | |
| 815 | Algorithm summary: |
| 816 | |
| 817 | 1. Sanitize identifier to `workspace_key`. |
| 818 | 2. Compute workspace path under workspace root. |
| 819 | 3. Ensure the workspace path exists as a directory. |
| 820 | 4. Mark `created_now=true` only if the directory was created during this call; otherwise |
| 821 | `created_now=false`. |
| 822 | 5. If `created_now=true`, run `after_create` hook if configured. |
| 823 | |
| 824 | Notes: |
| 825 | |
| 826 | - This section does not assume any specific repository/VCS workflow. |
| 827 | - Workspace preparation beyond directory creation (for example dependency bootstrap, checkout/sync, |
| 828 | code generation) is implementation-defined and is typically handled via hooks. |
| 829 | |
| 830 | ### 9.3 Optional Workspace Population (Implementation-Defined) |
| 831 | |
| 832 | The spec does not require any built-in VCS or repository bootstrap behavior. |
| 833 | |
| 834 | Implementations may populate or synchronize the workspace using implementation-defined logic and/or |
| 835 | hooks (for example `after_create` and/or `before_run`). |
| 836 | |
| 837 | Failure handling: |
| 838 | |
| 839 | - Workspace population/synchronization failures return an error for the current attempt. |
| 840 | - If failure happens while creating a brand-new workspace, implementations may remove the partially |
| 841 | prepared directory. |
| 842 | - Reused workspaces should not be destructively reset on population failure unless that policy is |
| 843 | explicitly chosen and documented. |
| 844 | |
| 845 | ### 9.4 Workspace Hooks |
| 846 | |
| 847 | Supported hooks: |
| 848 | |
| 849 | - `hooks.after_create` |
| 850 | - `hooks.before_run` |
| 851 | - `hooks.after_run` |
| 852 | - `hooks.before_remove` |
| 853 | |
| 854 | Execution contract: |
| 855 | |
| 856 | - Execute in a local shell context appropriate to the host OS, with the workspace directory as |
| 857 | `cwd`. |
| 858 | - On POSIX systems, `sh -lc <script>` (or a stricter equivalent such as `bash -lc <script>`) is a |
| 859 | conforming default. |
| 860 | - Hook timeout uses `hooks.timeout_ms`; default: `60000 ms`. |
| 861 | - Log hook start, failures, and timeouts. |
| 862 | |
| 863 | Failure semantics: |
| 864 | |
| 865 | - `after_create` failure or timeout is fatal to workspace creation. |
| 866 | - `before_run` failure or timeout is fatal to the current run attempt. |
| 867 | - `after_run` failure or timeout is logged and ignored. |
| 868 | - `before_remove` failure or timeout is logged and ignored. |
| 869 | |
| 870 | ### 9.5 Safety Invariants |
| 871 | |
| 872 | This is the most important portability constraint. |
| 873 | |
| 874 | Invariant 1: Run the coding agent only in the per-issue workspace path. |
| 875 | |
| 876 | - Before launching the coding-agent subprocess, validate: |
| 877 | - `cwd == workspace_path` |
| 878 | |
| 879 | Invariant 2: Workspace path must stay inside workspace root. |
| 880 | |
| 881 | - Normalize both paths to absolute. |
| 882 | - Require `workspace_path` to have `workspace_root` as a prefix directory. |
| 883 | - Reject any path outside the workspace root. |
| 884 | |
| 885 | Invariant 3: Workspace key is sanitized. |
| 886 | |
| 887 | - Only `[A-Za-z0-9._-]` allowed in workspace directory names. |
| 888 | - Replace all other characters with `_`. |
| 889 | |
| 890 | ## 10. Agent Runner Protocol (Coding Agent Integration) |
| 891 | |
| 892 | This section defines the language-neutral contract for integrating a coding agent app-server. |
| 893 | |
| 894 | Compatibility profile: |
| 895 | |
| 896 | - The normative contract is message ordering, required behaviors, and the logical fields that must |
| 897 | be extracted (for example session IDs, completion state, approval handling, and usage/rate-limit |
| 898 | telemetry). |
| 899 | - Exact JSON field names may vary slightly across compatible app-server versions. |
| 900 | - Implementations should tolerate equivalent payload shapes when they carry the same logical |
| 901 | meaning, especially for nested IDs, approval requests, user-input-required signals, and |
| 902 | token/rate-limit metadata. |
| 903 | |
| 904 | ### 10.1 Launch Contract |
| 905 | |
| 906 | Subprocess launch parameters: |
| 907 | |
| 908 | - Command: `codex.command` |
| 909 | - Invocation: `bash -lc <codex.command>` |
| 910 | - Working directory: workspace path |
| 911 | - Stdout/stderr: separate streams |
| 912 | - Framing: line-delimited protocol messages on stdout (JSON-RPC-like JSON per line) |
| 913 | |
| 914 | Notes: |
| 915 | |
| 916 | - The default command is `codex app-server`. |
| 917 | - Approval policy, cwd, and prompt are expressed in the protocol messages in Section 10.2. |
| 918 | |
| 919 | Recommended additional process settings: |
| 920 | |
| 921 | - Max line size: 10 MB (for safe buffering) |
| 922 | |
| 923 | ### 10.2 Session Startup Handshake |
| 924 | |
| 925 | Reference: https://developers.openai.com/codex/app-server/ |
| 926 | |
| 927 | The client must send these protocol messages in order: |
| 928 | |
| 929 | Illustrative startup transcript (equivalent payload shapes are acceptable if they preserve the same |
| 930 | semantics): |
| 931 | |
| 932 | ```json |
| 933 | {"id":1,"method":"initialize","params":{"clientInfo":{"name":"symphony","version":"1.0"},"capabilities":{}}} |
| 934 | {"method":"initialized","params":{}} |
| 935 | {"id":2,"method":"thread/start","params":{"approvalPolicy":"<implementation-defined>","sandbox":"<implementation-defined>","cwd":"/abs/workspace"}} |
| 936 | {"id":3,"method":"turn/start","params":{"threadId":"<thread-id>","input":[{"type":"text","text":"<rendered prompt-or-continuation-guidance>"}],"cwd":"/abs/workspace","title":"ABC-123: Example","approvalPolicy":"<implementation-defined>","sandboxPolicy":{"type":"<implementation-defined>"}}} |
| 937 | ``` |
| 938 | |
| 939 | 1. `initialize` request |
| 940 | - Params include: |
| 941 | - `clientInfo` object (for example `{name, version}`) |
| 942 | - `capabilities` object (may be empty) |
| 943 | - If the targeted Codex app-server requires capability negotiation for dynamic tools, include the |
| 944 | necessary capability flag(s) here. |
| 945 | - Wait for response (`read_timeout_ms`) |
| 946 | 2. `initialized` notification |
| 947 | 3. `thread/start` request |
| 948 | - Params include: |
| 949 | - `approvalPolicy` = implementation-defined session approval policy value |
| 950 | - `sandbox` = implementation-defined session sandbox value |
| 951 | - `cwd` = absolute workspace path |
| 952 | - If optional client-side tools are implemented, include their advertised tool specs using the |
| 953 | protocol mechanism supported by the targeted Codex app-server version. |
| 954 | 4. `turn/start` request |
| 955 | - Params include: |
| 956 | - `threadId` |
| 957 | - `input` = single text item containing rendered prompt for the first turn, or continuation |
| 958 | guidance for later turns on the same thread |
| 959 | - `cwd` |
| 960 | - `title` = `<issue.identifier>: <issue.title>` |
| 961 | - `approvalPolicy` = implementation-defined turn approval policy value |
| 962 | - `sandboxPolicy` = implementation-defined object-form sandbox policy payload when required by |
| 963 | the targeted app-server version |
| 964 | |
| 965 | Session identifiers: |
| 966 | |
| 967 | - Read `thread_id` from `thread/start` result `result.thread.id` |
| 968 | - Read `turn_id` from each `turn/start` result `result.turn.id` |
| 969 | - Emit `session_id = "<thread_id>-<turn_id>"` |
| 970 | - Reuse the same `thread_id` for all continuation turns inside one worker run |
| 971 | |
| 972 | ### 10.3 Streaming Turn Processing |
| 973 | |
| 974 | The client reads line-delimited messages until the turn terminates. |
| 975 | |
| 976 | Completion conditions: |
| 977 | |
| 978 | - `turn/completed` -> success |
| 979 | - `turn/failed` -> failure |
| 980 | - `turn/cancelled` -> failure |
| 981 | - turn timeout (`turn_timeout_ms`) -> failure |
| 982 | - subprocess exit -> failure |
| 983 | |
| 984 | Continuation processing: |
| 985 | |
| 986 | - If the worker decides to continue after a successful turn, it should issue another `turn/start` |
| 987 | on the same live `threadId`. |
| 988 | - The app-server subprocess should remain alive across those continuation turns and be stopped only |
| 989 | when the worker run is ending. |
| 990 | |
| 991 | Line handling requirements: |
| 992 | |
| 993 | - Read protocol messages from stdout only. |
| 994 | - Buffer partial stdout lines until newline arrives. |
| 995 | - Attempt JSON parse on complete stdout lines. |
| 996 | - Stderr is not part of the protocol stream: |
| 997 | - ignore it or log it as diagnostics |
| 998 | - do not attempt protocol JSON parsing on stderr |
| 999 | |
| 1000 | ### 10.4 Emitted Runtime Events (Upstream to Orchestrator) |
| 1001 | |
| 1002 | The app-server client emits structured events to the orchestrator callback. Each event should |
| 1003 | include: |
| 1004 | |
| 1005 | - `event` (enum/string) |
| 1006 | - `timestamp` (UTC timestamp) |
| 1007 | - `codex_app_server_pid` (if available) |
| 1008 | - optional `usage` map (token counts) |
| 1009 | - payload fields as needed |
| 1010 | |
| 1011 | Important emitted events may include: |
| 1012 | |
| 1013 | - `session_started` |
| 1014 | - `startup_failed` |
| 1015 | - `turn_completed` |
| 1016 | - `turn_failed` |
| 1017 | - `turn_cancelled` |
| 1018 | - `turn_ended_with_error` |
| 1019 | - `turn_input_required` |
| 1020 | - `approval_auto_approved` |
| 1021 | - `unsupported_tool_call` |
| 1022 | - `notification` |
| 1023 | - `other_message` |
| 1024 | - `malformed` |
| 1025 | |
| 1026 | ### 10.5 Approval, Tool Calls, and User Input Policy |
| 1027 | |
| 1028 | Approval, sandbox, and user-input behavior is implementation-defined. |
| 1029 | |
| 1030 | Policy requirements: |
| 1031 | |
| 1032 | - Each implementation should document its chosen approval, sandbox, and operator-confirmation |
| 1033 | posture. |
| 1034 | - Approval requests and user-input-required events must not leave a run stalled indefinitely. An |
| 1035 | implementation should either satisfy them, surface them to an operator, auto-resolve them, or |
| 1036 | fail the run according to its documented policy. |
| 1037 | |
| 1038 | Example high-trust behavior: |
| 1039 | |
| 1040 | - Auto-approve command execution approvals for the session. |
| 1041 | - Auto-approve file-change approvals for the session. |
| 1042 | - Treat user-input-required turns as hard failure. |
| 1043 | |
| 1044 | Unsupported dynamic tool calls: |
| 1045 | |
| 1046 | - Supported dynamic tool calls that are explicitly implemented and advertised by the runtime should |
| 1047 | be handled according to their extension contract. |
| 1048 | - If the agent requests a dynamic tool call (`item/tool/call`) that is not supported, return a tool |
| 1049 | failure response and continue the session. |
| 1050 | - This prevents the session from stalling on unsupported tool execution paths. |
| 1051 | |
| 1052 | Optional client-side tool extension: |
| 1053 | |
| 1054 | - An implementation may expose a limited set of client-side tools to the app-server session. |
| 1055 | - Current optional standardized tool: `linear_graphql`. |
| 1056 | - If implemented, supported tools should be advertised to the app-server session during startup |
| 1057 | using the protocol mechanism supported by the targeted Codex app-server version. |
| 1058 | - Unsupported tool names should still return a failure result and continue the session. |
| 1059 | |
| 1060 | `linear_graphql` extension contract: |
| 1061 | |
| 1062 | - Purpose: execute a raw GraphQL query or mutation against Linear using Symphony's configured |
| 1063 | tracker auth for the current session. |
| 1064 | - Availability: only meaningful when `tracker.kind == "linear"` and valid Linear auth is configured. |
| 1065 | - Preferred input shape: |
| 1066 | |
| 1067 | ```json |
| 1068 | { |
| 1069 | "query": "single GraphQL query or mutation document", |
| 1070 | "variables": { |
| 1071 | "optional": "graphql variables object" |
| 1072 | } |
| 1073 | } |
| 1074 | ``` |
| 1075 | |
| 1076 | - `query` must be a non-empty string. |
| 1077 | - `query` must contain exactly one GraphQL operation. |
| 1078 | - `variables` is optional and, when present, must be a JSON object. |
| 1079 | - Implementations may additionally accept a raw GraphQL query string as shorthand input. |
| 1080 | - Execute one GraphQL operation per tool call. |
| 1081 | - If the provided document contains multiple operations, reject the tool call as invalid input. |
| 1082 | - `operationName` selection is intentionally out of scope for this extension. |
| 1083 | - Reuse the configured Linear endpoint and auth from the active Symphony workflow/runtime config; do |
| 1084 | not require the coding agent to read raw tokens from disk. |
| 1085 | - Tool result semantics: |
| 1086 | - transport success + no top-level GraphQL `errors` -> `success=true` |
| 1087 | - top-level GraphQL `errors` present -> `success=false`, but preserve the GraphQL response body |
| 1088 | for debugging |
| 1089 | - invalid input, missing auth, or transport failure -> `success=false` with an error payload |
| 1090 | - Return the GraphQL response or error payload as structured tool output that the model can inspect |
| 1091 | in-session. |
| 1092 | |
| 1093 | Illustrative responses (equivalent payload shapes are acceptable if they preserve the same outcome): |
| 1094 | |
| 1095 | ```json |
| 1096 | {"id":"<approval-id>","result":{"approved":true}} |
| 1097 | {"id":"<tool-call-id>","result":{"success":false,"error":"unsupported_tool_call"}} |
| 1098 | ``` |
| 1099 | |
| 1100 | Hard failure on user input requirement: |
| 1101 | |
| 1102 | - If the agent requests user input, fail the run attempt immediately. |
| 1103 | - The client detects this via: |
| 1104 | - explicit method (`item/tool/requestUserInput`), or |
| 1105 | - turn methods/flags indicating input is required. |
| 1106 | |
| 1107 | ### 10.6 Timeouts and Error Mapping |
| 1108 | |
| 1109 | Timeouts: |
| 1110 | |
| 1111 | - `codex.read_timeout_ms`: request/response timeout during startup and sync requests |
| 1112 | - `codex.turn_timeout_ms`: total turn stream timeout |
| 1113 | - `codex.stall_timeout_ms`: enforced by orchestrator based on event inactivity |
| 1114 | |
| 1115 | Error mapping (recommended normalized categories): |
| 1116 | |
| 1117 | - `codex_not_found` |
| 1118 | - `invalid_workspace_cwd` |
| 1119 | - `response_timeout` |
| 1120 | - `turn_timeout` |
| 1121 | - `port_exit` |
| 1122 | - `response_error` |
| 1123 | - `turn_failed` |
| 1124 | - `turn_cancelled` |
| 1125 | - `turn_input_required` |
| 1126 | |
| 1127 | ### 10.7 Agent Runner Contract |
| 1128 | |
| 1129 | The `Agent Runner` wraps workspace + prompt + app-server client. |
| 1130 | |
| 1131 | Behavior: |
| 1132 | |
| 1133 | 1. Create/reuse workspace for issue. |
| 1134 | 2. Build prompt from workflow template. |
| 1135 | 3. Start app-server session. |
| 1136 | 4. Forward app-server events to orchestrator. |
| 1137 | 5. On any error, fail the worker attempt (the orchestrator will retry). |
| 1138 | |
| 1139 | Note: |
| 1140 | |
| 1141 | - Workspaces are intentionally preserved after successful runs. |
| 1142 | |
| 1143 | ## 11. Issue Tracker Integration Contract (Linear-Compatible) |
| 1144 | |
| 1145 | ### 11.1 Required Operations |
| 1146 | |
| 1147 | An implementation must support these tracker adapter operations: |
| 1148 | |
| 1149 | 1. `fetch_candidate_issues()` |
| 1150 | - Return issues in configured active states for a configured project. |
| 1151 | |
| 1152 | 2. `fetch_issues_by_states(state_names)` |
| 1153 | - Used for startup terminal cleanup. |
| 1154 | |
| 1155 | 3. `fetch_issue_states_by_ids(issue_ids)` |
| 1156 | - Used for active-run reconciliation. |
| 1157 | |
| 1158 | ### 11.2 Query Semantics (Linear) |
| 1159 | |
| 1160 | Linear-specific requirements for `tracker.kind == "linear"`: |
| 1161 | |
| 1162 | - `tracker.kind == "linear"` |
| 1163 | - GraphQL endpoint (default `https://api.linear.app/graphql`) |
| 1164 | - Auth token sent in `Authorization` header |
| 1165 | - `tracker.project_slug` maps to Linear project `slugId` |
| 1166 | - Candidate issue query filters project using `project: { slugId: { eq: $projectSlug } }` |
| 1167 | - Issue-state refresh query uses GraphQL issue IDs with variable type `[ID!]` |
| 1168 | - Pagination required for candidate issues |
| 1169 | - Page size default: `50` |
| 1170 | - Network timeout: `30000 ms` |
| 1171 | |
| 1172 | Important: |
| 1173 | |
| 1174 | - Linear GraphQL schema details can drift. Keep query construction isolated and test the exact query |
| 1175 | fields/types required by this specification. |
| 1176 | |
| 1177 | A non-Linear implementation may change transport details, but the normalized outputs must match the |
| 1178 | domain model in Section 4. |
| 1179 | |
| 1180 | ### 11.3 Normalization Rules |
| 1181 | |
| 1182 | Candidate issue normalization should produce fields listed in Section 4.1.1. |
| 1183 | |
| 1184 | Additional normalization details: |
| 1185 | |
| 1186 | - `labels` -> lowercase strings |
| 1187 | - `blocked_by` -> derived from inverse relations where relation type is `blocks` |
| 1188 | - `priority` -> integer only (non-integers become null) |
| 1189 | - `created_at` and `updated_at` -> parse ISO-8601 timestamps |
| 1190 | |
| 1191 | ### 11.4 Error Handling Contract |
| 1192 | |
| 1193 | Recommended error categories: |
| 1194 | |
| 1195 | - `unsupported_tracker_kind` |
| 1196 | - `missing_tracker_api_key` |
| 1197 | - `missing_tracker_project_slug` |
| 1198 | - `linear_api_request` (transport failures) |
| 1199 | - `linear_api_status` (non-200 HTTP) |
| 1200 | - `linear_graphql_errors` |
| 1201 | - `linear_unknown_payload` |
| 1202 | - `linear_missing_end_cursor` (pagination integrity error) |
| 1203 | |
| 1204 | Orchestrator behavior on tracker errors: |
| 1205 | |
| 1206 | - Candidate fetch failure: log and skip dispatch for this tick. |
| 1207 | - Running-state refresh failure: log and keep active workers running. |
| 1208 | - Startup terminal cleanup failure: log warning and continue startup. |
| 1209 | |
| 1210 | ### 11.5 Tracker Writes (Important Boundary) |
| 1211 | |
| 1212 | Symphony does not require first-class tracker write APIs in the orchestrator. |
| 1213 | |
| 1214 | - Ticket mutations (state transitions, comments, PR metadata) are typically handled by the coding |
| 1215 | agent using tools defined by the workflow prompt. |
| 1216 | - The service remains a scheduler/runner and tracker reader. |
| 1217 | - Workflow-specific success often means "reached the next handoff state" (for example |
| 1218 | `Human Review`) rather than tracker terminal state `Done`. |
| 1219 | - If the optional `linear_graphql` client-side tool extension is implemented, it is still part of |
| 1220 | the agent toolchain rather than orchestrator business logic. |
| 1221 | |
| 1222 | ## 12. Prompt Construction and Context Assembly |
| 1223 | |
| 1224 | ### 12.1 Inputs |
| 1225 | |
| 1226 | Inputs to prompt rendering: |
| 1227 | |
| 1228 | - `workflow.prompt_template` |
| 1229 | - normalized `issue` object |
| 1230 | - optional `attempt` integer (retry/continuation metadata) |
| 1231 | |
| 1232 | ### 12.2 Rendering Rules |
| 1233 | |
| 1234 | - Render with strict variable checking. |
| 1235 | - Render with strict filter checking. |
| 1236 | - Convert issue object keys to strings for template compatibility. |
| 1237 | - Preserve nested arrays/maps (labels, blockers) so templates can iterate. |
| 1238 | |
| 1239 | ### 12.3 Retry/Continuation Semantics |
| 1240 | |
| 1241 | `attempt` should be passed to the template because the workflow prompt may provide different |
| 1242 | instructions for: |
| 1243 | |
| 1244 | - first run (`attempt` null or absent) |
| 1245 | - continuation run after a successful prior session |
| 1246 | - retry after error/timeout/stall |
| 1247 | |
| 1248 | ### 12.4 Failure Semantics |
| 1249 | |
| 1250 | If prompt rendering fails: |
| 1251 | |
| 1252 | - Fail the run attempt immediately. |
| 1253 | - Let the orchestrator treat it like any other worker failure and decide retry behavior. |
| 1254 | |
| 1255 | ## 13. Logging, Status, and Observability |
| 1256 | |
| 1257 | ### 13.1 Logging Conventions |
| 1258 | |
| 1259 | Required context fields for issue-related logs: |
| 1260 | |
| 1261 | - `issue_id` |
| 1262 | - `issue_identifier` |
| 1263 | |
| 1264 | Required context for coding-agent session lifecycle logs: |
| 1265 | |
| 1266 | - `session_id` |
| 1267 | |
| 1268 | Message formatting requirements: |
| 1269 | |
| 1270 | - Use stable `key=value` phrasing. |
| 1271 | - Include action outcome (`completed`, `failed`, `retrying`, etc.). |
| 1272 | - Include concise failure reason when present. |
| 1273 | - Avoid logging large raw payloads unless necessary. |
| 1274 | |
| 1275 | ### 13.2 Logging Outputs and Sinks |
| 1276 | |
| 1277 | The spec does not prescribe where logs must go (stderr, file, remote sink, etc.). |
| 1278 | |
| 1279 | Requirements: |
| 1280 | |
| 1281 | - Operators must be able to see startup/validation/dispatch failures without attaching a debugger. |
| 1282 | - Implementations may write to one or more sinks. |
| 1283 | - If a configured log sink fails, the service should continue running when possible and emit an |
| 1284 | operator-visible warning through any remaining sink. |
| 1285 | |
| 1286 | ### 13.3 Runtime Snapshot / Monitoring Interface (Optional but Recommended) |
| 1287 | |
| 1288 | If the implementation exposes a synchronous runtime snapshot (for dashboards or monitoring), it |
| 1289 | should return: |
| 1290 | |
| 1291 | - `running` (list of running session rows) |
| 1292 | - each running row should include `turn_count` |
| 1293 | - `retrying` (list of retry queue rows) |
| 1294 | - `codex_totals` |
| 1295 | - `input_tokens` |
| 1296 | - `output_tokens` |
| 1297 | - `total_tokens` |
| 1298 | - `seconds_running` (aggregate runtime seconds as of snapshot time, including active sessions) |
| 1299 | - `rate_limits` (latest coding-agent rate limit payload, if available) |
| 1300 | |
| 1301 | Recommended snapshot error modes: |
| 1302 | |
| 1303 | - `timeout` |
| 1304 | - `unavailable` |
| 1305 | |
| 1306 | ### 13.4 Optional Human-Readable Status Surface |
| 1307 | |
| 1308 | A human-readable status surface (terminal output, dashboard, etc.) is optional and |
| 1309 | implementation-defined. |
| 1310 | |
| 1311 | If present, it should draw from orchestrator state/metrics only and must not be required for |
| 1312 | correctness. |
| 1313 | |
| 1314 | ### 13.5 Session Metrics and Token Accounting |
| 1315 | |
| 1316 | Token accounting rules: |
| 1317 | |
| 1318 | - Agent events may include token counts in multiple payload shapes. |
| 1319 | - Prefer absolute thread totals when available, such as: |
| 1320 | - `thread/tokenUsage/updated` payloads |
| 1321 | - `total_token_usage` within token-count wrapper events |
| 1322 | - Ignore delta-style payloads such as `last_token_usage` for dashboard/API totals. |
| 1323 | - Extract input/output/total token counts leniently from common field names within the selected |
| 1324 | payload. |
| 1325 | - For absolute totals, track deltas relative to last reported totals to avoid double-counting. |
| 1326 | - Do not treat generic `usage` maps as cumulative totals unless the event type defines them that |
| 1327 | way. |
| 1328 | - Accumulate aggregate totals in orchestrator state. |
| 1329 | |
| 1330 | Runtime accounting: |
| 1331 | |
| 1332 | - Runtime should be reported as a live aggregate at snapshot/render time. |
| 1333 | - Implementations may maintain a cumulative counter for ended sessions and add active-session |
| 1334 | elapsed time derived from `running` entries (for example `started_at`) when producing a |
| 1335 | snapshot/status view. |
| 1336 | - Add run duration seconds to the cumulative ended-session runtime when a session ends (normal exit |
| 1337 | or cancellation/termination). |
| 1338 | - Continuous background ticking of runtime totals is not required. |
| 1339 | |
| 1340 | Rate-limit tracking: |
| 1341 | |
| 1342 | - Track the latest rate-limit payload seen in any agent update. |
| 1343 | - Any human-readable presentation of rate-limit data is implementation-defined. |
| 1344 | |
| 1345 | ### 13.6 Humanized Agent Event Summaries (Optional) |
| 1346 | |
| 1347 | Humanized summaries of raw agent protocol events are optional. |
| 1348 | |
| 1349 | If implemented: |
| 1350 | |
| 1351 | - Treat them as observability-only output. |
| 1352 | - Do not make orchestrator logic depend on humanized strings. |
| 1353 | |
| 1354 | ### 13.7 Optional HTTP Server Extension |
| 1355 | |
| 1356 | This section defines an optional HTTP interface for observability and operational control. |
| 1357 | |
| 1358 | If implemented: |
| 1359 | |
| 1360 | - The HTTP server is an extension and is not required for conformance. |
| 1361 | - The implementation may serve server-rendered HTML or a client-side application for the dashboard. |
| 1362 | - The dashboard/API must be observability/control surfaces only and must not become required for |
| 1363 | orchestrator correctness. |
| 1364 | |
| 1365 | Enablement (extension): |
| 1366 | |
| 1367 | - Start the HTTP server when a CLI `--port` argument is provided. |
| 1368 | - Start the HTTP server when `server.port` is present in `WORKFLOW.md` front matter. |
| 1369 | - `server.port` is extension configuration and is intentionally not part of the core front-matter |
| 1370 | schema in Section 5.3. |
| 1371 | - Precedence: CLI `--port` overrides `server.port` when both are present. |
| 1372 | - `server.port` must be an integer. Positive values bind that port. `0` may be used to request an |
| 1373 | ephemeral port for local development and tests. |
| 1374 | - Implementations should bind loopback by default (`127.0.0.1` or host equivalent) unless explicitly |
| 1375 | configured otherwise. |
| 1376 | - Changes to HTTP listener settings (for example `server.port`) do not need to hot-rebind; |
| 1377 | restart-required behavior is conformant. |
| 1378 | |
| 1379 | #### 13.7.1 Human-Readable Dashboard (`/`) |
| 1380 | |
| 1381 | - Host a human-readable dashboard at `/`. |
| 1382 | - The returned document should depict the current state of the system (for example active sessions, |
| 1383 | retry delays, token consumption, runtime totals, recent events, and health/error indicators). |
| 1384 | - It is up to the implementation whether this is server-generated HTML or a client-side app that |
| 1385 | consumes the JSON API below. |
| 1386 | |
| 1387 | #### 13.7.2 JSON REST API (`/api/v1/*`) |
| 1388 | |
| 1389 | Provide a JSON REST API under `/api/v1/*` for current runtime state and operational debugging. |
| 1390 | |
| 1391 | Minimum endpoints: |
| 1392 | |
| 1393 | - `GET /api/v1/state` |
| 1394 | - Returns a summary view of the current system state (running sessions, retry queue/delays, |
| 1395 | aggregate token/runtime totals, latest rate limits, and any additional tracked summary fields). |
| 1396 | - Suggested response shape: |
| 1397 | |
| 1398 | ```json |
| 1399 | { |
| 1400 | "generated_at": "2026-02-24T20:15:30Z", |
| 1401 | "counts": { |
| 1402 | "running": 2, |
| 1403 | "retrying": 1 |
| 1404 | }, |
| 1405 | "running": [ |
| 1406 | { |
| 1407 | "issue_id": "abc123", |
| 1408 | "issue_identifier": "MT-649", |
| 1409 | "state": "In Progress", |
| 1410 | "session_id": "thread-1-turn-1", |
| 1411 | "turn_count": 7, |
| 1412 | "last_event": "turn_completed", |
| 1413 | "last_message": "", |
| 1414 | "started_at": "2026-02-24T20:10:12Z", |
| 1415 | "last_event_at": "2026-02-24T20:14:59Z", |
| 1416 | "tokens": { |
| 1417 | "input_tokens": 1200, |
| 1418 | "output_tokens": 800, |
| 1419 | "total_tokens": 2000 |
| 1420 | } |
| 1421 | } |
| 1422 | ], |
| 1423 | "retrying": [ |
| 1424 | { |
| 1425 | "issue_id": "def456", |
| 1426 | "issue_identifier": "MT-650", |
| 1427 | "attempt": 3, |
| 1428 | "due_at": "2026-02-24T20:16:00Z", |
| 1429 | "error": "no available orchestrator slots" |
| 1430 | } |
| 1431 | ], |
| 1432 | "codex_totals": { |
| 1433 | "input_tokens": 5000, |
| 1434 | "output_tokens": 2400, |
| 1435 | "total_tokens": 7400, |
| 1436 | "seconds_running": 1834.2 |
| 1437 | }, |
| 1438 | "rate_limits": null |
| 1439 | } |
| 1440 | ``` |
| 1441 | |
| 1442 | - `GET /api/v1/<issue_identifier>` |
| 1443 | - Returns issue-specific runtime/debug details for the identified issue, including any information |
| 1444 | the implementation tracks that is useful for debugging. |
| 1445 | - Suggested response shape: |
| 1446 | |
| 1447 | ```json |
| 1448 | { |
| 1449 | "issue_identifier": "MT-649", |
| 1450 | "issue_id": "abc123", |
| 1451 | "status": "running", |
| 1452 | "workspace": { |
| 1453 | "path": "/tmp/symphony_workspaces/MT-649" |
| 1454 | }, |
| 1455 | "attempts": { |
| 1456 | "restart_count": 1, |
| 1457 | "current_retry_attempt": 2 |
| 1458 | }, |
| 1459 | "running": { |
| 1460 | "session_id": "thread-1-turn-1", |
| 1461 | "turn_count": 7, |
| 1462 | "state": "In Progress", |
| 1463 | "started_at": "2026-02-24T20:10:12Z", |
| 1464 | "last_event": "notification", |
| 1465 | "last_message": "Working on tests", |
| 1466 | "last_event_at": "2026-02-24T20:14:59Z", |
| 1467 | "tokens": { |
| 1468 | "input_tokens": 1200, |
| 1469 | "output_tokens": 800, |
| 1470 | "total_tokens": 2000 |
| 1471 | } |
| 1472 | }, |
| 1473 | "retry": null, |
| 1474 | "logs": { |
| 1475 | "codex_session_logs": [ |
| 1476 | { |
| 1477 | "label": "latest", |
| 1478 | "path": "/var/log/symphony/codex/MT-649/latest.log", |
| 1479 | "url": null |
| 1480 | } |
| 1481 | ] |
| 1482 | }, |
| 1483 | "recent_events": [ |
| 1484 | { |
| 1485 | "at": "2026-02-24T20:14:59Z", |
| 1486 | "event": "notification", |
| 1487 | "message": "Working on tests" |
| 1488 | } |
| 1489 | ], |
| 1490 | "last_error": null, |
| 1491 | "tracked": {} |
| 1492 | } |
| 1493 | ``` |
| 1494 | |
| 1495 | - If the issue is unknown to the current in-memory state, return `404` with an error response (for |
| 1496 | example `{\"error\":{\"code\":\"issue_not_found\",\"message\":\"...\"}}`). |
| 1497 | |
| 1498 | - `POST /api/v1/refresh` |
| 1499 | - Queues an immediate tracker poll + reconciliation cycle (best-effort trigger; implementations |
| 1500 | may coalesce repeated requests). |
| 1501 | - Suggested request body: empty body or `{}`. |
| 1502 | - Suggested response (`202 Accepted`) shape: |
| 1503 | |
| 1504 | ```json |
| 1505 | { |
| 1506 | "queued": true, |
| 1507 | "coalesced": false, |
| 1508 | "requested_at": "2026-02-24T20:15:30Z", |
| 1509 | "operations": ["poll", "reconcile"] |
| 1510 | } |
| 1511 | ``` |
| 1512 | |
| 1513 | API design notes: |
| 1514 | |
| 1515 | - The JSON shapes above are the recommended baseline for interoperability and debugging ergonomics. |
| 1516 | - Implementations may add fields, but should avoid breaking existing fields within a version. |
| 1517 | - Endpoints should be read-only except for operational triggers like `/refresh`. |
| 1518 | - Unsupported methods on defined routes should return `405 Method Not Allowed`. |
| 1519 | - API errors should use a JSON envelope such as `{"error":{"code":"...","message":"..."}}`. |
| 1520 | - If the dashboard is a client-side app, it should consume this API rather than duplicating state |
| 1521 | logic. |
| 1522 | |
| 1523 | ## 14. Failure Model and Recovery Strategy |
| 1524 | |
| 1525 | ### 14.1 Failure Classes |
| 1526 | |
| 1527 | 1. `Workflow/Config Failures` |
| 1528 | - Missing `WORKFLOW.md` |
| 1529 | - Invalid YAML front matter |
| 1530 | - Unsupported tracker kind or missing tracker credentials/project slug |
| 1531 | - Missing coding-agent executable |
| 1532 | |
| 1533 | 2. `Workspace Failures` |
| 1534 | - Workspace directory creation failure |
| 1535 | - Workspace population/synchronization failure (implementation-defined; may come from hooks) |
| 1536 | - Invalid workspace path configuration |
| 1537 | - Hook timeout/failure |
| 1538 | |
| 1539 | 3. `Agent Session Failures` |
| 1540 | - Startup handshake failure |
| 1541 | - Turn failed/cancelled |
| 1542 | - Turn timeout |
| 1543 | - User input requested (hard fail) |
| 1544 | - Subprocess exit |
| 1545 | - Stalled session (no activity) |
| 1546 | |
| 1547 | 4. `Tracker Failures` |
| 1548 | - API transport errors |
| 1549 | - Non-200 status |
| 1550 | - GraphQL errors |
| 1551 | - malformed payloads |
| 1552 | |
| 1553 | 5. `Observability Failures` |
| 1554 | - Snapshot timeout |
| 1555 | - Dashboard render errors |
| 1556 | - Log sink configuration failure |
| 1557 | |
| 1558 | ### 14.2 Recovery Behavior |
| 1559 | |
| 1560 | - Dispatch validation failures: |
| 1561 | - Skip new dispatches. |
| 1562 | - Keep service alive. |
| 1563 | - Continue reconciliation where possible. |
| 1564 | |
| 1565 | - Worker failures: |
| 1566 | - Convert to retries with exponential backoff. |
| 1567 | |
| 1568 | - Tracker candidate-fetch failures: |
| 1569 | - Skip this tick. |
| 1570 | - Try again on next tick. |
| 1571 | |
| 1572 | - Reconciliation state-refresh failures: |
| 1573 | - Keep current workers. |
| 1574 | - Retry on next tick. |
| 1575 | |
| 1576 | - Dashboard/log failures: |
| 1577 | - Do not crash the orchestrator. |
| 1578 | |
| 1579 | ### 14.3 Partial State Recovery (Restart) |
| 1580 | |
| 1581 | Current design is intentionally in-memory for scheduler state. |
| 1582 | |
| 1583 | After restart: |
| 1584 | |
| 1585 | - No retry timers are restored from prior process memory. |
| 1586 | - No running sessions are assumed recoverable. |
| 1587 | - Service recovers by: |
| 1588 | - startup terminal workspace cleanup |
| 1589 | - fresh polling of active issues |
| 1590 | - re-dispatching eligible work |
| 1591 | |
| 1592 | ### 14.4 Operator Intervention Points |
| 1593 | |
| 1594 | Operators can control behavior by: |
| 1595 | |
| 1596 | - Editing `WORKFLOW.md` (prompt and most runtime settings). |
| 1597 | - `WORKFLOW.md` changes should be detected and re-applied automatically without restart. |
| 1598 | - Changing issue states in the tracker: |
| 1599 | - terminal state -> running session is stopped and workspace cleaned when reconciled |
| 1600 | - non-active state -> running session is stopped without cleanup |
| 1601 | - Restarting the service for process recovery or deployment (not as the normal path for applying |
| 1602 | workflow config changes). |
| 1603 | |
| 1604 | ## 15. Security and Operational Safety |
| 1605 | |
| 1606 | ### 15.1 Trust Boundary Assumption |
| 1607 | |
| 1608 | Each implementation defines its own trust boundary. |
| 1609 | |
| 1610 | Operational safety requirements: |
| 1611 | |
| 1612 | - Implementations should state clearly whether they are intended for trusted environments, more |
| 1613 | restrictive environments, or both. |
| 1614 | - Implementations should state clearly whether they rely on auto-approved actions, operator |
| 1615 | approvals, stricter sandboxing, or some combination of those controls. |
| 1616 | - Workspace isolation and path validation are important baseline controls, but they are not a |
| 1617 | substitute for whatever approval and sandbox policy an implementation chooses. |
| 1618 | |
| 1619 | ### 15.2 Filesystem Safety Requirements |
| 1620 | |
| 1621 | Mandatory: |
| 1622 | |
| 1623 | - Workspace path must remain under configured workspace root. |
| 1624 | - Coding-agent cwd must be the per-issue workspace path for the current run. |
| 1625 | - Workspace directory names must use sanitized identifiers. |
| 1626 | |
| 1627 | Recommended additional hardening for ports: |
| 1628 | |
| 1629 | - Run under a dedicated OS user. |
| 1630 | - Restrict workspace root permissions. |
| 1631 | - Mount workspace root on a dedicated volume if possible. |
| 1632 | |
| 1633 | ### 15.3 Secret Handling |
| 1634 | |
| 1635 | - Support `$VAR` indirection in workflow config. |
| 1636 | - Do not log API tokens or secret env values. |
| 1637 | - Validate presence of secrets without printing them. |
| 1638 | |
| 1639 | ### 15.4 Hook Script Safety |
| 1640 | |
| 1641 | Workspace hooks are arbitrary shell scripts from `WORKFLOW.md`. |
| 1642 | |
| 1643 | Implications: |
| 1644 | |
| 1645 | - Hooks are fully trusted configuration. |
| 1646 | - Hooks run inside the workspace directory. |
| 1647 | - Hook output should be truncated in logs. |
| 1648 | - Hook timeouts are required to avoid hanging the orchestrator. |
| 1649 | |
| 1650 | ### 15.5 Harness Hardening Guidance |
| 1651 | |
| 1652 | Running Codex agents against repositories, issue trackers, and other inputs that may contain |
| 1653 | sensitive data or externally-controlled content can be dangerous. A permissive deployment can lead |
| 1654 | to data leaks, destructive mutations, or full machine compromise if the agent is induced to execute |
| 1655 | harmful commands or use overly-powerful integrations. |
| 1656 | |
| 1657 | Implementations should explicitly evaluate their own risk profile and harden the execution harness |
| 1658 | where appropriate. This specification intentionally does not mandate a single hardening posture, but |
| 1659 | ports should not assume that tracker data, repository contents, prompt inputs, or tool arguments are |
| 1660 | fully trustworthy just because they originate inside a normal workflow. |
| 1661 | |
| 1662 | Possible hardening measures include: |
| 1663 | |
| 1664 | - Tightening Codex approval and sandbox settings described elsewhere in this specification instead |
| 1665 | of running with a maximally permissive configuration. |
| 1666 | - Adding external isolation layers such as OS/container/VM sandboxing, network restrictions, or |
| 1667 | separate credentials beyond the built-in Codex policy controls. |
| 1668 | - Filtering which Linear issues, projects, teams, labels, or other tracker sources are eligible for |
| 1669 | dispatch so untrusted or out-of-scope tasks do not automatically reach the agent. |
| 1670 | - Narrowing the optional `linear_graphql` tool so it can only read or mutate data inside the |
| 1671 | intended project scope, rather than exposing general workspace-wide tracker access. |
| 1672 | - Reducing the set of client-side tools, credentials, filesystem paths, and network destinations |
| 1673 | available to the agent to the minimum needed for the workflow. |
| 1674 | |
| 1675 | The correct controls are deployment-specific, but implementations should document them clearly and |
| 1676 | treat harness hardening as part of the core safety model rather than an optional afterthought. |
| 1677 | |
| 1678 | ## 16. Reference Algorithms (Language-Agnostic) |
| 1679 | |
| 1680 | ### 16.1 Service Startup |
| 1681 | |
| 1682 | ```text |
| 1683 | function start_service(): |
| 1684 | configure_logging() |
| 1685 | start_observability_outputs() |
| 1686 | start_workflow_watch(on_change=reload_and_reapply_workflow) |
| 1687 | |
| 1688 | state = { |
| 1689 | poll_interval_ms: get_config_poll_interval_ms(), |
| 1690 | max_concurrent_agents: get_config_max_concurrent_agents(), |
| 1691 | running: {}, |
| 1692 | claimed: set(), |
| 1693 | retry_attempts: {}, |
| 1694 | completed: set(), |
| 1695 | codex_totals: {input_tokens: 0, output_tokens: 0, total_tokens: 0, seconds_running: 0}, |
| 1696 | codex_rate_limits: null |
| 1697 | } |
| 1698 | |
| 1699 | validation = validate_dispatch_config() |
| 1700 | if validation is not ok: |
| 1701 | log_validation_error(validation) |
| 1702 | fail_startup(validation) |
| 1703 | |
| 1704 | startup_terminal_workspace_cleanup() |
| 1705 | schedule_tick(delay_ms=0) |
| 1706 | |
| 1707 | event_loop(state) |
| 1708 | ``` |
| 1709 | |
| 1710 | ### 16.2 Poll-and-Dispatch Tick |
| 1711 | |
| 1712 | ```text |
| 1713 | on_tick(state): |
| 1714 | state = reconcile_running_issues(state) |
| 1715 | |
| 1716 | validation = validate_dispatch_config() |
| 1717 | if validation is not ok: |
| 1718 | log_validation_error(validation) |
| 1719 | notify_observers() |
| 1720 | schedule_tick(state.poll_interval_ms) |
| 1721 | return state |
| 1722 | |
| 1723 | issues = tracker.fetch_candidate_issues() |
| 1724 | if issues failed: |
| 1725 | log_tracker_error() |
| 1726 | notify_observers() |
| 1727 | schedule_tick(state.poll_interval_ms) |
| 1728 | return state |
| 1729 | |
| 1730 | for issue in sort_for_dispatch(issues): |
| 1731 | if no_available_slots(state): |
| 1732 | break |
| 1733 | |
| 1734 | if should_dispatch(issue, state): |
| 1735 | state = dispatch_issue(issue, state, attempt=null) |
| 1736 | |
| 1737 | notify_observers() |
| 1738 | schedule_tick(state.poll_interval_ms) |
| 1739 | return state |
| 1740 | ``` |
| 1741 | |
| 1742 | ### 16.3 Reconcile Active Runs |
| 1743 | |
| 1744 | ```text |
| 1745 | function reconcile_running_issues(state): |
| 1746 | state = reconcile_stalled_runs(state) |
| 1747 | |
| 1748 | running_ids = keys(state.running) |
| 1749 | if running_ids is empty: |
| 1750 | return state |
| 1751 | |
| 1752 | refreshed = tracker.fetch_issue_states_by_ids(running_ids) |
| 1753 | if refreshed failed: |
| 1754 | log_debug("keep workers running") |
| 1755 | return state |
| 1756 | |
| 1757 | for issue in refreshed: |
| 1758 | if issue.state in terminal_states: |
| 1759 | state = terminate_running_issue(state, issue.id, cleanup_workspace=true) |
| 1760 | else if issue.state in active_states: |
| 1761 | state.running[issue.id].issue = issue |
| 1762 | else: |
| 1763 | state = terminate_running_issue(state, issue.id, cleanup_workspace=false) |
| 1764 | |
| 1765 | return state |
| 1766 | ``` |
| 1767 | |
| 1768 | ### 16.4 Dispatch One Issue |
| 1769 | |
| 1770 | ```text |
| 1771 | function dispatch_issue(issue, state, attempt): |
| 1772 | worker = spawn_worker( |
| 1773 | fn -> run_agent_attempt(issue, attempt, parent_orchestrator_pid) end |
| 1774 | ) |
| 1775 | |
| 1776 | if worker spawn failed: |
| 1777 | return schedule_retry(state, issue.id, next_attempt(attempt), { |
| 1778 | identifier: issue.identifier, |
| 1779 | error: "failed to spawn agent" |
| 1780 | }) |
| 1781 | |
| 1782 | state.running[issue.id] = { |
| 1783 | worker_handle, |
| 1784 | monitor_handle, |
| 1785 | identifier: issue.identifier, |
| 1786 | issue, |
| 1787 | session_id: null, |
| 1788 | codex_app_server_pid: null, |
| 1789 | last_codex_message: null, |
| 1790 | last_codex_event: null, |
| 1791 | last_codex_timestamp: null, |
| 1792 | codex_input_tokens: 0, |
| 1793 | codex_output_tokens: 0, |
| 1794 | codex_total_tokens: 0, |
| 1795 | last_reported_input_tokens: 0, |
| 1796 | last_reported_output_tokens: 0, |
| 1797 | last_reported_total_tokens: 0, |
| 1798 | retry_attempt: normalize_attempt(attempt), |
| 1799 | started_at: now_utc() |
| 1800 | } |
| 1801 | |
| 1802 | state.claimed.add(issue.id) |
| 1803 | state.retry_attempts.remove(issue.id) |
| 1804 | return state |
| 1805 | ``` |
| 1806 | |
| 1807 | ### 16.5 Worker Attempt (Workspace + Prompt + Agent) |
| 1808 | |
| 1809 | ```text |
| 1810 | function run_agent_attempt(issue, attempt, orchestrator_channel): |
| 1811 | workspace = workspace_manager.create_for_issue(issue.identifier) |
| 1812 | if workspace failed: |
| 1813 | fail_worker("workspace error") |
| 1814 | |
| 1815 | if run_hook("before_run", workspace.path) failed: |
| 1816 | fail_worker("before_run hook error") |
| 1817 | |
| 1818 | session = app_server.start_session(workspace=workspace.path) |
| 1819 | if session failed: |
| 1820 | run_hook_best_effort("after_run", workspace.path) |
| 1821 | fail_worker("agent session startup error") |
| 1822 | |
| 1823 | max_turns = config.agent.max_turns |
| 1824 | turn_number = 1 |
| 1825 | |
| 1826 | while true: |
| 1827 | prompt = build_turn_prompt(workflow_template, issue, attempt, turn_number, max_turns) |
| 1828 | if prompt failed: |
| 1829 | app_server.stop_session(session) |
| 1830 | run_hook_best_effort("after_run", workspace.path) |
| 1831 | fail_worker("prompt error") |
| 1832 | |
| 1833 | turn_result = app_server.run_turn( |
| 1834 | session=session, |
| 1835 | prompt=prompt, |
| 1836 | issue=issue, |
| 1837 | on_message=(msg) -> send(orchestrator_channel, {codex_update, issue.id, msg}) |
| 1838 | ) |
| 1839 | |
| 1840 | if turn_result failed: |
| 1841 | app_server.stop_session(session) |
| 1842 | run_hook_best_effort("after_run", workspace.path) |
| 1843 | fail_worker("agent turn error") |
| 1844 | |
| 1845 | refreshed_issue = tracker.fetch_issue_states_by_ids([issue.id]) |
| 1846 | if refreshed_issue failed: |
| 1847 | app_server.stop_session(session) |
| 1848 | run_hook_best_effort("after_run", workspace.path) |
| 1849 | fail_worker("issue state refresh error") |
| 1850 | |
| 1851 | issue = refreshed_issue[0] or issue |
| 1852 | |
| 1853 | if issue.state is not active: |
| 1854 | break |
| 1855 | |
| 1856 | if turn_number >= max_turns: |
| 1857 | break |
| 1858 | |
| 1859 | turn_number = turn_number + 1 |
| 1860 | |
| 1861 | app_server.stop_session(session) |
| 1862 | run_hook_best_effort("after_run", workspace.path) |
| 1863 | |
| 1864 | exit_normal() |
| 1865 | ``` |
| 1866 | |
| 1867 | ### 16.6 Worker Exit and Retry Handling |
| 1868 | |
| 1869 | ```text |
| 1870 | on_worker_exit(issue_id, reason, state): |
| 1871 | running_entry = state.running.remove(issue_id) |
| 1872 | state = add_runtime_seconds_to_totals(state, running_entry) |
| 1873 | |
| 1874 | if reason == normal: |
| 1875 | state.completed.add(issue_id) # bookkeeping only |
| 1876 | state = schedule_retry(state, issue_id, 1, { |
| 1877 | identifier: running_entry.identifier, |
| 1878 | delay_type: continuation |
| 1879 | }) |
| 1880 | else: |
| 1881 | state = schedule_retry(state, issue_id, next_attempt_from(running_entry), { |
| 1882 | identifier: running_entry.identifier, |
| 1883 | error: format("worker exited: %reason") |
| 1884 | }) |
| 1885 | |
| 1886 | notify_observers() |
| 1887 | return state |
| 1888 | ``` |
| 1889 | |
| 1890 | ```text |
| 1891 | on_retry_timer(issue_id, state): |
| 1892 | retry_entry = state.retry_attempts.pop(issue_id) |
| 1893 | if missing: |
| 1894 | return state |
| 1895 | |
| 1896 | candidates = tracker.fetch_candidate_issues() |
| 1897 | if fetch failed: |
| 1898 | return schedule_retry(state, issue_id, retry_entry.attempt + 1, { |
| 1899 | identifier: retry_entry.identifier, |
| 1900 | error: "retry poll failed" |
| 1901 | }) |
| 1902 | |
| 1903 | issue = find_by_id(candidates, issue_id) |
| 1904 | if issue is null: |
| 1905 | state.claimed.remove(issue_id) |
| 1906 | return state |
| 1907 | |
| 1908 | if available_slots(state) == 0: |
| 1909 | return schedule_retry(state, issue_id, retry_entry.attempt + 1, { |
| 1910 | identifier: issue.identifier, |
| 1911 | error: "no available orchestrator slots" |
| 1912 | }) |
| 1913 | |
| 1914 | return dispatch_issue(issue, state, attempt=retry_entry.attempt) |
| 1915 | ``` |
| 1916 | |
| 1917 | ## 17. Test and Validation Matrix |
| 1918 | |
| 1919 | A conforming implementation should include tests that cover the behaviors defined in this |
| 1920 | specification. |
| 1921 | |
| 1922 | Validation profiles: |
| 1923 | |
| 1924 | - `Core Conformance`: deterministic tests required for all conforming implementations. |
| 1925 | - `Extension Conformance`: required only for optional features that an implementation chooses to |
| 1926 | ship. |
| 1927 | - `Real Integration Profile`: environment-dependent smoke/integration checks recommended before |
| 1928 | production use. |
| 1929 | |
| 1930 | Unless otherwise noted, Sections 17.1 through 17.7 are `Core Conformance`. Bullets that begin with |
| 1931 | `If ... is implemented` are `Extension Conformance`. |
| 1932 | |
| 1933 | ### 17.1 Workflow and Config Parsing |
| 1934 | |
| 1935 | - Workflow file path precedence: |
| 1936 | - explicit runtime path is used when provided |
| 1937 | - cwd default is `WORKFLOW.md` when no explicit runtime path is provided |
| 1938 | - Workflow file changes are detected and trigger re-read/re-apply without restart |
| 1939 | - Invalid workflow reload keeps last known good effective configuration and emits an |
| 1940 | operator-visible error |
| 1941 | - Missing `WORKFLOW.md` returns typed error |
| 1942 | - Invalid YAML front matter returns typed error |
| 1943 | - Front matter non-map returns typed error |
| 1944 | - Config defaults apply when optional values are missing |
| 1945 | - `tracker.kind` validation enforces currently supported kind (`linear`) |
| 1946 | - `tracker.api_key` works (including `$VAR` indirection) |
| 1947 | - `$VAR` resolution works for tracker API key and path values |
| 1948 | - `~` path expansion works |
| 1949 | - `codex.command` is preserved as a shell command string |
| 1950 | - Per-state concurrency override map normalizes state names and ignores invalid values |
| 1951 | - Prompt template renders `issue` and `attempt` |
| 1952 | - Prompt rendering fails on unknown variables (strict mode) |
| 1953 | |
| 1954 | ### 17.2 Workspace Manager and Safety |
| 1955 | |
| 1956 | - Deterministic workspace path per issue identifier |
| 1957 | - Missing workspace directory is created |
| 1958 | - Existing workspace directory is reused |
| 1959 | - Existing non-directory path at workspace location is handled safely (replace or fail per |
| 1960 | implementation policy) |
| 1961 | - Optional workspace population/synchronization errors are surfaced |
| 1962 | - Temporary artifacts (`tmp`, `.elixir_ls`) are removed during prep |
| 1963 | - `after_create` hook runs only on new workspace creation |
| 1964 | - `before_run` hook runs before each attempt and failure/timeouts abort the current attempt |
| 1965 | - `after_run` hook runs after each attempt and failure/timeouts are logged and ignored |
| 1966 | - `before_remove` hook runs on cleanup and failures/timeouts are ignored |
| 1967 | - Workspace path sanitization and root containment invariants are enforced before agent launch |
| 1968 | - Agent launch uses the per-issue workspace path as cwd and rejects out-of-root paths |
| 1969 | |
| 1970 | ### 17.3 Issue Tracker Client |
| 1971 | |
| 1972 | - Candidate issue fetch uses active states and project slug |
| 1973 | - Linear query uses the specified project filter field (`slugId`) |
| 1974 | - Empty `fetch_issues_by_states([])` returns empty without API call |
| 1975 | - Pagination preserves order across multiple pages |
| 1976 | - Blockers are normalized from inverse relations of type `blocks` |
| 1977 | - Labels are normalized to lowercase |
| 1978 | - Issue state refresh by ID returns minimal normalized issues |
| 1979 | - Issue state refresh query uses GraphQL ID typing (`[ID!]`) as specified in Section 11.2 |
| 1980 | - Error mapping for request errors, non-200, GraphQL errors, malformed payloads |
| 1981 | |
| 1982 | ### 17.4 Orchestrator Dispatch, Reconciliation, and Retry |
| 1983 | |
| 1984 | - Dispatch sort order is priority then oldest creation time |
| 1985 | - `Todo` issue with non-terminal blockers is not eligible |
| 1986 | - `Todo` issue with terminal blockers is eligible |
| 1987 | - Active-state issue refresh updates running entry state |
| 1988 | - Non-active state stops running agent without workspace cleanup |
| 1989 | - Terminal state stops running agent and cleans workspace |
| 1990 | - Reconciliation with no running issues is a no-op |
| 1991 | - Normal worker exit schedules a short continuation retry (attempt 1) |
| 1992 | - Abnormal worker exit increments retries with 10s-based exponential backoff |
| 1993 | - Retry backoff cap uses configured `agent.max_retry_backoff_ms` |
| 1994 | - Retry queue entries include attempt, due time, identifier, and error |
| 1995 | - Stall detection kills stalled sessions and schedules retry |
| 1996 | - Slot exhaustion requeues retries with explicit error reason |
| 1997 | - If a snapshot API is implemented, it returns running rows, retry rows, token totals, and rate |
| 1998 | limits |
| 1999 | - If a snapshot API is implemented, timeout/unavailable cases are surfaced |
| 2000 | |
| 2001 | ### 17.5 Coding-Agent App-Server Client |
| 2002 | |
| 2003 | - Launch command uses workspace cwd and invokes `bash -lc <codex.command>` |
| 2004 | - Startup handshake sends `initialize`, `initialized`, `thread/start`, `turn/start` |
| 2005 | - `initialize` includes client identity/capabilities payload required by the targeted Codex |
| 2006 | app-server protocol |
| 2007 | - Policy-related startup payloads use the implementation's documented approval/sandbox settings |
| 2008 | - `thread/start` and `turn/start` parse nested IDs and emit `session_started` |
| 2009 | - Request/response read timeout is enforced |
| 2010 | - Turn timeout is enforced |
| 2011 | - Partial JSON lines are buffered until newline |
| 2012 | - Stdout and stderr are handled separately; protocol JSON is parsed from stdout only |
| 2013 | - Non-JSON stderr lines are logged but do not crash parsing |
| 2014 | - Command/file-change approvals are handled according to the implementation's documented policy |
| 2015 | - Unsupported dynamic tool calls are rejected without stalling the session |
| 2016 | - User input requests are handled according to the implementation's documented policy and do not |
| 2017 | stall indefinitely |
| 2018 | - Usage and rate-limit payloads are extracted from nested payload shapes |
| 2019 | - Compatible payload variants for approvals, user-input-required signals, and usage/rate-limit |
| 2020 | telemetry are accepted when they preserve the same logical meaning |
| 2021 | - If optional client-side tools are implemented, the startup handshake advertises the supported tool |
| 2022 | specs required for discovery by the targeted app-server version |
| 2023 | - If the optional `linear_graphql` client-side tool extension is implemented: |
| 2024 | - the tool is advertised to the session |
| 2025 | - valid `query` / `variables` inputs execute against configured Linear auth |
| 2026 | - top-level GraphQL `errors` produce `success=false` while preserving the GraphQL body |
| 2027 | - invalid arguments, missing auth, and transport failures return structured failure payloads |
| 2028 | - unsupported tool names still fail without stalling the session |
| 2029 | |
| 2030 | ### 17.6 Observability |
| 2031 | |
| 2032 | - Validation failures are operator-visible |
| 2033 | - Structured logging includes issue/session context fields |
| 2034 | - Logging sink failures do not crash orchestration |
| 2035 | - Token/rate-limit aggregation remains correct across repeated agent updates |
| 2036 | - If a human-readable status surface is implemented, it is driven from orchestrator state and does |
| 2037 | not affect correctness |
| 2038 | - If humanized event summaries are implemented, they cover key wrapper/agent event classes without |
| 2039 | changing orchestrator behavior |
| 2040 | |
| 2041 | ### 17.7 CLI and Host Lifecycle |
| 2042 | |
| 2043 | - CLI accepts an optional positional workflow path argument (`path-to-WORKFLOW.md`) |
| 2044 | - CLI uses `./WORKFLOW.md` when no workflow path argument is provided |
| 2045 | - CLI errors on nonexistent explicit workflow path or missing default `./WORKFLOW.md` |
| 2046 | - CLI surfaces startup failure cleanly |
| 2047 | - CLI exits with success when application starts and shuts down normally |
| 2048 | - CLI exits nonzero when startup fails or the host process exits abnormally |
| 2049 | |
| 2050 | ### 17.8 Real Integration Profile (Recommended) |
| 2051 | |
| 2052 | These checks are recommended for production readiness and may be skipped in CI when credentials, |
| 2053 | network access, or external service permissions are unavailable. |
| 2054 | |
| 2055 | - A real tracker smoke test can be run with valid credentials supplied by `LINEAR_API_KEY` or a |
| 2056 | documented local bootstrap mechanism (for example `~/.linear_api_key`). |
| 2057 | - Real integration tests should use isolated test identifiers/workspaces and clean up tracker |
| 2058 | artifacts when practical. |
| 2059 | - A skipped real-integration test should be reported as skipped, not silently treated as passed. |
| 2060 | - If a real-integration profile is explicitly enabled in CI or release validation, failures should |
| 2061 | fail that job. |
| 2062 | |
| 2063 | ## 18. Implementation Checklist (Definition of Done) |
| 2064 | |
| 2065 | Use the same validation profiles as Section 17: |
| 2066 | |
| 2067 | - Section 18.1 = `Core Conformance` |
| 2068 | - Section 18.2 = `Extension Conformance` |
| 2069 | - Section 18.3 = `Real Integration Profile` |
| 2070 | |
| 2071 | ### 18.1 Required for Conformance |
| 2072 | |
| 2073 | - Workflow path selection supports explicit runtime path and cwd default |
| 2074 | - `WORKFLOW.md` loader with YAML front matter + prompt body split |
| 2075 | - Typed config layer with defaults and `$` resolution |
| 2076 | - Dynamic `WORKFLOW.md` watch/reload/re-apply for config and prompt |
| 2077 | - Polling orchestrator with single-authority mutable state |
| 2078 | - Issue tracker client with candidate fetch + state refresh + terminal fetch |
| 2079 | - Workspace manager with sanitized per-issue workspaces |
| 2080 | - Workspace lifecycle hooks (`after_create`, `before_run`, `after_run`, `before_remove`) |
| 2081 | - Hook timeout config (`hooks.timeout_ms`, default `60000`) |
| 2082 | - Coding-agent app-server subprocess client with JSON line protocol |
| 2083 | - Codex launch command config (`codex.command`, default `codex app-server`) |
| 2084 | - Strict prompt rendering with `issue` and `attempt` variables |
| 2085 | - Exponential retry queue with continuation retries after normal exit |
| 2086 | - Configurable retry backoff cap (`agent.max_retry_backoff_ms`, default 5m) |
| 2087 | - Reconciliation that stops runs on terminal/non-active tracker states |
| 2088 | - Workspace cleanup for terminal issues (startup sweep + active transition) |
| 2089 | - Structured logs with `issue_id`, `issue_identifier`, and `session_id` |
| 2090 | - Operator-visible observability (structured logs; optional snapshot/status surface) |
| 2091 | |
| 2092 | ### 18.2 Recommended Extensions (Not Required for Conformance) |
| 2093 | |
| 2094 | - Optional HTTP server honors CLI `--port` over `server.port`, uses a safe default bind host, and |
| 2095 | exposes the baseline endpoints/error semantics in Section 13.7 if shipped. |
| 2096 | - Optional `linear_graphql` client-side tool extension exposes raw Linear GraphQL access through the |
| 2097 | app-server session using configured Symphony auth. |
| 2098 | - TODO: Persist retry queue and session metadata across process restarts. |
| 2099 | - TODO: Make observability settings configurable in workflow front matter without prescribing UI |
| 2100 | implementation details. |
| 2101 | - TODO: Add first-class tracker write APIs (comments/state transitions) in the orchestrator instead |
| 2102 | of only via agent tools. |
| 2103 | - TODO: Add pluggable issue tracker adapters beyond Linear. |
| 2104 | |
| 2105 | ### 18.3 Operational Validation Before Production (Recommended) |
| 2106 | |
| 2107 | - Run the `Real Integration Profile` from Section 17.8 with valid credentials and network access. |
| 2108 | - Verify hook execution and workflow path resolution on the target host OS/shell environment. |
| 2109 | - If the optional HTTP server is shipped, verify the configured port behavior and loopback/default |
| 2110 | bind expectations on the target environment. |
| 2111 | |