openai/symphony

Public

mirrored fromhttps://github.com/openai/symphonyAvailable

Watch0 Fork0 Star0

Code Commits Issues Pull requests Actions Insights Security

b0e0ff0082236a73c12a48483d0c6036fdd31fe1

Find a branch or tag

Branches

b0e0ff0082236a73c12a48483d0c6036fdd31fe1

Clone

HTTPS

Download ZIP

SPEC.md

2110lines · modecode

Raw Download

Latest commit unavailable.

unknown

1	`# Symphony Service Specification`
2
3	`Status: Draft v1 (language-agnostic)`
4
5	`Purpose: Define a service that orchestrates coding agents to get project work done.`
6
7	`## 1. Problem Statement`
8
9	`Symphony is a long-running automation service that continuously reads work from an issue tracker`
10	`(Linear in this specification version), creates an isolated workspace for each issue, and runs a`
11	`coding agent session for that issue inside the workspace.`
12
13	`The service solves four operational problems:`
14
15	`- It turns issue execution into a repeatable daemon workflow instead of manual scripts.`
16	`- It isolates agent execution in per-issue workspaces so agent commands run only inside per-issue`
17	`workspace directories.`
18	- It keeps the workflow policy in-repo (`WORKFLOW.md`) so teams version the agent prompt and runtime
19	`settings with their code.`
20	`- It provides enough observability to operate and debug multiple concurrent agent runs.`
21
22	`Implementations are expected to document their trust and safety posture explicitly. This`
23	`specification does not require a single approval, sandbox, or operator-confirmation policy; some`
24	`implementations may target trusted environments with a high-trust configuration, while others may`
25	`require stricter approvals or sandboxing.`
26
27	`Important boundary:`
28
29	`- Symphony is a scheduler/runner and tracker reader.`
30	`- Ticket writes (state transitions, comments, PR links) are typically performed by the coding agent`
31	`using tools available in the workflow/runtime environment.`
32	- A successful run may end at a workflow-defined handoff state (for example `Human Review`), not
33	necessarily `Done`.
34
35	`## 2. Goals and Non-Goals`
36
37	`### 2.1 Goals`
38
39	`- Poll the issue tracker on a fixed cadence and dispatch work with bounded concurrency.`
40	`- Maintain a single authoritative orchestrator state for dispatch, retries, and reconciliation.`
41	`- Create deterministic per-issue workspaces and preserve them across runs.`
42	`- Stop active runs when issue state changes make them ineligible.`
43	`- Recover from transient failures with exponential backoff.`
44	- Load runtime behavior from a repository-owned `WORKFLOW.md` contract.
45	`- Expose operator-visible observability (at minimum structured logs).`
46	`- Support restart recovery without requiring a persistent database.`
47
48	`### 2.2 Non-Goals`
49
50	`- Rich web UI or multi-tenant control plane.`
51	`- Prescribing a specific dashboard or terminal UI implementation.`
52	`- General-purpose workflow engine or distributed job scheduler.`
53	`- Built-in business logic for how to edit tickets, PRs, or comments. (That logic lives in the`
54	`workflow prompt and agent tooling.)`
55	`- Mandating strong sandbox controls beyond what the coding agent and host OS provide.`
56	`- Mandating a single default approval, sandbox, or operator-confirmation posture for all`
57	`implementations.`
58
59	`## 3. System Overview`
60
61	`### 3.1 Main Components`
62
63	1. `Workflow Loader`
64	- Reads `WORKFLOW.md`.
65	`- Parses YAML front matter and prompt body.`
66	- Returns `{config, prompt_template}`.
67
68	2. `Config Layer`
69	`- Exposes typed getters for workflow config values.`
70	`- Applies defaults and environment variable indirection.`
71	`- Performs validation used by the orchestrator before dispatch.`
72
73	3. `Issue Tracker Client`
74	`- Fetches candidate issues in active states.`
75	`- Fetches current states for specific issue IDs (reconciliation).`
76	`- Fetches terminal-state issues during startup cleanup.`
77	`- Normalizes tracker payloads into a stable issue model.`
78
79	4. `Orchestrator`
80	`- Owns the poll tick.`
81	`- Owns the in-memory runtime state.`
82	`- Decides which issues to dispatch, retry, stop, or release.`
83	`- Tracks session metrics and retry queue state.`
84
85	5. `Workspace Manager`
86	`- Maps issue identifiers to workspace paths.`
87	`- Ensures per-issue workspace directories exist.`
88	`- Runs workspace lifecycle hooks.`
89	`- Cleans workspaces for terminal issues.`
90
91	6. `Agent Runner`
92	`- Creates workspace.`
93	`- Builds prompt from issue + workflow template.`
94	`- Launches the coding agent app-server client.`
95	`- Streams agent updates back to the orchestrator.`
96
97	7. `Status Surface` (optional)
98	`- Presents human-readable runtime status (for example terminal output, dashboard, or other`
99	`operator-facing view).`
100
101	8. `Logging`
102	`- Emits structured runtime logs to one or more configured sinks.`
103
104	`### 3.2 Abstraction Levels`
105
106	`Symphony is easiest to port when kept in these layers:`
107
108	1. `Policy Layer` (repo-defined)
109	- `WORKFLOW.md` prompt body.
110	`- Team-specific rules for ticket handling, validation, and handoff.`
111
112	2. `Configuration Layer` (typed getters)
113	`- Parses front matter into typed runtime settings.`
114	`- Handles defaults, environment tokens, and path normalization.`
115
116	3. `Coordination Layer` (orchestrator)
117	`- Polling loop, issue eligibility, concurrency, retries, reconciliation.`
118
119	4. `Execution Layer` (workspace + agent subprocess)
120	`- Filesystem lifecycle, workspace preparation, coding-agent protocol.`
121
122	5. `Integration Layer` (Linear adapter)
123	`- API calls and normalization for tracker data.`
124
125	6. `Observability Layer` (logs + optional status surface)
126	`- Operator visibility into orchestrator and agent behavior.`
127
128	`### 3.3 External Dependencies`
129
130	- Issue tracker API (Linear for `tracker.kind: linear` in this specification version).
131	`- Local filesystem for workspaces and logs.`
132	`- Optional workspace population tooling (for example Git CLI, if used).`
133	`- Coding-agent executable that supports JSON-RPC-like app-server mode over stdio.`
134	`- Host environment authentication for the issue tracker and coding agent.`
135
136	`## 4. Core Domain Model`
137
138	`### 4.1 Entities`
139
140	`#### 4.1.1 Issue`
141
142	`Normalized issue record used by orchestration, prompt rendering, and observability output.`
143
144	`Fields:`
145
146	- `id` (string)
147	`- Stable tracker-internal ID.`
148	- `identifier` (string)
149	- Human-readable ticket key (example: `ABC-123`).
150	- `title` (string)
151	- `description` (string or null)
152	- `priority` (integer or null)
153	`- Lower numbers are higher priority in dispatch sorting.`
154	- `state` (string)
155	`- Current tracker state name.`
156	- `branch_name` (string or null)
157	`- Tracker-provided branch metadata if available.`
158	- `url` (string or null)
159	- `labels` (list of strings)
160	`- Normalized to lowercase.`
161	- `blocked_by` (list of blocker refs)
162	`- Each blocker ref contains:`
163	- `id` (string or null)
164	- `identifier` (string or null)
165	- `state` (string or null)
166	- `created_at` (timestamp or null)
167	- `updated_at` (timestamp or null)
168
169	`#### 4.1.2 Workflow Definition`
170
171	Parsed `WORKFLOW.md` payload:
172
173	- `config` (map)
174	`- YAML front matter root object.`
175	- `prompt_template` (string)
176	`- Markdown body after front matter, trimmed.`
177
178	`#### 4.1.3 Service Config (Typed View)`
179
180	Typed runtime values derived from `WorkflowDefinition.config` plus environment resolution.
181
182	`Examples:`
183
184	`- poll interval`
185	`- workspace root`
186	`- active and terminal issue states`
187	`- concurrency limits`
188	`- coding-agent executable/args/timeouts`
189	`- workspace hooks`
190
191	`#### 4.1.4 Workspace`
192
193	`Filesystem workspace assigned to one issue identifier.`
194
195	`Fields (logical):`
196
197	- `path` (workspace path; current runtime typically uses absolute paths, but relative roots are
198	`possible if configured without path separators)`
199	- `workspace_key` (sanitized issue identifier)
200	- `created_now` (boolean, used to gate `after_create` hook)
201
202	`#### 4.1.5 Run Attempt`
203
204	`One execution attempt for one issue.`
205
206	`Fields (logical):`
207
208	- `issue_id`
209	- `issue_identifier`
210	- `attempt` (integer or null, `null` for first run, `>=1` for retries/continuation)
211	- `workspace_path`
212	- `started_at`
213	- `status`
214	- `error` (optional)
215
216	`#### 4.1.6 Live Session (Agent Session Metadata)`
217
218	`State tracked while a coding-agent subprocess is running.`
219
220	`Fields:`
221
222	- `session_id` (string, `<thread_id>-<turn_id>`)
223	- `thread_id` (string)
224	- `turn_id` (string)
225	- `codex_app_server_pid` (string or null)
226	- `last_codex_event` (string/enum or null)
227	- `last_codex_timestamp` (timestamp or null)
228	- `last_codex_message` (summarized payload)
229	- `codex_input_tokens` (integer)
230	- `codex_output_tokens` (integer)
231	- `codex_total_tokens` (integer)
232	- `last_reported_input_tokens` (integer)
233	- `last_reported_output_tokens` (integer)
234	- `last_reported_total_tokens` (integer)
235	- `turn_count` (integer)
236	`- Number of coding-agent turns started within the current worker lifetime.`
237
238	`#### 4.1.7 Retry Entry`
239
240	`Scheduled retry state for an issue.`
241
242	`Fields:`
243
244	- `issue_id`
245	- `identifier` (best-effort human ID for status surfaces/logs)
246	- `attempt` (integer, 1-based for retry queue)
247	- `due_at_ms` (monotonic clock timestamp)
248	- `timer_handle` (runtime-specific timer reference)
249	- `error` (string or null)
250
251	`#### 4.1.8 Orchestrator Runtime State`
252
253	`Single authoritative in-memory state owned by the orchestrator.`
254
255	`Fields:`
256
257	- `poll_interval_ms` (current effective poll interval)
258	- `max_concurrent_agents` (current effective global concurrency limit)
259	- `running` (map `issue_id -> running entry`)
260	- `claimed` (set of issue IDs reserved/running/retrying)
261	- `retry_attempts` (map `issue_id -> RetryEntry`)
262	- `completed` (set of issue IDs; bookkeeping only, not dispatch gating)
263	- `codex_totals` (aggregate tokens + runtime seconds)
264	- `codex_rate_limits` (latest rate-limit snapshot from agent events)
265
266	`### 4.2 Stable Identifiers and Normalization Rules`
267
268	- `Issue ID`
269	`- Use for tracker lookups and internal map keys.`
270	- `Issue Identifier`
271	`- Use for human-readable logs and workspace naming.`
272	- `Workspace Key`
273	- Derive from `issue.identifier` by replacing any character not in `[A-Za-z0-9._-]` with `_`.
274	`- Use the sanitized value for the workspace directory name.`
275	- `Normalized Issue State`
276	- Compare states after `trim` + `lowercase`.
277	- `Session ID`
278	- Compose from coding-agent `thread_id` and `turn_id` as `<thread_id>-<turn_id>`.
279
280	`## 5. Workflow Specification (Repository Contract)`
281
282	`### 5.1 File Discovery and Path Resolution`
283
284	`Workflow file path precedence:`
285
286	`1. Explicit application/runtime setting (set by CLI startup path).`
287	2. Default: `WORKFLOW.md` in the current process working directory.
288
289	`Loader behavior:`
290
291	- If the file cannot be read, return `missing_workflow_file` error.
292	`- The workflow file is expected to be repository-owned and version-controlled.`
293
294	`### 5.2 File Format`
295
296	`WORKFLOW.md` is a Markdown file with optional YAML front matter.
297
298	`Design note:`
299
300	- `WORKFLOW.md` should be self-contained enough to describe and run different workflows (prompt,
301	`runtime settings, hooks, and tracker selection/config) without requiring out-of-band`
302	`service-specific configuration.`
303
304	`Parsing rules:`
305
306	- If file starts with `---`, parse lines until the next `---` as YAML front matter.
307	`- Remaining lines become the prompt body.`
308	`- If front matter is absent, treat the entire file as prompt body and use an empty config map.`
309	`- YAML front matter must decode to a map/object; non-map YAML is an error.`
310	`- Prompt body is trimmed before use.`
311
312	`Returned workflow object:`
313
314	- `config`: front matter root object (not nested under a `config` key).
315	- `prompt_template`: trimmed Markdown body.
316
317	`### 5.3 Front Matter Schema`
318
319	`Top-level keys:`
320
321	- `tracker`
322	- `polling`
323	- `workspace`
324	- `hooks`
325	- `agent`
326	- `codex`
327
328	`Unknown keys should be ignored for forward compatibility.`
329
330	`Note:`
331
332	`- The workflow front matter is extensible. Optional extensions may define additional top-level keys`
333	(for example `server`) without changing the core schema above.
334	`- Extensions should document their field schema, defaults, validation rules, and whether changes`
335	`apply dynamically or require restart.`
336	- Common extension: `server.port` (integer) enables the optional HTTP server described in Section
337	`13.7.`
338
339	#### 5.3.1 `tracker` (object)
340
341	`Fields:`
342
343	- `kind` (string)
344	`- Required for dispatch.`
345	- Current supported value: `linear`
346	- `endpoint` (string)
347	- Default for `tracker.kind == "linear"`: `https://api.linear.app/graphql`
348	- `api_key` (string)
349	- May be a literal token or `$VAR_NAME`.
350	- Canonical environment variable for `tracker.kind == "linear"`: `LINEAR_API_KEY`.
351	- If `$VAR_NAME` resolves to an empty string, treat the key as missing.
352	- `project_slug` (string)
353	- Required for dispatch when `tracker.kind == "linear"`.
354	- `active_states` (list of strings or comma-separated string)
355	- Default: `Todo`, `In Progress`
356	- `terminal_states` (list of strings or comma-separated string)
357	- Default: `Closed`, `Cancelled`, `Canceled`, `Duplicate`, `Done`
358
359	#### 5.3.2 `polling` (object)
360
361	`Fields:`
362
363	- `interval_ms` (integer or string integer)
364	- Default: `30000`
365	`- Changes should be re-applied at runtime and affect future tick scheduling without restart.`
366
367	#### 5.3.3 `workspace` (object)
368
369	`Fields:`
370
371	- `root` (path string or `$VAR`)
372	- Default: `<system-temp>/symphony_workspaces`
373	- `~` and strings containing path separators are expanded.
374	`- Bare strings without path separators are preserved as-is (relative roots are allowed but`
375	`discouraged).`
376
377	#### 5.3.4 `hooks` (object)
378
379	`Fields:`
380
381	- `after_create` (multiline shell script string, optional)
382	`- Runs only when a workspace directory is newly created.`
383	`- Failure aborts workspace creation.`
384	- `before_run` (multiline shell script string, optional)
385	`- Runs before each agent attempt after workspace preparation and before launching the coding`
386	`agent.`
387	`- Failure aborts the current attempt.`
388	- `after_run` (multiline shell script string, optional)
389	`- Runs after each agent attempt (success, failure, timeout, or cancellation) once the workspace`
390	`exists.`
391	`- Failure is logged but ignored.`
392	- `before_remove` (multiline shell script string, optional)
393	`- Runs before workspace deletion if the directory exists.`
394	`- Failure is logged but ignored; cleanup still proceeds.`
395	- `timeout_ms` (integer, optional)
396	- Default: `60000`
397	`- Applies to all workspace hooks.`
398	`- Non-positive values should be treated as invalid and fall back to the default.`
399	`- Changes should be re-applied at runtime for future hook executions.`
400
401	#### 5.3.5 `agent` (object)
402
403	`Fields:`
404
405	- `max_concurrent_agents` (integer or string integer)
406	- Default: `10`
407	`- Changes should be re-applied at runtime and affect subsequent dispatch decisions.`
408	- `max_retry_backoff_ms` (integer or string integer)
409	- Default: `300000` (5 minutes)
410	`- Changes should be re-applied at runtime and affect future retry scheduling.`
411	- `max_concurrent_agents_by_state` (map `state_name -> positive integer`)
412	`- Default: empty map.`
413	- State keys are normalized (`trim` + `lowercase`) for lookup.
414	`- Invalid entries (non-positive or non-numeric) are ignored.`
415
416	#### 5.3.6 `codex` (object)
417
418	`Fields:`
419
420	For Codex-owned config values such as `approval_policy`, `thread_sandbox`, and
421	`turn_sandbox_policy`, supported values are defined by the targeted Codex app-server version.
422	`Implementors should treat them as pass-through Codex config values rather than relying on a`
423	`hand-maintained enum in this spec. To inspect the installed Codex schema, run`
424	`codex app-server generate-json-schema --out <dir>` and inspect the relevant definitions referenced
425	by `v2/ThreadStartParams.json` and `v2/TurnStartParams.json`. Implementations may validate these
426	`fields locally if they want stricter startup checks.`
427
428	- `command` (string shell command)
429	- Default: `codex app-server`
430	- The runtime launches this command via `bash -lc` in the workspace directory.
431	`- The launched process must speak a compatible app-server protocol over stdio.`
432	- `approval_policy` (Codex `AskForApproval` value)
433	`- Default: implementation-defined.`
434	- `thread_sandbox` (Codex `SandboxMode` value)
435	`- Default: implementation-defined.`
436	- `turn_sandbox_policy` (Codex `SandboxPolicy` value)
437	`- Default: implementation-defined.`
438	- `turn_timeout_ms` (integer)
439	- Default: `3600000` (1 hour)
440	- `read_timeout_ms` (integer)
441	- Default: `5000`
442	- `stall_timeout_ms` (integer)
443	- Default: `300000` (5 minutes)
444	- If `<= 0`, stall detection is disabled.
445
446	`### 5.4 Prompt Template Contract`
447
448	The Markdown body of `WORKFLOW.md` is the per-issue prompt template.
449
450	`Rendering requirements:`
451
452	`- Use a strict template engine (Liquid-compatible semantics are sufficient).`
453	`- Unknown variables must fail rendering.`
454	`- Unknown filters must fail rendering.`
455
456	`Template input variables:`
457
458	- `issue` (object)
459	`- Includes all normalized issue fields, including labels and blockers.`
460	- `attempt` (integer or null)
461	- `null`/absent on first attempt.
462	`- Integer on retry or continuation run.`
463
464	`Fallback prompt behavior:`
465
466	`- If the workflow prompt body is empty, the runtime may use a minimal default prompt`
467	(`You are working on an issue from Linear.`).
468	`- Workflow file read/parse failures are configuration/validation errors and should not silently fall`
469	`back to a prompt.`
470
471	`### 5.5 Workflow Validation and Error Surface`
472
473	`Error classes:`
474
475	- `missing_workflow_file`
476	- `workflow_parse_error`
477	- `workflow_front_matter_not_a_map`
478	- `template_parse_error` (during prompt rendering)
479	- `template_render_error` (unknown variable/filter, invalid interpolation)
480
481	`Dispatch gating behavior:`
482
483	`- Workflow file read/YAML errors block new dispatches until fixed.`
484	`- Template errors fail only the affected run attempt.`
485
486	`## 6. Configuration Specification`
487
488	`### 6.1 Source Precedence and Resolution Semantics`
489
490	`Configuration precedence:`
491
492	`1. Workflow file path selection (runtime setting -> cwd default).`
493	`2. YAML front matter values.`
494	3. Environment indirection via `$VAR_NAME` inside selected YAML values.
495	`4. Built-in defaults.`
496
497	`Value coercion semantics:`
498
499	`- Path/command fields support:`
500	- `~` home expansion
501	- `$VAR` expansion for env-backed path values
502	`- Apply expansion only to values intended to be local filesystem paths; do not rewrite URIs or`
503	`arbitrary shell command strings.`
504
505	`### 6.2 Dynamic Reload Semantics`
506
507	`Dynamic reload is required:`
508
509	- The software should watch `WORKFLOW.md` for changes.
510	`- On change, it should re-read and re-apply workflow config and prompt template without restart.`
511	`- The software should attempt to adjust live behavior to the new config (for example polling`
512	`cadence, concurrency limits, active/terminal states, codex settings, workspace paths/hooks, and`
513	`prompt content for future runs).`
514	`- Reloaded config applies to future dispatch, retry scheduling, reconciliation decisions, hook`
515	`execution, and agent launches.`
516	`- Implementations are not required to restart in-flight agent sessions automatically when config`
517	`changes.`
518	`- Extensions that manage their own listeners/resources (for example an HTTP server port change) may`
519	`require restart unless the implementation explicitly supports live rebind.`
520	`- Implementations should also re-validate/reload defensively during runtime operations (for example`
521	`before dispatch) in case filesystem watch events are missed.`
522	`- Invalid reloads should not crash the service; keep operating with the last known good effective`
523	`configuration and emit an operator-visible error.`
524
525	`### 6.3 Dispatch Preflight Validation`
526
527	`This validation is a scheduler preflight run before attempting to dispatch new work. It validates`
528	`the workflow/config needed to poll and launch workers, not a full audit of all possible workflow`
529	`behavior.`
530
531	`Startup validation:`
532
533	`- Validate configuration before starting the scheduling loop.`
534	`- If startup validation fails, fail startup and emit an operator-visible error.`
535
536	`Per-tick dispatch validation:`
537
538	`- Re-validate before each dispatch cycle.`
539	`- If validation fails, skip dispatch for that tick, keep reconciliation active, and emit an`
540	`operator-visible error.`
541
542	`Validation checks:`
543
544	`- Workflow file can be loaded and parsed.`
545	- `tracker.kind` is present and supported.
546	- `tracker.api_key` is present after `$` resolution.
547	- `tracker.project_slug` is present when required by the selected tracker kind.
548	- `codex.command` is present and non-empty.
549
550	`### 6.4 Config Fields Summary (Cheat Sheet)`
551
552	`This section is intentionally redundant so a coding agent can implement the config layer quickly.`
553
554	- `tracker.kind`: string, required, currently `linear`
555	- `tracker.endpoint`: string, default `https://api.linear.app/graphql` when `tracker.kind=linear`
556	- `tracker.api_key`: string or `$VAR`, canonical env `LINEAR_API_KEY` when `tracker.kind=linear`
557	- `tracker.project_slug`: string, required when `tracker.kind=linear`
558	- `tracker.active_states`: list/string, default `Todo, In Progress`
559	- `tracker.terminal_states`: list/string, default `Closed, Cancelled, Canceled, Duplicate, Done`
560	- `polling.interval_ms`: integer, default `30000`
561	- `workspace.root`: path, default `<system-temp>/symphony_workspaces`
562	- `hooks.after_create`: shell script or null
563	- `hooks.before_run`: shell script or null
564	- `hooks.after_run`: shell script or null
565	- `hooks.before_remove`: shell script or null
566	- `hooks.timeout_ms`: integer, default `60000`
567	- `agent.max_concurrent_agents`: integer, default `10`
568	- `agent.max_turns`: integer, default `20`
569	- `agent.max_retry_backoff_ms`: integer, default `300000` (5m)
570	- `agent.max_concurrent_agents_by_state`: map of positive integers, default `{}`
571	- `codex.command`: shell command string, default `codex app-server`
572	- `codex.approval_policy`: Codex `AskForApproval` value, default implementation-defined
573	- `codex.thread_sandbox`: Codex `SandboxMode` value, default implementation-defined
574	- `codex.turn_sandbox_policy`: Codex `SandboxPolicy` value, default implementation-defined
575	- `codex.turn_timeout_ms`: integer, default `3600000`
576	- `codex.read_timeout_ms`: integer, default `5000`
577	- `codex.stall_timeout_ms`: integer, default `300000`
578	- `server.port` (extension): integer, optional; enables the optional HTTP server, `0` may be used
579	for ephemeral local bind, and CLI `--port` overrides it
580
581	`## 7. Orchestration State Machine`
582
583	`The orchestrator is the only component that mutates scheduling state. All worker outcomes are`
584	`reported back to it and converted into explicit state transitions.`
585
586	`### 7.1 Issue Orchestration States`
587
588	This is not the same as tracker states (`Todo`, `In Progress`, etc.). This is the service's internal
589	`claim state.`
590
591	1. `Unclaimed`
592	`- Issue is not running and has no retry scheduled.`
593
594	2. `Claimed`
595	`- Orchestrator has reserved the issue to prevent duplicate dispatch.`
596	- In practice, claimed issues are either `Running` or `RetryQueued`.
597
598	3. `Running`
599	- Worker task exists and the issue is tracked in `running` map.
600
601	4. `RetryQueued`
602	- Worker is not running, but a retry timer exists in `retry_attempts`.
603
604	5. `Released`
605	`- Claim removed because issue is terminal, non-active, missing, or retry path completed without`
606	`re-dispatch.`
607
608	`Important nuance:`
609
610	`- A successful worker exit does not mean the issue is done forever.`
611	`- The worker may continue through multiple back-to-back coding-agent turns before it exits.`
612	`- After each normal turn completion, the worker re-checks the tracker issue state.`
613	`- If the issue is still in an active state, the worker should start another turn on the same live`
614	coding-agent thread in the same workspace, up to `agent.max_turns`.
615	`- The first turn should use the full rendered task prompt.`
616	`- Continuation turns should send only continuation guidance to the existing thread, not resend the`
617	`original task prompt that is already present in thread history.`
618	`- Once the worker exits normally, the orchestrator still schedules a short continuation retry`
619	`(about 1 second) so it can re-check whether the issue remains active and needs another worker`
620	`session.`
621
622	`### 7.2 Run Attempt Lifecycle`
623
624	`A run attempt transitions through these phases:`
625
626	1. `PreparingWorkspace`
627	2. `BuildingPrompt`
628	3. `LaunchingAgentProcess`
629	4. `InitializingSession`
630	5. `StreamingTurn`
631	6. `Finishing`
632	7. `Succeeded`
633	8. `Failed`
634	9. `TimedOut`
635	10. `Stalled`
636	11. `CanceledByReconciliation`
637
638	`Distinct terminal reasons are important because retry logic and logs differ.`
639
640	`### 7.3 Transition Triggers`
641
642	- `Poll Tick`
643	`- Reconcile active runs.`
644	`- Validate config.`
645	`- Fetch candidate issues.`
646	`- Dispatch until slots are exhausted.`
647
648	- `Worker Exit (normal)`
649	`- Remove running entry.`
650	`- Update aggregate runtime totals.`
651	- Schedule continuation retry (attempt `1`) after the worker exhausts or finishes its in-process
652	`turn loop.`
653
654	- `Worker Exit (abnormal)`
655	`- Remove running entry.`
656	`- Update aggregate runtime totals.`
657	`- Schedule exponential-backoff retry.`
658
659	- `Codex Update Event`
660	`- Update live session fields, token counters, and rate limits.`
661
662	- `Retry Timer Fired`
663	`- Re-fetch active candidates and attempt re-dispatch, or release claim if no longer eligible.`
664
665	- `Reconciliation State Refresh`
666	`- Stop runs whose issue states are terminal or no longer active.`
667
668	- `Stall Timeout`
669	`- Kill worker and schedule retry.`
670
671	`### 7.4 Idempotency and Recovery Rules`
672
673	`- The orchestrator serializes state mutations through one authority to avoid duplicate dispatch.`
674	- `claimed` and `running` checks are required before launching any worker.
675	`- Reconciliation runs before dispatch on every tick.`
676	`- Restart recovery is tracker-driven and filesystem-driven (no durable orchestrator DB required).`
677	`- Startup terminal cleanup removes stale workspaces for issues already in terminal states.`
678
679	`## 8. Polling, Scheduling, and Reconciliation`
680
681	`### 8.1 Poll Loop`
682
683	`At startup, the service validates config, performs startup cleanup, schedules an immediate tick, and`
684	then repeats every `polling.interval_ms`.
685
686	`The effective poll interval should be updated when workflow config changes are re-applied.`
687
688	`Tick sequence:`
689
690	`1. Reconcile running issues.`
691	`2. Run dispatch preflight validation.`
692	`3. Fetch candidate issues from tracker using active states.`
693	`4. Sort issues by dispatch priority.`
694	`5. Dispatch eligible issues while slots remain.`
695	`6. Notify observability/status consumers of state changes.`
696
697	`If per-tick validation fails, dispatch is skipped for that tick, but reconciliation still happens`
698	`first.`
699
700	`### 8.2 Candidate Selection Rules`
701
702	`An issue is dispatch-eligible only if all are true:`
703
704	- It has `id`, `identifier`, `title`, and `state`.
705	- Its state is in `active_states` and not in `terminal_states`.
706	- It is not already in `running`.
707	- It is not already in `claimed`.
708	`- Global concurrency slots are available.`
709	`- Per-state concurrency slots are available.`
710	- Blocker rule for `Todo` state passes:
711	- If the issue state is `Todo`, do not dispatch when any blocker is non-terminal.
712
713	`Sorting order (stable intent):`
714
715	1. `priority` ascending (1..4 are preferred; null/unknown sorts last)
716	2. `created_at` oldest first
717	3. `identifier` lexicographic tie-breaker
718
719	`### 8.3 Concurrency Control`
720
721	`Global limit:`
722
723	- `available_slots = max(max_concurrent_agents - running_count, 0)`
724
725	`Per-state limit:`
726
727	- `max_concurrent_agents_by_state[state]` if present (state key normalized)
728	`- otherwise fallback to global limit`
729
730	The runtime counts issues by their current tracked state in the `running` map.
731
732	`### 8.4 Retry and Backoff`
733
734	`Retry entry creation:`
735
736	`- Cancel any existing retry timer for the same issue.`
737	- Store `attempt`, `identifier`, `error`, `due_at_ms`, and new timer handle.
738
739	`Backoff formula:`
740
741	- Normal continuation retries after a clean worker exit use a short fixed delay of `1000` ms.
742	- Failure-driven retries use `delay = min(10000 * 2^(attempt - 1), agent.max_retry_backoff_ms)`.
743	- Power is capped by the configured max retry backoff (default `300000` / 5m).
744
745	`Retry handling behavior:`
746
747	`1. Fetch active candidate issues (not all issues).`
748	2. Find the specific issue by `issue_id`.
749	`3. If not found, release claim.`
750	`4. If found and still candidate-eligible:`
751	`- Dispatch if slots are available.`
752	- Otherwise requeue with error `no available orchestrator slots`.
753	`5. If found but no longer active, release claim.`
754
755	`Note:`
756
757	`- Terminal-state workspace cleanup is handled by startup cleanup and active-run reconciliation`
758	`(including terminal transitions for currently running issues).`
759	`- Retry handling mainly operates on active candidates and releases claims when the issue is absent,`
760	`rather than performing terminal cleanup itself.`
761
762	`### 8.5 Active Run Reconciliation`
763
764	`Reconciliation runs every tick and has two parts.`
765
766	`Part A: Stall detection`
767
768	- For each running issue, compute `elapsed_ms` since:
769	- `last_codex_timestamp` if any event has been seen, else
770	- `started_at`
771	- If `elapsed_ms > codex.stall_timeout_ms`, terminate the worker and queue a retry.
772	- If `stall_timeout_ms <= 0`, skip stall detection entirely.
773
774	`Part B: Tracker state refresh`
775
776	`- Fetch current issue states for all running issue IDs.`
777	`- For each running issue:`
778	`- If tracker state is terminal: terminate worker and clean workspace.`
779	`- If tracker state is still active: update the in-memory issue snapshot.`
780	`- If tracker state is neither active nor terminal: terminate worker without workspace cleanup.`
781	`- If state refresh fails, keep workers running and try again on the next tick.`
782
783	`### 8.6 Startup Terminal Workspace Cleanup`
784
785	`When the service starts:`
786
787	`1. Query tracker for issues in terminal states.`
788	`2. For each returned issue identifier, remove the corresponding workspace directory.`
789	`3. If the terminal-issues fetch fails, log a warning and continue startup.`
790
791	`This prevents stale terminal workspaces from accumulating after restarts.`
792
793	`## 9. Workspace Management and Safety`
794
795	`### 9.1 Workspace Layout`
796
797	`Workspace root:`
798
799	- `workspace.root` (normalized path; the current config layer expands path-like values and preserves
800	`bare relative names)`
801
802	`Per-issue workspace path:`
803
804	- `<workspace.root>/<sanitized_issue_identifier>`
805
806	`Workspace persistence:`
807
808	`- Workspaces are reused across runs for the same issue.`
809	`- Successful runs do not auto-delete workspaces.`
810
811	`### 9.2 Workspace Creation and Reuse`
812
813	Input: `issue.identifier`
814
815	`Algorithm summary:`
816
817	1. Sanitize identifier to `workspace_key`.
818	`2. Compute workspace path under workspace root.`
819	`3. Ensure the workspace path exists as a directory.`
820	4. Mark `created_now=true` only if the directory was created during this call; otherwise
821	`created_now=false`.
822	5. If `created_now=true`, run `after_create` hook if configured.
823
824	`Notes:`
825
826	`- This section does not assume any specific repository/VCS workflow.`
827	`- Workspace preparation beyond directory creation (for example dependency bootstrap, checkout/sync,`
828	`code generation) is implementation-defined and is typically handled via hooks.`
829
830	`### 9.3 Optional Workspace Population (Implementation-Defined)`
831
832	`The spec does not require any built-in VCS or repository bootstrap behavior.`
833
834	`Implementations may populate or synchronize the workspace using implementation-defined logic and/or`
835	hooks (for example `after_create` and/or `before_run`).
836
837	`Failure handling:`
838
839	`- Workspace population/synchronization failures return an error for the current attempt.`
840	`- If failure happens while creating a brand-new workspace, implementations may remove the partially`
841	`prepared directory.`
842	`- Reused workspaces should not be destructively reset on population failure unless that policy is`
843	`explicitly chosen and documented.`
844
845	`### 9.4 Workspace Hooks`
846
847	`Supported hooks:`
848
849	- `hooks.after_create`
850	- `hooks.before_run`
851	- `hooks.after_run`
852	- `hooks.before_remove`
853
854	`Execution contract:`
855
856	`- Execute in a local shell context appropriate to the host OS, with the workspace directory as`
857	`cwd`.
858	- On POSIX systems, `sh -lc <script>` (or a stricter equivalent such as `bash -lc <script>`) is a
859	`conforming default.`
860	- Hook timeout uses `hooks.timeout_ms`; default: `60000 ms`.
861	`- Log hook start, failures, and timeouts.`
862
863	`Failure semantics:`
864
865	- `after_create` failure or timeout is fatal to workspace creation.
866	- `before_run` failure or timeout is fatal to the current run attempt.
867	- `after_run` failure or timeout is logged and ignored.
868	- `before_remove` failure or timeout is logged and ignored.
869
870	`### 9.5 Safety Invariants`
871
872	`This is the most important portability constraint.`
873
874	`Invariant 1: Run the coding agent only in the per-issue workspace path.`
875
876	`- Before launching the coding-agent subprocess, validate:`
877	- `cwd == workspace_path`
878
879	`Invariant 2: Workspace path must stay inside workspace root.`
880
881	`- Normalize both paths to absolute.`
882	- Require `workspace_path` to have `workspace_root` as a prefix directory.
883	`- Reject any path outside the workspace root.`
884
885	`Invariant 3: Workspace key is sanitized.`
886
887	- Only `[A-Za-z0-9._-]` allowed in workspace directory names.
888	- Replace all other characters with `_`.
889
890	`## 10. Agent Runner Protocol (Coding Agent Integration)`
891
892	`This section defines the language-neutral contract for integrating a coding agent app-server.`
893
894	`Compatibility profile:`
895
896	`- The normative contract is message ordering, required behaviors, and the logical fields that must`
897	`be extracted (for example session IDs, completion state, approval handling, and usage/rate-limit`
898	`telemetry).`
899	`- Exact JSON field names may vary slightly across compatible app-server versions.`
900	`- Implementations should tolerate equivalent payload shapes when they carry the same logical`
901	`meaning, especially for nested IDs, approval requests, user-input-required signals, and`
902	`token/rate-limit metadata.`
903
904	`### 10.1 Launch Contract`
905
906	`Subprocess launch parameters:`
907
908	- Command: `codex.command`
909	- Invocation: `bash -lc <codex.command>`
910	`- Working directory: workspace path`
911	`- Stdout/stderr: separate streams`
912	`- Framing: line-delimited protocol messages on stdout (JSON-RPC-like JSON per line)`
913
914	`Notes:`
915
916	- The default command is `codex app-server`.
917	`- Approval policy, cwd, and prompt are expressed in the protocol messages in Section 10.2.`
918
919	`Recommended additional process settings:`
920
921	`- Max line size: 10 MB (for safe buffering)`
922
923	`### 10.2 Session Startup Handshake`
924
925	`Reference: https://developers.openai.com/codex/app-server/`
926
927	`The client must send these protocol messages in order:`
928
929	`Illustrative startup transcript (equivalent payload shapes are acceptable if they preserve the same`
930	`semantics):`
931
932	```json
933	`{"id":1,"method":"initialize","params":{"clientInfo":{"name":"symphony","version":"1.0"},"capabilities":{}}}`
934	`{"method":"initialized","params":{}}`
935	`{"id":2,"method":"thread/start","params":{"approvalPolicy":"<implementation-defined>","sandbox":"<implementation-defined>","cwd":"/abs/workspace"}}`
936	`{"id":3,"method":"turn/start","params":{"threadId":"<thread-id>","input":[{"type":"text","text":"<rendered prompt-or-continuation-guidance>"}],"cwd":"/abs/workspace","title":"ABC-123: Example","approvalPolicy":"<implementation-defined>","sandboxPolicy":{"type":"<implementation-defined>"}}}`
937	```
938
939	1. `initialize` request
940	`- Params include:`
941	- `clientInfo` object (for example `{name, version}`)
942	- `capabilities` object (may be empty)
943	`- If the targeted Codex app-server requires capability negotiation for dynamic tools, include the`
944	`necessary capability flag(s) here.`
945	- Wait for response (`read_timeout_ms`)
946	2. `initialized` notification
947	3. `thread/start` request
948	`- Params include:`
949	- `approvalPolicy` = implementation-defined session approval policy value
950	- `sandbox` = implementation-defined session sandbox value
951	- `cwd` = absolute workspace path
952	`- If optional client-side tools are implemented, include their advertised tool specs using the`
953	`protocol mechanism supported by the targeted Codex app-server version.`
954	4. `turn/start` request
955	`- Params include:`
956	- `threadId`
957	- `input` = single text item containing rendered prompt for the first turn, or continuation
958	`guidance for later turns on the same thread`
959	- `cwd`
960	- `title` = `<issue.identifier>: <issue.title>`
961	- `approvalPolicy` = implementation-defined turn approval policy value
962	- `sandboxPolicy` = implementation-defined object-form sandbox policy payload when required by
963	`the targeted app-server version`
964
965	`Session identifiers:`
966
967	- Read `thread_id` from `thread/start` result `result.thread.id`
968	- Read `turn_id` from each `turn/start` result `result.turn.id`
969	- Emit `session_id = "<thread_id>-<turn_id>"`
970	- Reuse the same `thread_id` for all continuation turns inside one worker run
971
972	`### 10.3 Streaming Turn Processing`
973
974	`The client reads line-delimited messages until the turn terminates.`
975
976	`Completion conditions:`
977
978	- `turn/completed` -> success
979	- `turn/failed` -> failure
980	- `turn/cancelled` -> failure
981	- turn timeout (`turn_timeout_ms`) -> failure
982	`- subprocess exit -> failure`
983
984	`Continuation processing:`
985
986	- If the worker decides to continue after a successful turn, it should issue another `turn/start`
987	on the same live `threadId`.
988	`- The app-server subprocess should remain alive across those continuation turns and be stopped only`
989	`when the worker run is ending.`
990
991	`Line handling requirements:`
992
993	`- Read protocol messages from stdout only.`
994	`- Buffer partial stdout lines until newline arrives.`
995	`- Attempt JSON parse on complete stdout lines.`
996	`- Stderr is not part of the protocol stream:`
997	`- ignore it or log it as diagnostics`
998	`- do not attempt protocol JSON parsing on stderr`
999
1000	`### 10.4 Emitted Runtime Events (Upstream to Orchestrator)`
1001
1002	`The app-server client emits structured events to the orchestrator callback. Each event should`
1003	`include:`
1004
1005	- `event` (enum/string)
1006	- `timestamp` (UTC timestamp)
1007	- `codex_app_server_pid` (if available)
1008	- optional `usage` map (token counts)
1009	`- payload fields as needed`
1010
1011	`Important emitted events may include:`
1012
1013	- `session_started`
1014	- `startup_failed`
1015	- `turn_completed`
1016	- `turn_failed`
1017	- `turn_cancelled`
1018	- `turn_ended_with_error`
1019	- `turn_input_required`
1020	- `approval_auto_approved`
1021	- `unsupported_tool_call`
1022	- `notification`
1023	- `other_message`
1024	- `malformed`
1025
1026	`### 10.5 Approval, Tool Calls, and User Input Policy`
1027
1028	`Approval, sandbox, and user-input behavior is implementation-defined.`
1029
1030	`Policy requirements:`
1031
1032	`- Each implementation should document its chosen approval, sandbox, and operator-confirmation`
1033	`posture.`
1034	`- Approval requests and user-input-required events must not leave a run stalled indefinitely. An`
1035	`implementation should either satisfy them, surface them to an operator, auto-resolve them, or`
1036	`fail the run according to its documented policy.`
1037
1038	`Example high-trust behavior:`
1039
1040	`- Auto-approve command execution approvals for the session.`
1041	`- Auto-approve file-change approvals for the session.`
1042	`- Treat user-input-required turns as hard failure.`
1043
1044	`Unsupported dynamic tool calls:`
1045
1046	`- Supported dynamic tool calls that are explicitly implemented and advertised by the runtime should`
1047	`be handled according to their extension contract.`
1048	- If the agent requests a dynamic tool call (`item/tool/call`) that is not supported, return a tool
1049	`failure response and continue the session.`
1050	`- This prevents the session from stalling on unsupported tool execution paths.`
1051
1052	`Optional client-side tool extension:`
1053
1054	`- An implementation may expose a limited set of client-side tools to the app-server session.`
1055	- Current optional standardized tool: `linear_graphql`.
1056	`- If implemented, supported tools should be advertised to the app-server session during startup`
1057	`using the protocol mechanism supported by the targeted Codex app-server version.`
1058	`- Unsupported tool names should still return a failure result and continue the session.`
1059
1060	`linear_graphql` extension contract:
1061
1062	`- Purpose: execute a raw GraphQL query or mutation against Linear using Symphony's configured`
1063	`tracker auth for the current session.`
1064	- Availability: only meaningful when `tracker.kind == "linear"` and valid Linear auth is configured.
1065	`- Preferred input shape:`
1066
1067	```json
1068	`{`
1069	`"query": "single GraphQL query or mutation document",`
1070	`"variables": {`
1071	`"optional": "graphql variables object"`
1072	`}`
1073	`}`
1074	```
1075
1076	- `query` must be a non-empty string.
1077	- `query` must contain exactly one GraphQL operation.
1078	- `variables` is optional and, when present, must be a JSON object.
1079	`- Implementations may additionally accept a raw GraphQL query string as shorthand input.`
1080	`- Execute one GraphQL operation per tool call.`
1081	`- If the provided document contains multiple operations, reject the tool call as invalid input.`
1082	- `operationName` selection is intentionally out of scope for this extension.
1083	`- Reuse the configured Linear endpoint and auth from the active Symphony workflow/runtime config; do`
1084	`not require the coding agent to read raw tokens from disk.`
1085	`- Tool result semantics:`
1086	- transport success + no top-level GraphQL `errors` -> `success=true`
1087	- top-level GraphQL `errors` present -> `success=false`, but preserve the GraphQL response body
1088	`for debugging`
1089	- invalid input, missing auth, or transport failure -> `success=false` with an error payload
1090	`- Return the GraphQL response or error payload as structured tool output that the model can inspect`
1091	`in-session.`
1092
1093	`Illustrative responses (equivalent payload shapes are acceptable if they preserve the same outcome):`
1094
1095	```json
1096	`{"id":"<approval-id>","result":{"approved":true}}`
1097	`{"id":"<tool-call-id>","result":{"success":false,"error":"unsupported_tool_call"}}`
1098	```
1099
1100	`Hard failure on user input requirement:`
1101
1102	`- If the agent requests user input, fail the run attempt immediately.`
1103	`- The client detects this via:`
1104	- explicit method (`item/tool/requestUserInput`), or
1105	`- turn methods/flags indicating input is required.`
1106
1107	`### 10.6 Timeouts and Error Mapping`
1108
1109	`Timeouts:`
1110
1111	- `codex.read_timeout_ms`: request/response timeout during startup and sync requests
1112	- `codex.turn_timeout_ms`: total turn stream timeout
1113	- `codex.stall_timeout_ms`: enforced by orchestrator based on event inactivity
1114
1115	`Error mapping (recommended normalized categories):`
1116
1117	- `codex_not_found`
1118	- `invalid_workspace_cwd`
1119	- `response_timeout`
1120	- `turn_timeout`
1121	- `port_exit`
1122	- `response_error`
1123	- `turn_failed`
1124	- `turn_cancelled`
1125	- `turn_input_required`
1126
1127	`### 10.7 Agent Runner Contract`
1128
1129	The `Agent Runner` wraps workspace + prompt + app-server client.
1130
1131	`Behavior:`
1132
1133	`1. Create/reuse workspace for issue.`
1134	`2. Build prompt from workflow template.`
1135	`3. Start app-server session.`
1136	`4. Forward app-server events to orchestrator.`
1137	`5. On any error, fail the worker attempt (the orchestrator will retry).`
1138
1139	`Note:`
1140
1141	`- Workspaces are intentionally preserved after successful runs.`
1142
1143	`## 11. Issue Tracker Integration Contract (Linear-Compatible)`
1144
1145	`### 11.1 Required Operations`
1146
1147	`An implementation must support these tracker adapter operations:`
1148
1149	1. `fetch_candidate_issues()`
1150	`- Return issues in configured active states for a configured project.`
1151
1152	2. `fetch_issues_by_states(state_names)`
1153	`- Used for startup terminal cleanup.`
1154
1155	3. `fetch_issue_states_by_ids(issue_ids)`
1156	`- Used for active-run reconciliation.`
1157
1158	`### 11.2 Query Semantics (Linear)`
1159
1160	Linear-specific requirements for `tracker.kind == "linear"`:
1161
1162	- `tracker.kind == "linear"`
1163	- GraphQL endpoint (default `https://api.linear.app/graphql`)
1164	- Auth token sent in `Authorization` header
1165	- `tracker.project_slug` maps to Linear project `slugId`
1166	- Candidate issue query filters project using `project: { slugId: { eq: $projectSlug } }`
1167	- Issue-state refresh query uses GraphQL issue IDs with variable type `[ID!]`
1168	`- Pagination required for candidate issues`
1169	- Page size default: `50`
1170	- Network timeout: `30000 ms`
1171
1172	`Important:`
1173
1174	`- Linear GraphQL schema details can drift. Keep query construction isolated and test the exact query`
1175	`fields/types required by this specification.`
1176
1177	`A non-Linear implementation may change transport details, but the normalized outputs must match the`
1178	`domain model in Section 4.`
1179
1180	`### 11.3 Normalization Rules`
1181
1182	`Candidate issue normalization should produce fields listed in Section 4.1.1.`
1183
1184	`Additional normalization details:`
1185
1186	- `labels` -> lowercase strings
1187	- `blocked_by` -> derived from inverse relations where relation type is `blocks`
1188	- `priority` -> integer only (non-integers become null)
1189	- `created_at` and `updated_at` -> parse ISO-8601 timestamps
1190
1191	`### 11.4 Error Handling Contract`
1192
1193	`Recommended error categories:`
1194
1195	- `unsupported_tracker_kind`
1196	- `missing_tracker_api_key`
1197	- `missing_tracker_project_slug`
1198	- `linear_api_request` (transport failures)
1199	- `linear_api_status` (non-200 HTTP)
1200	- `linear_graphql_errors`
1201	- `linear_unknown_payload`
1202	- `linear_missing_end_cursor` (pagination integrity error)
1203
1204	`Orchestrator behavior on tracker errors:`
1205
1206	`- Candidate fetch failure: log and skip dispatch for this tick.`
1207	`- Running-state refresh failure: log and keep active workers running.`
1208	`- Startup terminal cleanup failure: log warning and continue startup.`
1209
1210	`### 11.5 Tracker Writes (Important Boundary)`
1211
1212	`Symphony does not require first-class tracker write APIs in the orchestrator.`
1213
1214	`- Ticket mutations (state transitions, comments, PR metadata) are typically handled by the coding`
1215	`agent using tools defined by the workflow prompt.`
1216	`- The service remains a scheduler/runner and tracker reader.`
1217	`- Workflow-specific success often means "reached the next handoff state" (for example`
1218	`Human Review`) rather than tracker terminal state `Done`.
1219	- If the optional `linear_graphql` client-side tool extension is implemented, it is still part of
1220	`the agent toolchain rather than orchestrator business logic.`
1221
1222	`## 12. Prompt Construction and Context Assembly`
1223
1224	`### 12.1 Inputs`
1225
1226	`Inputs to prompt rendering:`
1227
1228	- `workflow.prompt_template`
1229	- normalized `issue` object
1230	- optional `attempt` integer (retry/continuation metadata)
1231
1232	`### 12.2 Rendering Rules`
1233
1234	`- Render with strict variable checking.`
1235	`- Render with strict filter checking.`
1236	`- Convert issue object keys to strings for template compatibility.`
1237	`- Preserve nested arrays/maps (labels, blockers) so templates can iterate.`
1238
1239	`### 12.3 Retry/Continuation Semantics`
1240
1241	`attempt` should be passed to the template because the workflow prompt may provide different
1242	`instructions for:`
1243
1244	- first run (`attempt` null or absent)
1245	`- continuation run after a successful prior session`
1246	`- retry after error/timeout/stall`
1247
1248	`### 12.4 Failure Semantics`
1249
1250	`If prompt rendering fails:`
1251
1252	`- Fail the run attempt immediately.`
1253	`- Let the orchestrator treat it like any other worker failure and decide retry behavior.`
1254
1255	`## 13. Logging, Status, and Observability`
1256
1257	`### 13.1 Logging Conventions`
1258
1259	`Required context fields for issue-related logs:`
1260
1261	- `issue_id`
1262	- `issue_identifier`
1263
1264	`Required context for coding-agent session lifecycle logs:`
1265
1266	- `session_id`
1267
1268	`Message formatting requirements:`
1269
1270	- Use stable `key=value` phrasing.
1271	- Include action outcome (`completed`, `failed`, `retrying`, etc.).
1272	`- Include concise failure reason when present.`
1273	`- Avoid logging large raw payloads unless necessary.`
1274
1275	`### 13.2 Logging Outputs and Sinks`
1276
1277	`The spec does not prescribe where logs must go (stderr, file, remote sink, etc.).`
1278
1279	`Requirements:`
1280
1281	`- Operators must be able to see startup/validation/dispatch failures without attaching a debugger.`
1282	`- Implementations may write to one or more sinks.`
1283	`- If a configured log sink fails, the service should continue running when possible and emit an`
1284	`operator-visible warning through any remaining sink.`
1285
1286	`### 13.3 Runtime Snapshot / Monitoring Interface (Optional but Recommended)`
1287
1288	`If the implementation exposes a synchronous runtime snapshot (for dashboards or monitoring), it`
1289	`should return:`
1290
1291	- `running` (list of running session rows)
1292	- each running row should include `turn_count`
1293	- `retrying` (list of retry queue rows)
1294	- `codex_totals`
1295	- `input_tokens`
1296	- `output_tokens`
1297	- `total_tokens`
1298	- `seconds_running` (aggregate runtime seconds as of snapshot time, including active sessions)
1299	- `rate_limits` (latest coding-agent rate limit payload, if available)
1300
1301	`Recommended snapshot error modes:`
1302
1303	- `timeout`
1304	- `unavailable`
1305
1306	`### 13.4 Optional Human-Readable Status Surface`
1307
1308	`A human-readable status surface (terminal output, dashboard, etc.) is optional and`
1309	`implementation-defined.`
1310
1311	`If present, it should draw from orchestrator state/metrics only and must not be required for`
1312	`correctness.`
1313
1314	`### 13.5 Session Metrics and Token Accounting`
1315
1316	`Token accounting rules:`
1317
1318	`- Agent events may include token counts in multiple payload shapes.`
1319	`- Prefer absolute thread totals when available, such as:`
1320	- `thread/tokenUsage/updated` payloads
1321	- `total_token_usage` within token-count wrapper events
1322	- Ignore delta-style payloads such as `last_token_usage` for dashboard/API totals.
1323	`- Extract input/output/total token counts leniently from common field names within the selected`
1324	`payload.`
1325	`- For absolute totals, track deltas relative to last reported totals to avoid double-counting.`
1326	- Do not treat generic `usage` maps as cumulative totals unless the event type defines them that
1327	`way.`
1328	`- Accumulate aggregate totals in orchestrator state.`
1329
1330	`Runtime accounting:`
1331
1332	`- Runtime should be reported as a live aggregate at snapshot/render time.`
1333	`- Implementations may maintain a cumulative counter for ended sessions and add active-session`
1334	elapsed time derived from `running` entries (for example `started_at`) when producing a
1335	`snapshot/status view.`
1336	`- Add run duration seconds to the cumulative ended-session runtime when a session ends (normal exit`
1337	`or cancellation/termination).`
1338	`- Continuous background ticking of runtime totals is not required.`
1339
1340	`Rate-limit tracking:`
1341
1342	`- Track the latest rate-limit payload seen in any agent update.`
1343	`- Any human-readable presentation of rate-limit data is implementation-defined.`
1344
1345	`### 13.6 Humanized Agent Event Summaries (Optional)`
1346
1347	`Humanized summaries of raw agent protocol events are optional.`
1348
1349	`If implemented:`
1350
1351	`- Treat them as observability-only output.`
1352	`- Do not make orchestrator logic depend on humanized strings.`
1353
1354	`### 13.7 Optional HTTP Server Extension`
1355
1356	`This section defines an optional HTTP interface for observability and operational control.`
1357
1358	`If implemented:`
1359
1360	`- The HTTP server is an extension and is not required for conformance.`
1361	`- The implementation may serve server-rendered HTML or a client-side application for the dashboard.`
1362	`- The dashboard/API must be observability/control surfaces only and must not become required for`
1363	`orchestrator correctness.`
1364
1365	`Enablement (extension):`
1366
1367	- Start the HTTP server when a CLI `--port` argument is provided.
1368	- Start the HTTP server when `server.port` is present in `WORKFLOW.md` front matter.
1369	- `server.port` is extension configuration and is intentionally not part of the core front-matter
1370	`schema in Section 5.3.`
1371	- Precedence: CLI `--port` overrides `server.port` when both are present.
1372	- `server.port` must be an integer. Positive values bind that port. `0` may be used to request an
1373	`ephemeral port for local development and tests.`
1374	- Implementations should bind loopback by default (`127.0.0.1` or host equivalent) unless explicitly
1375	`configured otherwise.`
1376	- Changes to HTTP listener settings (for example `server.port`) do not need to hot-rebind;
1377	`restart-required behavior is conformant.`
1378
1379	#### 13.7.1 Human-Readable Dashboard (`/`)
1380
1381	- Host a human-readable dashboard at `/`.
1382	`- The returned document should depict the current state of the system (for example active sessions,`
1383	`retry delays, token consumption, runtime totals, recent events, and health/error indicators).`
1384	`- It is up to the implementation whether this is server-generated HTML or a client-side app that`
1385	`consumes the JSON API below.`
1386
1387	#### 13.7.2 JSON REST API (`/api/v1/*`)
1388
1389	Provide a JSON REST API under `/api/v1/*` for current runtime state and operational debugging.
1390
1391	`Minimum endpoints:`
1392
1393	- `GET /api/v1/state`
1394	`- Returns a summary view of the current system state (running sessions, retry queue/delays,`
1395	`aggregate token/runtime totals, latest rate limits, and any additional tracked summary fields).`
1396	`- Suggested response shape:`
1397
1398	```json
1399	`{`
1400	`"generated_at": "2026-02-24T20:15:30Z",`
1401	`"counts": {`
1402	`"running": 2,`
1403	`"retrying": 1`
1404	`},`
1405	`"running": [`
1406	`{`
1407	`"issue_id": "abc123",`
1408	`"issue_identifier": "MT-649",`
1409	`"state": "In Progress",`
1410	`"session_id": "thread-1-turn-1",`
1411	`"turn_count": 7,`
1412	`"last_event": "turn_completed",`
1413	`"last_message": "",`
1414	`"started_at": "2026-02-24T20:10:12Z",`
1415	`"last_event_at": "2026-02-24T20:14:59Z",`
1416	`"tokens": {`
1417	`"input_tokens": 1200,`
1418	`"output_tokens": 800,`
1419	`"total_tokens": 2000`
1420	`}`
1421	`}`
1422	`],`
1423	`"retrying": [`
1424	`{`
1425	`"issue_id": "def456",`
1426	`"issue_identifier": "MT-650",`
1427	`"attempt": 3,`
1428	`"due_at": "2026-02-24T20:16:00Z",`
1429	`"error": "no available orchestrator slots"`
1430	`}`
1431	`],`
1432	`"codex_totals": {`
1433	`"input_tokens": 5000,`
1434	`"output_tokens": 2400,`
1435	`"total_tokens": 7400,`
1436	`"seconds_running": 1834.2`
1437	`},`
1438	`"rate_limits": null`
1439	`}`
1440	```
1441
1442	- `GET /api/v1/<issue_identifier>`
1443	`- Returns issue-specific runtime/debug details for the identified issue, including any information`
1444	`the implementation tracks that is useful for debugging.`
1445	`- Suggested response shape:`
1446
1447	```json
1448	`{`
1449	`"issue_identifier": "MT-649",`
1450	`"issue_id": "abc123",`
1451	`"status": "running",`
1452	`"workspace": {`
1453	`"path": "/tmp/symphony_workspaces/MT-649"`
1454	`},`
1455	`"attempts": {`
1456	`"restart_count": 1,`
1457	`"current_retry_attempt": 2`
1458	`},`
1459	`"running": {`
1460	`"session_id": "thread-1-turn-1",`
1461	`"turn_count": 7,`
1462	`"state": "In Progress",`
1463	`"started_at": "2026-02-24T20:10:12Z",`
1464	`"last_event": "notification",`
1465	`"last_message": "Working on tests",`
1466	`"last_event_at": "2026-02-24T20:14:59Z",`
1467	`"tokens": {`
1468	`"input_tokens": 1200,`
1469	`"output_tokens": 800,`
1470	`"total_tokens": 2000`
1471	`}`
1472	`},`
1473	`"retry": null,`
1474	`"logs": {`
1475	`"codex_session_logs": [`
1476	`{`
1477	`"label": "latest",`
1478	`"path": "/var/log/symphony/codex/MT-649/latest.log",`
1479	`"url": null`
1480	`}`
1481	`]`
1482	`},`
1483	`"recent_events": [`
1484	`{`
1485	`"at": "2026-02-24T20:14:59Z",`
1486	`"event": "notification",`
1487	`"message": "Working on tests"`
1488	`}`
1489	`],`
1490	`"last_error": null,`
1491	`"tracked": {}`
1492	`}`
1493	```
1494
1495	- If the issue is unknown to the current in-memory state, return `404` with an error response (for
1496	example `{\"error\":{\"code\":\"issue_not_found\",\"message\":\"...\"}}`).
1497
1498	- `POST /api/v1/refresh`
1499	`- Queues an immediate tracker poll + reconciliation cycle (best-effort trigger; implementations`
1500	`may coalesce repeated requests).`
1501	- Suggested request body: empty body or `{}`.
1502	- Suggested response (`202 Accepted`) shape:
1503
1504	```json
1505	`{`
1506	`"queued": true,`
1507	`"coalesced": false,`
1508	`"requested_at": "2026-02-24T20:15:30Z",`
1509	`"operations": ["poll", "reconcile"]`
1510	`}`
1511	```
1512
1513	`API design notes:`
1514
1515	`- The JSON shapes above are the recommended baseline for interoperability and debugging ergonomics.`
1516	`- Implementations may add fields, but should avoid breaking existing fields within a version.`
1517	- Endpoints should be read-only except for operational triggers like `/refresh`.
1518	- Unsupported methods on defined routes should return `405 Method Not Allowed`.
1519	- API errors should use a JSON envelope such as `{"error":{"code":"...","message":"..."}}`.
1520	`- If the dashboard is a client-side app, it should consume this API rather than duplicating state`
1521	`logic.`
1522
1523	`## 14. Failure Model and Recovery Strategy`
1524
1525	`### 14.1 Failure Classes`
1526
1527	1. `Workflow/Config Failures`
1528	- Missing `WORKFLOW.md`
1529	`- Invalid YAML front matter`
1530	`- Unsupported tracker kind or missing tracker credentials/project slug`
1531	`- Missing coding-agent executable`
1532
1533	2. `Workspace Failures`
1534	`- Workspace directory creation failure`
1535	`- Workspace population/synchronization failure (implementation-defined; may come from hooks)`
1536	`- Invalid workspace path configuration`
1537	`- Hook timeout/failure`
1538
1539	3. `Agent Session Failures`
1540	`- Startup handshake failure`
1541	`- Turn failed/cancelled`
1542	`- Turn timeout`
1543	`- User input requested (hard fail)`
1544	`- Subprocess exit`
1545	`- Stalled session (no activity)`
1546
1547	4. `Tracker Failures`
1548	`- API transport errors`
1549	`- Non-200 status`
1550	`- GraphQL errors`
1551	`- malformed payloads`
1552
1553	5. `Observability Failures`
1554	`- Snapshot timeout`
1555	`- Dashboard render errors`
1556	`- Log sink configuration failure`
1557
1558	`### 14.2 Recovery Behavior`
1559
1560	`- Dispatch validation failures:`
1561	`- Skip new dispatches.`
1562	`- Keep service alive.`
1563	`- Continue reconciliation where possible.`
1564
1565	`- Worker failures:`
1566	`- Convert to retries with exponential backoff.`
1567
1568	`- Tracker candidate-fetch failures:`
1569	`- Skip this tick.`
1570	`- Try again on next tick.`
1571
1572	`- Reconciliation state-refresh failures:`
1573	`- Keep current workers.`
1574	`- Retry on next tick.`
1575
1576	`- Dashboard/log failures:`
1577	`- Do not crash the orchestrator.`
1578
1579	`### 14.3 Partial State Recovery (Restart)`
1580
1581	`Current design is intentionally in-memory for scheduler state.`
1582
1583	`After restart:`
1584
1585	`- No retry timers are restored from prior process memory.`
1586	`- No running sessions are assumed recoverable.`
1587	`- Service recovers by:`
1588	`- startup terminal workspace cleanup`
1589	`- fresh polling of active issues`
1590	`- re-dispatching eligible work`
1591
1592	`### 14.4 Operator Intervention Points`
1593
1594	`Operators can control behavior by:`
1595
1596	- Editing `WORKFLOW.md` (prompt and most runtime settings).
1597	- `WORKFLOW.md` changes should be detected and re-applied automatically without restart.
1598	`- Changing issue states in the tracker:`
1599	`- terminal state -> running session is stopped and workspace cleaned when reconciled`
1600	`- non-active state -> running session is stopped without cleanup`
1601	`- Restarting the service for process recovery or deployment (not as the normal path for applying`
1602	`workflow config changes).`
1603
1604	`## 15. Security and Operational Safety`
1605
1606	`### 15.1 Trust Boundary Assumption`
1607
1608	`Each implementation defines its own trust boundary.`
1609
1610	`Operational safety requirements:`
1611
1612	`- Implementations should state clearly whether they are intended for trusted environments, more`
1613	`restrictive environments, or both.`
1614	`- Implementations should state clearly whether they rely on auto-approved actions, operator`
1615	`approvals, stricter sandboxing, or some combination of those controls.`
1616	`- Workspace isolation and path validation are important baseline controls, but they are not a`
1617	`substitute for whatever approval and sandbox policy an implementation chooses.`
1618
1619	`### 15.2 Filesystem Safety Requirements`
1620
1621	`Mandatory:`
1622
1623	`- Workspace path must remain under configured workspace root.`
1624	`- Coding-agent cwd must be the per-issue workspace path for the current run.`
1625	`- Workspace directory names must use sanitized identifiers.`
1626
1627	`Recommended additional hardening for ports:`
1628
1629	`- Run under a dedicated OS user.`
1630	`- Restrict workspace root permissions.`
1631	`- Mount workspace root on a dedicated volume if possible.`
1632
1633	`### 15.3 Secret Handling`
1634
1635	- Support `$VAR` indirection in workflow config.
1636	`- Do not log API tokens or secret env values.`
1637	`- Validate presence of secrets without printing them.`
1638
1639	`### 15.4 Hook Script Safety`
1640
1641	Workspace hooks are arbitrary shell scripts from `WORKFLOW.md`.
1642
1643	`Implications:`
1644
1645	`- Hooks are fully trusted configuration.`
1646	`- Hooks run inside the workspace directory.`
1647	`- Hook output should be truncated in logs.`
1648	`- Hook timeouts are required to avoid hanging the orchestrator.`
1649
1650	`### 15.5 Harness Hardening Guidance`
1651
1652	`Running Codex agents against repositories, issue trackers, and other inputs that may contain`
1653	`sensitive data or externally-controlled content can be dangerous. A permissive deployment can lead`
1654	`to data leaks, destructive mutations, or full machine compromise if the agent is induced to execute`
1655	`harmful commands or use overly-powerful integrations.`
1656
1657	`Implementations should explicitly evaluate their own risk profile and harden the execution harness`
1658	`where appropriate. This specification intentionally does not mandate a single hardening posture, but`
1659	`ports should not assume that tracker data, repository contents, prompt inputs, or tool arguments are`
1660	`fully trustworthy just because they originate inside a normal workflow.`
1661
1662	`Possible hardening measures include:`
1663
1664	`- Tightening Codex approval and sandbox settings described elsewhere in this specification instead`
1665	`of running with a maximally permissive configuration.`
1666	`- Adding external isolation layers such as OS/container/VM sandboxing, network restrictions, or`
1667	`separate credentials beyond the built-in Codex policy controls.`
1668	`- Filtering which Linear issues, projects, teams, labels, or other tracker sources are eligible for`
1669	`dispatch so untrusted or out-of-scope tasks do not automatically reach the agent.`
1670	- Narrowing the optional `linear_graphql` tool so it can only read or mutate data inside the
1671	`intended project scope, rather than exposing general workspace-wide tracker access.`
1672	`- Reducing the set of client-side tools, credentials, filesystem paths, and network destinations`
1673	`available to the agent to the minimum needed for the workflow.`
1674
1675	`The correct controls are deployment-specific, but implementations should document them clearly and`
1676	`treat harness hardening as part of the core safety model rather than an optional afterthought.`
1677
1678	`## 16. Reference Algorithms (Language-Agnostic)`
1679
1680	`### 16.1 Service Startup`
1681
1682	```text
1683	`function start_service():`
1684	`configure_logging()`
1685	`start_observability_outputs()`
1686	`start_workflow_watch(on_change=reload_and_reapply_workflow)`
1687
1688	`state = {`
1689	`poll_interval_ms: get_config_poll_interval_ms(),`
1690	`max_concurrent_agents: get_config_max_concurrent_agents(),`
1691	`running: {},`
1692	`claimed: set(),`
1693	`retry_attempts: {},`
1694	`completed: set(),`
1695	`codex_totals: {input_tokens: 0, output_tokens: 0, total_tokens: 0, seconds_running: 0},`
1696	`codex_rate_limits: null`
1697	`}`
1698
1699	`validation = validate_dispatch_config()`
1700	`if validation is not ok:`
1701	`log_validation_error(validation)`
1702	`fail_startup(validation)`
1703
1704	`startup_terminal_workspace_cleanup()`
1705	`schedule_tick(delay_ms=0)`
1706
1707	`event_loop(state)`
1708	```
1709
1710	`### 16.2 Poll-and-Dispatch Tick`
1711
1712	```text
1713	`on_tick(state):`
1714	`state = reconcile_running_issues(state)`
1715
1716	`validation = validate_dispatch_config()`
1717	`if validation is not ok:`
1718	`log_validation_error(validation)`
1719	`notify_observers()`
1720	`schedule_tick(state.poll_interval_ms)`
1721	`return state`
1722
1723	`issues = tracker.fetch_candidate_issues()`
1724	`if issues failed:`
1725	`log_tracker_error()`
1726	`notify_observers()`
1727	`schedule_tick(state.poll_interval_ms)`
1728	`return state`
1729
1730	`for issue in sort_for_dispatch(issues):`
1731	`if no_available_slots(state):`
1732	`break`
1733
1734	`if should_dispatch(issue, state):`
1735	`state = dispatch_issue(issue, state, attempt=null)`
1736
1737	`notify_observers()`
1738	`schedule_tick(state.poll_interval_ms)`
1739	`return state`
1740	```
1741
1742	`### 16.3 Reconcile Active Runs`
1743
1744	```text
1745	`function reconcile_running_issues(state):`
1746	`state = reconcile_stalled_runs(state)`
1747
1748	`running_ids = keys(state.running)`
1749	`if running_ids is empty:`
1750	`return state`
1751
1752	`refreshed = tracker.fetch_issue_states_by_ids(running_ids)`
1753	`if refreshed failed:`
1754	`log_debug("keep workers running")`
1755	`return state`
1756
1757	`for issue in refreshed:`
1758	`if issue.state in terminal_states:`
1759	`state = terminate_running_issue(state, issue.id, cleanup_workspace=true)`
1760	`else if issue.state in active_states:`
1761	`state.running[issue.id].issue = issue`
1762	`else:`
1763	`state = terminate_running_issue(state, issue.id, cleanup_workspace=false)`
1764
1765	`return state`
1766	```
1767
1768	`### 16.4 Dispatch One Issue`
1769
1770	```text
1771	`function dispatch_issue(issue, state, attempt):`
1772	`worker = spawn_worker(`
1773	`fn -> run_agent_attempt(issue, attempt, parent_orchestrator_pid) end`
1774	`)`
1775
1776	`if worker spawn failed:`
1777	`return schedule_retry(state, issue.id, next_attempt(attempt), {`
1778	`identifier: issue.identifier,`
1779	`error: "failed to spawn agent"`
1780	`})`
1781
1782	`state.running[issue.id] = {`
1783	`worker_handle,`
1784	`monitor_handle,`
1785	`identifier: issue.identifier,`
1786	`issue,`
1787	`session_id: null,`
1788	`codex_app_server_pid: null,`
1789	`last_codex_message: null,`
1790	`last_codex_event: null,`
1791	`last_codex_timestamp: null,`
1792	`codex_input_tokens: 0,`
1793	`codex_output_tokens: 0,`
1794	`codex_total_tokens: 0,`
1795	`last_reported_input_tokens: 0,`
1796	`last_reported_output_tokens: 0,`
1797	`last_reported_total_tokens: 0,`
1798	`retry_attempt: normalize_attempt(attempt),`
1799	`started_at: now_utc()`
1800	`}`
1801
1802	`state.claimed.add(issue.id)`
1803	`state.retry_attempts.remove(issue.id)`
1804	`return state`
1805	```
1806
1807	`### 16.5 Worker Attempt (Workspace + Prompt + Agent)`
1808
1809	```text
1810	`function run_agent_attempt(issue, attempt, orchestrator_channel):`
1811	`workspace = workspace_manager.create_for_issue(issue.identifier)`
1812	`if workspace failed:`
1813	`fail_worker("workspace error")`
1814
1815	`if run_hook("before_run", workspace.path) failed:`
1816	`fail_worker("before_run hook error")`
1817
1818	`session = app_server.start_session(workspace=workspace.path)`
1819	`if session failed:`
1820	`run_hook_best_effort("after_run", workspace.path)`
1821	`fail_worker("agent session startup error")`
1822
1823	`max_turns = config.agent.max_turns`
1824	`turn_number = 1`
1825
1826	`while true:`
1827	`prompt = build_turn_prompt(workflow_template, issue, attempt, turn_number, max_turns)`
1828	`if prompt failed:`
1829	`app_server.stop_session(session)`
1830	`run_hook_best_effort("after_run", workspace.path)`
1831	`fail_worker("prompt error")`
1832
1833	`turn_result = app_server.run_turn(`
1834	`session=session,`
1835	`prompt=prompt,`
1836	`issue=issue,`
1837	`on_message=(msg) -> send(orchestrator_channel, {codex_update, issue.id, msg})`
1838	`)`
1839
1840	`if turn_result failed:`
1841	`app_server.stop_session(session)`
1842	`run_hook_best_effort("after_run", workspace.path)`
1843	`fail_worker("agent turn error")`
1844
1845	`refreshed_issue = tracker.fetch_issue_states_by_ids([issue.id])`
1846	`if refreshed_issue failed:`
1847	`app_server.stop_session(session)`
1848	`run_hook_best_effort("after_run", workspace.path)`
1849	`fail_worker("issue state refresh error")`
1850
1851	`issue = refreshed_issue[0] or issue`
1852
1853	`if issue.state is not active:`
1854	`break`
1855
1856	`if turn_number >= max_turns:`
1857	`break`
1858
1859	`turn_number = turn_number + 1`
1860
1861	`app_server.stop_session(session)`
1862	`run_hook_best_effort("after_run", workspace.path)`
1863
1864	`exit_normal()`
1865	```
1866
1867	`### 16.6 Worker Exit and Retry Handling`
1868
1869	```text
1870	`on_worker_exit(issue_id, reason, state):`
1871	`running_entry = state.running.remove(issue_id)`
1872	`state = add_runtime_seconds_to_totals(state, running_entry)`
1873
1874	`if reason == normal:`
1875	`state.completed.add(issue_id) # bookkeeping only`
1876	`state = schedule_retry(state, issue_id, 1, {`
1877	`identifier: running_entry.identifier,`
1878	`delay_type: continuation`
1879	`})`
1880	`else:`
1881	`state = schedule_retry(state, issue_id, next_attempt_from(running_entry), {`
1882	`identifier: running_entry.identifier,`
1883	`error: format("worker exited: %reason")`
1884	`})`
1885
1886	`notify_observers()`
1887	`return state`
1888	```
1889
1890	```text
1891	`on_retry_timer(issue_id, state):`
1892	`retry_entry = state.retry_attempts.pop(issue_id)`
1893	`if missing:`
1894	`return state`
1895
1896	`candidates = tracker.fetch_candidate_issues()`
1897	`if fetch failed:`
1898	`return schedule_retry(state, issue_id, retry_entry.attempt + 1, {`
1899	`identifier: retry_entry.identifier,`
1900	`error: "retry poll failed"`
1901	`})`
1902
1903	`issue = find_by_id(candidates, issue_id)`
1904	`if issue is null:`
1905	`state.claimed.remove(issue_id)`
1906	`return state`
1907
1908	`if available_slots(state) == 0:`
1909	`return schedule_retry(state, issue_id, retry_entry.attempt + 1, {`
1910	`identifier: issue.identifier,`
1911	`error: "no available orchestrator slots"`
1912	`})`
1913
1914	`return dispatch_issue(issue, state, attempt=retry_entry.attempt)`
1915	```
1916
1917	`## 17. Test and Validation Matrix`
1918
1919	`A conforming implementation should include tests that cover the behaviors defined in this`
1920	`specification.`
1921
1922	`Validation profiles:`
1923
1924	- `Core Conformance`: deterministic tests required for all conforming implementations.
1925	- `Extension Conformance`: required only for optional features that an implementation chooses to
1926	`ship.`
1927	- `Real Integration Profile`: environment-dependent smoke/integration checks recommended before
1928	`production use.`
1929
1930	Unless otherwise noted, Sections 17.1 through 17.7 are `Core Conformance`. Bullets that begin with
1931	`If ... is implemented` are `Extension Conformance`.
1932
1933	`### 17.1 Workflow and Config Parsing`
1934
1935	`- Workflow file path precedence:`
1936	`- explicit runtime path is used when provided`
1937	- cwd default is `WORKFLOW.md` when no explicit runtime path is provided
1938	`- Workflow file changes are detected and trigger re-read/re-apply without restart`
1939	`- Invalid workflow reload keeps last known good effective configuration and emits an`
1940	`operator-visible error`
1941	- Missing `WORKFLOW.md` returns typed error
1942	`- Invalid YAML front matter returns typed error`
1943	`- Front matter non-map returns typed error`
1944	`- Config defaults apply when optional values are missing`
1945	- `tracker.kind` validation enforces currently supported kind (`linear`)
1946	- `tracker.api_key` works (including `$VAR` indirection)
1947	- `$VAR` resolution works for tracker API key and path values
1948	- `~` path expansion works
1949	- `codex.command` is preserved as a shell command string
1950	`- Per-state concurrency override map normalizes state names and ignores invalid values`
1951	- Prompt template renders `issue` and `attempt`
1952	`- Prompt rendering fails on unknown variables (strict mode)`
1953
1954	`### 17.2 Workspace Manager and Safety`
1955
1956	`- Deterministic workspace path per issue identifier`
1957	`- Missing workspace directory is created`
1958	`- Existing workspace directory is reused`
1959	`- Existing non-directory path at workspace location is handled safely (replace or fail per`
1960	`implementation policy)`
1961	`- Optional workspace population/synchronization errors are surfaced`
1962	- Temporary artifacts (`tmp`, `.elixir_ls`) are removed during prep
1963	- `after_create` hook runs only on new workspace creation
1964	- `before_run` hook runs before each attempt and failure/timeouts abort the current attempt
1965	- `after_run` hook runs after each attempt and failure/timeouts are logged and ignored
1966	- `before_remove` hook runs on cleanup and failures/timeouts are ignored
1967	`- Workspace path sanitization and root containment invariants are enforced before agent launch`
1968	`- Agent launch uses the per-issue workspace path as cwd and rejects out-of-root paths`
1969
1970	`### 17.3 Issue Tracker Client`
1971
1972	`- Candidate issue fetch uses active states and project slug`
1973	- Linear query uses the specified project filter field (`slugId`)
1974	- Empty `fetch_issues_by_states([])` returns empty without API call
1975	`- Pagination preserves order across multiple pages`
1976	- Blockers are normalized from inverse relations of type `blocks`
1977	`- Labels are normalized to lowercase`
1978	`- Issue state refresh by ID returns minimal normalized issues`
1979	- Issue state refresh query uses GraphQL ID typing (`[ID!]`) as specified in Section 11.2
1980	`- Error mapping for request errors, non-200, GraphQL errors, malformed payloads`
1981
1982	`### 17.4 Orchestrator Dispatch, Reconciliation, and Retry`
1983
1984	`- Dispatch sort order is priority then oldest creation time`
1985	- `Todo` issue with non-terminal blockers is not eligible
1986	- `Todo` issue with terminal blockers is eligible
1987	`- Active-state issue refresh updates running entry state`
1988	`- Non-active state stops running agent without workspace cleanup`
1989	`- Terminal state stops running agent and cleans workspace`
1990	`- Reconciliation with no running issues is a no-op`
1991	`- Normal worker exit schedules a short continuation retry (attempt 1)`
1992	`- Abnormal worker exit increments retries with 10s-based exponential backoff`
1993	- Retry backoff cap uses configured `agent.max_retry_backoff_ms`
1994	`- Retry queue entries include attempt, due time, identifier, and error`
1995	`- Stall detection kills stalled sessions and schedules retry`
1996	`- Slot exhaustion requeues retries with explicit error reason`
1997	`- If a snapshot API is implemented, it returns running rows, retry rows, token totals, and rate`
1998	`limits`
1999	`- If a snapshot API is implemented, timeout/unavailable cases are surfaced`
2000
2001	`### 17.5 Coding-Agent App-Server Client`
2002
2003	- Launch command uses workspace cwd and invokes `bash -lc <codex.command>`
2004	- Startup handshake sends `initialize`, `initialized`, `thread/start`, `turn/start`
2005	- `initialize` includes client identity/capabilities payload required by the targeted Codex
2006	`app-server protocol`
2007	`- Policy-related startup payloads use the implementation's documented approval/sandbox settings`
2008	- `thread/start` and `turn/start` parse nested IDs and emit `session_started`
2009	`- Request/response read timeout is enforced`
2010	`- Turn timeout is enforced`
2011	`- Partial JSON lines are buffered until newline`
2012	`- Stdout and stderr are handled separately; protocol JSON is parsed from stdout only`
2013	`- Non-JSON stderr lines are logged but do not crash parsing`
2014	`- Command/file-change approvals are handled according to the implementation's documented policy`
2015	`- Unsupported dynamic tool calls are rejected without stalling the session`
2016	`- User input requests are handled according to the implementation's documented policy and do not`
2017	`stall indefinitely`
2018	`- Usage and rate-limit payloads are extracted from nested payload shapes`
2019	`- Compatible payload variants for approvals, user-input-required signals, and usage/rate-limit`
2020	`telemetry are accepted when they preserve the same logical meaning`
2021	`- If optional client-side tools are implemented, the startup handshake advertises the supported tool`
2022	`specs required for discovery by the targeted app-server version`
2023	- If the optional `linear_graphql` client-side tool extension is implemented:
2024	`- the tool is advertised to the session`
2025	- valid `query` / `variables` inputs execute against configured Linear auth
2026	- top-level GraphQL `errors` produce `success=false` while preserving the GraphQL body
2027	`- invalid arguments, missing auth, and transport failures return structured failure payloads`
2028	`- unsupported tool names still fail without stalling the session`
2029
2030	`### 17.6 Observability`
2031
2032	`- Validation failures are operator-visible`
2033	`- Structured logging includes issue/session context fields`
2034	`- Logging sink failures do not crash orchestration`
2035	`- Token/rate-limit aggregation remains correct across repeated agent updates`
2036	`- If a human-readable status surface is implemented, it is driven from orchestrator state and does`
2037	`not affect correctness`
2038	`- If humanized event summaries are implemented, they cover key wrapper/agent event classes without`
2039	`changing orchestrator behavior`
2040
2041	`### 17.7 CLI and Host Lifecycle`
2042
2043	- CLI accepts an optional positional workflow path argument (`path-to-WORKFLOW.md`)
2044	- CLI uses `./WORKFLOW.md` when no workflow path argument is provided
2045	- CLI errors on nonexistent explicit workflow path or missing default `./WORKFLOW.md`
2046	`- CLI surfaces startup failure cleanly`
2047	`- CLI exits with success when application starts and shuts down normally`
2048	`- CLI exits nonzero when startup fails or the host process exits abnormally`
2049
2050	`### 17.8 Real Integration Profile (Recommended)`
2051
2052	`These checks are recommended for production readiness and may be skipped in CI when credentials,`
2053	`network access, or external service permissions are unavailable.`
2054
2055	- A real tracker smoke test can be run with valid credentials supplied by `LINEAR_API_KEY` or a
2056	documented local bootstrap mechanism (for example `~/.linear_api_key`).
2057	`- Real integration tests should use isolated test identifiers/workspaces and clean up tracker`
2058	`artifacts when practical.`
2059	`- A skipped real-integration test should be reported as skipped, not silently treated as passed.`
2060	`- If a real-integration profile is explicitly enabled in CI or release validation, failures should`
2061	`fail that job.`
2062
2063	`## 18. Implementation Checklist (Definition of Done)`
2064
2065	`Use the same validation profiles as Section 17:`
2066
2067	- Section 18.1 = `Core Conformance`
2068	- Section 18.2 = `Extension Conformance`
2069	- Section 18.3 = `Real Integration Profile`
2070
2071	`### 18.1 Required for Conformance`
2072
2073	`- Workflow path selection supports explicit runtime path and cwd default`
2074	- `WORKFLOW.md` loader with YAML front matter + prompt body split
2075	- Typed config layer with defaults and `$` resolution
2076	- Dynamic `WORKFLOW.md` watch/reload/re-apply for config and prompt
2077	`- Polling orchestrator with single-authority mutable state`
2078	`- Issue tracker client with candidate fetch + state refresh + terminal fetch`
2079	`- Workspace manager with sanitized per-issue workspaces`
2080	- Workspace lifecycle hooks (`after_create`, `before_run`, `after_run`, `before_remove`)
2081	- Hook timeout config (`hooks.timeout_ms`, default `60000`)
2082	`- Coding-agent app-server subprocess client with JSON line protocol`
2083	- Codex launch command config (`codex.command`, default `codex app-server`)
2084	- Strict prompt rendering with `issue` and `attempt` variables
2085	`- Exponential retry queue with continuation retries after normal exit`
2086	- Configurable retry backoff cap (`agent.max_retry_backoff_ms`, default 5m)
2087	`- Reconciliation that stops runs on terminal/non-active tracker states`
2088	`- Workspace cleanup for terminal issues (startup sweep + active transition)`
2089	- Structured logs with `issue_id`, `issue_identifier`, and `session_id`
2090	`- Operator-visible observability (structured logs; optional snapshot/status surface)`
2091
2092	`### 18.2 Recommended Extensions (Not Required for Conformance)`
2093
2094	- Optional HTTP server honors CLI `--port` over `server.port`, uses a safe default bind host, and
2095	`exposes the baseline endpoints/error semantics in Section 13.7 if shipped.`
2096	- Optional `linear_graphql` client-side tool extension exposes raw Linear GraphQL access through the
2097	`app-server session using configured Symphony auth.`
2098	`- TODO: Persist retry queue and session metadata across process restarts.`
2099	`- TODO: Make observability settings configurable in workflow front matter without prescribing UI`
2100	`implementation details.`
2101	`- TODO: Add first-class tracker write APIs (comments/state transitions) in the orchestrator instead`
2102	`of only via agent tools.`
2103	`- TODO: Add pluggable issue tracker adapters beyond Linear.`
2104
2105	`### 18.3 Operational Validation Before Production (Recommended)`
2106
2107	- Run the `Real Integration Profile` from Section 17.8 with valid credentials and network access.
2108	`- Verify hook execution and workflow path resolution on the target host OS/shell environment.`
2109	`- If the optional HTTP server is shipped, verify the configured port behavior and loopback/default`
2110	`bind expectations on the target environment.`
2111

openai/symphony

Branches

Tags

Clone