microsoft/TypeAgent

Public

mirrored fromhttps://github.com/microsoft/TypeAgentAvailable

CodeCommitsIssuesPull requestsActionsInsightsSecurity
copilot/optimize-grammar-matcher-efficiency

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

ts/docs/architecture/agentServerSessions.md

434lines · modepreview

# AgentServer Sessions Architecture

---

## Motivation

Today, the agent server has a single, implicit session shared by all connected clients. The Shell application compensates by saving chat history to an HTML file on disk, giving it de-facto session persistence. The CLI has no such mechanism and starts fresh every time it connects. Neither client exposes user-level session management — users cannot resume a previous conversation, name a context, or discard one they no longer need.

As we begin making a larger shift towards a client-server model with the agent server, it is becoming more apparent that the session lifecycle belongs in the server layer, not in individual clients. This document proposes a design for explicit, named sessions with full CRUD semantics exposed via the existing RPC protocol.

---

## Goals

- **Create** a new named session on demand.
- **Resume** a previous session (chat history, conversation memory, display log) by ID.
- **List** available sessions, optionally filtered by a name substring match.
- **Rename** a session.
- **Delete** a session and its persisted data.
- Sessions are identified by a **GUID** and carry a human-readable **name** and **client count** so clients can make informed join decisions.
- Clients specify an **optional session ID** at `joinSession()` time; omitting it connects to the default session, or auto-creates one if none exist. The server always resolves to `"default"` rather than the most recently active session because in a multi-client environment, "most recently active" is a server-wide concept — a CLI user spinning up independently should not be silently dropped into an active Shell session. Clients that want last-used behavior should remember their last session ID locally and pass it explicitly.
- Session isolation is **client-enforced** for now — the server provides the signals, clients decide the policy.

## Non-Goals

- **Authentication or access control on the WebSocket endpoint.** Any process that can reach the agentServer's WebSocket port can call any RPC method, including `deleteSession`. Securing the endpoint itself is out of scope for v1.
- **Per-session access control.** The server does not restrict which clients can join or delete which sessions. Clients are trusted. See Open Questions for a future path.
- **Multi-user or multi-machine session sharing.** Sessions are local to a single agentServer instance.

---

## Current State

### What Exists Today

The dispatcher already has the scaffolding for session persistence:

- `Session.restoreLastSession()` — loads the most recently used session on startup.
- `persistSession: true` + `persistDir` — persists chat history, conversation memory, display log, and session config to `~/.typeagent/`.
- `sessions.json` + `sessions/<sessionDir>/data.json` — per-session on-disk records.

However, this is **transparent to clients**: there is no protocol-level API to list, choose, or delete sessions. The server always resumes whatever was last active.

### Instance Storage vs. Session Storage

The dispatcher exposes two storage scopes to agents via `SessionContext`:

- **`instanceStorage`** — scoped to `instanceDir` when present, falling back to `persistDir` when `instanceDir` is absent (standalone Shell, CLI, tests). Intended for configuration and data that should **survive across dispatcher sessions** (e.g. agent auth tokens, user preferences, learned config). Agents write here and expect to read it back regardless of which session the user is in.
- **`sessionStorage`** — scoped to `persistDir/sessions/<sessionId>/`. Intended for ephemeral, session-local data (e.g. caches, in-progress state) that is discarded when the user creates a new session.

In `sessionContext.ts`, the mapping is explicit:

```typescript
const storage = storageProvider.getStorage(name, sessionDirPath); // sessionStorage
const instanceStorage =
  (context.instanceDir ?? context.persistDir)
    ? storageProvider!.getStorage(
        name,
        context.instanceDir ?? context.persistDir!,
      )
    : undefined; // instanceStorage
```

This contract — `instanceStorage` survives, `sessionStorage` is ephemeral — holds today in both the standalone Shell and the CLI.

### The Problem with Scoping `persistDir` per Server Session

Naively scoping each server-session's `persistDir` to `server-sessions/<server-session-id>/` breaks this contract:

```
server-sessions/<server-session-id>/                        ← persistDir → instanceStorage root
server-sessions/<server-session-id>/sessions/<session-id>/  ← sessionStorage
```

**Every time a new server session is created, both `instanceStorage` and `sessionStorage` start fresh.** Agent configuration data (auth tokens, user preferences, learned state) is silently discarded whenever the user connects to a new server session. The fix is a split storage root described in Section 4.

### One Shared Context for All Clients

A critical detail: `createSharedDispatcher()` calls `initializeCommandHandlerContext()` **once** at startup, producing a single `context`. Every subsequent `join()` call creates a `Dispatcher` via `createDispatcherFromContext(context, connectionId, ...)` — all clients share the same underlying session context. Chat history, conversation memory, and session config are fully shared state. The `connectionId` only isolates `ClientIO` routing (display output reaches the right client), not the conversation itself.

This means a second client connecting mid-conversation sees — and appends to — the same chat history the first client was using. There is effectively no per-client session isolation today.

### Key Gap

The `join()` call today accepts only:

```typescript
type DispatcherConnectOptions = {
  filter?: boolean;
  clientType?: "shell" | "extension";
};
```

There is no way for a client to specify which session to use, or to perform session management at all.

---

## Proposed Design

### 1. Session Identity

Each session is identified by:

| Field         | Type                | Description                                                                                                                                               |
| ------------- | ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `sessionId`   | `string` (UUIDv4)   | Stable, globally unique identifier                                                                                                                        |
| `name`        | `string`            | Human-readable label (1–256 chars), set by the caller at `createSession()` time. Not enforced unique.                                                     |
| `createdAt`   | `string` (ISO 8601) | When the session was first created                                                                                                                        |
| `clientCount` | `number`            | Number of clients currently connected to this session (runtime-derived; `0` if the session is not loaded in memory). **Never persisted** — see Section 2. |

### 2. Session Metadata

A `sessions.json` file lives at `instanceDir/server-sessions/sessions.json` and is the authoritative registry:

```json
{
  "sessions": [
    {
      "sessionId": "a1b2c3d4-...",
      "name": "workout playlist setup",
      "createdAt": "2026-04-01T10:00:00Z"
    }
  ]
}
```

Each session's ephemeral data (chat history, conversation memory, display log, session config) is stored in `instanceDir/server-sessions/<sessionId>/`. Agent `instanceStorage` (config, auth tokens, learned state) is stored directly under `instanceDir/<agentName>/`, **shared across all server sessions**.

> **Note:** `clientCount` is a runtime-only field — it is **never written to `sessions.json`**. It is populated at query time by inspecting the live dispatcher pool.

> **Note:** Legacy session data (from standalone Shell runs) is left in place and coexists on disk with the agentServer's session registry. The agentServer does not read, migrate, or touch legacy session directories. The standalone Shell continues to use its own session management independently for now.

### 3. Protocol Changes

#### Extended `DispatcherConnectOptions`

```typescript
type DispatcherConnectOptions = {
  filter?: boolean;
  clientType?: "shell" | "extension";

  // Session management (new)
  sessionId?: string; // Join a specific session by UUID. If omitted → connects to the default session.
};
```

#### New `AgentServerInvokeFunctions`

The existing `join` RPC is replaced by `joinSession`. A `leaveSession` call is added for explicit session departure. Full session CRUD is exposed:

```typescript
type AgentServerInvokeFunctions = {
  // Replaces the old `join`
  joinSession: (
    options?: DispatcherConnectOptions,
  ) => Promise<JoinSessionResult>;
  leaveSession: (sessionId: string) => Promise<void>;

  // Session CRUD
  createSession: (name: string) => Promise<SessionInfo>;
  listSessions: (name?: string) => Promise<SessionInfo[]>;
  renameSession: (sessionId: string, newName: string) => Promise<void>;
  deleteSession: (sessionId: string) => Promise<void>;
};
```

#### `JoinSessionResult`

```typescript
type JoinSessionResult = {
  connectionId: string;
  sessionId: string; // The session that was joined or auto-created
};
```

> **Migration note:** The old `join()` returned `Promise<string>` (the `connectionId`). The fork avoids this breaking change by renaming the method to `joinSession()` and keeping `connectDispatcher()` as a `@deprecated` backward-compatible wrapper in `agentServerClient.ts`.

#### `SessionInfo`

```typescript
type SessionInfo = {
  sessionId: string;
  name: string;
  clientCount: number;
  createdAt: string;
};
```

### 4. Server-Side: `SessionManager` and `SharedDispatcher`

Today, `createSharedDispatcher()` creates one global dispatcher with one session. Under the new design, a `SessionManager` maintains a **pool of per-session `SharedDispatcher` instances** — one per active session, shared by all clients connected to that session.

```
AgentServer
  └── SessionManager
        ├── SessionRecord[session-A]
        │     └── SharedDispatcher ← clients 0, 1 (both connected to session A)
        └── SessionRecord[session-B]
              └── SharedDispatcher ← client 2 (connected to session B)
```

#### Storage Split: `instanceDir` vs. `persistDir`

To preserve the `instanceStorage` / `sessionStorage` contract across server sessions, the dispatcher must be initialized with **two distinct root directories** rather than one:

| Directory     | Purpose                                                                                                                                 | Lifetime                                                                                                  |
| ------------- | --------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
| `instanceDir` | Global instance root — maps to `instanceStorage` for all agents. Contains agent config, auth tokens, user preferences, embedding cache. | Lives for the lifetime of the agentServer process (or the user profile). Never scoped per server session. |
| `persistDir`  | Per-server-session root — maps to `sessionStorage` and holds chat history, conversation memory, display log, and session config.        | Scoped to `instanceDir/server-sessions/<sessionId>/`. Discarded with the session.                         |

**Concrete paths:**

```
~/.typeagent/profiles/dev/                                              ← instanceDir (global)
~/.typeagent/profiles/dev/server-sessions/<sessionId>/                  ← persistDir (per session)
~/.typeagent/profiles/dev/server-sessions/<sessionId>/sessions/<id>/    ← sessionStorage
~/.typeagent/profiles/dev/<agentName>/                                  ← instanceStorage (global)
```

#### `DispatcherOptions` changes

`initializeCommandHandlerContext()` today accepts a single `persistDir`. To support the split, a new optional `instanceDir` field is added:

```typescript
type DispatcherOptions = {
  // ...existing fields...
  persistDir?: string; // per-server-session directory (chat history, memory, config)
  instanceDir?: string; // global instance directory for cross-session agent storage
  // ...
};
```

When `instanceDir` is provided, `instanceStorage` is rooted there instead of at `persistDir`. When `instanceDir` is omitted (standalone Shell, CLI, tests), behavior is unchanged — `instanceStorage` falls back to `persistDir`, preserving full backward compatibility.

#### `SessionContext` wiring

In `sessionContext.ts`, the `instanceStorage` base changes from `context.persistDir` to the new `context.instanceDir` (falling back to `context.persistDir` when `instanceDir` is absent):

```typescript
const instanceStorage =
  (context.instanceDir ?? context.persistDir)
    ? storageProvider!.getStorage(
        name,
        context.instanceDir ?? context.persistDir!,
      )
    : undefined;
```

This is the only change needed in the storage wiring — no changes to the `Storage` interface or agent code.

#### Server initialization

When the agentServer starts up, it resolves both directories once and passes them to every per-session dispatcher:

```typescript
const instanceDir = getProfilePath("dev"); // e.g. ~/.typeagent/profiles/dev
const persistDir = path.join(instanceDir, "server-sessions", sessionId); // per-session subdirectory

initializeCommandHandlerContext("agentServer", {
  instanceDir, // global — never changes between sessions
  persistDir, // scoped to this server session
  persistSession: true,
  // ...
});
```

#### `CommandHandlerContext` changes

A new `instanceDir` field is added alongside the existing `persistDir`:

```typescript
export type CommandHandlerContext = {
  // ...existing fields...
  readonly persistDir: string | undefined; // per-server-session root (chat, memory, config)
  readonly instanceDir: string | undefined; // global instance root (agent config, auth tokens)
  // ...
};
```

Each session's `SharedDispatcher` is created lazily on first `joinSession()` and calls `initializeCommandHandlerContext()` with a `persistDir` scoped to `server-sessions/<sessionId>/` and a shared `instanceDir`, giving it fully isolated chat history and session config while preserving agent configuration across session boundaries. Clients connecting to the same session share one dispatcher instance and its routing `ClientIO` table, consistent with how the current single dispatcher works today.

`SharedDispatcher.join()` calls `createDispatcherFromContext(context, connectionId, ...)` per client — producing a lightweight `Dispatcher` handle bound to a unique `connectionId` but sharing the same underlying context. Output routing is per-client via `connectionId`; conversation state is shared across all clients in the session.

Idle session dispatchers are automatically evicted from memory after 5 minutes with no connected clients, freeing resources without requiring explicit lifecycle management.

#### Channel Namespacing

Each session uses namespaced WebSocket channels to allow multiple sessions over a single WebSocket connection:

- `dispatcher:<sessionId>` — the dispatcher RPC channel for that session
- `clientio:<sessionId>` — the ClientIO RPC channel for that session

### 5. Session Lifecycle on `joinSession()`

```
Client calls joinSession({ sessionId?, clientType, filter })
  │
  ├─ sessionId provided?
  │   ├─ Yes → look up instanceDir/server-sessions/sessions.json
  │   │   ├─ Found → load SharedDispatcher for this session (lazy init if not in memory pool)
  │   │   └─ Not found → return error: "Session not found"
  │   └─ No → connect to the default session
  │       ├─ Session named "default" exists → use it
  │       └─ No sessions exist → auto-create session named "default"
  │           ├─ Create instanceDir/server-sessions/<sessionId>/     ← persistDir
  │           └─ Init dispatcher with instanceDir (global) + persistDir (session-scoped)
  │
  ├─ Register client in session's SharedDispatcher routing table
  └─ Return JoinSessionResult { connectionId, sessionId }
```

### 6. `listSessions(name?)`

Returns the sessions from `sessions.json`. If `name` is provided, only sessions whose `name` contains the substring (case-insensitive) are returned. If `name` is omitted, all sessions are returned. `clientCount` is populated from the live dispatcher pool for sessions currently loaded in memory.

```typescript
// Response shape
SessionInfo[]
// Example:
[
  {
    sessionId: "a1b2c3d4-e5f6-...",
    name: "workout playlist setup",
    createdAt: "2026-04-01T10:00:00Z",
    clientCount: 1
  },
  {
    sessionId: "f7e8d9c0-...",
    name: "flight research",
    createdAt: "2026-03-28T09:15:00Z",
    clientCount: 0
  }
]
```

### 7. `deleteSession(sessionId)`

1. Close all active client dispatcher handles for the session.
2. Shut down and evict the session's `SharedDispatcher` from the in-memory pool.
3. Remove `instanceDir/server-sessions/<sessionId>/` from disk (recursive delete of the `persistDir` subtree only, best-effort). **Agent `instanceStorage` under `instanceDir/<agentName>/` is not touched.**
4. Remove the entry from `sessions.json`.

> **Note:** Any connected client can call `deleteSession` on any session, including sessions they are not currently connected to. The calling client's session-namespaced channels are cleaned up immediately; other clients connected to the deleted session have their dispatcher handles closed when `SharedDispatcher.close()` is called. Server-side authorization is out of scope for v1 (see Open Questions).

---

## Client Integration

### CLI

The CLI implements the full session management surface described in this document, with client-side session persistence.

#### `connect` — join a session

```bash
agent-cli connect                        # connect to the 'CLI' session (created if absent)
agent-cli connect --resume               # resume the last used session
agent-cli connect --session <id>         # connect to a specific session by ID
agent-cli connect --port <port>          # connect to a server on a non-default port (default: 8999)
```

By default (no flags), `connect` targets a session named `"CLI"`. It calls `listSessions("CLI")` and joins the first match, or calls `createSession("CLI")` if none exists.

Pass `--resume` / `-r` to instead resume the last used session, whose ID is persisted client-side in `~/.typeagent/cli-state.json`. If that session is no longer found on the server, the user is prompted to join the `"CLI"` session (find-or-create). If the user declines, the stale ID is cleared and the command exits.

Pass `--session` / `-s <id>` to connect to a specific session by UUID. This takes priority over `--resume` if both are provided; errors propagate as-is without the recovery prompt.

On every successful connection the connected session ID is written to `~/.typeagent/cli-state.json` for use by future `--resume` invocations.

#### `sessions` topic — session CRUD

| Command                                    | RPC call                     |
| ------------------------------------------ | ---------------------------- |
| `agent-cli sessions create <name>`         | `createSession(name)`        |
| `agent-cli sessions list [--name <sub>]`   | `listSessions(name?)`        |
| `agent-cli sessions rename <id> <newName>` | `renameSession(id, newName)` |
| `agent-cli sessions delete <id> [--yes]`   | `deleteSession(id)`          |

`sessions create`, `list`, `rename`, and `delete` use `connectAgentServer()` directly (no `joinSession()`) — they are management operations that do not require joining a session.

`sessions delete` prompts `Delete session <id> and all its data? (y/N)` before calling `deleteSession()`. Pass `--yes` / `-y` to skip the prompt.

`sessions list` renders a fixed-width table with columns `SESSION ID`, `NAME`, `CLIENTS`, and `CREATED AT`. Pass `--name <substring>` to filter by name (case-insensitive).

### Shell

Session management only applies when the Shell is running in **connected mode** (`--connect <port>`). In standalone mode the Shell runs an in-process dispatcher and manages its own session state independently.

When connected to the agentServer:

- On startup, Shell calls `listSessions()` and presents a session picker (or auto-connects to the default session).
- A session management panel allows listing, switching, renaming, and deleting sessions.
- When resuming a session, the Shell loads chat history from the server via `getDisplayHistory()` (which already exists on the `Dispatcher` interface and works per-session since each session has its own `DisplayLog` instance) rather than its local HTML file. How this history is rendered in the Shell UI is an open question — see Open Questions item 2.

---

## Out of Scope (Future Iterations)

- **Session sharing across machines** — sessions are local to one agentServer instance for now.
- **Session export/import** — useful for backup/restore, but not required for v1.
- **Per-session agent configuration** — sessions inherit the global agent enable/disable config for now.
- **Pagination for `listSessions()`** — the full index is loaded on every call; pagination (`limit`/`offset`) can be added once session counts grow large enough to matter.
- **Graceful drain on `deleteSession()`** — currently `deleteSession()` closes all client dispatcher handles immediately with no drain window. A future iteration should notify connected clients and await in-flight `processCommand()` calls (with a timeout) before tearing down the session.
- **Per-session concurrency lock** — concurrent `joinSession()` calls targeting the same session ID before lazy initialization completes could race. A per-session async mutex should be added in a follow-up.
- **LLM-generated session summaries** — auto-generating a one-line summary after the first exchange is a useful future addition for session discoverability, but is deferred in favor of explicit `name` for now.

---

## Open Questions

1. **Session size limits:** Should there be a maximum number of sessions? A maximum disk size per session? These constraints are useful for operational hygiene but depend on expected usage patterns.

2. **Shell history rendering on session resume:** When the Shell resumes a session in connected mode, it calls `getDisplayHistory()` to load prior history. The open question is how this is rendered: does the Shell rebuild the chat view in place from the returned entries, swap its local HTML file, or render history on demand? This is a Shell-specific UX and architecture decision to be resolved when this work is tackled.

3. **Client-Enforced Session Isolation:** Session isolation is currently **client-enforced**. The server provides `clientCount` as a signal, but nothing prevents a poorly behaved client from joining a session it shouldn't. Whether to add server-side enforcement (e.g., max connections per session, explicit session locking) is an open question for a future iteration.

---

## Summary

This design adds explicit session management to the agentServer without fundamentally restructuring its architecture. The core additions are:

- A `sessions.json` registry for discoverable, GUID-keyed named sessions.
- A `SessionManager` that maintains a pool of per-session `SharedDispatcher` instances, each with its own isolated `initializeCommandHandlerContext()` call, chat history, conversation memory, and persist directory.
- Five new RPC methods on the `AgentServer` channel: `joinSession`, `leaveSession`, `createSession`, `listSessions`, `renameSession`, `deleteSession`.
- `sessionId` in `DispatcherConnectOptions` so clients can resume a specific session by ID.
- `listSessions(name?)` with optional substring filtering as the primary session discovery mechanism.
- Session-namespaced WebSocket channels (`dispatcher:<id>`, `clientio:<id>`) enabling multiple concurrent sessions over a single connection.
- Idle dispatcher eviction after 5 minutes to free memory for inactive sessions.
- **A split storage root**: `instanceDir` (global, shared across all server sessions) and `persistDir` (per-server-session, discarded with the session). `instanceStorage` is rooted at `instanceDir`, preserving agent configuration and auth tokens across session boundaries. `sessionStorage` and all ephemeral dispatcher data (chat history, memory, display log) remain scoped to `persistDir`. A new `instanceDir` field is added to `DispatcherOptions` and `CommandHandlerContext`; when absent, behavior falls back to `persistDir` for full backward compatibility with the standalone Shell, CLI, and tests.

The server enforces no policy on who can join or delete a session — `clientCount` gives clients the signal to make that decision themselves.

Session state is local to the server — the underlying LLM API is stateless, and the server owns all history management. This lifts that responsibility from individual clients (Shell, CLI) into the shared server layer, where it belongs.