AgenticCode/openspec/specs/chat-streaming/spec.md

## Purpose

Define the streaming AI response pipeline — backend proxy to the Responses API, SSE delivery to the WASM client, configuration, and error handling.

## Requirements

### Requirement: Chat endpoint proxies to Responses API

The API backend SHALL expose `POST /api/chat` that accepts a list of messages and proxies the request to the local Responses API at a configurable base URL using the `POST /v1/responses` endpoint.

#### Scenario: Successful proxy request

- **WHEN** the client sends a POST to `/api/chat` with a message list
- **THEN** the API forwards the messages to the Responses API with the configured model and returns the response

### Requirement: Streaming response delivery

The API backend SHALL stream the Responses API's SSE events back to the WASM client as `text/event-stream`, forwarding `response.output_text.delta` events so the client can render tokens incrementally.

#### Scenario: Tokens stream to client

- **WHEN** the Responses API emits `response.output_text.delta` events
- **THEN** the backend forwards each delta as an SSE event to the client containing the text fragment

#### Scenario: Stream completes

- **WHEN** the Responses API emits `response.completed`
- **THEN** the backend signals stream completion to the client

### Requirement: Configurable proxy target

The Responses API base URL and model name SHALL be configurable via `appsettings.json` in the API project, not hardcoded.

#### Scenario: Configuration read at startup

- **WHEN** the API starts
- **THEN** it reads `ResponsesApi:BaseUrl` and `ResponsesApi:Model` from configuration

### Requirement: Client streams from backend

The WASM client SHALL call `POST /api/chat` with `SetBrowserResponseStreamingEnabled(true)` and `HttpCompletionOption.ResponseHeadersRead`, then iterate the SSE stream to update the UI token by token.

#### Scenario: Client reads streaming response

- **WHEN** the client sends a chat request
- **THEN** it reads the response stream incrementally and appends each text delta to the assistant message in real time

### Requirement: Error propagation

If the Responses API returns an error or is unreachable, the API backend SHALL return an appropriate HTTP error status and the client SHALL display the error to the user.

#### Scenario: Proxy unreachable

- **WHEN** the Responses API is not running
- **THEN** the client displays an error message instead of an assistant response