Overhaul extraction pipeline with new TradeItem model, conversation flow, and dedicated extraction endpoint. Add sidebar navigation with NavMenu component and landing page. Introduce few-shot prompting service and tests. Add prompt settings and email upload specs. Update OpenSpec tooling with improved export-spec and extract-feature commands. Archive completed changes and export full specs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
71 lines
3.8 KiB
Markdown
71 lines
3.8 KiB
Markdown
## Purpose
|
|
|
|
Define the streaming AI response pipeline — backend chat endpoint using Semantic Kernel, SSE delivery to the WASM client, configuration, and error handling.
|
|
|
|
## Requirements
|
|
|
|
### Requirement: Chat endpoint proxies to Responses API
|
|
|
|
The API backend SHALL expose `POST /api/chat` that accepts a `ChatRequest` containing messages, an optional system prompt, and optional model settings. The request is processed using a Semantic Kernel chat completion service. When a system prompt is provided, it SHALL be added as the first system message in the ChatHistory. When model settings are provided, non-null values SHALL be applied to the execution settings. A separate `POST /api/chat/extract` endpoint SHALL handle extraction-specific requests with few-shot prompting.
|
|
|
|
#### Scenario: Successful chat request with system prompt
|
|
|
|
- **WHEN** the client sends a POST to `/api/chat` with messages and a system prompt
|
|
- **THEN** the API creates a ChatHistory with the system prompt as the first message, followed by the conversation messages, and processes them through Semantic Kernel
|
|
|
|
#### Scenario: Successful chat request with model settings
|
|
|
|
- **WHEN** the client sends a POST to `/api/chat` with messages and model settings (e.g., Temperature=0.3)
|
|
- **THEN** the API applies the settings to OpenAIPromptExecutionSettings before calling the Semantic Kernel
|
|
|
|
#### Scenario: Successful chat request without optional fields
|
|
|
|
- **WHEN** the client sends a POST to `/api/chat` with only messages (no system prompt, no settings)
|
|
- **THEN** the API processes the request with default behavior (no system message, default execution settings)
|
|
|
|
#### Scenario: Extraction request routed to dedicated endpoint
|
|
|
|
- **WHEN** the client sends a POST to `/api/chat/extract` with email HTML
|
|
- **THEN** the API uses the few-shot ChatHistory prefix and extraction tools instead of the general chat configuration
|
|
|
|
### Requirement: Streaming response delivery
|
|
|
|
The API backend SHALL stream the Semantic Kernel's chat completion response back to the WASM client as `text/event-stream`, forwarding text content so the client can render tokens incrementally. The SSE event format MUST remain `data: {"text":"..."}\n\n` for text deltas and `data: [DONE]\n\n` for completion.
|
|
|
|
#### Scenario: Tokens stream to client
|
|
|
|
- **WHEN** the Semantic Kernel emits streaming chat message content
|
|
- **THEN** the backend forwards each content chunk as an SSE event to the client containing the text fragment
|
|
|
|
#### Scenario: Stream completes
|
|
|
|
- **WHEN** the Semantic Kernel streaming response completes
|
|
- **THEN** the backend signals stream completion to the client with `data: [DONE]\n\n`
|
|
|
|
### Requirement: Configurable proxy target
|
|
|
|
The CLIProxyAPI base URL and model name SHALL be configurable via `appsettings.json` in the API project, not hardcoded. These values are used to configure the Semantic Kernel OpenAI connector.
|
|
|
|
#### Scenario: Configuration read at startup
|
|
|
|
- **WHEN** the API starts
|
|
- **THEN** it reads `ResponsesApi:BaseUrl` and `ResponsesApi:Model` from configuration to configure the Semantic Kernel
|
|
|
|
### Requirement: Client streams from backend
|
|
|
|
The WASM client SHALL call `POST /api/chat` with `SetBrowserResponseStreamingEnabled(true)` and `HttpCompletionOption.ResponseHeadersRead`, then iterate the SSE stream to update the UI token by token.
|
|
|
|
#### Scenario: Client reads streaming response
|
|
|
|
- **WHEN** the client sends a chat request
|
|
- **THEN** it reads the response stream incrementally and appends each text delta to the assistant message in real time
|
|
|
|
### Requirement: Error propagation
|
|
|
|
If the LLM service returns an error or is unreachable, the API backend SHALL return an error SSE event and the client SHALL display the error to the user.
|
|
|
|
#### Scenario: LLM service unreachable
|
|
|
|
- **WHEN** the CLIProxyAPI proxy is not running
|
|
- **THEN** the client displays an error message instead of an assistant response
|