Files
AgenticCode/openspec/specs/chat-streaming/spec.md
local 471e9ce935 feat: migrate chat backend to Semantic Kernel with tool calling support
Replace manual HTTP proxy in ChatController with Semantic Kernel's
OpenAI chat completion service pointed at CLIProxyAPI. Add extraction
plugin with validation function for structured field extraction from
natural language, enabling an agentic loop with auto-retry and
human-in-the-loop escalation.

- Add Microsoft.SemanticKernel 1.74.0 with OpenAI connector
- Create ExtractedFields schema and ValidationResult models
- Create ExtractionPlugin with [KernelFunction] validation
- Rewrite ChatController to use IChatCompletionService streaming
- Configure FunctionChoiceBehavior.Auto() for tool calling
- Preserve existing SSE contract (client unchanged)
- Update tests to mock SK services, add plugin and integration tests
- Archive multi-turn-conversations and migrate-to-semantic-kernel changes
- Sync specs for agent-extraction, semantic-kernel-integration, chat-streaming

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:59:13 +01:00

2.6 KiB

Purpose

Define the streaming AI response pipeline — backend chat endpoint using Semantic Kernel, SSE delivery to the WASM client, configuration, and error handling.

Requirements

Requirement: Chat endpoint proxies to Responses API

The API backend SHALL expose POST /api/chat that accepts a list of messages and processes them using a Semantic Kernel chat completion service. The kernel is configured with an OpenAI connector pointed at the existing CLIProxyAPI proxy.

Scenario: Successful chat request

  • WHEN the client sends a POST to /api/chat with a message list
  • THEN the API processes the messages through the Semantic Kernel and returns the response

Requirement: Streaming response delivery

The API backend SHALL stream the Semantic Kernel's chat completion response back to the WASM client as text/event-stream, forwarding text content so the client can render tokens incrementally. The SSE event format MUST remain data: {"text":"..."}\n\n for text deltas and data: [DONE]\n\n for completion.

Scenario: Tokens stream to client

  • WHEN the Semantic Kernel emits streaming chat message content
  • THEN the backend forwards each content chunk as an SSE event to the client containing the text fragment

Scenario: Stream completes

  • WHEN the Semantic Kernel streaming response completes
  • THEN the backend signals stream completion to the client with data: [DONE]\n\n

Requirement: Configurable proxy target

The CLIProxyAPI base URL and model name SHALL be configurable via appsettings.json in the API project, not hardcoded. These values are used to configure the Semantic Kernel OpenAI connector.

Scenario: Configuration read at startup

  • WHEN the API starts
  • THEN it reads ResponsesApi:BaseUrl and ResponsesApi:Model from configuration to configure the Semantic Kernel

Requirement: Client streams from backend

The WASM client SHALL call POST /api/chat with SetBrowserResponseStreamingEnabled(true) and HttpCompletionOption.ResponseHeadersRead, then iterate the SSE stream to update the UI token by token.

Scenario: Client reads streaming response

  • WHEN the client sends a chat request
  • THEN it reads the response stream incrementally and appends each text delta to the assistant message in real time

Requirement: Error propagation

If the LLM service returns an error or is unreachable, the API backend SHALL return an error SSE event and the client SHALL display the error to the user.

Scenario: LLM service unreachable

  • WHEN the CLIProxyAPI proxy is not running
  • THEN the client displays an error message instead of an assistant response